On Mon, Oct 23, 2006 at 11:41:18AM -0400, Pete Wyckoff wrote:
> That infamous noncontig test fails now that I have upgraded
> mpich2 to 1.0.4p1.  Is the pvfs2 ROMIO in there up to date?

Sadly I found a bunch of bugs about a day after 1.0.4p1 hit the ftp
site.  You'll need the attached patch until the next release.  It will
fix the noncontig test and also several other tests (all of which are
now in the nightly builds, so this won't happen again).

Apply this patch in the src/mpi/romio directory. 

==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B
---------------------
PatchSet 1115 
Date: 2006/08/25 16:54:13
Author: robl
Branch: MPICH2_1_0_4p0
Tag: (none) 
Log:
from HEAD: inspect file type, not memory type

Members: 
        adio/ad_pvfs2/ad_pvfs2_read.c:1.21.2.2->1.21.2.3 
        adio/ad_pvfs2/ad_pvfs2_write.c:1.23.2.2->1.23.2.3 

Index: romio/adio/ad_pvfs2/ad_pvfs2_read.c
diff -u romio/adio/ad_pvfs2/ad_pvfs2_read.c:1.21.2.2 
romio/adio/ad_pvfs2/ad_pvfs2_read.c:1.21.2.3
--- romio/adio/ad_pvfs2/ad_pvfs2_read.c:1.21.2.2        Wed Aug  2 16:14:01 2006
+++ romio/adio/ad_pvfs2/ad_pvfs2_read.c Fri Aug 25 10:54:13 2006
@@ -139,7 +139,7 @@
      * are actually contiguous and do not need the expensive workarond */
     if (!filetype_is_contig) {
        flat_file = ADIOI_Flatlist;
-       while (flat_buf->type != fd->filetype) flat_file = flat_file->next;
+       while (flat_file->type != fd->filetype) flat_file = flat_file->next;
        if (flat_file->count == 1)
            filetype_is_contig = 1;
     }
Index: romio/adio/ad_pvfs2/ad_pvfs2_write.c
diff -u romio/adio/ad_pvfs2/ad_pvfs2_write.c:1.23.2.2 
romio/adio/ad_pvfs2/ad_pvfs2_write.c:1.23.2.3
--- romio/adio/ad_pvfs2/ad_pvfs2_write.c:1.23.2.2       Wed Aug  2 16:14:04 2006
+++ romio/adio/ad_pvfs2/ad_pvfs2_write.c        Fri Aug 25 10:54:13 2006
@@ -160,7 +160,7 @@
      * are actually contiguous and do not need the expensive workarond */
     if (!filetype_is_contig) {
        flat_file = ADIOI_Flatlist;
-       while (flat_buf->type != fd->filetype) flat_file = flat_file->next;
+       while (flat_file->type != fd->filetype) flat_file = flat_file->next;
        if (flat_file->count == 1)
            filetype_is_contig = 1;
     }
---------------------
PatchSet 1116 
Date: 2006/08/25 16:57:21
Author: robl
Branch: MPICH2_1_0_4p0
Tag: (none) 
Log:
from HEAD: fixing "obviously broken" code actually introduced a bug.

Members: 
        adio/ad_pvfs/ad_pvfs_read.c:1.17->1.17.2.1 
        adio/ad_pvfs/ad_pvfs_write.c:1.19->1.19.2.1 
        adio/ad_pvfs2/ad_pvfs2_read.c:1.21.2.3->1.21.2.4 
        adio/ad_pvfs2/ad_pvfs2_write.c:1.23.2.3->1.23.2.4 

Index: romio/adio/ad_pvfs/ad_pvfs_read.c
diff -u romio/adio/ad_pvfs/ad_pvfs_read.c:1.17 
romio/adio/ad_pvfs/ad_pvfs_read.c:1.17.2.1
--- romio/adio/ad_pvfs/ad_pvfs_read.c:1.17      Fri Jun  9 11:42:44 2006
+++ romio/adio/ad_pvfs/ad_pvfs_read.c   Fri Aug 25 10:57:21 2006
@@ -541,8 +541,6 @@
                max_mem_list = mem_list_count;
            if (max_file_list < file_list_count)
                max_file_list = file_list_count;
-           if (max_mem_list == MAX_ARRAY_SIZE)
-               break;
        } /* while (size_read < bufsize) */
 
        mem_offsets = (char **)ADIOI_Malloc(max_mem_list*sizeof(char *));
Index: romio/adio/ad_pvfs/ad_pvfs_write.c
diff -u romio/adio/ad_pvfs/ad_pvfs_write.c:1.19 
romio/adio/ad_pvfs/ad_pvfs_write.c:1.19.2.1
--- romio/adio/ad_pvfs/ad_pvfs_write.c:1.19     Fri Jun  9 11:42:44 2006
+++ romio/adio/ad_pvfs/ad_pvfs_write.c  Fri Aug 25 10:57:25 2006
@@ -881,8 +881,6 @@
                max_mem_list = mem_list_count;
            if (max_file_list < file_list_count)
                max_file_list = file_list_count;
-           if (max_mem_list == MAX_ARRAY_SIZE)
-               break;
        } /* while (size_wrote < bufsize) */
 
        mem_offsets = (char **)ADIOI_Malloc(max_mem_list*sizeof(char *));
Index: romio/adio/ad_pvfs2/ad_pvfs2_read.c
diff -u romio/adio/ad_pvfs2/ad_pvfs2_read.c:1.21.2.3 
romio/adio/ad_pvfs2/ad_pvfs2_read.c:1.21.2.4
--- romio/adio/ad_pvfs2/ad_pvfs2_read.c:1.21.2.3        Fri Aug 25 10:54:13 2006
+++ romio/adio/ad_pvfs2/ad_pvfs2_read.c Fri Aug 25 10:57:29 2006
@@ -675,8 +675,6 @@
                max_mem_list = mem_list_count;
            if (max_file_list < file_list_count)
                max_file_list = file_list_count;
-           if (max_mem_list == MAX_ARRAY_SIZE)
-               break;
        } /* while (size_read < bufsize) */
 
        /* one last check before we actually carry out the operation:
Index: romio/adio/ad_pvfs2/ad_pvfs2_write.c
diff -u romio/adio/ad_pvfs2/ad_pvfs2_write.c:1.23.2.3 
romio/adio/ad_pvfs2/ad_pvfs2_write.c:1.23.2.4
--- romio/adio/ad_pvfs2/ad_pvfs2_write.c:1.23.2.3       Fri Aug 25 10:54:13 2006
+++ romio/adio/ad_pvfs2/ad_pvfs2_write.c        Fri Aug 25 10:57:30 2006
@@ -727,8 +727,6 @@
                max_mem_list = mem_list_count;
            if (max_file_list < file_list_count)
                max_file_list = file_list_count;
-           if (max_mem_list == MAX_ARRAY_SIZE)
-               break;
        } /* while (size_wrote < bufsize) */
 
        /* one last check before we actually carry out the operation:
---------------------
PatchSet 1117 
Date: 2006/08/25 16:58:37
Author: robl
Branch: MPICH2_1_0_4p0
Tag: (none) 
Log:
from HEAD: null out a pointer to fix a bug in EXCL case

Members: 
        adio/ad_pvfs2/ad_pvfs2_open.c:1.26->1.26.2.1 

Index: romio/adio/ad_pvfs2/ad_pvfs2_open.c
diff -u romio/adio/ad_pvfs2/ad_pvfs2_open.c:1.26 
romio/adio/ad_pvfs2/ad_pvfs2_open.c:1.26.2.1
--- romio/adio/ad_pvfs2/ad_pvfs2_open.c:1.26    Mon Jun 12 11:06:33 2006
+++ romio/adio/ad_pvfs2/ad_pvfs2_open.c Fri Aug 25 10:58:37 2006
@@ -209,6 +209,7 @@
     if (o_status.error != 0)
     { 
        ADIOI_Free(pvfs2_fs);
+       fd->fs_ptr = NULL;
        *error_code = MPIO_Err_create_code(MPI_SUCCESS,
                                           MPIR_ERR_RECOVERABLE,
                                           myname, __LINE__,
---------------------
PatchSet 1118 
Date: 2006/08/25 17:02:47
Author: robl
Branch: MPICH2_1_0_4p0
Tag: (none) 
Log:
from HEAD: improved error reporting

Members: 
        test/noncontig_coll2.c:1.12->1.12.8.1 

Index: romio/test/noncontig_coll2.c
diff -u romio/test/noncontig_coll2.c:1.12 romio/test/noncontig_coll2.c:1.12.8.1
--- romio/test/noncontig_coll2.c:1.12   Fri Jul 22 18:08:31 2005
+++ romio/test/noncontig_coll2.c        Fri Aug 25 11:02:47 2006
@@ -472,13 +472,19 @@
     MPI_File_set_view(fh, 0, MPI_INT, newtype, "native", info);
 
     for (i=0; i<SIZE; i++) buf[i] = SEEDER(mynod,i,SIZE);
-    MPI_File_write_all(fh, buf, 1, newtype, &status);
+    errcode = MPI_File_write_all(fh, buf, 1, newtype, &status);
+    if (errcode != MPI_SUCCESS) {
+           handle_error(errcode, "nc mem - nc file: MPI_File_write_all");
+    }
 
     MPI_Barrier(MPI_COMM_WORLD);
 
     for (i=0; i<SIZE; i++) buf[i] = -1;
 
-    MPI_File_read_at_all(fh, 0, buf, 1, newtype, &status);
+    errcode = MPI_File_read_at_all(fh, 0, buf, 1, newtype, &status);
+    if (errcode != MPI_SUCCESS) {
+           handle_error(errcode, "nc mem - nc file: MPI_File_read_at_all");
+    }
 
     /* the verification for N compute nodes is tricky. Say we have 3
      * processors.  
@@ -523,13 +529,19 @@
                   info, &fh);
 
     for (i=0; i<SIZE; i++) buf[i] = SEEDER(mynod,i,SIZE);
-    MPI_File_write_at_all(fh, mynod*(SIZE/nprocs)*sizeof(int), buf, 1, 
newtype, &status);
+    errcode = MPI_File_write_at_all(fh, mynod*(SIZE/nprocs)*sizeof(int), 
+                   buf, 1, newtype, &status);
+    if (errcode != MPI_SUCCESS)
+           handle_error(errcode, "nc mem - c file: MPI_File_write_at_all");
 
     MPI_Barrier(MPI_COMM_WORLD);
 
     for (i=0; i<SIZE; i++) buf[i] = -1;
 
-    MPI_File_read_at_all(fh, mynod*(SIZE/nprocs)*sizeof(int), buf, 1, newtype, 
&status);
+    errcode = MPI_File_read_at_all(fh, mynod*(SIZE/nprocs)*sizeof(int), 
+                   buf, 1, newtype, &status);
+    if (errcode != MPI_SUCCESS)
+           handle_error(errcode, "nc mem - c file: MPI_File_read_at_all");
 
     /* just like as above */
     for (i=0; i<mynod; i++ ) {
@@ -567,13 +579,17 @@
     MPI_File_set_view(fh, 0, MPI_INT, newtype, "native", info);
 
     for (i=0; i<SIZE; i++) buf[i] = SEEDER(mynod, i, SIZE);
-    MPI_File_write_all(fh, buf, SIZE, MPI_INT, &status);
+    errcode = MPI_File_write_all(fh, buf, SIZE, MPI_INT, &status);
+    if (errcode != MPI_SUCCESS)
+           handle_error(errcode, "c mem - nc file: MPI_File_write_all");
 
     MPI_Barrier(MPI_COMM_WORLD);
 
     for (i=0; i<SIZE; i++) buf[i] = -1;
 
-    MPI_File_read_at_all(fh, 0, buf, SIZE, MPI_INT, &status);
+    errcode = MPI_File_read_at_all(fh, 0, buf, SIZE, MPI_INT, &status);
+    if (errcode != MPI_SUCCESS)
+           handle_error(errcode, "c mem - nc file: MPI_File_read_at_all");
 
     /* same crazy checking */
     for (i=0; i<SIZE; i++) {
---------------------
PatchSet 1119 
Date: 2006/08/25 17:03:30
Author: robl
Branch: MPICH2_1_0_4p0
Tag: (none) 
Log:
from HEAD: need to disable ncache for some workloads

Members: 
        adio/ad_pvfs2/ad_pvfs2_common.c:1.16->1.16.2.1 

Index: romio/adio/ad_pvfs2/ad_pvfs2_common.c
diff -u romio/adio/ad_pvfs2/ad_pvfs2_common.c:1.16 
romio/adio/ad_pvfs2/ad_pvfs2_common.c:1.16.2.1
--- romio/adio/ad_pvfs2/ad_pvfs2_common.c:1.16  Mon Jun 12 11:06:33 2006
+++ romio/adio/ad_pvfs2/ad_pvfs2_common.c       Fri Aug 25 11:03:30 2006
@@ -48,6 +48,7 @@
 {
     int ret;
     static char myname[] = "ADIOI_PVFS2_INIT";
+    char * ncache_timeout;
 
     /* do nothing if we've already fired up the pvfs2 interface */
     if (ADIOI_PVFS2_Initialized != MPI_KEYVAL_INVALID) {
@@ -55,6 +56,13 @@
        return;
     }
 
+    /* for consistency, we should disable the pvfs2 ncache.  If the
+     * environtment variable is already set, assume a  user knows it
+     * won't be a problem */
+    ncache_timeout = getenv("PVFS2_NCACHE_TIMEOUT");
+    if (ncache_timeout == NULL )
+       setenv("PVFS2_NCACHE_TIMEOUT", "0", 1);
+
     ret = PVFS_util_init_defaults();
     if (ret < 0 ) {
        *error_code = MPIO_Err_create_code(MPI_SUCCESS,
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to