Hi all,

I'm hitting a brick wall debugging the linkshare segfaults I'm seeing.

(These logs are from my 2-way x86_64, but I'm seeing similar issues on a G5
(ppc64):

HUGETLB_SHARE=2 xB.linkshare 2 (32):    PASS
HUGETLB_SHARE=2 xB.linkshare 2 (64):    PASS
HUGETLB_SHARE=1 xB.linkshare 2 (32):    FAIL    2 of 2 children exited 
abnormally
HUGETLB_SHARE=1 xB.linkshare 2 (64):    FAIL    2 of 2 children exited 
abnormally
HUGETLB_SHARE=2 xBDT.linkshare 2 (32):  PASS
HUGETLB_SHARE=2 xBDT.linkshare 2 (64):  PASS
HUGETLB_SHARE=1 xBDT.linkshare 2 (32):  FAIL    2 of 2 children exited 
abnormally
HUGETLB_SHARE=1 xBDT.linkshare 2 (64):  FAIL    2 of 2 children exited 
abnormally

With all 4 failures being segmentation faults we caught.

)

I'm including below four outputs from my testing, as well as the current
patch I'm using.

[1] the patch I'm using, which is applied on top of the patch I sent to
the list earlier to rm -rf files in /mnt/hugetlbfs between runs of
elfshare_test.

[2] the output of `make func` from the libhuge testsuite. We get 3
PASSes and 5 FAILs.

[3] the dmesg output from the `make func` run, with a dmesg -c before.
For some reason we're not catching some segfaults?

[4] the output of `make funcv` (which only should be changing the
verbosity of the tests) from the libhuge testsuite. Note, we now get 2
PASSes and 6 FAILs.

[5] the dmesg output from the `make funcv` run, with a dmesg -c before.
We're again not catching all the segfaults.

Three problems, then. Obviously, one is why linkshare is segfaulting.
Two is why func and funcv differ? And three is why are we not catching
all the segfaults with a SIGSEGV handler?

Thanks,
Nish

[1]

diff --git a/tests/linkshare.c b/tests/linkshare.c
index 3461a7d..308fd75 100644
--- a/tests/linkshare.c
+++ b/tests/linkshare.c
@@ -16,6 +16,7 @@
  * License along with this library; if not, write to the Free Software
  * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
  */
+#define _GNU_SOURCE
 
 #include <stdio.h>
 #include <stdlib.h>
@@ -24,6 +25,8 @@
 #include <time.h>
 #include <errno.h>
 #include <limits.h>
+#include <signal.h>
+#include <string.h>
 #include <sys/types.h>
 #include <sys/mman.h>
 #include <sys/shm.h>
@@ -134,15 +137,35 @@ static ino_t do_test(struct test_entry *
        return get_addr_inode(te->data);
 }
 
+void signal_handler(int signum, siginfo_t *si, void *context)
+{
+       verbose_printf("Process %d got a Segmentation Fault at address %p\n",
+                                               getpid(), si->si_addr);
+       exit(RC_FAIL);
+}
+
+static struct sigaction sa = {
+       .sa_sigaction = signal_handler,
+       .sa_flags = SA_SIGINFO,
+};
+
 int main(int argc, char *argv[], char *envp[])
 {
        int i;
        int shmid;
        ino_t *shm;
        int num_sharings;
+       int status;
+       int child_failed = 0;
+       int ret;
 
        test_init(argc, argv);
 
+       ret = sigaction(SIGSEGV, &sa, NULL);
+       if (ret < 0)
+               FAIL("Installing SIGSEGV handler failed: %s",
+                                               strerror(errno));
+
        if (argc == 2) {
                /*
                 * first process
@@ -150,7 +173,7 @@ int main(int argc, char *argv[], char *e
                 */
                char *env;
                pid_t *children;
-               int ret, j;
+               int j;
                /* both default to 0 */
                int sharing = 0, elfmap_off = 0;
 
@@ -216,12 +239,28 @@ int main(int argc, char *argv[], char *e
                        }
                }
                for (i = 0; i < num_sharings; i++) {
-                       ret = waitpid(children[i], NULL, 0);
+                       ret = waitpid(children[i], &status, 0);
                        if (ret < 0) {
                                shmctl(shmid, IPC_RMID, NULL);
                                shmdt(shm);
                                FAIL("waitpid failed: %s", strerror(errno));
                        }
+                       if (WIFEXITED(status) && WEXITSTATUS(status) != 0) {
+                               child_failed++;
+                               verbose_printf("Child %d exited with non-zero 
status: %d\n",
+                                               i + 1, WEXITSTATUS(status));
+                       }
+                       if (WIFSIGNALED(status)) {
+                               child_failed++;
+                               verbose_printf("Child %d killed by signal: 
%s\n", i + 1,
+                                               strsignal(WTERMSIG(status)));
+                       }
+               }
+               if (child_failed) {
+                       shmctl(shmid, IPC_RMID, NULL);
+                       shmdt(shm);
+                       FAIL("%d of %d children exited abnormally",
+                                       child_failed, num_sharings);
                }
                for (i = 0; i < NUM_TESTS; i++) {
                        ino_t base = shm[i];

[2]

HUGETLB_SHARE=2 xB.linkshare 2 (32):    PASS
HUGETLB_SHARE=2 xB.linkshare 2 (64):    PASS
HUGETLB_SHARE=1 xB.linkshare 2 (32):    FAIL    2 of 2 children exited 
abnormally
HUGETLB_SHARE=1 xB.linkshare 2 (64):    FAIL    2 of 2 children exited 
abnormally
HUGETLB_SHARE=2 xBDT.linkshare 2 (32):  PASS
HUGETLB_SHARE=2 xBDT.linkshare 2 (64):  FAIL    2 of 2 children exited 
abnormally
HUGETLB_SHARE=1 xBDT.linkshare 2 (32):  FAIL    2 of 2 children exited 
abnormally
HUGETLB_SHARE=1 xBDT.linkshare 2 (64):  FAIL    2 of 2 children exited 
abnormally

[3]

[92721.340866] xBDT.linkshare[23592]: segfault at 00002b0f78891b40 rip 
00002b0f78891b40 rsp 00007fff5db771f8 error 14
[92721.342742] xBDT.linkshare[23593]: segfault at 00002b0f78891b40 rip 
00002b0f78891b40 rsp 00007fffdad02388 error 14

[4]

HUGETLB_SHARE=2 xB.linkshare 2 (32):
libhugetlbfs: HUGETLB_SHARE=2, sharing enabled for all segments
libhugetlbfs: Hugepage segment 0 (phdr 4): 0x9000000-0x9010048  (filesz=0) 
(prot = 0x7)
libhugetlbfs: Got unpopulated shared fd -- Preparing
libhugetlbfs: Mapped hugeseg at 0x55800000. Copying 0 bytes from 0x9000000...
...done
libhugetlbfs: Copying extra 0x8 bytes from 0x9000000...
...done
libhugetlbfs: Prepare succeeded
Starting testcase "xB.linkshare", pid 23827
Segment remapping enabled, sharing = 2
libhugetlbfs: HUGETLB_SHARE=2, sharing enabled for all segments
libhugetlbfs: Hugepage segment 0 (phdr 4): 0x9000000-0x9010048  (filesz=0) 
(prot = 0x7)
libhugetlbfs: HUGETLB_SHARE=2, sharing enabled for all segments
libhugetlbfs: Hugepage segment 0 (phdr 4): 0x9000000-0x9010048  (filesz=0) 
(prot = 0x7)
Starting testcase "xB.linkshare", pid 23828
Starting testcase "xB.linkshare", pid 23829
PASS

HUGETLB_SHARE=2 xB.linkshare 2 (64):
libhugetlbfs: HUGETLB_SHARE=2, sharing enabled for all segments
libhugetlbfs: Hugepage segment 0 (phdr 4): 0x1000000-0x1010050  (filesz=0) 
(prot = 0x7)
libhugetlbfs: Got unpopulated shared fd -- Preparing
libhugetlbfs: Mapped hugeseg at 0x2aaaaac00000. Copying 0 bytes from 
0x1000000...
...done
libhugetlbfs: Copying extra 0x10 bytes from 0x1000000...
...done
libhugetlbfs: Prepare succeeded
Starting testcase "xB.linkshare", pid 23836
Segment remapping enabled, sharing = 2
libhugetlbfs: HUGETLB_SHARE=2, sharing enabled for all segments
libhugetlbfs: Hugepage segment 0 (phdr 4): 0x1000000-0x1010050  (filesz=0) 
(prot = 0x7)
libhugetlbfs: HUGETLB_SHARE=2, sharing enabled for all segments
libhugetlbfs: Hugepage segment 0 (phdr 4): 0x1000000-0x1010050  (filesz=0) 
(prot = 0x7)
Child 1 killed by signal: Segmentation fault
Child 2 killed by signal: Segmentation fault
FAIL    2 of 2 children exited abnormally

HUGETLB_SHARE=1 xB.linkshare 2 (32):
libhugetlbfs: HUGETLB_SHARE=1, sharing enabled for only read-only segments
libhugetlbfs: Hugepage segment 0 (phdr 4): 0x9000000-0x9010048  (filesz=0) 
(prot = 0x7)
libhugetlbfs: Mapped hugeseg at 0x55800000. Copying 0 bytes from 0x9000000...
...done
libhugetlbfs: Copying extra 0x8 bytes from 0x9000000...
...done
libhugetlbfs: Prepare succeeded
Starting testcase "xB.linkshare", pid 23857
Segment remapping enabled, sharing = 1
libhugetlbfs: HUGETLB_SHARE=1, sharing enabled for only read-only segments
libhugetlbfs: Hugepage segment 0 (phdr libhugetlbfs: Hugepage segment 0 (phdr 
4): 0x9000000-0x9010048  (filesz=0) (prot = 0x7)
libhugetlbfs: Mapped hugeseg at 0x55800000. Copying 0 bytes from 0x9000000...
...done
libhugetlbfs: Copying extra 0x8 bytes from 0x9000000...
libhugetlbfs: Mapped hugeseg at 0x55800000. Copying 0 bytes from 0x9000000...
...done
libhugetlbfs: Copying extra 0x8 bytes from 0x9000000...
...done
libhugetlbfs: Prepare succeeded
...done
libhugetlbfs: Prepare succeeded
Starting testcase "xB.linkshare", pid 23858
Process 23858 got a Segmentation Fault at address 0x5556f008
Starting testcase "xB.linkshare", pid 23859
Process 23859 got a Segmentation Fault at address 0x5556f028
Child 1 exited with non-zero status: 2
Child 2 exited with non-zero status: 2
FAIL    2 of 2 children exited abnormally

HUGETLB_SHARE=1 xB.linkshare 2 (64):
libhugetlbfs: HUGETLB_SHARE=1, sharing enabled for only read-only segments
libhugetlbfs: Hugepage segment 0 (phdr 4): 0x1000000-0x1010050  (filesz=0) 
(prot = 0x7)
libhugetlbfs: Mapped hugeseg at 0x2aaaaac00000. Copying 0 bytes from 
0x1000000...
...done
libhugetlbfs: Copying extra 0x10 bytes from 0x1000000...
...done
libhugetlbfs: Prepare succeeded
Starting testcase "xB.linkshare", pid 23872
Segment remapping enabled, sharing = 1
libhugetlbfs: HUGETLB_SHARE=1, sharing enabled for only read-only segments
libhugetlbfs: Hugepage segment 0 (phdr 4): 0x1000000-0x1010050  (filesz=0) 
(prot = 0x7)
libhugetlbfs: HUGETLB_SHARE=1, sharing enabled for only read-only segments
libhugetlbfs: Hugepage segment 0 (phdr 4): 0x1000000-0x1010050  (filesz=0) 
(prot = 0x7)
libhugetlbfs: Mapped hugeseg at 0x2aaaaac00000. Copying 0 bytes from 
0x1000000...
...done
libhugetlbfs: Copying extra 0x10 bytes from 0x1000000...
libhugetlbfs: Mapped hugeseg at 0x2aaaaac00000. Copying 0 bytes from 
0x1000000...
...done
libhugetlbfs: Copying extra 0x10 bytes from 0x1000000...
...done
libhugetlbfs: Prepare succeeded
...done
libhugetlbfs: Prepare succeeded
Starting testcase "xB.linkshare", pid 23873
Process 23873 got a Segmentation Fault at address 0x2abb87143010
Child 1 exited with non-zero status: 2
Starting testcase "xB.linkshare", pid 23874
Process 23874 got a Segmentation Fault at address 0x2b03ebc73050
Child 2 exited with non-zero status: 2
FAIL    2 of 2 children exited abnormally

HUGETLB_SHARE=2 xBDT.linkshare 2 (32):
libhugetlbfs: HUGETLB_SHARE=2, sharing enabled for all segments
libhugetlbfs: Hugepage segment 0 (phdr 2): 0x8000000-0x8012198  
(filesz=0x12198) (prot = 0x5)
libhugetlbfs: Hugepage segment 1 (phdr 3): 0x9000000-0x90203a8  
(filesz=0x10350) (prot = 0x7)
libhugetlbfs: Got unpopulated shared fd -- Preparing
libhugetlbfs: Mapped hugeseg at 0x55800000. Copying 0x12198 bytes from 
0x8000000...
...done
libhugetlbfs: Prepare succeeded
libhugetlbfs: Got unpopulated shared fd -- Preparing
libhugetlbfs: Mapped hugeseg at 0x55800000. Copying 0x10350 bytes from 
0x9000000...
...done
libhugetlbfs: Copying extra 0x8 bytes from 0x9010360...
...done
libhugetlbfs: Prepare succeeded
Starting testcase "xBDT.linkshare", pid 23893
Segment remapping enabled, sharing = 2
libhugetlbfs: HUGETLB_SHARE=2, sharing enabled for all segments
libhugetlbfs: Hugepage segment 0 (phdr 2): 0x8000000-0x8012198  
(filesz=0x12198) (prot = 0x5)
libhugetlbfs: Hugepage segment 1 (phdr 3): 0x9000000-0x90203a8  
(filesz=0x10350) (prot = 0x7)
libhugetlbfs: HUGETLB_SHARE=2, sharing enabled for all segments
libhugetlbfs: Hugepage segment 0 (phdr 2): 0x8000000-0x8012198  
(filesz=0x12198) (prot = 0x5)
libhugetlbfs: Hugepage segment 1 (phdr 3): 0x9000000-0x90203a8  
(filesz=0x10350) (prot = 0x7)
Starting testcase "xBDT.linkshare", pid 23894
Starting testcase "xBDT.linkshare", pid 23895
PASS

HUGETLB_SHARE=2 xBDT.linkshare 2 (64):
libhugetlbfs: HUGETLB_SHARE=2, sharing enabled for all segments
libhugetlbfs: Hugepage segment 0 (phdr 2): 0x1000000-0x101266c  
(filesz=0x1266c) (prot = 0x5)
libhugetlbfs: Hugepage segment 1 (phdr 3): 0x2000000-0x2020590  
(filesz=0x1053c) (prot = 0x7)
libhugetlbfs: Got unpopulated shared fd -- Preparing
libhugetlbfs: Mapped hugeseg at 0x2aaaaac00000. Copying 0x1266c bytes from 
0x1000000...
...done
libhugetlbfs: Prepare succeeded
libhugetlbfs: Got unpopulated shared fd -- Preparing
libhugetlbfs: Mapped hugeseg at 0x2aaaaac00000. Copying 0x1053c bytes from 
0x2000000...
...done
libhugetlbfs: Copying extra 0x10 bytes from 0x2010540...
...done
libhugetlbfs: Prepare succeeded
Starting testcase "xBDT.linkshare", pid 23902
Segment remapping enabled, sharing = 2
libhugetlbfs: HUGETLB_SHARE=2, sharing enabled for all segments
libhugetlbfs: Hugepage segment 0 (phdr 2): 0x1000000-0x101266c  
(filesz=0x1266c) (prot = 0x5)
libhugetlbfs: Hugepage segment 1 (phdr 3): 0x2000000-0x2020590  
(filesz=0x1053c) (prot = 0x7)
libhugetlbfs: HUGETLB_SHARE=2, sharing enabled for all segments
libhugetlbfs: Hugepage segment 0 (phdr 2): 0x1000000-0x101266c  
(filesz=0x1266c) (prot = 0x5)
libhugetlbfs: Hugepage segment 1 (phdr 3): 0x2000000-0x2020590  
(filesz=0x1053c) (prot = 0x7)
Child 1 killed by signal: Segmentation fault
Child 2 killed by signal: Segmentation fault
FAIL    2 of 2 children exited abnormally

HUGETLB_SHARE=1 xBDT.linkshare 2 (32):
libhugetlbfs: HUGETLB_SHARE=1, sharing enabled for only read-only segments
libhugetlbfs: Hugepage segment 0 (phdr 2): 0x8000000-0x8012198  
(filesz=0x12198) (prot = 0x5)
libhugetlbfs: Hugepage segment 1 (phdr 3): 0x9000000-0x90203a8  
(filesz=0x10350) (prot = 0x7)
libhugetlbfs: Got unpopulated shared fd -- Preparing
libhugetlbfs: Mapped hugeseg at 0x55800000. Copying 0x12198 bytes from 
0x8000000...
...done
libhugetlbfs: Prepare succeeded
libhugetlbfs: Mapped hugeseg at 0x55800000. Copying 0x10350 bytes from 
0x9000000...
...done
libhugetlbfs: Copying extra 0x8 bytes from 0x9010360...
...done
libhugetlbfs: Prepare succeeded
Starting testcase "xBDT.linkshare", pid 23923
Segment remapping enabled, sharing = 1
libhugetlbfs: HUGETLB_SHARE=1, sharing enabled for only read-only segments
libhugetlbfs: Hugepage segment 0 (phdr 2): 0x8000000-0x8012198  
(filesz=0x12198) (prot = 0x5)
libhugetlbfs: Hugepage segment 1 (phdr 3): 0x9000000-0x90203a8  
(filesz=0x10350) (prot = 0x7)
libhugetlbfs: Mapped hugeseg at 0x55800000. Copying 0x10350 bytes from 
0x9000000...
libhugetlbfs: HUGETLB_SHARE=1, sharing enabled for only read-only segments
libhugetlbfs: Hugepage segment 0 (phdr 2): 0x8000000-0x8012198  
(filesz=0x12198) (prot = 0x5)
libhugetlbfs: Hugepage segment 1 (phdr 3): 0x9000000-0x90203a8  
(filesz=0x10350) (prot = 0x7)
libhugetlbfs: Mapped hugeseg at 0x55800000. Copying 0x10350 bytes from 
0x9000000...
...done
libhugetlbfs: Copying extra 0x8 bytes from 0x9010360...
...done
libhugetlbfs: Prepare succeeded
Starting testcase "xBDT.linkshare", pid 23925
Process 23925 got a Segmentation Fault at address 0x5556f020
...done
libhugetlbfs: Copying extra 0x8 bytes from 0x9010360...
...done
libhugetlbfs: Prepare succeeded
Starting testcase "xBDT.linkshare", pid 23924
Process 23924 got a Segmentation Fault at address 0x5556f000
Child 1 exited with non-zero status: 2
Child 2 exited with non-zero status: 2
FAIL    2 of 2 children exited abnormally

HUGETLB_SHARE=1 xBDT.linkshare 2 (64):
libhugetlbfs: HUGETLB_SHARE=1, sharing enabled for only read-only segments
libhugetlbfs: Hugepage segment 0 (phdr 2): 0x1000000-0x101266c  
(filesz=0x1266c) (prot = 0x5)
libhugetlbfs: Hugepage segment 1 (phdr 3): 0x2000000-0x2020590  
(filesz=0x1053c) (prot = 0x7)
libhugetlbfs: Got unpopulated shared fd -- Preparing
libhugetlbfs: Mapped hugeseg at 0x2aaaaac00000. Copying 0x1266c bytes from 
0x1000000...
...done
libhugetlbfs: Prepare succeeded
libhugetlbfs: Mapped hugeseg at 0x2aaaaac00000. Copying 0x1053c bytes from 
0x2000000...
...done
libhugetlbfs: Copying extra 0x10 bytes from 0x2010540...
...done
libhugetlbfs: Prepare succeeded
Starting testcase "xBDT.linkshare", pid 23938
Segment remapping enabled, sharing = 1
libhugetlbfs: HUGETLB_SHARE=1, sharing enabled for only read-only segments
libhugetlbfs: Hugepage segment 0 (phdr 2): 0x1000000-0x101266c  
(filesz=0x1266c) (prot = 0x5)
libhugetlbfs: Hugepage segment 1 (phdr 3): 0x2000000-0x2020590  
(filesz=0x1053c) (prot = 0x7)
libhugetlbfs: Mapped hugeseg at 0x2aaaaac00000. Copying 0x1053c bytes from 
0x2000000...
libhugetlbfs: HUGETLB_SHARE=1, sharing enabled for only read-only segments
libhugetlbfs: Hugepage segment 0 (phdr 2): 0x1000000-0x101266c  
(filesz=0x1266c) (prot = 0x5)
libhugetlbfs: Hugepage segment 1 (phdr 3): 0x2000000-0x2020590  
(filesz=0x1053c) (prot = 0x7)
libhugetlbfs: Mapped hugeseg at 0x2aaaaac00000. Copying 0x1053c bytes from 
0x2000000...
...done
libhugetlbfs: Copying extra 0x10 bytes from 0x2010540...
...done
libhugetlbfs: Prepare succeeded
Starting testcase "xBDT.linkshare", pid 23940
Process 23940 got a Segmentation Fault at address 0x2b33f127f040
...done
libhugetlbfs: Copying extra 0x10 bytes from 0x2010540...
...done
libhugetlbfs: Prepare succeeded
Starting testcase "xBDT.linkshare", pid 23939
Process 23939 got a Segmentation Fault at address 0x2b8f03d29000
Child 1 exited with non-zero status: 2
Child 2 exited with non-zero status: 2
FAIL    2 of 2 children exited abnormally

[5]

[92773.453097] xB.linkshare[23838]: segfault at 00002b980301fb88 rip 
00002ae67e48f315 rsp 00007fff2c9856d0 error 4
[92773.456395] xB.linkshare[23837]: segfault at 00002b980301fb88 rip 
00002b3e89b2f315 rsp 00007fff212e7030 error 4
[92779.769836] xBDT.linkshare[23903]: segfault at 00002b7604c72b40 rip 
00002b7604c72b40 rsp 00007fff5df255b8 error 14
[92779.772422] xBDT.linkshare[23904]: segfault at 00002b7604c72b40 rip 
00002b7604c72b40 rsp 00007fff479fd098 error 14

-- 
Nishanth Aravamudan <[EMAIL PROTECTED]>
IBM Linux Technology Center

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Libhugetlbfs-devel mailing list
Libhugetlbfs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libhugetlbfs-devel

Reply via email to