Re: [polyml] Segmentation Fault When Porting

David Matthews Mon, 18 Jan 2016 11:04:32 -0800

James,

I've managed to set up a big-endian mips debian virtual machine usingqemu inside a virtual debian machine in virtualbox on Windows. Despiteall the layers of virtualisation it works and more importantly Poly/MLactually builds successfully. It does crash with some larger examples,such as Tests/Succeed/Test133.ML, and I've seen some other crashes inthe garbage-collector. I suspect that there is a problem withendian-ness somewhere but it may be possible to narrow this down with gdb.


Regards,
David

On 16/01/2016 17:01, James Clarke wrote:

Hi David,
I just tried building on mipsel, and that compiles and passes the test suite 
with the same compiler flags. Endianness is looking like a strong candidate, 
given that the only architectures it fails on are big-endian, although compiler 
optimisations are “responsible”. I shall see if a very old version works on 
big-endian mips; if so, I will try and do a git bisect, otherwise it might have 
to be some painful debugging.

Regards,
James

On 15 Jan 2016, at 11:59, James Clarke <[email protected]> wrote:

They are all big-endian. I haven't tried mipsel; that could help narrow it 
down. One thing making me not so sure it's an endianness issue is that you 
support 32-bit PowerPC, and that runs properly. Also the mips builds are broken 
by GCC's optimisations; adding -fno-omit-frame-pointer made it work for some 
reason, if I remember correctly.

James

On 15 Jan 2016, at 11:29, David Matthews <[email protected]> wrote:

I wish I could help but there's not much I can suggest.  The only idea that 
occurs to me is that there is some endian-ness issue that has crept in.  Are 
these little-endian or big-endian?  In theory the interpreter should work on 
both big-endian and little-endian but I've only tested the most recent version 
on X86.  Have a look at an earlier version of Poly/ML and see if you have any 
more success with that.

David

On 12/01/2016 14:52, James Clarke wrote:
Hi,
I’ve been trying to port Poly/ML to mips and IBM’s S/390 (the 64-bit version, often 
referred to as s390x). For both, I tried just adding an extra case in configure.ac, along 
with corresponding HOSTARCHITECTURE macros and cases in libpolyml/elfexport.cpp. However, 
these all seem to segfault when polyimport is run when building (both with 5.5.2 and git 
commit ee26375, "Merge branch ‘PICTest’"). I can’t seem to get a meaningful 
stack trace out of the mips segfault, but it crashes just after “Use: basis/Socket.sml” 
is printed. However, on s390x, it crashes before anything is printed, and valgrind gave 
me the following (with no errors before this point) when running ee26375’s polyimport:

==16138== Thread 3:
==16138== Invalid read of size 8
==16138==    at 0x489EA50: Offset (globals.h:315)
==16138==    by 0x489EA50: GetConstSegmentForCode (globals.h:344)
==16138==    by 0x489EA50: GetConstSegmentForCode (globals.h:350)
==16138==    by 0x489EA50: ConstPtrForCode (globals.h:355)
==16138==    by 0x489EA50: buildStackList(TaskData*, PolyWord*, PolyWord*) 
(run_time.cpp:413)
==16138==    by 0x489EC87: exceptionToTraceException(TaskData*, SaveVecEntry*) 
(run_time.cpp:471)
==16138==    by 0x48AC9ED: IntTaskData::SwitchToPoly() (interpret.cpp:877)
==16138==    by 0x48ACC33: IntTaskData::EnterPolyCode() (interpret.cpp:1428)
==16138==    by 0x489324D: NewThreadFunction(void*) (processes.cpp:1128)
==16138==    by 0x48E591D: start_thread (pthread_create.c:335)
==16138==    by 0x4C8CEA9: ??? (in /lib/s390x-linux-gnu/libc-2.21.so)
==16138==  Address 0xe000000005ab5b38 is not stack'd, malloc'd or (recently) 
free'd
==16138==
==16138==
==16138== Process terminating with default action of signal 11 (SIGSEGV)
==16138==  Access not within mapped region at address 0xE000000005AB5000
==16138==    at 0x489EA50: Offset (globals.h:315)
==16138==    by 0x489EA50: GetConstSegmentForCode (globals.h:344)
==16138==    by 0x489EA50: GetConstSegmentForCode (globals.h:350)
==16138==    by 0x489EA50: ConstPtrForCode (globals.h:355)
==16138==    by 0x489EA50: buildStackList(TaskData*, PolyWord*, PolyWord*) 
(run_time.cpp:413)
==16138==    by 0x489EC87: exceptionToTraceException(TaskData*, SaveVecEntry*) 
(run_time.cpp:471)
==16138==    by 0x48AC9ED: IntTaskData::SwitchToPoly() (interpret.cpp:877)
==16138==    by 0x48ACC33: IntTaskData::EnterPolyCode() (interpret.cpp:1428)
==16138==    by 0x489324D: NewThreadFunction(void*) (processes.cpp:1128)
==16138==    by 0x48E591D: start_thread (pthread_create.c:335)
==16138==    by 0x4C8CEA9: ??? (in /lib/s390x-linux-gnu/libc-2.21.so)
==16138==  If you believe this happened as a result of a stack
==16138==  overflow in your program's main thread (unlikely but
==16138==  possible), you can try to increase the size of the
==16138==  main thread stack using the --main-stacksize= flag.
==16138==  The main thread stack size used in this run was 8388608.

(the ??? for libc is because valgrind does not yet understand compressed debug 
info; I removed a whole load of warnings to that effect)

Have you ever come across anything like this? Do you have any thoughts for 
where to start with hunting this down?

Regards,
James Clarke

_______________________________________________
polyml mailing list
[email protected]
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

_______________________________________________
polyml mailing list
[email protected]
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml




_______________________________________________
polyml mailing list
[email protected]
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

_______________________________________________
polyml mailing list
[email protected]
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

Re: [polyml] Segmentation Fault When Porting

Reply via email to