Hi All,

I am working on building/porting Julia on ppc64le architecture. I am using
Ubuntu 14.10 on top of ppc64le hardware, while compiling Julia code (master
branch) I was getting segmentation fault, I am able to resolve this
segmentation fault by turning on ‘MEMDEBUG’ flag from ‘src/options.h’
file.


I decided to work more this issue and try to find out root cause of
segmentation fault, so I started studying/understanding memory management
of Julia. I have couple of questions in my mind regarding memory management
of Julia and I want to discuss those here.

1. While defining ‘REGION_PG_COUNT’ macro, 4096 value is used, I want to
know what is the significance of 4096?
                  If 4096 is indicates page-size then this code is valid or
work fine on amd64/x86_64   architecture where page size 4k and it may
behave abnormally in case of PPC64 where page size is 64k, basically here I
want to discuss the impact of large page size on Julia code and what all
other things I need to take into consideration while porting Julia on
PPC64le.

2. Past few days I was working on understanding memory management scheme of
Julia and I find it bit of difficult and time consuming process though I
have some success. I want to know is there any official / unofficial
document around which will help me understand it.

Any suggestions/pointers on above mention points are much appreciated.

-Mahesh

On Tue, Aug 18, 2015 at 8:11 PM, Jameson Nash <[email protected]> wrote:

> It is a considerable performance impact to run with MEMDEBUG, but
> otherwise has no side-effects. It is not necessary to run with this flag in
> production (and probably not helpful either, since you wouldn't have a
> debugger attached).
>
>
> On Tue, Aug 18, 2015 at 9:53 AM Mahesh Waidande <
> [email protected]> wrote:
>
>> Hi Jamseson,
>>
>> Thanks for explaining memory allocations on PPC and providing pointers on
>> resolving segmentation fault, pointers are really helpful and I am working
>> on those. I am able to compile Julia master branch after turning ‘MEMDEBUG‘
>> flag on from options.h file, compilation went smooth and I am able to see
>> the Julia prompt. Although I will continue to work on finding root cause of
>> segmentation fault, occur at a time of Julia initialization.
>>
>>
>> I think when we turn on the ‘MEMDEBUG‘ flag it will reduce a performance
>> of Julia bit as with MEMDEBUG no memory pools are used and all allocation
>> is treated as big.
>>
>>
>> Apart from performance issue, I have few questions in my mind and I would
>> like to discuss those,
>> 1. Apart from performance hit, is there any other functionality has
>> impacted due to turning on ‘MEMDEBUG’ flag OR what are side effects of
>> turning ‘MEMDEBUG’ flag on?
>> 2. Should I use these settings (turning MEMDEBUG flag on) in production
>> environment or in release mode?
>>
>>
>>
>> -Mahesh
>>
>> On Fri, Aug 14, 2015 at 10:03 PM, Jameson Nash <[email protected]> wrote:
>>
>>> It's a JIT copy of a julia function named "new". The last time this
>>> error popped up, it was due to an error in the free_page function logic to
>>> compute whether it was safe to free the current page (since PPC using large
>>> pages). One place to check then is to ensure the invalid pointer hadn't
>>> accidentally being deleted by an madvise(DONTNEED) for an unrelated page
>>> free operations.
>>>
>>> Beyond that, I would suggest trying with the `MEMDEBUG` turned on in
>>> options.h (which will also disable the `free_page` function).
>>>
>>> Also, when you have gdb running, there are many more useful things to
>>> print than just the backtrace. For starters, I would suggest looking at
>>> `disassembly` and `info registers`. Also, go `up` on the stack trace and
>>> look at `jl_(f->linfo)`, `jl_(jl_uncompress_ast(f->linfo, f->linfo->ast))`,
>>> and `jl_(args[0])` / `jl_(args[1])`
>>>
>>>
>>> On Fri, Aug 14, 2015 at 9:07 AM Mahesh Waidande <
>>> [email protected]> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I am working on building/porting Julia on ppc64le architecture. I am
>>>> using Ubuntu 14.10 on top of ppc64le hardware, while compiling Julia
>>>> code(master branch) I am getting segmentation fault. I tried to debug
>>>> segmentation fault with tools like gdb/vgdb , valgrind , electric-fence
>>>> etc. but I not able to find a root cause of it. I need some
>>>> help/pointers/suggestions on how I resolve it.
>>>>
>>>> Here are some details which will help you to diagnose a problem,
>>>>
>>>> 1. Machine details :
>>>> $ uname -a
>>>> Linux pts00433-vm1 3.16.0-30-generic #40-Ubuntu SMP Mon Jan 12 22:07:11
>>>> UTC 2015 ppc64le ppc64le ppc64le GNU/Linux
>>>> $
>>>>
>>>> 2. Snapshot of ‘make debug’ log
>>>> make[1]: Leaving directory '/home/test/Mahesh/julia/julia/base'
>>>> make[1]: Entering directory '/home/test/Mahesh/julia/julia'
>>>>  cd base && /home/test/Mahesh/julia/julia/usr/bin/julia-debug -C native
>>>> --output-ji /home/test/Mahesh/julia/julia/usr/lib/julia/inference0.ji -f
>>>> coreimg.jl
>>>>
>>>>   Electric Fence 2.2 Copyright (C) 1987-1999 Bruce Perens <
>>>> [email protected]>
>>>> Segmentation fault
>>>> Makefile:175: recipe for target
>>>> '/home/test/Mahesh/julia/julia/usr/lib/julia/inference0.ji' failed
>>>> make[1]: ***
>>>> [/home/test/Mahesh/julia/julia/usr/lib/julia/inference0.ji] Error 139
>>>> make[1]: Leaving directory '/home/test/Mahesh/julia/julia'
>>>> Makefile:64: recipe for target 'julia-inference' failed
>>>> make: *** [julia-inference] Error 2
>>>>
>>>> 3. gdb stack trace
>>>> test@pts00433-vm1:~/Mahesh/julia/julia/base$ gdb --args
>>>> /home/test/Mahesh/julia/julia/usr/bin/julia-debug -C native --output-ji
>>>> /home/test/Mahesh/julia/julia/usr/lib/julia/inference0.ji -f coreimg.jl
>>>> GNU gdb (Ubuntu 7.8-1ubuntu4) 7.8.0.20141001-cvs
>>>> Copyright (C) 2014 Free Software Foundation, Inc.
>>>> License GPLv3+: GNU GPL version 3 or later <
>>>> http://gnu.org/licenses/gpl.html>
>>>> This is free software: you are free to change and redistribute it.
>>>> There is NO WARRANTY, to the extent permitted by law.  Type "show
>>>> copying"
>>>> and "show warranty" for details.
>>>> This GDB was configured as "powerpc64le-linux-gnu".
>>>> Type "show configuration" for configuration details.
>>>> For bug reporting instructions, please see:
>>>> <http://www.gnu.org/software/gdb/bugs/>.
>>>> Find the GDB manual and other documentation resources online at:
>>>> <http://www.gnu.org/software/gdb/documentation/>.
>>>> For help, type "help".
>>>> Type "apropos word" to search for commands related to "word"...
>>>> Reading symbols from
>>>> /home/test/Mahesh/julia/julia/usr/bin/julia-debug...done.
>>>> (gdb) b repl.c:532
>>>> Breakpoint 1 at 0x10003a34: file repl.c, line 532.
>>>> (gdb) r
>>>> Starting program: /home/test/Mahesh/julia/julia/usr/bin/julia-debug -C
>>>> native --output-ji
>>>> /home/test/Mahesh/julia/julia/usr/lib/julia/inference0.ji -f coreimg.jl
>>>> [Thread debugging using libthread_db enabled]
>>>> Using host libthread_db library
>>>> "/lib/powerpc64le-linux-gnu/libthread_db.so.1".
>>>>
>>>>   Electric Fence 2.2 Copyright (C) 1987-1999 Bruce Perens <
>>>> [email protected]>
>>>>
>>>> Breakpoint 1, main (argc=7, argv=0x3ffffffff478) at repl.c:533
>>>> 533     {
>>>> (gdb) c
>>>> Continuing.
>>>>
>>>> Program received signal SIGSEGV, Segmentation fault.
>>>> 0x00003fffb6970078 in julia.new_0 ()
>>>> (gdb) where
>>>> #0  0x00003fffb6970078 in julia.new_0 ()
>>>> #1  0x00003fffb6b3b820 in jl_apply (f=0x3ffd9ac1de10,
>>>> args=0x3fffffffde28, nargs=2) at julia.h:1263
>>>> #2  0x00003fffb6b4137c in jl_trampoline (F=0x3ffd9ac1de10,
>>>> args=0x3fffffffde28, nargs=2) at builtins.c:979
>>>> #3  0x00003fffb6b2b084 in jl_apply (f=0x3ffd9ac1de10,
>>>> args=0x3fffffffde28, nargs=2) at julia.h:1263
>>>> #4  0x00003fffb6b328d0 in jl_apply_generic (F=0x3ffd9ac1dd90,
>>>> args=0x3fffffffde28, nargs=2) at gf.c:1675
>>>> #5  0x00003fffb6c2d9a0 in jl_apply (f=0x3ffd9ac1dd90,
>>>> args=0x3fffffffde28, nargs=2) at julia.h:1263
>>>> #6  0x00003fffb6c2e014 in do_call (f=0x3ffd9ac1dd90,
>>>> args=0x3ffd9ac215a8, nargs=2, eval0=0x0, locals=0x0, nl=0, ngensym=0)
>>>>     at interpreter.c:65
>>>> #7  0x00003fffb6c2eec4 in eval (e=0x3ffd9ac1ddd0, locals=0x0, nl=0,
>>>> ngensym=0) at interpreter.c:212
>>>> #8  0x00003fffb6c2dc20 in jl_interpret_toplevel_expr (e=0x3ffd9ac1ddd0)
>>>> at interpreter.c:27
>>>> #9  0x00003fffb6c55eac in jl_toplevel_eval_flex (e=0x3ffd9ac1ddb0,
>>>> fast=1) at toplevel.c:524
>>>> #10 0x00003fffb6c56260 in jl_parse_eval_all (fname=0x3fffb7950158
>>>> "boot.jl", len=8) at toplevel.c:574
>>>> #11 0x00003fffb6c56510 in jl_load (fname=0x3fffb7950158 "boot.jl",
>>>> len=8) at toplevel.c:614
>>>> #12 0x00003fffb6c3cf58 in _julia_init (rel=JL_IMAGE_JULIA_HOME) at
>>>> init.c:1107
>>>> #13 0x00003fffb6c3f38c in julia_init (rel=JL_IMAGE_JULIA_HOME) at
>>>> task.c:252
>>>> #14 0x0000000010003af8 in main (argc=1, argv=0x3ffffffff4a8) at
>>>> repl.c:601
>>>> (gdb) q
>>>> A debugging session is active.
>>>>
>>>>         Inferior 1 [process 26906] will be killed.
>>>>
>>>> Quit anyway? (y or n) y
>>>> test@pts00433-vm1:~/Mahesh/julia/julia/base$
>>>>
>>>> 4. Segmentation fault occur at a time of Julia initialization, at a
>>>> time of initialization Julia compile some jl files, while compiling
>>>> ‘int.jl’ segmentation fault occurs.
>>>>
>>>> I extract above information from inspecting Julia code and attaching
>>>> valgrind to Julia-debug binary.
>>>> $ valgrind -v --vgdb=yes --vgdb-error=0 --leak-check=full
>>>> --show-leak-kinds=all --log-file=valgrind-test.log
>>>> /home/test/Mahesh/julia/julia/usr/bin/julia-debug -C native --output-ji
>>>> /home/test/Mahesh/julia/julia/usr/lib/julia/inference0.ji -f coreimg.jl
>>>>
>>>>   Electric Fence 2.2 Copyright (C) 1987-1999 Bruce Perens <
>>>> [email protected]>
>>>> essentials.jl
>>>> reflection.jl
>>>> options.jl
>>>> promotion.jl
>>>> tuple.jl
>>>> range.jl
>>>> expr.jl
>>>> error.jl
>>>> bool.jl
>>>> number.jl
>>>> int.jl
>>>>
>>>> signal (11): Segmentation fault
>>>> $
>>>>
>>>>
>>>> I have couple questions/doubt in my mind,
>>>>
>>>> a.) When I search ‘julia.new_0 ()’[gdb - frame 0 ]  in Julia and
>>>> dependent  source, I am not able find it out, my guess is at run time julia
>>>> create this function for initialization. I would like to hear your comments
>>>> / suggestion on how I debug this or where I need to look/check or quick
>>>> word on initialization process.
>>>>
>>>> b.) When I try to step in jl_apply() [gdb - frame 1 ] function, I get
>>>> the segmentation fault, I could not step in to the function. When I try to
>>>> step in (with step/s command of gdb) I am getting segmentation fault.  So
>>>> my question is, how I validate ‘0x3ffd9ac1de10’ address or contain present
>>>> at ‘0x3ffd9ac1de10’?.  ‘0x3ffd9ac1de10’ is virtual address because every
>>>> time I see same address in stack trace.
>>>>
>>>>
>>>> c.) Is this segmentation fault is a cascading effect of something goes
>>>> wrong at pre internalize stage and if I want to put any check on it, where
>>>> should I put it, any specific code snippet?
>>>>
>>>> d.) I observe strange behavior while compiling Julia source. For
>>>> debugging purpose I insert some printf statement in main() [gdb- frame 14]
>>>> it done some trick and I did not get any segmentation fault, code compile
>>>> smoothly and I get Julia prompt but when I remove printf statements all
>>>> together,  again I observe an segmentation fault, Any comments on this ?
>>>> Even single statement will do the trick.
>>>>
>>>>
>>>> I have attached all logs that I have i.e make debug log, gdb stack
>>>> trace and valgrind log with this mail.I am continued to investigate more on
>>>> this any suggestions/pointers are much appreciated.
>>>>
>>>> -Mahesh
>>>>
>>>
>>

Reply via email to