Re: guile 3 update, june 2018 edition

2018-07-17 Thread dsmich


 dsm...@roadrunner.com wrote: 
> Ok! now getting past the "make -j" issue, but I'm still getting a segfault.

And now commit e6461cf1b2b63e3ec9a2867731742db552b61b71 has gotten past the 
segfault.

Wooo!

-Dale




Re: guile 3 update, june 2018 edition

2018-07-05 Thread Andy Wingo
Hi :)

On Mon 02 Jul 2018 11:28, l...@gnu.org (Ludovic Courtès) writes:

> Andy Wingo  skribis:
>
>> My current plan is that the frame overhead will still be two slots: the
>> saved previous FP, and the saved return address.  Right now the return
>> address is always a bytecode address.  In the future it will be bytecode
>> or native code.  Guile will keep a runtime routine marking regions of
>> native code so it can know if it needs to if an RA is bytecode or native
>> code, for debugging reasons; but in most operation, Guile won't need to
>> know.  The interpreter will tier up to JIT code through an adapter frame
>> that will do impedance matching over virtual<->physical addresses.  To
>> tier down to the interpreter (e.g. when JIT code calls interpreted
>> code), the JIT will simply return to the interpreter, which will pick up
>> state from the virtual IP, SP, and FP saved in the VM state.
>
> What will the “adapter frame” look like?

Aah, sadly it won't work like this.  Somehow I was thinking of an
adapter frame on the C stack.  However an adapter frame corresponds to a
continuation, so it would have to have the life of a continuation, so it
would have to be on the VM stack.  I don't think I want adapter frames
on the VM stack, so I have to scrap this.  More below...

>> We do walk the stack from Scheme sometimes, notably when making a
>> backtrace.  So, we'll make the runtime translate the JIT return
>> addresses to virtual return addresses in the frame API.  To Scheme, it
>> will be as if all things were interpreted.
>
> Currently you can inspect the locals of a stack frame.  Will that be
> possible with frames corresponding to native code? (I suppose that’d be
> difficult.)

Yes, because native code manipulates the VM stack in exactly the same
way as bytecode.  Eventually we should do register allocation and avoid
always writing values to the stack, but that is down the road.

>> My current problem is knowing when a callee has JIT code.  Say you're in
>> JITted function F which calls G.  Can you directly jump to G's native
>> code, or is G not compiled yet and you need to use the interpreter?  I
>> haven't solved this yet.  "Known calls" that use call-label and similar
>> can of course eagerly ensure their callees are JIT-compiled, at
>> compilation time.  Unknown calls are the problem.  I don't know whether
>> to consider reserving another word in scm_tc7_program objects for JIT
>> code.  I have avoided JIT overhead elsewhere and would like to do so
>> here as well!
>
> In the absence of a native code pointer in scm_tc7_program objects, how
> will libguile find the native code for a given program?

This is a good question and it was not clear to me when I wrote this!  I
think I have a solution now but it involves memory overhead.  Oh well.

Firstly, I propose to add a slot to stack frames.  Stack frames will now
store the saved FP, the virtual return address (vRA), and the machine
return address IP (mRA).  When in JIT code, a return will check if the
mRA is nonzero, and if so jump to that mRA.  Otherwise it will return
from JIT, and the interpreter should continue.

Likewise when doing a function return from the interpreter and the mRA
is nonzero, the interpreter should return by entering JIT code to that
address.

When building an interpreter-only Guile (Guile without JIT) or an
AOT-only Guile (doesn't exist currently), we could configure Guile to
not reserve this extra stack word.  However that would be a different
ABI: a .go file built with interpreter-only Guile wouldn't work on
Guile-with-JIT, because interpreter-only Guile would think stack frames
only need two reserved words, whereas Guile-with-JIT would write three
words.  To avoid the complication, for 3.0 I think we should just use
3-word frames all the time.

So, that's returns.  Other kinds of non-local returns like
abort-to-prompt, resuming delimited continuations, or calling
undelimited continuations would work similarly: the continuation would
additionally record an mRA, and resuming would jump there instead, if
appropriate.

Now, calls.  One of the reasons that I wanted to avoid an extra program
word was because scm_tc7_program doesn't exist in a one-to-one
relationship with code.  "Well-known" procedures get compiled by closure
optimization to be always called via call-label or tail-call-label -- so
some code doesn't have program objects.  On the other hand, closures
mean that some code has many program objects.

So I thought about using side tables indexed by code; or inline
"maybe-tier-up-here" instructions, which would reference a code pointer
location, that if nonzero, would be the JIT code.

However I see now that really we need to optimize for the JIT-to-JIT
call case, as by definition that's going to be the hot case.  Of course
call-label from JIT can do an unconditional jmp.  But calling a program
object... how do we do this?  This is complicated by code pages being
read-only, so we don't have space to store a pointer in 

Re: guile 3 update, june 2018 edition

2018-07-02 Thread Ludovic Courtès
Hello!

Andy Wingo  skribis:

> The news is that the VM has been completely converted over to call out
> to the Guile runtime through an "intrinsics" vtable.  For some
> intrinsics, the compiler will emit specialized call-intrinsic opcodes.
> (There's one of these opcodes for each intrinsic function type.)  For
> others that are a bit more specialized, like the intrinsic used in
> call-with-prompt, the VM calls out directly to the intrinsic.
>
> The upshot is that we're now ready to do JIT compilation.  JIT-compiled
> code will use the intrinsics vtable to embed references to runtime
> routines.  In some future, AOT-compiled code can keep the intrinsics
> vtable in a register, and call indirectly through that register.

Exciting!  It sounds like a really good strategy because it means that
the complex instructions don’t have to be implemented in lightning
assembly by hand, which would be a pain.

> My current plan is that the frame overhead will still be two slots: the
> saved previous FP, and the saved return address.  Right now the return
> address is always a bytecode address.  In the future it will be bytecode
> or native code.  Guile will keep a runtime routine marking regions of
> native code so it can know if it needs to if an RA is bytecode or native
> code, for debugging reasons; but in most operation, Guile won't need to
> know.  The interpreter will tier up to JIT code through an adapter frame
> that will do impedance matching over virtual<->physical addresses.  To
> tier down to the interpreter (e.g. when JIT code calls interpreted
> code), the JIT will simply return to the interpreter, which will pick up
> state from the virtual IP, SP, and FP saved in the VM state.

What will the “adapter frame” look like?

> We do walk the stack from Scheme sometimes, notably when making a
> backtrace.  So, we'll make the runtime translate the JIT return
> addresses to virtual return addresses in the frame API.  To Scheme, it
> will be as if all things were interpreted.

Currently you can inspect the locals of a stack frame.  Will that be
possible with frames corresponding to native code? (I suppose that’d be
difficult.)

> My current problem is knowing when a callee has JIT code.  Say you're in
> JITted function F which calls G.  Can you directly jump to G's native
> code, or is G not compiled yet and you need to use the interpreter?  I
> haven't solved this yet.  "Known calls" that use call-label and similar
> can of course eagerly ensure their callees are JIT-compiled, at
> compilation time.  Unknown calls are the problem.  I don't know whether
> to consider reserving another word in scm_tc7_program objects for JIT
> code.  I have avoided JIT overhead elsewhere and would like to do so
> here as well!

In the absence of a native code pointer in scm_tc7_program objects, how
will libguile find the native code for a given program?

Thanks for sharing this plan!  Good times ahead!

Ludo’.




Re: guile 3 update, june 2018 edition

2018-07-01 Thread dsmich
Ok! now getting past the "make -j" issue, but I'm still getting a segfault.

Here is a backtrace from the core dump.

Line 25:
#25 0x7efeb518b09f in scm_error (key=0x563599bbb120, subr=subr@entry=0x0, 
message=message@entry=0x7efeb521c0cd "Unbound variable: ~S", 
args=0x563599f8f260, 
rest=rest@entry=0x4) at error.c:62

Looks kinda suspicious.  Should subr be 0x0 there?


Thread 4 (Thread 0x7efeb282c700 (LWP 10059)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at 
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x7efeb4401cc7 in GC_wait_marker () from 
/usr/lib/x86_64-linux-gnu/libgc.so.1
#2  0x7efeb43f85ca in GC_help_marker () from 
/usr/lib/x86_64-linux-gnu/libgc.so.1
#3  0x7efeb440033c in GC_mark_thread () from 
/usr/lib/x86_64-linux-gnu/libgc.so.1
#4  0x7efeb49f6494 in start_thread (arg=0x7efeb282c700) at 
pthread_create.c:333
#5  0x7efeb4738acf in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:97

Thread 3 (Thread 0x7efeb382e700 (LWP 10057)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at 
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x7efeb4401cc7 in GC_wait_marker () from 
/usr/lib/x86_64-linux-gnu/libgc.so.1
#2  0x7efeb43f85ca in GC_help_marker () from 
/usr/lib/x86_64-linux-gnu/libgc.so.1
#3  0x7efeb440033c in GC_mark_thread () from 
/usr/lib/x86_64-linux-gnu/libgc.so.1
#4  0x7efeb49f6494 in start_thread (arg=0x7efeb382e700) at 
pthread_create.c:333
#5  0x7efeb4738acf in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:97

Thread 2 (Thread 0x7efeb302d700 (LWP 10058)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at 
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x7efeb4401cc7 in GC_wait_marker () from 
/usr/lib/x86_64-linux-gnu/libgc.so.1
#2  0x7efeb43f85ca in GC_help_marker () from 
/usr/lib/x86_64-linux-gnu/libgc.so.1
#3  0x7efeb440033c in GC_mark_thread () from 
/usr/lib/x86_64-linux-gnu/libgc.so.1
#4  0x7efeb49f6494 in start_thread (arg=0x7efeb302d700) at 
pthread_create.c:333
#5  0x7efeb4738acf in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:97

Thread 1 (Thread 0x7efeb565d740 (LWP 10034)):
#0  0x7efeb51afd26 in scm_maybe_resolve_module 
(name=name@entry=0x563599f8f140) at modules.c:195
#1  0x7efeb51b01bf in scm_public_variable (module_name=0x563599f8f140, 
name=0x563599da50e0) at modules.c:656
#2  0x7efeb518036a in init_print_frames_var_and_frame_to_stack_vector_var 
() at backtrace.c:103
#3  0x7efeb49fd739 in __pthread_once_slow (once_control=0x7efeb545d828 
, init_routine=0x7efeb5180340 
)
at pthread_once.c:116
#4  0x7efeb49fd7e5 in __GI___pthread_once 
(once_control=once_control@entry=0x7efeb545d828 , 
init_routine=init_routine@entry=0x7efeb5180340 
) at pthread_once.c:143
#5  0x7efeb51801b0 in display_backtrace_body (a=0x7ffe2b3b7ea0) at 
backtrace.c:218
#6  0x7efeb520040f in vm_regular_engine (thread=0x563599b14dc0) at 
vm-engine.c:610
#7  0x7efeb52046d3 in scm_call_n (proc=proc@entry=0x563599da5aa0, 
argv=argv@entry=0x0, nargs=nargs@entry=0) at vm.c:1440
#8  0x7efeb518cab9 in scm_call_0 (proc=proc@entry=0x563599da5aa0) at 
eval.c:489
#9  0x7efeb51f8cd6 in catch (tag=tag@entry=0x404, thunk=0x563599da5aa0, 
handler=0x563599da5940, pre_unwind_handler=0x4) at throw.c:144
#10 0x7efeb51f9015 in scm_catch_with_pre_unwind_handler 
(key=key@entry=0x404, thunk=, handler=, 
pre_unwind_handler=)
at throw.c:262
#11 0x7efeb51f91cf in scm_c_catch (tag=tag@entry=0x404, 
body=body@entry=0x7efeb5180190 , 
body_data=body_data@entry=0x7ffe2b3b7ea0, 
handler=handler@entry=0x7efeb5180580 , 
handler_data=handler_data@entry=0x563599baf000, 
pre_unwind_handler=pre_unwind_handler@entry=0x0, 
pre_unwind_handler_data=0x0) at throw.c:387
#12 0x7efeb51f91de in scm_internal_catch (tag=tag@entry=0x404, 
body=body@entry=0x7efeb5180190 , 
body_data=body_data@entry=0x7ffe2b3b7ea0, 
handler=handler@entry=0x7efeb5180580 , 
handler_data=handler_data@entry=0x563599baf000) at throw.c:396
#13 0x7efeb5180185 in scm_display_backtrace_with_highlights 
(stack=stack@entry=0x563599da5b60, port=port@entry=0x563599baf000, 
first=first@entry=0x4, 
depth=depth@entry=0x4, highlights=highlights@entry=0x304) at backtrace.c:277
#14 0x7efeb51f8fec in handler_message (tag=tag@entry=0x563599bbb120, 
args=args@entry=0x563599c0cdb0, handler_data=) at throw.c:548
#15 0x7efeb51f93cb in scm_handle_by_message (handler_data=, 
tag=0x563599bbb120, args=0x563599c0cdb0) at throw.c:585
#16 0x7efeb51f94fe in default_exception_handler (args=0x563599c0cdb0, 
k=0x563599bbb120) at throw.c:174
#17 throw_without_pre_unwind (tag=0x563599bbb120, args=0x563599c0cdb0) at 
throw.c:248
#18 0x7efeb520040f in vm_regular_engine (thread=0x563599b14dc0) at 
vm-engine.c:610
#19 0x7efeb52046d3 in scm_call_n (proc=proc@entry=0x563599baf9c0, 
argv=, nargs=5) at vm.c:1440
---Type  to continue, or q  to quit---
#20 0x7efeb518ce4b in scm_apply_0 

Re: guile 3 update, june 2018 edition

2018-06-29 Thread dsmich
Greetings Andy!

 Andy Wingo  wrote: 
> Hi,
> 
> Just wanted to give an update on Guile 3 developments.  Last note was
> here:
> 
>   https://lists.gnu.org/archive/html/guile-devel/2018-04/msg4.html
> 
> The news is that the VM has been completely converted over to call out
> to the Guile runtime through an "intrinsics" vtable.  For some
> intrinsics, the compiler will emit specialized call-intrinsic opcodes.
> (There's one of these opcodes for each intrinsic function type.)  For
> others that are a bit more specialized, like the intrinsic used in
> call-with-prompt, the VM calls out directly to the intrinsic.

Very exciting!

However, master is not building for me. :(

  git clean -dxf; ./autogen.sh && ./configure && make -j5

gives me

  SNARF  atomic.x
  SNARF  backtrace.x
  SNARF  boolean.x
In file included from atomic.c:29:0:
extensions.h:26:30: fatal error: libguile/libpath.h: No such file or directory
 #include "libguile/libpath.h"
  ^
compilation terminated.
Makefile:3893: recipe for target 'atomic.x' failed
make[2]: *** [atomic.x] Error 1



Maybe some dependency tuning is needed?

 

So.  Building without -j :

  make clean; make

gives gives a segfault when generating the docs


  SNARF  regex-posix.doc
  GEN  guile-procedures.texi
Uncaught exception:
Backtrace:
/bin/bash: line 1: 13428 Broken pipe cat alist.doc array-handle.doc 
array-map.doc arrays.doc async.doc atomic.doc backtrace.doc boolean.doc 
bitvectors.doc bytevectors.doc chars.doc control.doc continuations.doc 
debug.doc deprecated.doc deprecation.doc dynl.doc dynwind.doc eq.doc error.doc 
eval.doc evalext.doc expand.doc extensions.doc fdes-finalizers.doc feature.doc 
filesys.doc fluids.doc foreign.doc fports.doc gc-malloc.doc gc.doc gettext.doc 
generalized-arrays.doc generalized-vectors.doc goops.doc gsubr.doc 
guardians.doc hash.doc hashtab.doc hooks.doc i18n.doc init.doc ioext.doc 
keywords.doc list.doc load.doc macros.doc mallocs.doc memoize.doc modules.doc 
numbers.doc objprop.doc options.doc pairs.doc ports.doc print.doc procprop.doc 
procs.doc promises.doc r6rs-ports.doc random.doc rdelim.doc read.doc rw.doc 
scmsigs.doc script.doc simpos.doc smob.doc sort.doc srcprop.doc srfi-1.doc 
srfi-4.doc srfi-13.doc srfi-14.doc srfi-60.doc stackchk.doc stacks.doc 
stime.doc strings.doc strorder.doc strports.doc struct.doc symbols.doc 
syntax.doc threads.doc throw.doc trees.doc unicode.doc uniform.doc values.doc 
variable.doc vectors.doc version.doc vports.doc weak-set.doc weak-table.doc 
weak-vector.doc dynl.doc posix.doc net_db.doc socket.doc regex-posix.doc
 13429 Segmentation fault  | GUILE_AUTO_COMPILE=0 ../meta/build-env 
guild snarf-check-and-output-texi > guile-procedures.texi
Makefile:3910: recipe for target 'guile-procedures.texi' failed

This is

$ git describe
v2.2.2-504-gb5dcdf2e2

And gcc is
$ gcc --version
gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516

On an up to date Debian 9.4 system:
$ uname -a
Linux debmetrix 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 (2018-05-07) x86_64 
GNU/Linux


-Dale





guile 3 update, june 2018 edition

2018-06-29 Thread Andy Wingo
Hi,

Just wanted to give an update on Guile 3 developments.  Last note was
here:

  https://lists.gnu.org/archive/html/guile-devel/2018-04/msg4.html

The news is that the VM has been completely converted over to call out
to the Guile runtime through an "intrinsics" vtable.  For some
intrinsics, the compiler will emit specialized call-intrinsic opcodes.
(There's one of these opcodes for each intrinsic function type.)  For
others that are a bit more specialized, like the intrinsic used in
call-with-prompt, the VM calls out directly to the intrinsic.

The upshot is that we're now ready to do JIT compilation.  JIT-compiled
code will use the intrinsics vtable to embed references to runtime
routines.  In some future, AOT-compiled code can keep the intrinsics
vtable in a register, and call indirectly through that register.

My current plan is that the frame overhead will still be two slots: the
saved previous FP, and the saved return address.  Right now the return
address is always a bytecode address.  In the future it will be bytecode
or native code.  Guile will keep a runtime routine marking regions of
native code so it can know if it needs to if an RA is bytecode or native
code, for debugging reasons; but in most operation, Guile won't need to
know.  The interpreter will tier up to JIT code through an adapter frame
that will do impedance matching over virtual<->physical addresses.  To
tier down to the interpreter (e.g. when JIT code calls interpreted
code), the JIT will simply return to the interpreter, which will pick up
state from the virtual IP, SP, and FP saved in the VM state.

We do walk the stack from Scheme sometimes, notably when making a
backtrace.  So, we'll make the runtime translate the JIT return
addresses to virtual return addresses in the frame API.  To Scheme, it
will be as if all things were interpreted.

This strategy relies on the JIT being a simple code generator, not an
optimizer -- the state of the stack whether JIT or interpreted is the
same.  We can consider relaxing this in the future.

My current problem is knowing when a callee has JIT code.  Say you're in
JITted function F which calls G.  Can you directly jump to G's native
code, or is G not compiled yet and you need to use the interpreter?  I
haven't solved this yet.  "Known calls" that use call-label and similar
can of course eagerly ensure their callees are JIT-compiled, at
compilation time.  Unknown calls are the problem.  I don't know whether
to consider reserving another word in scm_tc7_program objects for JIT
code.  I have avoided JIT overhead elsewhere and would like to do so
here as well!

For actual JIT code generation, I think my current plan is to import a
copy of GNU lightning into Guile's source, using git-subtree merges.
Lightning is fine for our purposes as we only need code generation, not
optimization, and it supports lots of architectures: ARM, MIPS, PPC,
SPARC, x86 / x86-64, IA64, HPPA, AArch64, S390, and Alpha.

Lightning will be built statically into libguile.  This has the
advantage that we always know the version being used, and we are able to
extend lightning without waiting for distros to pick up a new version.
Already we will need to extend it to support atomic ops.  Subtree merges
should allow us to pick up upstream improvements without too much pain.
This strategy also allows us to drop lightning in the future if that's
the right thing.  Basically from the user POV it should be transparent.
The whole thing will be behind an --enable-jit / --disable-jit configure
option.  When it is working we can consider enabling shared lightning
usage.

Happy hacking,

Andy