Re: persistant state in guile-log

2016-01-27 Thread Andrew Gaylard

Hi Stefan,

This is definitely something that's of interest to me.  Closures are one 
of the great strengths of scheme, and have been very useful to me in the 
past.


I'd love it for guile to have an "official" way to work with them, 
including your load/save semantics.  What's the format of the saved 
state? — is it human-readable or just readable by guile?


In addition it would be great for guile to be able to reason about the 
current state: e.g. walk the call stack and examine what's in it, or 
even change it.  (Perhaps this is already possible, I haven't checked.)


--
Andrew

On 27/01/2016 10:13, Stefan Israelsson Tampe wrote:

Hi all,

In guile 2.1 the position of code segements are fixed relative certain 
vectors and this makes it
possible to store indexes of code segements and as a possibility to 
persist closures. I took advatage of this because state in guile-log 
means that we must do exactly that. Persist not only datastructures 
like struct lists and vectors vhashes etc, but also persist closures. 
So now one can do cool things in guile-prolog like:


prolog>  X=1,stall,Y=2.
prolog> .setp 1
prolog> .savep
prolog> .quit

stis> guile-prolog
prolog> .loadp
prolog>  .refp 1
prolog> .cont

  X=1
  Y=2

prolog>
This is the interface:
-
(.setp)  associate current state to key
(.refp)  instate state referenced by key to current state
(.savep   ) save all referenced states to disk
(.loadp   ) load new referenced states from disk
(.cont ) continue a stalled predicate from current state

I can make this persistant code into a library anyone interested?

Oh the security implications of this is horrible but I don't pretend 
that guile-log is secure so
I don't care. What's more demading is that it depends on groveling 
into guile internal datastructures. Therefore I am reqesting an 
official way of persisting closures. What's your take on that? Would you

guy's want to supply such a mechansim or is it bad practice?

regards
Stefan




Re: Memory accounting in libgc

2014-03-12 Thread Andrew Gaylard

On 03/12/14 08:57, Mark H Weaver wrote:

Andy Wingo wi...@pobox.com writes:


How does this affect libgc?

First of all, it gives an answer to the question of how much memory
does an object use -- simply stop the world, mark the heap in two parts
(the first time ignoring the object in question, the second time
starting from the object), and subtract the live heap size of the former
from the latter.  Libgc could do this without too much problem, it seems
to me, on objects of any kind.  It would be a little extra code but it
could be useful.  Or not?  Dunno.

This could be generalized to the far more useful question: How much
memory does this set of objects use?, although that's a slippery
question that might better be formulated as How much memory would be
freed if this set of objects were no longer needed?.

For example, suppose you have a large data structure that is referenced
from two small header objects, A and B.  If you ask How much memory
does A use?, the answer will be the size of the small header, and ditto
for B.  Without being able to ask the more general question, there's no
way to find out how much would be freed by releasing both.

  Mark

Agreed.  In order to build industrial-strength applications in guile, it's
important to be able to answer questions such as what is causing my
process' memory usage to grow?

--
Andrew Gaylard




Re: The 2.0.9 VM cores in enqueue (threads.c:309)

2013-04-29 Thread Andrew Gaylard

On 04/28/13 03:07, Daniel Hartwig wrote:

On 28 April 2013 03:57, Andrew Gaylard a...@computer.org wrote:

Those 0x304 values look dodgy to me, and explain why the
SCM_SETCDR causes an invalid memory access.


0x304 is SCM_EOL.

Hi Daniel,

Thanks for the feedback.

Are you saying that the 0x304 values are fine, and the problem lies 
elsewhere?

(e.g. heap corruption, ...)

--
Andrew




The 2.0.9 VM cores in enqueue (threads.c:309)

2013-04-27 Thread Andrew Gaylard

Hi guile hackers,

I'm experiencing the VM coring in a repeatable manner.

My application launches a number of threads, which pass objects
from one thread to another via queues (ice-9 q).  To ensure thread-
safety, the queues are actually accessed via (container async-queue)
from guile-lib-0.2.2; see:

http://git.savannah.gnu.org/gitweb/?p=guile-lib.git;a=blob;f=src/container/async-queue.scm;h=82841f12eefe42ef6dacbbca8f0057723964323b;hb=HEAD

The idea is that if one thread adds an object to a queue, while another
is taking an object off a queue, a mutex will (or should) ensure that only
one thread alters the underlying queue objects at a time.

I've built guile with --enable-debug, and compiled with -ggdb3.
After the VM cores, gdb reveals this (apologies for the long lines):

(gdb) bt
#0  0x7e77b5f4 in enqueue (q=0x1010892c0, t=0x1018aac20) at 
threads.c:309
#1  0x7e77bc20 in block_self (queue=0x1010892c0, 
sleep_object=0x1010892d0, mutex=0x1019eef00, waittime=0x0) at threads.c:452
#2  0x7e77df50 in fat_mutex_lock (mutex=0x1010892d0, 
timeout=0x0, owner=0x904, ret=0x734f92ac) at threads.c:1473
#3  0x7e77e0e0 in scm_lock_mutex_timed (m=0x1010892d0, 
timeout=0x904, owner=0x904) at threads.c:1513
#4  0x7e78e9f4 in vm_regular_engine (vm=0x1018aabd0, 
program=0x7e94a4d0 scm_lock_mutex_timed__subr_raw_cell, 
argv=0x734fa2c0, nargs=3) at vm-i-system.c:858
#5  0x7e7b2ea0 in scm_c_vm_run (vm=0x1018aabd0, 
program=0x1003a3720, argv=0x734fa2a8, nargs=3) at vm.c:753
#6  0x7e68b8ac in scm_call_3 (proc=0x1003a3720, arg1=0x404, 
arg2=0x101980cc0, arg3=0x1011fac40) at eval.c:500
#7  0x7e7810c0 in scm_catch (key=0x404, thunk=0x101980cc0, 
handler=0x1011fac40) at throw.c:73
#8  0x7e77cc60 in really_launch (d=0x7fffa6f0) at 
threads.c:1009
#9  0x7e67b390 in c_body (d=0x734fb9b8) at 
continuations.c:511
#10 0x7e781564 in apply_catch_closure (clo=0x101fd30c0, 
args=0x304) at throw.c:146

#11 0x7e73cc6c in apply_1 (smob=0x101fd30c0, a=0x304) at smob.c:142
#12 0x7e78e9b0 in vm_regular_engine (vm=0x1018aabd0, 
program=0x1002c8700, argv=0x734fb690, nargs=2) at vm-i-system.c:855
#13 0x7e7b2ea0 in scm_c_vm_run (vm=0x1018aabd0, 
program=0x1003a3720, argv=0x734fb670, nargs=4) at vm.c:753
#14 0x7e68b91c in scm_call_4 (proc=0x1003a3720, arg1=0x404, 
arg2=0x101fd30c0, arg3=0x101fd30a0, arg4=0x101fd3080) at eval.c:507
#15 0x7e7811f4 in scm_catch_with_pre_unwind_handler (key=0x404, 
thunk=0x101fd30c0, handler=0x101fd30a0, pre_unwind_handler=0x101fd3080) 
at throw.c:86
#16 0x7e781664 in scm_c_catch (tag=0x404, 
body=0x7e67b364 c_body, body_data=0x734fb9b8, 
handler=0x7e67b3ac c_handler, handler_data=0x734fb9b8, 
pre_unwind_handler=0x7e67b438 pre_unwind_handler, 
pre_unwind_handler_data=0x1002ccaf0) at throw.c:213
#17 0x7e67b14c in scm_i_with_continuation_barrier 
(body=0x7e67b364 c_body, body_data=0x734fb9b8, 
handler=0x7e67b3ac c_handler, handler_data=0x734fb9b8, 
pre_unwind_handler=0x7e67b438 pre_unwind_handler, 
pre_unwind_handler_data=0x1002ccaf0) at continuations.c:449
#18 0x7e67b52c in scm_c_with_continuation_barrier 
(func=0x7e77cb74 really_launch, data=0x7fffa6f0) at 
continuations.c:545
#19 0x7e77c924 in with_guile_and_parent 
(base=0x734fbb50, data=0x734fbc18) at threads.c:908
#20 0x7e32e138 in GC_call_with_stack_base () from 
/opt/cs/components/3rd/bdw-gc/7.2.7e16628s16377h0398/lib/libgc.so.1
#21 0x7e77ca40 in scm_i_with_guile_and_parent 
(func=0x7e77cb74 really_launch, data=0x7fffa6f0, 
parent=0x100272d80) at threads.c:951
#22 0x7e77cce0 in launch_thread (d=0x7fffa6f0) at 
threads.c:1019
#23 0x7e337e00 in GC_inner_start_routine () from 
/opt/cs/components/3rd/bdw-gc/7.2.7e16628s16377h0398/lib/libgc.so.1
#24 0x7e32e138 in GC_call_with_stack_base () from 
/opt/cs/components/3rd/bdw-gc/7.2.7e16628s16377h0398/lib/libgc.so.1
#25 0x7e33ba64 in GC_start_routine () from 
/opt/cs/components/3rd/bdw-gc/7.2.7e16628s16377h0398/lib/libgc.so.1

#26 0x7c9d8b04 in _lwp_start () from /lib/64/libc.so.1

(gdb) list
304   SCM c = scm_cons (t, SCM_EOL);
305   SCM_CRITICAL_SECTION_START;
306   if (scm_is_null (SCM_CDR (q)))
307 SCM_SETCDR (q, c);
308   else
309 SCM_SETCDR (SCM_CAR (q), c);
310   SCM_SETCAR (q, c);
311   SCM_CRITICAL_SECTION_END;
312   return c;
313 }
(gdb) p q
$21 = (SCM) 0x1010892c0
(gdb) p c
$22 = (SCM) 0x103aa4ad0
(gdb) p SCM_IMP(q)
$23 = 0
(gdb) p SCM_IMP(c)
$24 = 0
(gdb) p SCM2PTR(q)
$25 = (scm_t_cell *) 0x1010892c0
(gdb) p *SCM2PTR(q)
$26 = {word_0 = 0x304, word_1 = 0x1039c4c20}
(gdb) p SCM2PTR(c)
$27 = (scm_t_cell *) 0x103aa4ad0
(gdb) p *SCM2PTR(c)
$28 = {word_0 = 

Re: [PATCH] Bindings for ‘sendfile’

2013-03-21 Thread Andrew Gaylard

On 03/21/13 11:15, Ludovic Courtès wrote:

Noah Lavine noah.b.lav...@gmail.com skribis:

I've thought for a while that if I had time (which I know I won't) I would
make a module called (linux) with bindings for non-POSIX Linux kernel
features. What do you think of this idea? If so, what do you think of
putting sendfile there and expanding it with other functions as we need
them?

I’ve thought about it, but ended up with making sendfile work whether or
not the syscall is available (just like glibc does, after all).

So for this particular case, I’d rather keep it in the global name
space.  There’s also the untold argument that even if sendfile(2) is
unavailable, the loop written in C is going to be faster than the
equivalent bytecode.
Just another datapoint: Solaris 10 has sendfile.  So it's not just a 
Linux feature.


--
Andrew



Re: Core dump when throwing an exception from a resumed partial continuation

2013-03-21 Thread Andrew Gaylard

On 03/21/13 11:43, Andy Wingo wrote:

On Fri 15 Mar 2013 22:01, Brent Pinkney b...@4dst.com writes:


When I resume the continuation in another thread, all works perfectly
UNLESS the continued execution throws and exception.
Then guile exits with a core dump.

By contrast if I resume the continuation in the same thread and then
throw and exception all works as expected.

I think I know what this is.

So, a delimited continuation should capture that part of the dynamic
environment made in its extent.  (See Oleg Kiselyov and Chung-Chieh
Shan's Delimited Dynamic Binding paper.)  That is what Guile does, for
fluids, prompts, and dynamic-wind blocks.

Our implementation of exception handling uses a fluid,
%exception-handler (boot-9.scm:86).  However that fluid references a
stack of exception handlers on the heap.  There is the problem: an
exception in a reinstated delimited continuation continuation will walk
the captured exception handler stack from the heap, not from its own
dynamic environment.  Therefore it could abort to a continuation that is
not present on the new thread.

The solution is to have the exception handler find the next handler from
the dynamic environment.  This will need a new primitive to walk the
dynamic stack, I think.

I can't look at this atm as I broke my arm (!) and so typing is tough.
For now as a workaround I suggest you put a catch #t in each of your
delimited continuations.  This way all throws will be handled by catches
established by the continuation.

Regards,

Andy

Andy,

Thanks for giving this some thought -- sorry to hear about your arm!

This does shed some light on things. If I change this:

(throw 'oops) ; should not crash the vm

to this:

(catch #t
(λ ()
(throw 'oops)) ; should not crash the vm
(λ ()
(display Success!)(newline))) ; never reached

the VM still cores; Success is never shown. However, you've probably
spotted my mistake: the handler should be (λ (key . args) ... ).

But this core shows up differently in the stack-trace in gdb:

#0 scm_error (key=0x1001854c0, subr=0x0, message=0x7e7ef518 
Wrong number of arguments to ~A, args=0x100db95b0, rest=0x4) at 
error.c:62


... which is exactly the exception one would expect. Fixing the handler 
thus:


(catch #t
(λ ()
(throw 'oops)) ; should not crash the vm
(λ (key . args)
(display Success!)(newline))) ; works!

...solves the problem, and the VM doesn't core any more.

So it seems that although we *did* have a catch around our resumption,
there must have been some (different) error in its handler, which caused a
second exception, which caused the VM to crash.

Unfortunately, the test-case we made handles this second exception fine.
It'd be great to be able to distill this problem down to a pithy test-case.
(Our app is 4500 lines and still growing, so it's not really a candidate to
send to the list.)

The same problem happens (VM cores) if I do this:

(catch 'not-oops
(λ ()
(throw 'oops)) ; should not crash the vm
(λ (key . args)
(display Success!)(newline))); never reached

So your answer to surround the resumption with a (catch #t ...) is a
good workaround. For our code, anyway.

(I'm now off to go read 
http://www.cs.indiana.edu/~sabry/papers/delim-dyn-bind.pdf :)

--
Andrew




Re: Core dump when throwing an exception from a resumed partial continuation

2013-03-19 Thread Andrew Gaylard

On 03/15/13 23:30, Andy Wingo wrote:

On Fri 15 Mar 2013 22:01, Brent Pinkney b...@4dst.com writes:


I am using partial continuations to resume a computation when an
external system returns with an answer.
I am using (call-with-prompt ...) and (abort-to-prompt)

When I resume the continuation in another thread, all works perfectly

Neat :)


UNLESS the continued execution throws and exception.
Then guile exits with a core dump.

That's not good!  Can you work up a short test case?


We've tried to create a short test-case.  Unfortunately, it doesn't seem 
to trigger the core.  However, the app we're creating triggers the 
core-dump every time. So, to dig into this problem, I built a debuggable VM.


What we see in the debuggable cores is the first backtrace.  You'll note 
that aside from the stack overflow at frame #3, the pattern of Abort to 
unknown prompt is repeated /ad infinitum/. Well, certainly to a stack 
depth of 28,000 :).  So the stack overflow is understandable.  I guess 
the question is,  why does guile get stuck in a loop aborting to an 
unknown prompt?.


This is on Linux x86 Ubuntu 12.04, both 32- and 64-bit.  The same code 
crashes the same VM at the same point on Solaris SPARC 64-bit, but that 
core does not appear to show this repetitive pattern.  When I say the 
same VM, I mean it: all dependencies except for the kernel and libc 
are built from identical sources, using as near as possible the same 
configure flags:


   gcc-4.7.2
   bdw-gc-7.2d
   libtool-2.2.10
   gmp-5.0.2
   libiconv-1.14
   libunistring-0.9.3
   libffi-3.0.10
   readline-6.1
   guile-2.0.7

Ubuntu's guile also shows the same problem.

To understand how guile gets into this state, I put a breakpoint in the 
VM at the point where it first calls abort. That reveals the second 
backtrace below.  This shows what happens immediately before the VM goes 
bananas, and fills up the stack.  Which is exactly what happens when gdb 
allows guile to continue beyond the breakpoint.


I then tried stepping through the scm_c_abort code in frame #2, and it 
indeed does not find anything in the wind list.  Certainly, the list 
returned by scm_i_dynwinds has 12 entries in it.  It's just that none of 
them match.


I'd be really grateful for any help on this -- as you can tell, I'm not 
a VM hacker!


--
Andrew

#0  0x0033b416 in __kernel_vsyscall ()
#1  0x004ff1df in __GI_raise (sig=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:64

#2  0x00502825 in __GI_abort () at abort.c:91
#3  0x00a106b7 in vm_error_stack_overflow (vp=0x9c9cfc0) at vm.c:516
#4  0x00a204a4 in vm_regular_engine (vm=0x9cb29e8, program=0x93b50d0, 
argv=0xac055e70, nargs=4) at vm-engine.c:166
#5  0x00a34d00 in scm_c_vm_run (vm=0x9cb29e8, program=0x93b50d0, 
argv=0xac055e70, nargs=4) at vm.c:741
#6  0x00a3585e in scm_call_with_vm (vm=0x9cb29e8, proc=0x93b50d0, 
args=0x304) at vm.c:1033
#7  0x009763e1 in scm_apply (proc=0x93b50d0, arg1=0x95dd4c8, 
args=0x95dd4c8) at eval.c:748
#8  0x00975f7c in scm_apply_1 (proc=0x93b50d0, arg1=0x937df10, 
args=0x95dd4d0) at eval.c:588

#9  0x00a0ba1d in scm_throw (key=0x937df10, args=0x95dd4d0) at throw.c:104
#10 0x00a102ff in vm_error (msg=0xa6631d VM: Too many arguments, 
arg=0x16) at vm.c:414

#11 0x00a105e6 in vm_error_too_many_args (nargs=5) at vm.c:490
#12 0x00a11a42 in vm_regular_engine (vm=0x9cb29e8, program=0x93b50d0, 
argv=0xac056770, nargs=5) at vm-engine.c:104
#13 0x00a34d00 in scm_c_vm_run (vm=0x9cb29e8, program=0x93b50d0, 
argv=0xac056770, nargs=5) at vm.c:741
#14 0x00a3585e in scm_call_with_vm (vm=0x9cb29e8, proc=0x93b50d0, 
args=0x304) at vm.c:1033
#15 0x009763e1 in scm_apply (proc=0x93b50d0, arg1=0x95dd550, 
args=0x95dd550) at eval.c:748
#16 0x00975f7c in scm_apply_1 (proc=0x93b50d0, arg1=0x9362130, 
args=0x95dd558) at eval.c:588

#17 0x00a0ba1d in scm_throw (key=0x9362130, args=0x95dd558) at throw.c:104
#18 0x00a0c097 in scm_ithrow (key=0x9362130, args=0x95dd558, noreturn=1) 
at throw.c:441
#19 0x009735bf in scm_error_scm (key=0x9362130, subr=0x99105b0, 
message=0x99105c0, args=0x95dd5c8, data=0x4) at error.c:95
#20 0x00973576 in scm_error (key=0x9362130, subr=0xa4cd3b abort, 
message=0xa4cd23 Abort to unknown prompt, args=0x95dd5c8, rest=0x4) at 
error.c:62
#21 0x00973b6b in scm_misc_error (subr=0xa4cd3b abort, 
message=0xa4cd23 Abort to unknown prompt, args=0x95dd5c8) at error.c:316
#22 0x0096aef5 in scm_c_abort (vm=0x9cb29e8, tag=0x9c08af0, n=5, 
argv=0xac056960, cookie=6614) at control.c:209

#23 0x00a0fe36 in vm_abort (vm=0x9cb29e8, n=0, vm_cookie=6614) at vm.c:264
#24 0x00a18942 in vm_regular_engine (vm=0x9cb29e8, program=0x93b5260, 
argv=0xac0571f4, nargs=6) at vm-i-system.c:1528
#25 0x00a34d00 in scm_c_vm_run (vm=0x9cb29e8, program=0x93b50d0, 
argv=0xac0571e0, nargs=5) at vm.c:741
#26 0x00a3585e in scm_call_with_vm (vm=0x9cb29e8, proc=0x93b50d0, 
args=0x304) at vm.c:1033
#27 0x009763e1 in scm_apply (proc=0x93b50d0, arg1=0x95dd678, 
args=0x95dd678) at eval.c:748
#28 0x00975f7c in scm_apply_1 (proc=0x93b50d0, 

Re: My Guile 2.0.8 TODO list

2013-03-06 Thread Andrew Gaylard

On 03/05/13 23:14, Mark H Weaver wrote:

FYI, here's what I'm hoping to get into Guile 2.0.8.

 Mark


2.0.8 TODO
==

* [SUBMITTED] Refactor pending numerics patches.

* [SUBMITTED] Implement Dybvig and Burger's algorithm for printing
   floats.

* [NEEDS REVISION] Fix BOM handling.

* #!optional and #!rest reader handling.

* Add command-line option to augment %load-compiled-path.

* Change 'sqrt' to return an exact rational when possible.

* Optimize overflow check in scm_product (i.e. avoid the division).

* Add call/ec and let/ec.

* Implement optional arguments to vector-copy (SRFI-43)

* Make sure that 'syntax-parameterize' reports an error (or at least a
   warning) if the associated identifier is not defined as a syntax
   parameter.

* [Ludovic?] Fix par-map and par-for-each to not overflow the stack for
   large lists.

* [Daniel Hartwig?] Support Relative URIs in (web uri)

* [Ian Price?] Fix flawed index range check in bytevector accessors.


Hi Mark,

Two suggestions for 2.0.8:

It'd be great to get Andy's (oop goops save) patch in.
See http://lists.gnu.org/archive/html/guile-user/2013-01/msg00088.html

Also, Nala's colourised REPL patch is pretty cool; see
https://lists.gnu.org/archive/html/guile-devel/2012-12/msg00062.html

--
Andrew




Re: [patch] get 1.8.8 to build on Solaris 10u9

2011-04-29 Thread Andrew Gaylard
[resending -- this time to the list.  Sorry for the noise.]

On Thu, Apr 28, 2011 at 7:55 PM, Andy Wingo wi...@pobox.com wrote:
 Hi Andrew,

 On Thu 28 Apr 2011 17:33, Andrew Gaylard a...@computer.org writes:

 With the attached patch, I can build and run guile-1.8.8 on Solaris.
 It seems that the old logic that used USRSTACK no longer works,
 so I took it out.

 Tested on Solaris 10u9, on both SPARC64 and x86_64.

 Thanks for the patch.  Do you have access to other versions of Solaris?
 We would need to test this patch under them as well.

 Andy

Hi Andy,

I've tested on a Solaris-9 SPARC zone, with these results:

gmake[4]: Entering directory
`/export/home/andrewg/guile/branches/1.8.8/src/guile-1.8.8/test-suite/standalone'
PASS: test-system-cmds
PASS: test-require-extension
PASS: test-bad-identifiers
PASS: test-num2integral
PASS: test-round
PASS: test-gh
PASS: test-asmobs
PASS: test-list
FAIL: test-unwind
PASS: test-conversion
PASS: test-fast-slot-ref
PASS: test-use-srfi
PASS: test-scm-c-read
PASS: test-scm-take-locale-symbol
PASS: test-with-guile-module
PASS: test-scm-with-guile
==
1 of 16 tests failed
Please report to bug-gu...@gnu.org
==

This occurs with and without the patch to guile-1.8.8/libguile/threads.c.
Without the patch to guile-1.8.8/libguile/gc_os_dep.c, the build doesn't
even complete:

gc_os_dep.c: In function `scm_get_stack_base':
gc_os_dep.c:1909: error: `USERLIMIT' undeclared (first use in this function)
gc_os_dep.c:1909: error: (Each undeclared identifier is reported only once
gc_os_dep.c:1909: error: for each function it appears in.)

So this patch is good to go, I think.

I've tried debugging the core left behind by test-unwind by rebuilding with -g,
but I get this far, after which I'm stuck:

Core was generated by
`/export/home/andrewg/guile/branches/1.8.8/src/guile-1.8.8/test-suite/standalone'.
Program terminated with signal 11, Segmentation fault.
#0  0x7f8bf714 in scm_i_dowinds (to=0xfd740, delta=-1,
turn_func=0x7f8b6764 copy_stack, data=0xffbfec10) at dynwind.c:303
303   if (FRAME_P (wind_elt))
(gdb) bt
#0  0x7f8bf714 in scm_i_dowinds (to=0xfd740, delta=-1,
turn_func=0x7f8b6764 copy_stack, data=0xffbfec10) at dynwind.c:303
#1  0x7f8bf6f0 in scm_i_dowinds (to=0xfd750, delta=-2,
turn_func=0x7f8b6764 copy_stack, data=0xffbfec10) at dynwind.c:300
#2  0x7f8b6854 in copy_stack_and_call (continuation=0x105668, val=0x4,
dst=0xffbfeec4) at continuations.c:222
#3  0x7f8b698c in scm_dynthrow (cont=0xfd758, val=0x4) at continuations.c:275
#4  0x7f8b6758 in grow_stack (cont=0x7f9bb6c4, val=0xffbfefd0) at
continuations.c:187
#5  0x7f8b6974 in scm_dynthrow (cont=0x0, val=0x0) at continuations.c:271
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) list
298   SCM wind_key;
299
300   scm_i_dowinds (SCM_CDR (to), 1 + delta, turn_func, data);
301   wind_elt = SCM_CAR (to);
302
303   if (FRAME_P (wind_elt))
304 {
305   if (!FRAME_REWINDABLE_P (wind_elt))
306 scm_misc_error (dowinds,
307 cannot invoke continuation from
this context,
(gdb) p wind_elt
$1 = (SCM) 0x0
(gdb) p to
$2 = (SCM) 0xfd740

I'm pretty sure this is not due to the patch to guile-1.8.8/libguile/threads.c,
since it happens with and without it.

-- 
Andrew



Re: [patch] get 1.8.8 to build on Solaris 10u9 -- Solaris stack layout

2011-04-29 Thread Andrew Gaylard
Hi, I don't know if this is useful, but here's some more background...

The old code in guile-1.8.8/libguile/gc_os_dep.c used to do this:

#   define STACKBOTTOM ((ptr_t) USRSTACK)

.. which is mentioned in the Solaris-10 headers...

$ find /usr/include/ | xargs grep USERLIMIT
/usr/include/sys/vmparam.h:#define  USRSTACKUSERLIMIT
/usr/include/sys/vmparam.h:#define  USRSTACK32  USERLIMIT32
/usr/include/sys/param.h:#defineUSERLIMIT   _userlimit
/usr/include/sys/param.h:#defineUSERLIMIT32 _userlimit32

However, I can't find the _userlimit or _userlimit32 symbols on Solaris 9 or 10.
So I guess Sun/Oracle's changed how this works.  Hence my patch to
guile-1.8.8/libguile/gc_os_dep.c .

I've found these two links which make it pretty clear what the Solaris
stack layout is,
at least in 10 and 11:

http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/sun4/os/startup.c#490
http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/i86pc/os/startup.c#347

If I understand them correctly, the ASCII-art diagrams indicate that
the stack grows
downwards from USERLIMIT, which is set to various values depending on x86/SPARC
and 32-/64-bit,  towards ss_sp.  The maximum size that the stack can grow to is
set by ulimit, and is read in the ss_size field of the struct
populated by stack_getbounds().

Using the stack_getbounds() call, we can do some checks.  This is on
Solaris-10/SPARC:

$ cat ./stackbounds.c
#include ucontext.h
#include stdio.h

int main( int argc, char* argv[] )
{
stack_t stack;
stack_getbounds( stack );
printf( ss_sp=%p\nss_size=%dkiB\nuserlimit=%p\n,
stack.ss_sp,
stack.ss_size/1024,
stack.ss_sp + stack.ss_size );
return 0;
}

$ gcc -m64 ./stackbounds.c

$ ./a.out
ss_sp=7f80
ss_size=8192kiB
userlimit=8000

$ gcc -m32 ./stackbounds.c

$ ./a.out
ss_sp=ff40
ss_size=8192kiB
userlimit=ffc0

Clearly, the userlimit values detected here match those of the diagrams in
the Solaris source files. (Solaris-9/SPARC gives the identical output as for
the 32-bit case;  I don't have a 64-bit gcc on the sol-9 box, so I
can't test that.)

On Solaris-10/x86, we get this:

$ gcc -m64 ./stackbounds.c

$ ./a.out
ss_sp=fd7ffee0
ss_size=16384kiB
userlimit=fd7fffe0

$ gcc -m32 ./stackbounds.c

$ ./a.out
ss_sp=7048000
ss_size=16384kiB
userlimit=8048000

Here, the 64-bit value is pretty close (2048kiB) to the diagram's value of
0xFD80., and the 32-bit value matches exactly.

So I'm pretty confident that my patch to guile-1.8.8/libguile/threads.c
does the right thing.  The patch to guile-1.8.8/libguile/gc_os_dep.c
is harder to analyse, since it appears that the HEURISTIC code
installs a SEGV handler, and probes around to find the end of the
stack.  However, I tried disabling the heuristics, and setting
# define STACKBOTTOM ((ptr_t)(0xff40))
in line 714 with no change in the result of test-unwind.  So I'm
pretty sure that this test's failing for other reasons.  And TBH, I'm
reluctant to invest a lot of time in getting guile to work on Solaris-9 :)

I hope this helps,
- Andrew



[patch] get 1.8.8 to build on Solaris 10u9

2011-04-28 Thread Andrew Gaylard
Hi,

With the attached patch, I can build and run guile-1.8.8 on Solaris.
It seems that the old logic that used USRSTACK no longer works,
so I took it out.

Tested on Solaris 10u9, on both SPARC64 and x86_64.

- Andrew
--- guile-1.8.8/libguile/gc_os_dep.c.orig	Mon Dec 13 19:25:01 2010
+++ guile-1.8.8/libguile/gc_os_dep.c	Fri Apr 15 14:03:13 2011
@@ -714,11 +714,8 @@
 /*  # define STACKBOTTOM ((ptr_t)(_start)) worked through 2.7,  */
 /*  but reportedly breaks under 2.8.  It appears that the stack */
 /*  base is a property of the executable, so this should not break  */
 /*  old executables.*/
-/*  HEURISTIC2 probably works, but this appears to be preferable.   */
-#   include sys/vm.h
-#   define STACKBOTTOM ((ptr_t) USRSTACK)
 #	ifndef USE_MMAP
 #	define USE_MMAP
 #	endif
 #   ifdef USE_MMAP


[patch] implement scm_init_guile for 1.8.8 on Solaris 10u9

2011-04-28 Thread Andrew Gaylard
Hi,

The attached patch implements the scm_init_guile function on Solaris.
The detection of the stack parameters is done via a new(ish) Solaris
function, stack_getbounds() -- see
http://download.oracle.com/docs/cd/E19253-01/816-5168/stack-getbounds-3c/index.html
for details.

Tested on Solaris 10u9, on both SPARC64 and x86_64.

- Andrew

PS: now, on to get 2.0.1 working...
--- guile-1.8.8/libguile/threads.c.orig	Mon Dec 13 19:24:40 2010
+++ guile-1.8.8/libguile/threads.c	Wed Apr 27 20:07:34 2011
@@ -689,8 +689,25 @@
 {
   return scm_get_stack_base ();
 }
 
+#elif defined (sun)
+
+#define HAVE_GET_THREAD_STACK_BASE
+#include ucontext.h
+static SCM_STACKITEM *
+get_thread_stack_base ()
+{
+  stack_t stack;
+  stack_getbounds( stack );
+
+#if SCM_STACK_GROWS_UP
+  return stack.ss_sp;
+#else
+  return stack.ss_sp + stack.ss_size;
+#endif
+}
+
 #endif /* pthread methods of get_thread_stack_base */
 
 #else /* !SCM_USE_PTHREAD_THREADS */