> From: Taylor R Campbell <campb...@mumble.net> > Date: Sun, 19 May 2013 17:08:52 +0000 > > [...] > > For microcode primitives, it is not a priori the case that > interrupts are disabled on entry.
Le Machine is in what state? Not "in a callout"? It is not up and running, creating callback tokens and making callouts to register them with the toolkit? You'll have to lay a usage scenario on me, bro. And explain why we are talking about "microcode primitives" now, and not "callbacks". If you want to energize Le Machine (initted with a disk-save continuation?) from a call"back" with no callout, that's... wild. Good luck. I hope I can follow all that AND the thing that I can't even spell: lugjmp? > I will admit that I haven't looked very closely at your FFI's > implementation. The magic happening in the macros is very hard to > follow (C-INCLUDE, for example, expands to nothing -- it seems that > you are abusing macros for side effects rather than expansion), I needed the info at syntax-time, and had no use for it at run-time. Most of the magic simplifies C type declarations (using the entire set of included declarations) to determine whether to use c-peek-char or c-peek-pointer when the programmer has only said "GSList->next". > and using it without installing things in > $PREFIX/lib/mit-scheme-$ARCH doesn't seem to be supported. You can install shims in the first (existing) directory on your MITSCHEME_LIBRARY_PATH, i.e. anywhere that exists. I thought I said something like that in the manual, but now I can't find it. Maybe I'll just twiddle the example to install in $HOME/.scheme-9.1/... I decided not to search all of MITSCHEME_LIBRARY_PATH to make it hard to load inconsistent -const.bin and -shim.so files (from different directories along the path), but I could change that, especially if I had some code to check a hash tag or sump'n. > It's also not clear to me why Scheme needs to memorize so much > information about the C platform's ABI Constants, sizes and offsets are hard-coded -- "filled in". I don't want to callout to e.g. Scm_gslist_next just to see if I've reached the end of a Glib list. I want to peek instead, something that could be inlined into a few instructions, NOT something that requires multiple insults to the hallowed CPU pipeline (obligatory interjection: "All hail the pipeline!") like register flipping and stack switching. > (and the grovelling mechanism will get in the way of any attempt at > cross-compilation), when you're already generating C code for the > shims. I didn't notice any problems while cross-compiling. My Gtk interface works in i386, x86_64 and (32 *and* 64 bit) C. ? Geez, YOU recommended the grovelling mechanism to me (in 2006)! > From: Taylor R Campbell <campb...@mumble.net> > Date: Thu, 10 Aug 2006 20:24:35 +0000 > > [...] > Other Lisp FFI utilities, such as sb-grovel[1], cffi-grovel[2], and > s48-grovel[3], (hmm, notice a trend here?), take this approach: > generate C code to generate Lisp code with the appropriate values > filled in. I have some developer-level documentation that has grown stale over the years, but most of the following is still accurate. Perhaps it can help. You might just skip to "@node C FFI Callbacks"... @node C FFI, Gtk, Microcode, Top @chapter The C Libraries' Foreign Function Interface @insertcopying @end ifnottex @menu * C FFI Modifications:: Changes to the stock Scheme machine. * C FFI Callouts:: Details of the code generated for callout trampolines. * C FFI Callbacks:: Details of the code generated for callback trampolines. * C FFI Build:: Building the microcode. Installing the Scheme code. @end menu This chapter describes how to add a Foreign Function Interface to an MIT/GNU Scheme v7.7.90+ build. It also provides an overview of the implementation of the FFI, including especially the callout and callback trampolines that are generated. It is assumed the reader is familiar with the FFI at the user level. @c @xref{Top,, Introduction, mit-scheme-ffi, FFI Users' Manual}. @c In HTML, I see "See Introduction." with "Introduction" linked to @c http://birkholz.chandler.az.us/~matt/Scheme/FFI/mit-scheme-ffi.html#Top @c In Info, I see "*Note Introduction:" followed by "(mit-scheme-ffi)Top". @xref{Top,, The FFI Users' Manual, mit-scheme-ffi, The FFI Users' Manual}. Most of the FFI code is concerned with loading and analyzing the C type information, the @file{.cdecl} files. The resulting data structure (a c-includes record) is used by the code generator and the syntax expanders. It contains indices of the declared C types, constants and the @code{alien-function} address caches. Toolkit data addresses (aliens) are the only other runtime object. The c-includes record is not needed once @code{c-generate} has been run and all syntax expanded. [organization of source code files/packages] The rest of this section looks at the FFI's data types and its groveler. Subsequent sections discuss the modifications to the Scheme machine, the callout and callback trampolines, and how to build the entire system. @section Runtime Objects @strong{Aliens} are Scheme wrappers for C data structures. Each contains a memory address split into two fixnum halves. An alien may have a C type description attached, for debugging purposes or perhaps some future runtime type checking facility. @strong{Alien functions} are used by the @code{C-call} syntax to cache trampoline entry addresses. They are implemented by a named vector type so that they can be fasdump/loaded. Some attempt is made to share these objects among multiple @code{C-call} syntax expansions. The cached entry addresses are only valid during the current process, so each alien function includes a @code{band-id} member. When the (possibly fasloaded) band ID does not match the current band's ID, the cache is invalid. The runtime system's @code{dld-*} procedures are used to fill the cache (on demand). @section Syntax Time @strong{Cdecls} are expressions found in @file{.cdecl} files. They are read by the @code{include-cdecls} procedure and assembled into a @code{c-includes} data structure. @strong{Ctypes} are the validated cdecls found in the @code{c-includes} structure. They are examined using a set of abstract procedures. An example of each expression is given below with the procedure that recognizes it. @multitable {@code{(struct Name (Member ctype)...)}} {ctype/struct-named?} @item @code{char} @tab ctype/basic? @item @code{(const char)} @tab ctype/const? @item @code{(* char)} @tab ctype/pointer? @item @code{(struct Name)} @tab ctype/struct-name? @item @code{(struct (Member ctype)...)} @tab ctype/struct-anon? @item @code{(struct Name (Member ctype)...)} @tab ctype/struct-named? @item @code{(union Name)} @tab ctype/union-name? @item @code{(union (Member ctype)...)} @tab ctype/union-anon? @item @code{(union Name (Member ctype)...)} @tab ctype/union-named? @item @code{(enum Name)} @tab ctype/enum-name? @item @code{(enum (Member)...)} @tab ctype/enum-anon? @item @code{(enum Name (Member)...)} @tab ctype/enum-named? @end multitable Note that the target types of pointer types are not currently validated. @section Groveler The @code{c-generate} procedure reads a @file{@i{library}.cdecl} (and included) file(s) and writes three new ones. @table @file @item @i{library}.c gets the callout and callback trampolines. @item @i{library}-types.bin gets a fasdump of the @code{c-includes} structure @emph{without} the @code{enum-values} and @code{struct-values} members. These are loaded from the @file{@i{library}-const.scm} file generated by the groveler. @item @i{library}-constants.c gets the groveler. @end table The groveler is the C program that outputs C constants in Scheme syntax. It generates a @file{@i{library}-const.scm} file that can be (fas)loaded by the @code{C-include} syntax. The @file{.scm} file should contain a list of two things. The first is an association list of enum constant values indexed by constant name. The second contains the sizeof a C @code{struct} type and the offset and type of each struct member. This information is repeated for any aliases. For example, these two Cdecls @example (struct A (B int) (C int)) (typedef D (struct A)) @end example produce the following list of struct values. @example ((sizeof (struct |A|)) . 8) ((offset (struct |A|) |B|) . 0) ((offset (struct |A|) |C|) . 4) ((sizeof |D|) . 8) ((offset |D| |A|) . 0) ((offset |D| |B|) . 4) @end example The @code{C-include} syntax loads a @code{c-includes} structure from a @file{@i{library}-types.bin} file and adds to it the enum and struct values loaded from a @file{@i{library}-const.bin} file. @node C FFI Modifications, C FFI Callouts, C FFI Overview, Top @section Modifications This FFI adds several new primitives to the Scheme machine. These can be found in the @file{pruxffi.c} and @file{pruxffi.h} files. It also requires a few changes to the @code{Interpret()} function itself, adding an argument, support for two new aborts, and a @code{callback-handler} slot in the fixed objects vector. The complete set of patches can be found in the @file{microcode.patch} file, which modifies the following files. @itemize @bullet @item @file{Makefile.in} Primarily adds a rule for the new @file{pruxffi.o} object. Several other changes support the @file{prhello} example. @item @file{boot.c} The C data stack (@code{callout_obstack}) is initialized, e.g. next to the initialization of @code{scratch_obstack}. Also, the @code{Interpret} function's old @code{pop_return_p} parameter is back. @item @file{configure} and @file{configure.ac} Adds the @file{pruxffi} module whenever @file{pruxdld} is available. Changing @file{configure} as well as @file{configure.ac} means you will not need to run @code{autoconf}. @item @file{const.h} Add @code{PRIM_RETURN_TO_C} and @code{PRIM_ABORT_TO_C}, two new ways of exiting the interpreter and leaving it ready for re-entry via @code{Interpret(1)}. @item @file{extern.h} Add declarations for @code{callout_obstack} and @code{find_primitive_cname}. Modify the declaration of @code{Interpret}. @item @file{fixobj.h} Add a @code{callback-handler} slot to the fixed objects vector. @item @file{interp.c} Add a @code{pop_return_p} parameter to @code{Interpret}. Implement the new @code{PRIM_RETURN_TO_C} and @code{PRIM_ABORT_TO_C} aborts. @item @file{primutl.c} A @code{find_primitive_cname} function is needed. There is a similar function, @code{find_primitive} that takes a Scheme string. A few modifications turn it into @code{find_primitive_cname}, in terms of which @code{find_primitive} is easily re-implemented. @item @file{utabmd.scm} Add the @code{callback-handler} slot. @end itemize @heading @file{pruxffi.c} This file extends the microcode with the following primitives and functions. @itemize @bullet @item The @code{c-peek-} and @code{c-poke-} primitives for each of the basic C types. @item The @code{c-peek-cstring} and @code{c-peek-cstringp} primitives help deal with the ubiquitous, null-terminated @code{* char} data. @item Utility primitives @code{c-malloc} and @code{c-free}. @item The callout primitives @code{c-call} and @code{c-call-continue}. @item The callback primitives @code{run-callback} and @code{return-to-c}. @item Functions referenced by the generated trampolines, like @code{callout_continue} and @code{Setup_Callback}. @end itemize @heading @file{pruxffi.h} The @file{pruxffi.h} file includes macros that implement a C data stack, @verb{"CStack"}, abstraction with methods like @verb{"CStack_Push"}, @verb{"CStack_LPop"} and @verb{"CStack_Pop_Frame"}. The push method is used in the first half of callout trampolines to save return values from the C library. The pop methods are used in the second half while converting the C values to Scheme values. The abstraction is implemented on an obstack, @verb{"callout_obstack"}, used simply as an automatically growing contiguous memory segment (with base and top pointers). The implementation never uses @verb{"obstack_finish"} --- just @verb{"obstack_grow"}. @node C FFI Callouts, Callbacks, Modifications, C FFI @section Callouts Callout trampolines are split into two parts. The first part is run by the @code{call-c} primitive. It converts the Scheme arguments and calls the C function, saving the returned value on a C data stack. Then it arranges for the second part to run by hacking its continuation and aborting. The hack substitutes the @code{call-c-continue} primitive for @code{call-c} in the primitive apply frame at the top of the Scheme stack. The abort causes the interpreter to retry the primitive application, this time applying @code{call-c-continue}. The second part, run by the @code{call-c-continue} primitive, pops the C function's return value off the C data stack and conses the corresponding Scheme return value. The pop is delayed until all consing is complete, making this part restartable after a GC abort. If the consing does abort for GC, any heap addresses used in the first part of the trampoline (during argument marshalling) will be invalidated, but this second part (return value consing) does not use (actually has no access to) these invalid pointers. Once the Scheme value is successfully constructed, the @code{call-c-continue} primitive can return ``normally'', as though from the call to @code{call-c}. For each @code{extern} cdecl, e.g. @smallexample (extern (* GtkWidget) gtk_window_new (type GtkWindowType)) @end smallexample the @code{gen-callout-trampolines} procedure generates a two-part callout trampoline. The first part might look like this. @verbatim void Scm_gtk_window_new (void) { /* Declare C args and return value. */ GtkWidget * ret0; GtkWindowType type; /* Init C args. Aborts are OK; they will restart this function. */ if (GET_LEXPR_ACTUALS < 3) { signal_error_from_primitive (ERR_WRONG_NUMBER_OF_ARGUMENTS); } type = arg_integer (3); /* Call the C function, but first swap c-call-continue for c-call and back out of the primitive. No more aborts! */ prepare_callout_continuation (); ret0 = gtk_window_new (type); prepare_for_callout_results (); /* Save C return value. */ CStack_Push (GtkWidget *, ret0); callout_continue (&Scm_continue_gtk_window_new); /* NOTREACHED */ } @end verbatim The matching second part might look like this. @verbatim SCHEME_OBJECT Scm_continue_gtk_window_new (void) { /* Declare. */ char * tos0; GtkWidget * ret0; SCHEME_OBJECT ret0a; /* Restore. */ CStack_top_of_results (tos0); CStack_LPop (GtkWidget *, ret0, tos0); /* Return. */ set_alien_address (ARG_REF (2), (void*)ret0); ret0a = UNSPECIFIC; pop_callout (tos0); return (ret0a); } @end verbatim The above example does not actually cons in the second part, but it easily could with something as simple as @code{long_to_integer}. The @code{c-call-continue} primitive must manage the C data stack carefully to stay restartable. It decrements a local top-of-stack pointer while popping the C results. It does not actually pop the frame off the stack until it has successfully consed all the results. The callout trampolines are GC abortable and restartable. They do not hold onto pointers into the Scheme stack. After an abort, they load their arguments again from the freshly-GCed Scheme stack. @node C FFI Callbacks, C FFI Build, C FFI Callouts, C FFI @section Callbacks Callback trampolines are also split into two parts, to accommodate GC aborts. The first part is registered with the toolkit, and runs outside the interpreter --- no consing --- no GC aborting. It calls @code{Interpret(1)} after hacking the Scheme stack like an interrupt. It pushes a couple zero-arity primitive application frames. The first applies the @code{return-to-c} primitive and the second applies @code{run-callback}. The second part is run in the interpreter by the @code{run-callback} primitive. It conses the callback arguments and applies the Scheme callback handler (from the fixed objects array). It is restartable, to accommodate GC aborts during construction of the arguments. When finished, it leaves the callback's return value in the value register. The interpreter then applies @code{return-to-c} and control returns to the first part of the callback trampoline, which converts the Scheme value register and returns an equivalent C value to the toolkit. For each callback cdecl, e.g. @smallexample (callback gint delete_event (window (* GtkWidget)) (event (* GdkEventAny)) (ID gpointer)) @end smallexample the @code{gen-callback-trampolines} procedure generates a callback trampoline and a restartable kernel. The trampoline for the above declaration should look something like this. @verbatim void Scm_clicked (GtkWidget * widget, gpointer ID) { Start_Callback (); CStack_Push (gpointer, ID); CStack_Push (GtkWidget *, widget); Run_Callback ((uint)ID, (CallbackKernel)&Scm_kernel_clicked); return; } @end verbatim The corresponding kernel looks like this. @verbatim static void Scm_kernel_clicked (void) { /* Declare. */ GtkWidget * widget; gpointer ID; SCHEME_OBJECT alien0; SCHEME_OBJECT arglist0; char * tos0; /* Init. */ tos0 = CStack_TOS (); CStack_LPop_Kernel_Check (&Scm_kernel_clicked, tos0); CStack_LPop (GtkWidget *, widget, tos0); CStack_LPop (gpointer, ID, tos0); arglist0 = EMPTY_LIST; /* Construct. */ alien0 = cons_alien ((void*)widget); arglist0 = cons (alien0, arglist0); Setup_Callback ((uint)ID, arglist0); CStack_Pop_Frame (tos0); PRIMITIVE_ABORT (PRIM_APPLY); } @end verbatim The Scheme callback handler looks up the registered closure and runs the closure without preemption, returning from the @code{run-callback} primitive with a Scheme value. The interpreter continues with the application of the @code{return-to-c} primitive, which immediately returns from @code{Interpret()}. The trampoline can then convert the Scheme value register to a C value and return it to the toolkit. @heading Callouts during callbacks during callouts... Callbacks usually arrive during a callout. The first part of the callout trampoline is careful to ``canonicalize the interpreter context'' before calling out, so that the Scheme stack and registers are in a GCable state. The callout tramp. can call the toolkit, the toolkit can call a callback tramp., and the callback tramp. can push its interrupt frame and @emph{recursively} enter @code{Interpret()}. Once inside the interpreter, with complete frames on the stack, GC aborts can be handled as callback arguments are consed. During a callback the toolkit is blocked waiting for @code{Interpret()} to execute the @code{return-to-c} primitive. It is possible for MIT Scheme to switch threads and (perhaps permanently!) abandon that continuation. The generic callback handler arranges for the current thread to run without preemption until the Scheme callback procedure returns. If an error is signaled in a callback, the standard error handler is invoked, and the error REPL can be used to debug the situation with the toolkit blocked. Here is the Scheme debugger's forward-trace (continuation trace) from a breakpoint in a callback of the example ``Hello, World!'' program. It shows the @code{return-to-c} primitive apply frame which waits to return values to the toolkit, and after that, a @code{c-call-continue} primitive apply frame, which continues with the reduction of a callout to @code{gtk_main}. @verbatim ; hello::clicked #[alien 2 #f 0x08112eb8] clicked ;To continue, call RESTART with an option number: ; (RESTART 2) => Return from BKPT. ; (RESTART 1) => Return to read-eval-print level 1. 2 bkpt> (debug) There are 10 subproblems on the stack. Subproblem level: 0 (this is the lowest subproblem level) Expression (from stack): (begin ### (call-alien (quote #[alien-function 3 Scm_gtk_label_set_text]) label (list->string (reverse! (string->list text))))) subproblem being executed (marked by ###): (bkpt (quote clicked)) Environment created by a LET special form applied to: ("!dlroW ,olleH") There is no execution history for this subproblem. You are now in the debugger. Type q to quit, ? for commands. 3 debug> h SL# Procedure-name Expression 0 (begin (bkpt (quote clicked)) (call-alien (quo ... 1 (begin (low-format "; hello::clicked ~S\n" wid ... 2 (let ((value (thunk))) (set-thread/execution-s ... 3 (return-to-c) 4 (c-call-continue (quote #[alien-function 5 Scm ... 5 (begin (call-alien (quote #[alien-function 6 S ... 6 %repl-eval (let ((value (hook/repl-eval s-expression envi ... 7 %repl-eval/write (hook/repl-write (%repl-eval s-expression envi ... 8 (begin (if (queue-empty? queue) (let ((environ ... 9 loop (loop (bind-abort-restart cmdl (lambda () (der ... 3 debug> K Choose an option by number: 2: Return from BKPT. 1: Return to read-eval-print level 1. Option number (1 through 2 inclusive): 2 @end verbatim Here is gdb's backtrace from the same point. It shows the recursive call @code{Interpret(1)} and the callout via @code{Prim_c_call}. @verbatim #0 0xb7f0e410 in __kernel_vsyscall () #1 0xb7cf5bcb in poll () from /lib/tls/i686/cmov/libc.so.6 #2 0x0809fb55 in OS_test_select_registry (registry=0x80ed178, blockp=1) at uxio.c:486 #3 0x08098c90 in Prim_test_selreg () at prosio.c:309 #4 0x0809502e in primitive_apply_internal (primitive=1610613295) at utils.c:861 #5 0x080b8b32 in comutil_primitive_apply (DSU_result=0xbff24594, primitive_raw=1610613295, ignore2=1359640, ignore3=5402588, ignore4=0) at cmpint.c:772 #6 0x080a7fb0 in scheme_to_interface_proceed () #7 0xbff24594 in ?? () #8 0x080b80fb in apply_compiled_procedure () at cmpint.c:436 #9 0x0807cc85 in Interpret (pop_return_p=1) at interp.c:1102 #10 0x080aa8a3 in run_callback (callback_id=2, kernel=0xb7f0901d <Scm_kernel_clicked>) at pruxffi.c:762 #11 0xb7f093ab in Scm_clicked (widget=0x8112eb8, ID=0x2) at prhello.c:643 #12 0xb75dbaff in g_cclosure_marshal_VOID__VOID () from /usr/lib/libgobject-2.0.so.0 #13 0xb75ce759 in g_closure_invoke () from /usr/lib/libgobject-2.0.so.0 [...25 Gtk frames...] #39 0xb754e577 in g_main_loop_run () from /usr/lib/libglib-2.0.so.0 #40 0xb789c264 in gtk_main () from /usr/lib/libgtk-x11-2.0.so.0 #41 0xb7f08a2d in Scm_gtk_main () at prhello.c:526 #42 0x080a9c2e in Prim_c_call () at pruxffi.c:491 #43 0x0809502e in primitive_apply_internal (primitive=1610612799) at utils.c:861 #44 0x0807c5c1 in Interpret (pop_return_p=0) at interp.c:1009 #45 0x0806b645 in Do_Enter_Interpreter () at boot.c:301 #46 0x0806b668 in Enter_Interpreter () at boot.c:309 #47 0x0806b631 in start_scheme () at boot.c:295 #48 0x0806ae84 in main (argc=1, argv=0xbff26174) at boot.c:132 @end verbatim @heading Callback stack requirement. The first part of a callback tramp. pushes two, zero-arity primitive apply frames on the stack (8 words). It cannot GC abort to get more stack --- it is running ``outside'' of the interpreter. Thus it @emph{will} fail if there is no room on the stack. A warning is emitted on stderr in that case. @c TODO!!! In the future, the required stack space might be guaranteed by the callout tramps. @heading Callback alien fixups. Aliens are normal records with a record type. However callback trampolines consing alien (pointer) callback arguments will not bother to track down the record type's ``dispatch-tag''. They will simply return a 3 element vector. The Scheme callback dispatcher can more easily munge the vector into a record, and there should be no mistaking a vector for an alien. The trampolines do not create Scheme vectors otherwise. _______________________________________________ MIT-Scheme-devel mailing list MIT-Scheme-devel@gnu.org https://lists.gnu.org/mailman/listinfo/mit-scheme-devel