Re: thoughts on native code
Hi Stefan, Just my idea about an assembler in Scheme. Sounds interesting. If it's done properly, it can be very promising to use scheme itself to directly emit machine instructions. This would also be interesting for meta compilation in the future (think of aiding GCC). So you are thinking about an assembler for x86? Perhaps I can help out on this one. I would like to do this part, as I haven't been able to aid on other parts besides voicing my ideas (anyways, I am on embedded development these days.) The only discussion is the syntax I believe, I mean, should it be ATT like, Intel like, or leave this domain and do something new. I would go for instructions like this (using macros): (let ((target :x686)) (assemble target ((mov long 100 EAX) (mov long 200 EBX) (add long EBX EAX Giving back the native machine code instructions. Perhaps special constructions can be made to return partially complete instructions (such as missing labels or calls to guile procedures...) Sjoerd On 11/12/2012 10:50 PM, Stefan Israelsson Tampe wrote: Thanks for your mail Noah, Yea libjit is quite interesting. But playing around with an assembler in scheme I do not want to go back to C or C++ land. The only problem is that we need a GNU scheme assembler and right now I use sbcl's assembler ported to scheme. We could perhaps use weinholts assembler as well in industria if he could sign papers to make it GNU. For the register allocation part I would really like to play a little in scheme to explore the idea you saw from my previous mail in this thread. Again I think it's natural to have this features in scheme and do not want to mess in C land too much. Am I wrong? Cheers Stefan On Sat, Nov 10, 2012 at 11:49 PM, Noah Lavine noah.b.lav...@gmail.com mailto:noah.b.lav...@gmail.com wrote: Hello, I assume compressed native is the idea you wrote about in your last email, where we generate native code which is a sequence of function calls to VM operations. I really like that idea. As you said, it uses the instruction cache better. But it also fixes something I was worried about, which is that it's a lot of work to port an assembler to a new architecture, so we might end up not supporting many native architectures. But it seems much easier to make an assembler that only knows how to make call instructions and branches. So we could support compressed native on lots of architectures, and maybe uncompressed native only on some. If you want a quick way to do compressed native with reasonable register allocation, GNU libjit might work. I used it a couple years ago for a JIT project that we never fully implemented. I chose it over GNU Lightning specifically because it did register allocation. It implements a full assembler, not just calls, which could also be nice later. Noah On Sat, Nov 10, 2012 at 5:06 PM, Stefan Israelsson Tampe stefan.ita...@gmail.com mailto:stefan.ita...@gmail.com wrote: I would like to continue the discussion about native code. Some facts are, For example, consider this (define (f x) (let loop ((s 0) (i 0)) (if (eq? i x) s (loop (+ s i) (+ i 1) The timings for (f 1) ~ (f 100M) is 1) current vm : 2.93s 2) rtl : 1.67s 3) compressed native : 1.15s 4) uncompressed native : 0.54s sbcl = compressed nativ + better register allocations (normal optimization level) : 0.68s To note is that for this example the call overhead is close to 5ns per iteration and meaning that if we combined 4 with better register handling the potential is to get this loop to run at 0.2s which means that the loop has the potential of running 500M iterations in one second without sacrifying safety and not have a extraterestial code analyzer. Also to note is that the native code for the compressed native is smaller then the rtl code by some factor and if we could make use of registers in a better way we would end up with even less overhead. To note is that compressed native is a very simple mechanism to gain some speed and also improve on memory usage in the instruction flow, Also the assembler is very simplistic and it would not be to much hassle to port a new instruction format to that environment. Also it's probably possible to handle the complexity of the code in pure C for the stubs and by compiling them in a special way make sure they output a format that can be combined with the meta information in special registers needed to make the execution of the compiled scheme effective. This study also shows that there is a clear benefit to be able to use the computers registers,
Re: thoughts on native code
Arg, I prematurely send that mail, well here is the continuation 4. I prefere to have an evironment like (assemble target (inst jmp label:) (inst mov b a) label: (inst mov b c) This makes the labels stand out and makes for a nice read of the assembler. You can see how weinholt in industria solves the assembling issue, also MIT Scheme as an assembler that you can learn from, I was not liking the syntax of that one though, but You maybe can port that over to guile, this is GNU and may be the shortest path to success. I like the environment in the sbcl assembler, I ported that over in the aschm repo and you can see a lot of assembler in the nativ/vm/insts.scm file in that repo on gitorious. We cannot use that one due to the fact that it's not GNU although it's open source and the code is somewhat unclean. It would also be nice if we could concentrate on a restricted set of instruction support from the beginning and make sure to support more architectures instead. What do you think? /Stefan On Thu, Nov 15, 2012 at 11:19 AM, Sjoerd van Leent Privé svanle...@gmail.com wrote: Hi Stefan, Just my idea about an assembler in Scheme. Sounds interesting. If it's done properly, it can be very promising to use scheme itself to directly emit machine instructions. This would also be interesting for meta compilation in the future (think of aiding GCC). So you are thinking about an assembler for x86? Perhaps I can help out on this one. I would like to do this part, as I haven't been able to aid on other parts besides voicing my ideas (anyways, I am on embedded development these days.) The only discussion is the syntax I believe, I mean, should it be ATT like, Intel like, or leave this domain and do something new. I would go for instructions like this (using macros): (let ((target :x686)) (assemble target ((mov long 100 EAX) (mov long 200 EBX) (add long EBX EAX Giving back the native machine code instructions. Perhaps special constructions can be made to return partially complete instructions (such as missing labels or calls to guile procedures...) Sjoerd On 11/12/2012 10:50 PM, Stefan Israelsson Tampe wrote: Thanks for your mail Noah, Yea libjit is quite interesting. But playing around with an assembler in scheme I do not want to go back to C or C++ land. The only problem is that we need a GNU scheme assembler and right now I use sbcl's assembler ported to scheme. We could perhaps use weinholts assembler as well in industria if he could sign papers to make it GNU. For the register allocation part I would really like to play a little in scheme to explore the idea you saw from my previous mail in this thread. Again I think it's natural to have this features in scheme and do not want to mess in C land too much. Am I wrong? Cheers Stefan On Sat, Nov 10, 2012 at 11:49 PM, Noah Lavine noah.b.lav...@gmail.comwrote: Hello, I assume compressed native is the idea you wrote about in your last email, where we generate native code which is a sequence of function calls to VM operations. I really like that idea. As you said, it uses the instruction cache better. But it also fixes something I was worried about, which is that it's a lot of work to port an assembler to a new architecture, so we might end up not supporting many native architectures. But it seems much easier to make an assembler that only knows how to make call instructions and branches. So we could support compressed native on lots of architectures, and maybe uncompressed native only on some. If you want a quick way to do compressed native with reasonable register allocation, GNU libjit might work. I used it a couple years ago for a JIT project that we never fully implemented. I chose it over GNU Lightning specifically because it did register allocation. It implements a full assembler, not just calls, which could also be nice later. Noah On Sat, Nov 10, 2012 at 5:06 PM, Stefan Israelsson Tampe stefan.ita...@gmail.com wrote: I would like to continue the discussion about native code. Some facts are, For example, consider this (define (f x) (let loop ((s 0) (i 0)) (if (eq? i x) s (loop (+ s i) (+ i 1) The timings for (f 1) ~ (f 100M) is 1) current vm : 2.93s 2) rtl : 1.67s 3) compressed native : 1.15s 4) uncompressed native : 0.54s sbcl = compressed nativ + better register allocations (normal optimization level) : 0.68s To note is that for this example the call overhead is close to 5ns per iteration and meaning that if we combined 4 with better register handling the potential is to get this loop to run at 0.2s which means that the loop has the potential of running 500M iterations in one second without sacrifying safety and not have a extraterestial code analyzer. Also to note is that the native code for the compressed native is smaller then the rtl code by
Re: thoughts on native code
Before anyone spends any more time on this, I want to make it clear that although I very much appreciate Stefan's pioneering spirit, and some of his ideas are likely to be incorporated, Stefan's work on native compilation is an independent project of his, and is unlikely to be merged into the official Guile. Andy and I have been planning a different approach. Mark
Re: thoughts on native code
Hi, Yes this is pre-work and what i'm doing is an investigation trying out things. bare that in mind :-) For the assembler it can be really good to support one that comes with guile so I do not find this work as a research work but as a service work to propose components that can be included in guile. Mark, don't you agree on that my higher level research here is experimental, but that you will need to have some kind of assembler in the end to have a sane working environment to output native code? /Stefan On Thu, Nov 15, 2012 at 6:50 PM, Mark H Weaver m...@netris.org wrote: Before anyone spends any more time on this, I want to make it clear that although I very much appreciate Stefan's pioneering spirit, and some of his ideas are likely to be incorporated, Stefan's work on native compilation is an independent project of his, and is unlikely to be merged into the official Guile. Andy and I have been planning a different approach. Mark
Re: thoughts on native code
Hello, Regarding the assembler, if I were to actually hack something ;-), I’d choose Sassy [0], along with Industria’s disassemblers [1]. Ludo’. [0] http://sassy.sourceforge.net/ [1] http://weinholt.se/industria/
Re: Adding to the end of the load path
l...@gnu.org (Ludovic Courtès) writes: Hi! Sorry for the delay. Andreas Rottmann a.rottm...@gmx.at skribis: Ian Price ianpric...@googlemail.com writes: [...] Andreas Rottmann suggested something similar in February[1]. I've attached a patch implementing that suggestion, FWIW. I don't have any concrete proposals better than what Andreas has suggested, but I felt I should make this post to the list, lest Mark and I forget. [...] 1. http://lists.gnu.org/archive/html/guile-devel/2012-02/msg00038.html Like Mark, I’m not comfortable with changing the meaning of the empty string in the load path, and the behavior of ‘%parse-path’. I agree to that -- there's quite a risk breaking existing setups this way. I pretty much like Mark’s suggestion of using ‘...’ as a special marker, even though that’s a valid file name. Well, there's a workaround -- specifying ./... as an escape sequence for ... if you really need to have a three-dot relative directory in the path. How would that work for you? I would like the approach using separate _SUFFIX variables better, as it doesn't have this special case. While I don't think the special case will be a problem for a user setting the environment variables themselves, if you want to set them programatically, you now have to consider treat ... specially, escaping it like mentioned above, to be general. However, I can live with that, but maybe we can have it both ways: - Add the _SUFFIX environment variables, making it clear in the docs that they are supported only from Guile 2.0.7 onward. - Additonally, add ... as a special marker, but mention it is just provided to support Guile 2.0.7, and should not be used in code that needs to depend on Guile 2.0.7 or newer for other reasons (e.g. reliance on another added feature or significant bugfix). I'm not sure how the deprecation strategy is employed exactly, but we could mark the ... feature as deprecated right away, or at least in master, and remove it in 2.2 or 2.4. This would also kind of lift the burden from programs manipulating *_LOAD_PATH programatically, as they can still be general wrt. _undeprecated_ features. Opinions? Regards, Rotty -- Andreas Rottmann -- http://rotty.xx.vu/
Re: Adding to the end of the load path
Hi Andreas, Andreas Rottmann a.rottm...@gmx.at skribis: l...@gnu.org (Ludovic Courtès) writes: [...] I pretty much like Mark’s suggestion of using ‘...’ as a special marker, even though that’s a valid file name. Well, there's a workaround -- specifying ./... as an escape sequence for ... if you really need to have a three-dot relative directory in the path. Right. In general, it’s a bad idea to use relative file names in such variables anyway. How would that work for you? I would like the approach using separate _SUFFIX variables better, as it doesn't have this special case. OK. I dislike the proliferation of environment variables, but yeah, it might somewhat less ugly than ‘...’. - Add the _SUFFIX environment variables, making it clear in the docs that they are supported only from Guile 2.0.7 onward. - Additonally, add ... as a special marker, but mention it is just provided to support Guile 2.0.7, and should not be used in code that needs to depend on Guile 2.0.7 or newer for other reasons (e.g. reliance on another added feature or significant bugfix). Blech, that second part is terrrible. Mark is right that ‘...’ is the only workable solution, in terms of compatibility. So we need that one. A potential problem is that in .bashrc, shell scripts, etc., it’s going to be hard to make sure that ‘...’ remains first, when that’s what you want, because you’ll inevitable find legacy code that does things like: export GUILE_LOAD_PATH=$HOME/soft/share/guile/site/2.0${GUILE_LOAD_PATH:+:}$GUILE_LOAD_PATH thereby moving ‘...’ further away. Pfff, this is really terrible. Mark: WDYT? Ludo’.
Re: Adding to the end of the load path
Hi Andreas, Andreas Rottmann a.rottm...@gmx.at writes: l...@gnu.org (Ludovic Courtès) writes: I pretty much like Mark’s suggestion of using ‘...’ as a special marker, even though that’s a valid file name. Well, there's a workaround -- specifying ./... as an escape sequence for ... if you really need to have a three-dot relative directory in the path. How would that work for you? I would like the approach using separate _SUFFIX variables better, as it doesn't have this special case. As I wrote earlier, I certainly agree that the _SUFFIX approach is cleaner. Unfortunately, we need a solution that will work nicely with earlier versions of Guile. While I don't think the special case will be a problem for a user setting the environment variables themselves, if you want to set them programatically, you now have to consider treat ... specially, escaping it like mentioned above, to be general. Note that PATH-style variables are already not general, because they provide no way to include filenames containing ':' (a colon). In general, it's best to avoid setting GUILE_LOAD_PATH programmatically, because it will affect more than just the instance of Guile you intended; it will also affect any subprocesses that use Guile. It's better to use -L which is fully general without any special cases, or to modify %load-path within the program itself. However, I can live with that, but maybe we can have it both ways: - Add the _SUFFIX environment variables, making it clear in the docs that they are supported only from Guile 2.0.7 onward. Yes, I agree this is a good idea. - Additonally, add ... as a special marker, but mention it is just provided to support Guile 2.0.7, and should not be used in code that needs to depend on Guile 2.0.7 or newer for other reasons (e.g. reliance on another added feature or significant bugfix). Again, these environment variables are not specific to any particular piece of code. They are usually associated with an entire user account. I'm not sure how the deprecation strategy is employed exactly, but we could mark the ... feature as deprecated right away, or at least in master, and remove it in 2.2 or 2.4. I don't think we can mark it deprecated until versions of Guile older than 2.0.7 have become very rare, which won't be until at least 2017 (due to Ubuntu 12.04 LTS), and then it will need to be deprecated for a couple more years before we can get rid of it entirely. Therefore, I think it's premature to emphasize the transient nature of the ... marker. Like it or not, we'll probably be stuck with it for 7 or 8 years. Does that make sense? Regards, Mark
Re: thoughts on native code
Sjoerd van Leent Privé svanle...@gmail.com writes: Hi Stefan, Just my idea about an assembler in Scheme. Sounds interesting. If it's done properly, it can be very promising to use scheme itself to directly emit machine instructions. This would also be interesting for meta compilation in the future (think of aiding GCC). So you are thinking about an assembler for x86? Perhaps I can help out on this one. I would like to do this part, as I haven't been able to aid on other parts besides voicing my ideas (anyways, I am on embedded development these days.) The only discussion is the syntax I believe, I mean, should it be ATT like, Intel like, or leave this domain and do something new. I would go for instructions like this (using macros): (let ((target :x686)) (assemble target ((mov long 100 EAX) (mov long 200 EBX) (add long EBX EAX Giving back the native machine code instructions. Perhaps special constructions can be made to return partially complete instructions (such as missing labels or calls to guile procedures...) Regarding the assembler: I have a working AVR assembler [0] in my avrth AVR Forth implementation [1]. That assembler also employs this partially complete instructions idea you mentioned, having a symbol resolution step. It also supports a simple evaluator for Assembly-time expressions. Maybe you find it interesting :-). Here's an example snippet, from the Forth implementation's runtime: (define-primitive-vocable vocabulary:primitive 1ms (vm) (scheme (sleep-seconds 0.001)) (assembly (ldi zl (lo8 (/ cpu-frequency 4000))) (ldi zh (hi8 (/ cpu-frequency 4000))) (sbiw zl 42) ;internal plus forth kernel overhead PFA_1MS1 (sbiw zl 1) (brne PFA_1MS1))) The assembler code is the `(assembly ...)' part, but above that you can see the runable documentation, written in Scheme ;-). You may also have noticed the expressions used, i.e. `(lo8 ...)' and `(hi8 ...)' -- these are evaluated at assembly-time by the `assembler-eval' as found in [0]. While I'm probably too time-starved to really help with Guile's assembler, I'd be interested in having, and maybe working on (if time permits), a Scheme-based assembler targeting ARM platforms, so I might chime in this area at some point. It should be fine to use my code (from a copyright view angle) as a basis something to be incorporated into Guile (even though that file is marked GPLv2, not GPLv2+), as long as you take care to eliminate the actual AVR instruction generation code. This part (i.e., the section marked Code emitters) is mostly originally transcribed from the assembler [2] written in Forth included in AmForth [3], which is GPLv2 (only, unfortunatly). However, I'm not even sure if the AmForth copyright is still applicable to that part of my code, as the code has been transformed substantially by transcription. That might well be a moot point, we probably don't want to target 8-bit microcontrollers as Guile platforms anyway ;-). [0] http://rotty.xx.vu/gitweb/?p=scheme/avrth.git;a=blob;f=assembler.sls;hb=HEAD [1] http://rotty.xx.vu/gitweb/?p=scheme/avrth.git;a=summary [2] http://amforth.svn.sourceforge.net/viewvc/amforth/trunk/lib/assembler.frt?revision=1301view=markup [3] http://amforth.sourceforge.net/ Regards, Rotty -- Andreas Rottmann -- http://rotty.xx.vu/
Re: Adding to the end of the load path
Mark H Weaver m...@netris.org skribis: However, I can live with that, but maybe we can have it both ways: - Add the _SUFFIX environment variables, making it clear in the docs that they are supported only from Guile 2.0.7 onward. Yes, I agree this is a good idea. But then, what would happen when GUILE_LOAD_PATH_SUFFIX is present *and* GUILE_LOAD_PATH contains ‘...’? Seems like a can of worms to me. Ludo’.
Re: Adding to the end of the load path
Hello, This is coming late in the discussion, but I'd like to suggest a somewhat different approach. I hope this is helpful. It seems to me that in the end, the module-lookup system may need to be more complex than having regular and suffix lookup paths. For instance, one of the big concerns here was reducing the number of stat() calls. What if we know that some load directories only contain certain modules? We might want a way for the user to say all the (foo ...) modules live in ~/foo, but you don't have to look for any other modules there. Or what if I want to use a backup version of a module that's also included in the regular Guile distribution, because I haven't ported my code to a new version yet (yes, I should use module versions, but I don't)? There might be more complicated scenarios too. Given that the module-lookup system is fundamentally complicated, I'm going to suggest that we *don't* try to make it all configurable by environment variables. If people want full control of lookups, they can write a site-wide Guile init file or a personal ~/.guile. The regular load-path would still be part of the system, and that would be configurable by an environment variable, so legacy setups would continue to work. However, I'd be happy saying that if you wanted to access all of the functionality, you need to write Scheme code. Let's start by adding Scheme interfaces to the functionality we want, and maybe not worry about environment variables if they're complicated. What do you think? Noah On Thu, Nov 15, 2012 at 5:44 PM, Mark H Weaver m...@netris.org wrote: Hi Andreas, Andreas Rottmann a.rottm...@gmx.at writes: l...@gnu.org (Ludovic Courtès) writes: I pretty much like Mark’s suggestion of using ‘...’ as a special marker, even though that’s a valid file name. Well, there's a workaround -- specifying ./... as an escape sequence for ... if you really need to have a three-dot relative directory in the path. How would that work for you? I would like the approach using separate _SUFFIX variables better, as it doesn't have this special case. As I wrote earlier, I certainly agree that the _SUFFIX approach is cleaner. Unfortunately, we need a solution that will work nicely with earlier versions of Guile. While I don't think the special case will be a problem for a user setting the environment variables themselves, if you want to set them programatically, you now have to consider treat ... specially, escaping it like mentioned above, to be general. Note that PATH-style variables are already not general, because they provide no way to include filenames containing ':' (a colon). In general, it's best to avoid setting GUILE_LOAD_PATH programmatically, because it will affect more than just the instance of Guile you intended; it will also affect any subprocesses that use Guile. It's better to use -L which is fully general without any special cases, or to modify %load-path within the program itself. However, I can live with that, but maybe we can have it both ways: - Add the _SUFFIX environment variables, making it clear in the docs that they are supported only from Guile 2.0.7 onward. Yes, I agree this is a good idea. - Additonally, add ... as a special marker, but mention it is just provided to support Guile 2.0.7, and should not be used in code that needs to depend on Guile 2.0.7 or newer for other reasons (e.g. reliance on another added feature or significant bugfix). Again, these environment variables are not specific to any particular piece of code. They are usually associated with an entire user account. I'm not sure how the deprecation strategy is employed exactly, but we could mark the ... feature as deprecated right away, or at least in master, and remove it in 2.2 or 2.4. I don't think we can mark it deprecated until versions of Guile older than 2.0.7 have become very rare, which won't be until at least 2017 (due to Ubuntu 12.04 LTS), and then it will need to be deprecated for a couple more years before we can get rid of it entirely. Therefore, I think it's premature to emphasize the transient nature of the ... marker. Like it or not, we'll probably be stuck with it for 7 or 8 years. Does that make sense? Regards, Mark