Re: thoughts on native code

2012-11-15 Thread Sjoerd van Leent Privé

Hi Stefan,

Just my idea about an assembler in Scheme. Sounds interesting. If it's 
done properly, it can be very promising to use scheme itself to directly 
emit machine instructions. This would also be interesting for meta 
compilation in the future (think of aiding GCC).


So you are thinking about an assembler for x86? Perhaps I can help out 
on this one. I would like to do this part, as I haven't been able to aid 
on other parts besides voicing my ideas (anyways, I am on embedded 
development these days.)


The only discussion is the syntax I believe, I mean, should it be ATT 
like, Intel like, or leave this domain and do something new. I would go 
for instructions like this (using macros):


(let ((target :x686))
  (assemble target
   ((mov long 100 EAX)
(mov long 200 EBX)
(add long EBX EAX

Giving back the native machine code instructions. Perhaps special 
constructions can be made to return partially complete instructions 
(such as missing labels or calls to guile procedures...)


Sjoerd


On 11/12/2012 10:50 PM, Stefan Israelsson Tampe wrote:

Thanks for your mail Noah,

Yea libjit is quite interesting. But playing around with an assembler 
in scheme I do not want to go back to
C or C++ land. The only problem is that we need a GNU scheme assembler 
and right now I use sbcl's assembler
ported to scheme. We could perhaps use weinholts assembler as well in 
industria if he could sign papers to make it GNU. For the register 
allocation part I would really like to play a little in scheme to 
explore the idea you saw from my previous mail in this thread. Again I 
think it's natural to have this features in scheme and do not want to 
mess in C land too much.


Am I wrong?

Cheers
Stefan


On Sat, Nov 10, 2012 at 11:49 PM, Noah Lavine noah.b.lav...@gmail.com 
mailto:noah.b.lav...@gmail.com wrote:


Hello,

I assume compressed native is the idea you wrote about in your
last email, where we generate native code which is a sequence of
function calls to VM operations.

I really like that idea. As you said, it uses the instruction
cache better. But it also fixes something I was worried about,
which is that it's a lot of work to port an assembler to a new
architecture, so we might end up not supporting many native
architectures. But it seems much easier to make an assembler that
only knows how to make call instructions and branches. So we could
support compressed native on lots of architectures, and maybe
uncompressed native only on some.

If you want a quick way to do compressed native with reasonable
register allocation, GNU libjit might work. I used it a couple
years ago for a JIT project that we never fully implemented. I
chose it over GNU Lightning specifically because it did register
allocation. It implements a full assembler, not just calls, which
could also be nice later.

Noah



On Sat, Nov 10, 2012 at 5:06 PM, Stefan Israelsson Tampe
stefan.ita...@gmail.com mailto:stefan.ita...@gmail.com wrote:

I would like to continue the discussion about native code.

Some facts are,
For example, consider this
(define (f x) (let loop ((s 0) (i 0)) (if (eq? i x) s (loop (+
s i) (+ i 1)

The timings for (f 1)  ~ (f 100M) is

1) current vm : 2.93s
2) rtl  : 1.67s
3) compressed native : 1.15s
4) uncompressed native : 0.54s

sbcl = compressed nativ + better register allocations (normal
optimization level) : 0.68s

To note is that for this example the call overhead is close to
5ns per iteration and meaning that
if we combined 4 with better register handling the potential
is to get this loop to run at 0.2s which means
that the loop has the potential of running 500M iterations in
one second without sacrifying safety and not
have a extraterestial code analyzer. Also to note is that the
native code for the compressed native is smaller then the
rtl code by some factor and if we could make use of registers
in a better way we would end up with even less overhead.

To note is that compressed native is a very simple mechanism
to gain some speed and also improve on memory
usage in the instruction flow, Also the assembler is very
simplistic and it would not be to much hassle to port a new
instruction format to that environment. Also it's probably
possible to handle the complexity of the code in pure C
for the stubs and by compiling them in a special way make sure
they output a format that can be combined
with the meta information in special registers needed to make
the execution of the compiled scheme effective.

This study also shows that there is a clear benefit to be able
to use the computers registers, 

Re: thoughts on native code

2012-11-15 Thread Stefan Israelsson Tampe
Arg, I prematurely send that mail, well here is the continuation

4. I prefere to have an evironment like
   (assemble target
  (inst jmp label:)
  (inst mov b a)
label:
  (inst mov b c)

This makes the labels stand out and makes for a nice read of the assembler.

You can see how weinholt in industria solves the assembling issue, also MIT
Scheme as an assembler that you can learn from, I was not liking the syntax
of that one though, but You maybe can port that over to guile, this is GNU
and may
be the shortest path to success.

I like the environment in the sbcl assembler, I ported that over in the
aschm repo and you can see a lot of assembler
in the nativ/vm/insts.scm file in that repo on gitorious. We cannot use
that one due to the fact that it's not GNU although
it's open source and the code is somewhat unclean.

It would also be nice if we could concentrate on a restricted set of
instruction support from the beginning and make sure to support more
architectures instead.

What do you think?

/Stefan


On Thu, Nov 15, 2012 at 11:19 AM, Sjoerd van Leent Privé 
svanle...@gmail.com wrote:

  Hi Stefan,

 Just my idea about an assembler in Scheme. Sounds interesting. If it's
 done properly, it can be very promising to use scheme itself to directly
 emit machine instructions. This would also be interesting for meta
 compilation in the future (think of aiding GCC).

 So you are thinking about an assembler for x86? Perhaps I can help out on
 this one. I would like to do this part, as I haven't been able to aid on
 other parts besides voicing my ideas (anyways, I am on embedded development
 these days.)

 The only discussion is the syntax I believe, I mean, should it be ATT
 like, Intel like, or leave this domain and do something new. I would go for
 instructions like this (using macros):

 (let ((target :x686))
   (assemble target
((mov long 100 EAX)
 (mov long 200 EBX)
 (add long EBX EAX

 Giving back the native machine code instructions. Perhaps special
 constructions can be made to return partially complete instructions (such
 as missing labels or calls to guile procedures...)

 Sjoerd



 On 11/12/2012 10:50 PM, Stefan Israelsson Tampe wrote:

 Thanks for your mail Noah,

 Yea libjit is quite interesting. But playing around with an assembler in
 scheme I do not want to go back to
 C or C++ land. The only problem is that we need a GNU scheme assembler and
 right now I use sbcl's assembler
 ported to scheme. We could perhaps use weinholts assembler as well in
 industria if he could sign papers to make it GNU. For the register
 allocation part I would really like to play a little in scheme to explore
 the idea you saw from my previous mail in this thread. Again I think it's
 natural to have this features in scheme and do not want to mess in C land
 too much.

 Am I wrong?

 Cheers
 Stefan


 On Sat, Nov 10, 2012 at 11:49 PM, Noah Lavine noah.b.lav...@gmail.comwrote:

 Hello,

  I assume compressed native is the idea you wrote about in your last
 email, where we generate native code which is a sequence of function calls
 to VM operations.

  I really like that idea. As you said, it uses the instruction cache
 better. But it also fixes something I was worried about, which is that it's
 a lot of work to port an assembler to a new architecture, so we might end
 up not supporting many native architectures. But it seems much easier to
 make an assembler that only knows how to make call instructions and
 branches. So we could support compressed native on lots of architectures,
 and maybe uncompressed native only on some.

  If you want a quick way to do compressed native with reasonable
 register allocation, GNU libjit might work. I used it a couple years ago
 for a JIT project that we never fully implemented. I chose it over GNU
 Lightning specifically because it did register allocation. It implements a
 full assembler, not just calls, which could also be nice later.

  Noah



 On Sat, Nov 10, 2012 at 5:06 PM, Stefan Israelsson Tampe 
 stefan.ita...@gmail.com wrote:

 I would like to continue the discussion about native code.

 Some facts are,
 For example, consider this
 (define (f x) (let loop ((s 0) (i 0)) (if (eq? i x) s (loop (+ s i) (+ i
 1)

 The timings for (f 1)  ~ (f 100M) is

 1) current vm : 2.93s
 2) rtl  : 1.67s
 3) compressed native : 1.15s
 4) uncompressed native : 0.54s

 sbcl = compressed nativ + better register allocations (normal
 optimization level) : 0.68s

 To note is that for this example the call overhead is close to 5ns per
 iteration and meaning that
 if we combined 4 with better register handling the potential is to get
 this loop to run at 0.2s which means
 that the loop has the potential of running 500M iterations in one second
 without sacrifying safety and not
 have a extraterestial code analyzer. Also to note is that the native
 code for the compressed native is smaller then the
 rtl code by 

Re: thoughts on native code

2012-11-15 Thread Mark H Weaver
Before anyone spends any more time on this, I want to make it clear that
although I very much appreciate Stefan's pioneering spirit, and some of
his ideas are likely to be incorporated, Stefan's work on native
compilation is an independent project of his, and is unlikely to be
merged into the official Guile.  Andy and I have been planning a
different approach.

  Mark



Re: thoughts on native code

2012-11-15 Thread Stefan Israelsson Tampe
Hi,

Yes this is pre-work and what i'm doing is an investigation trying out
things. bare that in mind :-)

For the assembler it can be really good to support one that comes with
guile so I do not find this
work as a research work but as a service work to propose components that
can be included in guile.

Mark, don't you agree on that my higher level research here is
experimental, but that you will need to have
some kind of assembler in the end to have a sane working environment to
output native code?

/Stefan


On Thu, Nov 15, 2012 at 6:50 PM, Mark H Weaver m...@netris.org wrote:

 Before anyone spends any more time on this, I want to make it clear that
 although I very much appreciate Stefan's pioneering spirit, and some of
 his ideas are likely to be incorporated, Stefan's work on native
 compilation is an independent project of his, and is unlikely to be
 merged into the official Guile.  Andy and I have been planning a
 different approach.

   Mark



Re: thoughts on native code

2012-11-15 Thread Ludovic Courtès
Hello,

Regarding the assembler, if I were to actually hack something ;-), I’d
choose Sassy [0], along with Industria’s disassemblers [1].

Ludo’.

[0] http://sassy.sourceforge.net/
[1] http://weinholt.se/industria/




Re: Adding to the end of the load path

2012-11-15 Thread Andreas Rottmann
l...@gnu.org (Ludovic Courtès) writes:

 Hi!

 Sorry for the delay.

 Andreas Rottmann a.rottm...@gmx.at skribis:

 Ian Price ianpric...@googlemail.com writes:

 [...]

 Andreas Rottmann suggested something similar in February[1].

 I've attached a patch implementing that suggestion, FWIW.

 I don't have any concrete proposals better than what Andreas has
 suggested, but I felt I should make this post to the list, lest Mark and
 I forget.


 [...]

 1. http://lists.gnu.org/archive/html/guile-devel/2012-02/msg00038.html

 Like Mark, I’m not comfortable with changing the meaning of the empty
 string in the load path, and the behavior of ‘%parse-path’.

I agree to that -- there's quite a risk breaking existing setups this
way.

 I pretty much like Mark’s suggestion of using ‘...’ as a special marker,
 even though that’s a valid file name.

Well, there's a workaround -- specifying ./... as an escape sequence
for ... if you really need to have a three-dot relative directory in
the path.

 How would that work for you?

I would like the approach using separate _SUFFIX variables better, as it
doesn't have this special case.  While I don't think the special case
will be a problem for a user setting the environment variables
themselves, if you want to set them programatically, you now have to
consider treat ... specially, escaping it like mentioned above, to be
general.  However, I can live with that, but maybe we can have it both
ways:

- Add the _SUFFIX environment variables, making it clear in the docs
  that they are supported only from Guile 2.0.7 onward.

- Additonally, add ... as a special marker, but mention it is just
  provided to support Guile  2.0.7, and should not be used in code that
  needs to depend on Guile 2.0.7 or newer for other reasons
  (e.g. reliance on another added feature or significant bugfix).

I'm not sure how the deprecation strategy is employed exactly, but we
could mark the ... feature as deprecated right away, or at least in
master, and remove it in 2.2 or 2.4. This would also kind of lift the
burden from programs manipulating *_LOAD_PATH programatically, as they
can still be general wrt. _undeprecated_ features.  Opinions?

Regards, Rotty
-- 
Andreas Rottmann -- http://rotty.xx.vu/



Re: Adding to the end of the load path

2012-11-15 Thread Ludovic Courtès
Hi Andreas,

Andreas Rottmann a.rottm...@gmx.at skribis:

 l...@gnu.org (Ludovic Courtès) writes:

[...]

 I pretty much like Mark’s suggestion of using ‘...’ as a special marker,
 even though that’s a valid file name.

 Well, there's a workaround -- specifying ./... as an escape sequence
 for ... if you really need to have a three-dot relative directory in
 the path.

Right.  In general, it’s a bad idea to use relative file names in such
variables anyway.

 How would that work for you?

 I would like the approach using separate _SUFFIX variables better, as it
 doesn't have this special case.

OK.  I dislike the proliferation of environment variables, but yeah, it
might somewhat less ugly than ‘...’.

 - Add the _SUFFIX environment variables, making it clear in the docs
   that they are supported only from Guile 2.0.7 onward.

 - Additonally, add ... as a special marker, but mention it is just
   provided to support Guile  2.0.7, and should not be used in code that
   needs to depend on Guile 2.0.7 or newer for other reasons
   (e.g. reliance on another added feature or significant bugfix).

Blech, that second part is terrrible.

Mark is right that ‘...’ is the only workable solution, in terms of
compatibility.  So we need that one.

A potential problem is that in .bashrc, shell scripts, etc., it’s going
to be hard to make sure that ‘...’ remains first, when that’s what you
want, because you’ll inevitable find legacy code that does things like:

  export 
GUILE_LOAD_PATH=$HOME/soft/share/guile/site/2.0${GUILE_LOAD_PATH:+:}$GUILE_LOAD_PATH

thereby moving ‘...’ further away.

Pfff, this is really terrible.

Mark: WDYT?

Ludo’.



Re: Adding to the end of the load path

2012-11-15 Thread Mark H Weaver
Hi Andreas,

Andreas Rottmann a.rottm...@gmx.at writes:

 l...@gnu.org (Ludovic Courtès) writes:

 I pretty much like Mark’s suggestion of using ‘...’ as a special marker,
 even though that’s a valid file name.

 Well, there's a workaround -- specifying ./... as an escape sequence
 for ... if you really need to have a three-dot relative directory in
 the path.

 How would that work for you?

 I would like the approach using separate _SUFFIX variables better, as it
 doesn't have this special case.

As I wrote earlier, I certainly agree that the _SUFFIX approach is
cleaner.  Unfortunately, we need a solution that will work nicely with
earlier versions of Guile.

 While I don't think the special case
 will be a problem for a user setting the environment variables
 themselves, if you want to set them programatically, you now have to
 consider treat ... specially, escaping it like mentioned above, to be
 general.

Note that PATH-style variables are already not general, because they
provide no way to include filenames containing ':' (a colon).

In general, it's best to avoid setting GUILE_LOAD_PATH programmatically,
because it will affect more than just the instance of Guile you
intended; it will also affect any subprocesses that use Guile.  It's
better to use -L which is fully general without any special cases, or to
modify %load-path within the program itself.

 However, I can live with that, but maybe we can have it both
 ways:

 - Add the _SUFFIX environment variables, making it clear in the docs
   that they are supported only from Guile 2.0.7 onward.

Yes, I agree this is a good idea.

 - Additonally, add ... as a special marker, but mention it is just
   provided to support Guile  2.0.7, and should not be used in code that
   needs to depend on Guile 2.0.7 or newer for other reasons
   (e.g. reliance on another added feature or significant bugfix).

Again, these environment variables are not specific to any particular
piece of code.  They are usually associated with an entire user account.

 I'm not sure how the deprecation strategy is employed exactly, but we
 could mark the ... feature as deprecated right away, or at least in
 master, and remove it in 2.2 or 2.4.

I don't think we can mark it deprecated until versions of Guile older
than 2.0.7 have become very rare, which won't be until at least 2017
(due to Ubuntu 12.04 LTS), and then it will need to be deprecated for a
couple more years before we can get rid of it entirely.  Therefore, I
think it's premature to emphasize the transient nature of the ...
marker.  Like it or not, we'll probably be stuck with it for 7 or 8
years.

Does that make sense?

Regards,
  Mark



Re: thoughts on native code

2012-11-15 Thread Andreas Rottmann
Sjoerd van Leent Privé svanle...@gmail.com writes:

 Hi Stefan,

 Just my idea about an assembler in Scheme. Sounds interesting. If it's
 done properly, it can be very promising to use scheme itself to
 directly emit machine instructions. This would also be interesting for
 meta compilation in the future (think of aiding GCC).

 So you are thinking about an assembler for x86? Perhaps I can help out
 on this one. I would like to do this part, as I haven't been able to
 aid on other parts besides voicing my ideas (anyways, I am on embedded
 development these days.)

 The only discussion is the syntax I believe, I mean, should it be ATT
 like, Intel like, or leave this domain and do something new. I would
 go for instructions like this (using macros):

 (let ((target :x686))
   (assemble target
((mov long 100 EAX)
 (mov long 200 EBX)
 (add long EBX EAX

 Giving back the native machine code instructions. Perhaps special
 constructions can be made to return partially complete instructions
 (such as missing labels or calls to guile procedures...)

Regarding the assembler: I have a working AVR assembler [0] in my
avrth AVR Forth implementation [1]. That assembler also employs this
partially complete instructions idea you mentioned, having a symbol
resolution step. It also supports a simple evaluator for Assembly-time
expressions.  Maybe you find it interesting :-).

Here's an example snippet, from the Forth implementation's runtime:

(define-primitive-vocable vocabulary:primitive 1ms (vm)
  (scheme
   (sleep-seconds 0.001))
  (assembly
   (ldi zl (lo8 (/ cpu-frequency 4000)))
   (ldi zh (hi8 (/ cpu-frequency 4000)))
   (sbiw zl 42) ;internal plus forth kernel overhead
   PFA_1MS1
   (sbiw zl 1)
   (brne PFA_1MS1)))

The assembler code is the `(assembly ...)' part, but above that you can
see the runable documentation, written in Scheme ;-).  You may also have
noticed the expressions used, i.e. `(lo8 ...)' and `(hi8 ...)' -- these
are evaluated at assembly-time by the `assembler-eval' as found in [0].

While I'm probably too time-starved to really help with Guile's
assembler, I'd be interested in having, and maybe working on (if time
permits), a Scheme-based assembler targeting ARM platforms, so I might
chime in this area at some point.

It should be fine to use my code (from a copyright view angle) as a
basis something to be incorporated into Guile (even though that file is
marked GPLv2, not GPLv2+), as long as you take care to eliminate the
actual AVR instruction generation code.  This part (i.e., the section
marked Code emitters) is mostly originally transcribed from the
assembler [2] written in Forth included in AmForth [3], which is GPLv2
(only, unfortunatly).  However, I'm not even sure if the AmForth
copyright is still applicable to that part of my code, as the code has
been transformed substantially by transcription.  That might well be a
moot point, we probably don't want to target 8-bit microcontrollers as
Guile platforms anyway ;-).

[0] http://rotty.xx.vu/gitweb/?p=scheme/avrth.git;a=blob;f=assembler.sls;hb=HEAD
[1] http://rotty.xx.vu/gitweb/?p=scheme/avrth.git;a=summary
[2] 
http://amforth.svn.sourceforge.net/viewvc/amforth/trunk/lib/assembler.frt?revision=1301view=markup
[3] http://amforth.sourceforge.net/

Regards, Rotty
-- 
Andreas Rottmann -- http://rotty.xx.vu/



Re: Adding to the end of the load path

2012-11-15 Thread Ludovic Courtès
Mark H Weaver m...@netris.org skribis:

 However, I can live with that, but maybe we can have it both
 ways:

 - Add the _SUFFIX environment variables, making it clear in the docs
   that they are supported only from Guile 2.0.7 onward.

 Yes, I agree this is a good idea.

But then, what would happen when GUILE_LOAD_PATH_SUFFIX is present *and*
GUILE_LOAD_PATH contains ‘...’?

Seems like a can of worms to me.

Ludo’.



Re: Adding to the end of the load path

2012-11-15 Thread Noah Lavine
Hello,

This is coming late in the discussion, but I'd like to suggest a somewhat
different approach. I hope this is helpful.

It seems to me that in the end, the module-lookup system may need to be
more complex than having regular and suffix lookup paths. For instance, one
of the big concerns here was reducing the number of stat() calls. What if
we know that some load directories only contain certain modules? We might
want a way for the user to say all the (foo ...) modules live in ~/foo,
but you don't have to look for any other modules there. Or what if I want
to use a backup version of a module that's also included in the regular
Guile distribution, because I haven't ported my code to a new version yet
(yes, I should use module versions, but I don't)? There might be more
complicated scenarios too.

Given that the module-lookup system is fundamentally complicated, I'm going
to suggest that we *don't* try to make it all configurable by environment
variables. If people want full control of lookups, they can write a
site-wide Guile init file or a personal ~/.guile. The regular load-path
would still be part of the system, and that would be configurable by an
environment variable, so legacy setups would continue to work. However, I'd
be happy saying that if you wanted to access all of the functionality, you
need to write Scheme code. Let's start by adding Scheme interfaces to the
functionality we want, and maybe not worry about environment variables if
they're complicated.

What do you think?
Noah



On Thu, Nov 15, 2012 at 5:44 PM, Mark H Weaver m...@netris.org wrote:

 Hi Andreas,

 Andreas Rottmann a.rottm...@gmx.at writes:

  l...@gnu.org (Ludovic Courtès) writes:
 
  I pretty much like Mark’s suggestion of using ‘...’ as a special marker,
  even though that’s a valid file name.
 
  Well, there's a workaround -- specifying ./... as an escape sequence
  for ... if you really need to have a three-dot relative directory in
  the path.
 
  How would that work for you?
 
  I would like the approach using separate _SUFFIX variables better, as it
  doesn't have this special case.

 As I wrote earlier, I certainly agree that the _SUFFIX approach is
 cleaner.  Unfortunately, we need a solution that will work nicely with
 earlier versions of Guile.

  While I don't think the special case
  will be a problem for a user setting the environment variables
  themselves, if you want to set them programatically, you now have to
  consider treat ... specially, escaping it like mentioned above, to be
  general.

 Note that PATH-style variables are already not general, because they
 provide no way to include filenames containing ':' (a colon).

 In general, it's best to avoid setting GUILE_LOAD_PATH programmatically,
 because it will affect more than just the instance of Guile you
 intended; it will also affect any subprocesses that use Guile.  It's
 better to use -L which is fully general without any special cases, or to
 modify %load-path within the program itself.

  However, I can live with that, but maybe we can have it both
  ways:
 
  - Add the _SUFFIX environment variables, making it clear in the docs
that they are supported only from Guile 2.0.7 onward.

 Yes, I agree this is a good idea.

  - Additonally, add ... as a special marker, but mention it is just
provided to support Guile  2.0.7, and should not be used in code that
needs to depend on Guile 2.0.7 or newer for other reasons
(e.g. reliance on another added feature or significant bugfix).

 Again, these environment variables are not specific to any particular
 piece of code.  They are usually associated with an entire user account.

  I'm not sure how the deprecation strategy is employed exactly, but we
  could mark the ... feature as deprecated right away, or at least in
  master, and remove it in 2.2 or 2.4.

 I don't think we can mark it deprecated until versions of Guile older
 than 2.0.7 have become very rare, which won't be until at least 2017
 (due to Ubuntu 12.04 LTS), and then it will need to be deprecated for a
 couple more years before we can get rid of it entirely.  Therefore, I
 think it's premature to emphasize the transient nature of the ...
 marker.  Like it or not, we'll probably be stuck with it for 7 or 8
 years.

 Does that make sense?

 Regards,
   Mark