Re: [Guile-commits] GNU Guile branch, string_abstraction2, updated. 823e444052817ee120d87a3575acb4f767f17475

2009-05-28 Thread Ludovic Courtès
Hello,

Andy Wingo wi...@pobox.com writes:

 This is complicated in Guile by #!. A reasonable thing would be to have
 the reader have a bit on whether it actually saw an expression yet or
 not. If not, ^;+ [^\n]*coding: ... would set the file's encoding.

I think it would make sense to follow Emacs' specification of file-local
variables as closely as possible (info (emacs) Specifying File
Variables), as well as its naming scheme for encodings as shown by
`M-x list-coding-systems'.

Thanks,
Ludo'.




Re: [Guile-commits] GNU Guile branch, string_abstraction2, updated. 823e444052817ee120d87a3575acb4f767f17475

2009-05-28 Thread Andy Wingo
Hi,

On Thu 28 May 2009 16:37, l...@gnu.org (Ludovic Courtès) writes:

 Andy Wingo wi...@pobox.com writes:

 This is complicated in Guile by #!. A reasonable thing would be to have
 the reader have a bit on whether it actually saw an expression yet or
 not. If not, ^;+ [^\n]*coding: ... would set the file's encoding.

 I think it would make sense to follow Emacs' specification of file-local
 variables as closely as possible (info (emacs) Specifying File
 Variables), as well as its naming scheme for encodings as shown by
 `M-x list-coding-systems'.

Good points. Although, I wonder how emacs does the right thing regarding
coding: if the variable list is at the end of a file. But certainly
recognizing it in the first two lines of the file would be robust and
follow emacs.

Andy
-- 
http://wingolog.org/




Re: [Guile-commits] GNU Guile branch, string_abstraction2, updated. 823e444052817ee120d87a3575acb4f767f17475

2009-05-28 Thread Ludovic Courtès
Hello,

Mike Gran spk...@yahoo.com writes:

 This all means that grepping the coding is a true preprocessing
 step, divorced from the reader.

Not necessarily.  The encoding can be stored in a fluid, or associated
with the current input port, and modified by `scm_read ()' as it
encounters encoding meta-data.

Thanks,
Ludo'.





Re: [Guile-commits] GNU Guile branch, string_abstraction2, updated. 823e444052817ee120d87a3575acb4f767f17475

2009-05-26 Thread Andy Wingo
On Tue 26 May 2009 00:22, l...@gnu.org (Ludovic Courtès) writes:

 However, this relies on eval-after-read semantics.  That is, if the
 whole file is read at once, *then* evaluated, that won't work, right?

Or read at once, *then* compiled, *then* evaluated; or even, read one
expression at a time, compiled one at a time, but evaluated all of a
piece.

A
-- 
http://wingolog.org/




Re: [Guile-commits] GNU Guile branch, string_abstraction2, updated. 823e444052817ee120d87a3575acb4f767f17475

2009-05-26 Thread Mike Gran


 From: Andy Wingo wi...@pobox.com

 
 On Tue 26 May 2009 00:22, l...@gnu.org (Ludovic Courtès) writes:
 
  However, this relies on eval-after-read semantics.  That is, if the
  whole file is read at once, *then* evaluated, that won't work, right?
 
 Or read at once, *then* compiled, *then* evaluated; or even, read one
 expression at a time, compiled one at a time, but evaluated all of a
 piece.

If one can't depend on the order of evaluation, the the source encoding
has to become a pragma that is preprocessed.

The reader could probably preprocess the file looking for where 
the text coding: X appears within a comment in the top dozen
lines of a source code file. Or perhaps a line that is explicitly
 #pragma coding: X in the top few lines of a file.




Re: [Guile-commits] GNU Guile branch, string_abstraction2, updated. 823e444052817ee120d87a3575acb4f767f17475

2009-05-25 Thread Ludovic Courtès
Hello,

Michael Gran spk...@yahoo.com writes:

 add tests for encoding/decoding wide strings

Nice!

Just a bit of cosmetic nitpicking:

   * test-suite/tests/encoding_utf8.test: new

Please use hyphens instead of underscores in file names, for
consistency.

 +(setlocale LC_ALL en_US.utf8)

[...]

 +(setencoding ASCII)

[...]

 +(setencoding ISO-8859-7)

Do these modify the encoding used by the underlying port?  If so, I'd
rather explicitly use a fluid, as is done for `current-reader'.

However, this relies on eval-after-read semantics.  That is, if the
whole file is read at once, *then* evaluated, that won't work, right?

 +(setlocale LC_ALL es_MX.ISO-8859-1)

Not everyone has this locale.  ;-)

 +(with-test-prefix
 + internal encoding
 +
 + (pass-if ultima
 +   (string=? s1 (string-ints #xfa #x6c #x74 #x69 #x6d #x61)))

Please indent as is done in other files.

Thanks!
Ludo'.




Re: [Guile-commits] GNU Guile branch, string_abstraction2, updated. 823e444052817ee120d87a3575acb4f767f17475

2009-05-25 Thread Mike Gran
On Tue, 2009-05-26 at 00:22 +0200, Ludovic Courtès wrote:
 Hello,
 
 Just a bit of cosmetic nitpicking:
 
  * test-suite/tests/encoding_utf8.test: new
 
 Please use hyphens instead of underscores in file names, for
 consistency.

OK

 
  +(setlocale LC_ALL en_US.utf8)
 
 [...]
 
  +(setencoding ASCII)
 
 [...]
 
  +(setencoding ISO-8859-7)
 
 Do these modify the encoding used by the underlying port?  If so, I'd
 rather explicitly use a fluid, as is done for `current-reader'.

For now, I have only one global port encoding.  So setlocale,
setencoding modify all subsequent port I/O.

 
 However, this relies on eval-after-read semantics.  That is, if the
 whole file is read at once, *then* evaluated, that won't work, right?

The reader needs to know the encoding of a file by reading the file.
The way I have it set up right now source gets evaluated sequentially
and the reader changes encoding when setlocale or setencoding is
encountered.  Kludgy, but, simple.

I think Python sets the source encoding using a magic comment.  That's
another way to go: have the reader scan the comment blocks for a magic
comment before trying to evaluate the file.

 
  +(setlocale LC_ALL es_MX.ISO-8859-1)
 
 Not everyone has this locale.  ;-)

Viva la raza!

 
  +(with-test-prefix
  + internal encoding
  +
  + (pass-if ultima
  + (string=? s1 (string-ints #xfa #x6c #x74 #x69 #x6d #x61)))
 
 Please indent as is done in other files.

OK

 
 Thanks!
 Ludo'.

Thanks,

Mike