Hi :)
On Mon 27 Feb 2017 17:07, Eli Zaretskii writes:
From: Andy Wingo
Date: Sun, 26 Feb 2017 22:20:31 +0100
In Scheme, strings are sequences of characters. Encoding and decoding
is only needed when going to and from bytes. Guile supports a finite
number of
> Date: Mon, 27 Feb 2017 20:24:19 + (GMT)
> From: Jan Wedekind
> cc: Eli Zaretskii , guile-user@gnu.org
>
> The encoding support of the Ruby programming language [1] is IMHO pretty
> good. It can handle different encodings for source code, input/output,
>
Hi :)
On Mon 27 Feb 2017 17:07, Eli Zaretskii writes:
>> From: Andy Wingo
>> Date: Sun, 26 Feb 2017 22:20:31 +0100
>>
>> In Scheme, strings are sequences of characters. Encoding and decoding
>> is only needed when going to and from bytes. Guile supports a
> From: Andy Wingo
> Cc: Chris Vine , guile-user@gnu.org
> Date: Sun, 26 Feb 2017 21:58:00 +0100
>
> On Wed 15 Feb 2017 18:07, Eli Zaretskii writes:
>
> > the [Emacs] MS-Windows port pretends towards Emacs internals that file
> >
On Mon 27 Feb 2017 13:09, David Kastrup writes:
> Andy Wingo writes:
>> I seriously invite you to read the fine manual, specifically the first
>> four subsections of this node:
>>
>>
>>
Andy Wingo writes:
> On Mon 27 Feb 2017 10:10, David Kastrup writes:
>
>>> String ports have nothing to do with the discussion AFAIU. (Ports in
>>> Guile are sequences of bytes also. They may be accessed using
>>> textual interfaces as well.
>>
>> They can
Hello,
On Mon 27 Feb 2017 10:10, David Kastrup writes:
> Andy Wingo writes:
>
>> Legacy programs don't use codepoints >255.
>
> Sort of a moot point when Guile makes the decision to interpret external
> files with codepoints >255. Not every data processed by a
Andy Wingo writes:
> Hello,
>
> I feel the need to correct points in this mail for the benefit of
> guile-user. No reply is needed.
>
> On Wed 15 Feb 2017 00:58, David Kastrup writes:
>
>> Mike Gran writes:
>>
>>> But, for what it is worth, the
Hello,
I feel the need to correct points in this mail for the benefit of
guile-user. No reply is needed.
On Wed 15 Feb 2017 00:58, David Kastrup writes:
> Mike Gran writes:
>
>> But, for what it is worth, the Latin-1/UCS-32 design decision came
>> from a
Hi,
On Wed 15 Feb 2017 18:07, Eli Zaretskii writes:
> the [Emacs] MS-Windows port pretends towards Emacs internals that file
> names are encoded in UTF-8, and shadows relevant system APIs that
> accept or return file names, like fopen, opendir/readdir, stat,
> etc. with its own
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Fri, Feb 17, 2017 at 10:04:29AM +0100, David Kastrup wrote:
[...]
> You can load an executable into an Emacs buffer and do a
> search-and-replace on UTF-8 strings, then save again. Assuming that the
> replacement has been by a string of the same
Marko Rauhamaa writes:
> Eli Zaretskii :
>>> From: Marko Rauhamaa
>>> Python uses the surrogate hole in the middle of the Unicode range to
>>> represent such stray bytes, but only when naming files.
>>
>> IMO, it makes no sense to limit this to
> From: Marko Rauhamaa
> Cc: d...@gnu.org, guile-user@gnu.org
> Date: Fri, 17 Feb 2017 10:46:32 +0200
>
> > IMO, it makes no sense to limit this to file names, because (a) you
> > don't always know on all levels of the code which string is a file
> > name or a part thereof;
Eli Zaretskii :
>> From: Marko Rauhamaa
>> Python uses the surrogate hole in the middle of the Unicode range to
>> represent such stray bytes, but only when naming files.
>
> IMO, it makes no sense to limit this to file names, because (a) you
> don't always know on
> From: Marko Rauhamaa
> Cc: Eli Zaretskii , guile-user@gnu.org
> Date: Thu, 16 Feb 2017 23:13:35 +0200
>
> Python uses the surrogate hole in the middle of the Unicode range to
> represent such stray bytes, but only when naming files.
IMO, it makes no sense to
> From: David Kastrup
> Cc: Marko Rauhamaa , guile-user@gnu.org
> Date: Thu, 16 Feb 2017 21:52:48 +0100
>
> Eli Zaretskii writes:
>
> > Yes, to be viable in real-life situation, Guile needs to support
> > character strings with occasional embedded
Eli Zaretskii writes:
>> From: Marko Rauhamaa
>> Cc: d...@gnu.org, guile-user@gnu.org
>> Date: Thu, 16 Feb 2017 21:35:12 +0200
>>
>> >> If emacs managed to restore a binary/text unification (and infect Guile
>> >> in the process), that would be quite an
> From: Marko Rauhamaa
> Cc: d...@gnu.org, guile-user@gnu.org
> Date: Thu, 16 Feb 2017 21:35:12 +0200
>
> >> If emacs managed to restore a binary/text unification (and infect Guile
> >> in the process), that would be quite an accomplishment.
> >
> > I don't understand what
Eli Zaretskii :
>> From: Marko Rauhamaa
>> You could leave character strings to application libraries for
>> newsreaders, IRC clients etc, and have a separate byte string data
>> type for the system interface.
>
> I don't know what you mean by "application
Mike Gran writes:
> On Thursday, February 16, 2017 9:39 AM, Marko Rauhamaa
> wrote:
>> Eli Zaretskii :
>
>>> You assume that Emacs concatenates strings by just splicing its bytes.
>>> But that's a far cry from what Emacs does, precisely to
> From: Marko Rauhamaa
> Cc: d...@gnu.org, guile-user@gnu.org
> Date: Thu, 16 Feb 2017 20:38:31 +0200
>
> Eli Zaretskii :
>
> > In any case, this is unrelated to how strings are implemented, because
> > the basic level of string implementation _must_ support
Eli Zaretskii :
> In any case, this is unrelated to how strings are implemented, because
> the basic level of string implementation _must_ support binary,
> character by character (and byte by byte) comparison. Otherwise, you
> won't be able to compare file names equal, for example,
> From: Marko Rauhamaa
> Cc: d...@gnu.org, guile-user@gnu.org
> Date: Thu, 16 Feb 2017 18:38:48 +0200
>
> Eli Zaretskii :
>
> > Why is that a problem? Unicode generally mandates that equivalent
> > character (a.k.a. "codepoint") sequences shall be handled the
> From: Marko Rauhamaa
> Cc: guile-user@gnu.org
> Date: Thu, 16 Feb 2017 18:35:48 +0200
>
> Eli Zaretskii :
>
> > You assume that Emacs concatenates strings by just splicing its bytes.
> > But that's a far cry from what Emacs does, precisely to countermand
> >
Eli Zaretskii :
> Why is that a problem? Unicode generally mandates that equivalent
> character (a.k.a. "codepoint") sequences shall be handled the same by
> applications, both while processing the text (e.g., searching it etc.)
> and when displaying it.
As I just said in another
> From: Marko Rauhamaa
> Cc: to...@tuxteam.de, guile-user@gnu.org
> Date: Thu, 16 Feb 2017 09:02:09 +0200
>
> Eli Zaretskii :
>
> >> From: Marko Rauhamaa
> >> Cc: to...@tuxteam.de, guile-user@gnu.org
> >> Date: Thu, 16 Feb 2017 08:15:57 +0200
> From: Marko Rauhamaa
> Date: Thu, 16 Feb 2017 14:14:41 +0200
> Cc: guile-user@gnu.org
>
> (On the other side of the equation, expressing a filename in Unicode may
> not produce an unambiguous code point sequence... http://unicode.org/faq/normalization.html>)
Why is that a
> From: Marko Rauhamaa
> Cc: guile-user@gnu.org
> Date: Thu, 16 Feb 2017 09:16:21 +0200
>
> If I understood it correctly, someone just told us emacs maps illegal
> UTF-8 to another form of illegal UTF-8 and back. That's better in that
> it's bytes to bytes (leaving Unicode
David Kastrup :
> Marko Rauhamaa writes:
>> And the point of bringing concatenation into the discussion was that
>> remapping byte sequences to byte sequences breaks concatenation
>> additivity:
>>
>>U(x) + U(y) = U(x + y)
>
> But Emacs' implementation doesn't
Marko Rauhamaa writes:
> David Kastrup :
>> It's still irrelevant since split does not _use_ the existing file name
>> for constructing new file names.
>
> Split was just an example of a command that concatenates bytes sequences
> to get pathnames, nothing more.
>
David Kastrup :
> It's still irrelevant since split does not _use_ the existing file name
> for constructing new file names.
Split was just an example of a command that concatenates bytes sequences
to get pathnames, nothing more.
Such concatenation is commonplace in Linux programs
Marko Rauhamaa writes:
> David Kastrup :
>
>> Marko Rauhamaa writes:
>>> You probably cannot produce valid UTF-8 out of invalid UTF-8 snippets
>>> with split(1). However split(1) does form filenames out of its
>>> arguments by concatenation:
>>>
David Kastrup :
> Marko Rauhamaa writes:
>> You probably cannot produce valid UTF-8 out of invalid UTF-8 snippets
>> with split(1). However split(1) does form filenames out of its
>> arguments by concatenation:
>>
>> split --additional-suffix=suffix file
Marko Rauhamaa writes:
> David Kastrup :
>
>> Marko Rauhamaa writes:
>>> That operation fails if you try to translate the snippets to strings
>>> before concatenation. Such concatenation operations are commonplace
>>> when dealing with filenames
David Kastrup :
> Marko Rauhamaa writes:
>> That operation fails if you try to translate the snippets to strings
>> before concatenation. Such concatenation operations are commonplace
>> when dealing with filenames (eg, split(1)).
>
> split(1) does not "deal with
Marko Rauhamaa writes:
> Eli Zaretskii :
>
>> Btw, if by "UCS-2" you meant to say that only characters within the
>> BMP are supported in file names on Windows, then this is wrong
>
> No, I'm claiming Windows allows pathnames to contain isolated surrogate
> code
Eli Zaretskii :
> Btw, if by "UCS-2" you meant to say that only characters within the
> BMP are supported in file names on Windows, then this is wrong
No, I'm claiming Windows allows pathnames to contain isolated surrogate
code points, which cannot be decoded back to Unicode with
Eli Zaretskii :
>> From: Marko Rauhamaa
>> Cc: to...@tuxteam.de, guile-user@gnu.org
>> Date: Thu, 16 Feb 2017 08:15:57 +0200
>>
>> It is possible to have illegal Unicode even in Windows filenames, ie,
>> filenames not expressible using Guile's strings.
>
> Is it
> Date: Thu, 16 Feb 2017 08:29:14 +0200
> From: Eli Zaretskii
> Cc: guile-user@gnu.org
>
> > From: Marko Rauhamaa
> > Cc: to...@tuxteam.de, guile-user@gnu.org
> > Date: Thu, 16 Feb 2017 08:15:57 +0200
> >
> > It is possible to have illegal Unicode even in
> From: Marko Rauhamaa
> Cc: to...@tuxteam.de, guile-user@gnu.org
> Date: Thu, 16 Feb 2017 08:15:57 +0200
>
> It is possible to have illegal Unicode even in Windows filenames, ie,
> filenames not expressible using Guile's strings.
Is it really possible? Can you show a code
Eli Zaretskii :
>> From: Marko Rauhamaa
>> Cc: to...@tuxteam.de, guile-user@gnu.org
>> Date: Wed, 15 Feb 2017 23:04:52 +0200
>>
>> Eli Zaretskii :
>>
>> > At the file system level (for NTFS volumes at least) Windows file
>> > names are always
> Date: Wed, 15 Feb 2017 22:15:52 +0100
> From: to...@tuxteam.de
> Cc: guile-user@gnu.org
>
> > > > A possible solution would be to decode each mount point's part as it
> > > > is being resolved.
> > >
> > > ...which can only be based on guesswork: there's no reliable info on
> > > the encoding
> From: Marko Rauhamaa
> Cc: to...@tuxteam.de, guile-user@gnu.org
> Date: Wed, 15 Feb 2017 23:04:52 +0200
>
> Eli Zaretskii :
>
> > At the file system level (for NTFS volumes at least) Windows file
> > names are always UTF-16 encoded, and Windows just "knows"
Eli Zaretskii :
> At the file system level (for NTFS volumes at least) Windows file
> names are always UTF-16 encoded, and Windows just "knows" that.
Hm, I had the impression NTFS filenames were UCS-2 (https://en.wikipedia.org/wiki/Talk%3AUTF-16/UCS-2>).
Marko
> Date: Wed, 15 Feb 2017 21:20:56 +0100
> From: to...@tuxteam.de
> Cc: guile-user@gnu.org
>
> > > Most notably, the whole path might cross several mount points, thus
> > > the whole path can well have fragments coming from several file systems.
> >
> > A possible solution would be to decode each
> Date: Wed, 15 Feb 2017 21:07:53 +0100
> From: to...@tuxteam.de
> Cc: to...@tuxteam.de, d...@gnu.org, guile-user@gnu.org
>
> > It took many years because those smart, experienced, and patient
> > people made bad decisions, twice, and had to correct them later, which
> > required rewriting
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Wed, Feb 15, 2017 at 06:59:14PM +0200, Eli Zaretskii wrote:
> > Date: Wed, 15 Feb 2017 10:18:32 +0100
> > From:
[...]
> > Most notably, the whole path might cross several mount points, thus
> > the whole path can well have
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Wed, Feb 15, 2017 at 07:04:10PM +0200, Eli Zaretskii wrote:
> > Date: Wed, 15 Feb 2017 11:10:33 +0100
> > From:
> > Cc: guile-user@gnu.org
> >
> > Yes, Emacs is the text specialist.
> >
> > It has taken years and a bunch of
Eli Zaretskii :
>> Date: Wed, 15 Feb 2017 10:18:32 +0100
>> From:
>> I think the only sane way to see a Linux file system path is the way
>> Linux sees it: as a byte string.
>
> This would lose a lot in 99% of use cases. You are, in effect,
> suggesting a "reverse
> Date: Wed, 15 Feb 2017 10:18:32 +0100
> From:
>
> > Filenames and locales are not necessarily related. When you access a
> > networked file system, you get the filename encoding you are given,
> > which may or may not be the same as the particular locale encoding on
> > your
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Wed, Feb 15, 2017 at 01:11:57PM +, Chris Vine wrote:
> On Wed, 15 Feb 2017 13:41:24 +0100
> wrote:
> [snip]
> > What I don't like about the fluid is that it still doesn't give you
> > an escape hatch in hard cases (your USB
On Wed, 15 Feb 2017 13:41:24 +0100
wrote:
[snip]
> What I don't like about the fluid is that it still doesn't give you
> an escape hatch in hard cases (your USB stick example).
The program would just have to document that any mount point must have
a path in a character set (eg
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Wed, Feb 15, 2017 at 12:13:09PM +, Chris Vine wrote:
> On Wed, 15 Feb 2017 12:48:20 +0100
> wrote:
> > On Wed, Feb 15, 2017 at 10:15:33AM +, Chris Vine wrote:
> [snip]
> > > I would prefer guile to make the filename
:
> On Wed, Feb 15, 2017 at 12:58:41AM +0100, David Kastrup wrote:
>> LilyPond is getting removed from Debian and other distributions
>> because it is still hopeless to get it to run under Guile-2 (the
>> experimental support has encoding and stability problems and runs
>> about
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Wed, Feb 15, 2017 at 10:15:33AM +, Chris Vine wrote:
[...]
> I don't disagree. My purpose was to point out that in the modern
> world of networking and plug-in devices, locales and filenames are
> disjoint.
>
> The glib approach is better
Marko Rauhamaa writes:
> David Kastrup :
>
>> If you tell Emacs that some external entity is in UTF-8, it will
>> represent all valid UTF-8 sequences as properly decoded characters,
>> and it has special codes for all bytes not part of valid UTF-8.
>>
>> As a
David Kastrup :
> If you tell Emacs that some external entity is in UTF-8, it will
> represent all valid UTF-8 sequences as properly decoded characters,
> and it has special codes for all bytes not part of valid UTF-8.
>
> As a result, it works with valid UTF-8 perfectly as expected
On Wed, 15 Feb 2017 10:18:32 +0100
wrote:
> On Tue, Feb 14, 2017 at 10:19:14PM +, Chris Vine wrote:
[snip]
> > Filenames and locales are not necessarily related. When you access
> > a networked file system, you get the filename encoding you are
> > given, which may or may
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Wed, Feb 15, 2017 at 12:58:41AM +0100, David Kastrup wrote:
[...]
> Not just on yours. LilyPond is probably the largest application using
> Guile as its extension language, with pretty much the worst impacts of
> Guile-2 design decisions. So
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Wed, Feb 15, 2017 at 10:54:06AM +0100, David Kastrup wrote:
> writes:
[...]
> > Not easy.
>
> If you tell Emacs that some external entity is in UTF-8, it will
> represent all valid UTF-8 sequences as properly decoded
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Tue, Feb 14, 2017 at 10:19:14PM +, Chris Vine wrote:
> On Tue, 14 Feb 2017 21:52:01 + (UTC)
> Mike Gran wrote:
> [snip]
> > > In particular, filenames are *not*, nor can they be mapped to,
> > > Unicode
> >
> > >
Chris Vine :
> On Tue, 14 Feb 2017 21:52:01 + (UTC)
> Mike Gran wrote:
>> True. Linux should follow OpenBSD and make all locales UTF-8.
>
> Filenames and locales are not necessarily related.
Linux *could* force that reality.
> When you access
Linas Vepstas skribis:
> On Mon, Jan 30, 2017 at 1:27 PM, David Kastrup wrote:
>> Marko Rauhamaa writes:
>>> David Kastrup :
Marko Rauhamaa writes:
> Guile's mistake was to move to Unicode strings
On Tue, 14 Feb 2017 21:52:01 + (UTC)
Mike Gran wrote:
[snip]
> > In particular, filenames are *not*, nor can they be mapped to,
> > Unicode
>
> > strings in Linux.
>
> True. Linux should follow OpenBSD and make all locales UTF-8.
Filenames and locales are not
Mike Gran :
> On Tuesday, February 14, 2017 1:07 PM, Marko Rauhamaa
> wrote:
>> Unicode strings are a special data type that have relatively little>
>> practical use. Byte strings are much more fundamental. C's "char *"
>> is perfect.
>
> Human language itself
Mike Gran :
> The great difficulty with the UTF-8 Guile prototype was the need to
> interrogate every string access or index to decide if it was a
> codepoint index or a byte index.
Unicode strings are a special data type that have relatively little
practical use. Byte strings
On Mon, Jan 30, 2017 at 1:27 PM, David Kastrup wrote:
> Marko Rauhamaa writes:
>> David Kastrup :
>>> Marko Rauhamaa writes:
Guile's mistake was to move to Unicode strings in the operating system
interface.
>>>
>>> Emacs
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On Mon, Jan 30, 2017 at 10:46:26PM +0200, Marko Rauhamaa wrote:
[...]
> You are jumping the gun. Linux won't be there for a long time if ever.
> Nothing prevents a pathname, or a command-line argument, or an
> environment variable, or the standard
Eli Zaretskii writes:
>> Date: Mon, 30 Jan 2017 20:42:38 + (UTC)
>> From: Mike Gran
>> Cc: "guile-user@gnu.org"
>>
>> Earlier in the 2.0.x release series, Guile had a hack where it started
>> up in a Latin-1 encoding, which would be
> Date: Mon, 30 Jan 2017 20:42:38 + (UTC)
> From: Mike Gran
> Cc: "guile-user@gnu.org"
>
> Earlier in the 2.0.x release series, Guile had a hack where it started
> up in a Latin-1 encoding, which would be capable of storing any
> 8-bit string of bytes,
Eli Zaretskii :
>> From: Marko Rauhamaa
>>
>> UTF-8 beautifully bridges the interpretation gap between 8-bit character
>> strings and text. However, the interpretation step should be done in the
>> application and not in the programming language.
>
> You can't do
On Monday, January 30, 2017 12:00 PM, Eli Zaretskii wrote:
> Actually, the need arises even sooner. Consider how load-path is set
> up during startup: it starts with the directory from which Emacs was
> invoked, either from argv[0] or by looking up PATH. Either way, you
> get a
> Date: Mon, 30 Jan 2017 21:32:41 +0200
> From: Eli Zaretskii
> Cc: guile-user@gnu.org
>
> > Hm, I know that XEmacs-Mule emphatically does not have unibyte strings
> > (and Stephen considers them a complication and abomination that should
> > never have been left in Emacs), so it
> From: Marko Rauhamaa
> Date: Mon, 30 Jan 2017 21:01:31 +0200
> Cc: guile-user@gnu.org
>
> UTF-8 beautifully bridges the interpretation gap between 8-bit character
> strings and text. However, the interpretation step should be done in the
> application and not in the
> From: David Kastrup
> Cc: ma...@pacujo.net, guile-user@gnu.org
> Date: Mon, 30 Jan 2017 20:00:03 +0100
>
> Eli Zaretskii writes:
>
> > One other crucial detail is that Emacs also has unibyte strings
> > (arrays of bytes), which are necessary during startup, when
Marko Rauhamaa writes:
> David Kastrup :
>
>> Marko Rauhamaa writes:
>>> Guile's mistake was to move to Unicode strings in the operating system
>>> interface.
>>
>> Emacs uses an UTF-8 based encoding internally [...]
>
> C uses 8-bit characters.
David Kastrup :
> Marko Rauhamaa writes:
>> Guile's mistake was to move to Unicode strings in the operating system
>> interface.
>
> Emacs uses an UTF-8 based encoding internally [...]
C uses 8-bit characters. That is a model worth emulating.
UTF-8 beautifully
> From: David Kastrup
> Date: Mon, 30 Jan 2017 19:32:14 +0100
> Cc: guile-user@gnu.org
>
> Emacs uses an UTF-8 based encoding internally: basically, valid UTF-8 is
> represented as itself, there is a number of coding points beyond the
> actual limit of UTF-8 that is used for
Eli Zaretskii writes:
>> From: David Kastrup
>> Date: Mon, 30 Jan 2017 19:32:14 +0100
>> Cc: guile-user@gnu.org
>>
>> Emacs uses an UTF-8 based encoding internally: basically, valid UTF-8 is
>> represented as itself, there is a number of coding points beyond the
>>
Marko Rauhamaa writes:
> David Kastrup :
>
>> But at any rate, this cannot easily be fixed since Guile uses libraries
>> for encoding/decoding that cannot deal reproducibly with improper byte
>> patterns.
>
> Guile's mistake was to move to Unicode strings in the
David Kastrup :
> But at any rate, this cannot easily be fixed since Guile uses libraries
> for encoding/decoding that cannot deal reproducibly with improper byte
> patterns.
Guile's mistake was to move to Unicode strings in the operating system
interface.
> The problem here is
l...@gnu.org (Ludovic Courtès) writes:
[...]
> However, in 2.0, the current locale is *not* installed; you have to
> either call ‘setlocale’ explicitly (like in C), or set this environment
> variable (info "(guile) Environment Variables"):
>
> GUILE_INSTALL_LOCALE=1
>
> When you do that (and
Marko Rauhamaa writes:
> David Kastrup :
>
>> Marko Rauhamaa writes:
>>> l...@gnu.org (Ludovic Courtès):
Guile assumes its command-line arguments are UTF-8-encoded and
decodes them accordingly.
>>>
>>> I'm afraid that choice (which
Hey Dave!
David Kastrup skribis:
> l...@gnu.org (Ludovic Courtès) writes:
[...]
>>> ERROR: In procedure open-file: No such file or directory:
>>> "/home/hermann/Desktop/filename_\u540d\u5b57.scm"
>>
>> In C, argv is just an array of byte sequences, but in Guile,
>>
David Kastrup :
> Marko Rauhamaa writes:
>> l...@gnu.org (Ludovic Courtès):
>>> Guile assumes its command-line arguments are UTF-8-encoded and
>>> decodes them accordingly.
>>
>> I'm afraid that choice (which Python made, as well) was a bad one
>> because Linux
Marko Rauhamaa writes:
> l...@gnu.org (Ludovic Courtès):
>
>> In C, argv is just an array of byte sequences, but in Guile,
>> (command-line) returns a list of strings, not a list of bytevectors.
>>
>> Guile decodes its arguments according to the encoding of the current
>>
l...@gnu.org (Ludovic Courtès):
> In C, argv is just an array of byte sequences, but in Guile,
> (command-line) returns a list of strings, not a list of bytevectors.
>
> Guile decodes its arguments according to the encoding of the current
> locale. So if you’re in a UTF-8 locale (say, zn_CH.utf8
l...@gnu.org (Ludovic Courtès) writes:
> Hi!
>
> Thomas Morley skribis:
>
>> guile filename_名字.scm
>> ;;; Stat of /home/hermann/Desktop/filename_??.scm failed:
>> ;;; ERROR: In procedure stat: No such file or directory:
>>
Hi!
Thomas Morley skribis:
> guile filename_名字.scm
> ;;; Stat of /home/hermann/Desktop/filename_??.scm failed:
> ;;; ERROR: In procedure stat: No such file or directory:
> "/home/hermann/Desktop/filename_\u540d\u5b57.scm"
> Backtrace:
> In ice-9/boot-9.scm:
> 160: 8
It's a bug. There have been bugs on and off with guile utf8 handling.
One of the guile-2.0 versions does almost everything right, but utf8
is semi-broken, again in 2.2 -- some things work, but various things
that used to work great are now broken (again). I'm guessing that
guile has a
2016-11-27 13:16 GMT+01:00 Chaos Eternal :
> Seems that UTF-8 encoded string has been converted to unicode before calling
> `open',
> but on filesystem the filename is utf8 string
Your analysis is surely correct, but what to do?
I expected
guile filename_名字.scm
to work out
Seems that UTF-8 encoded string has been converted to unicode before
calling `open',
but on filesystem the filename is utf8 string
On Sun, Nov 27, 2016 at 7:58 PM Thomas Morley
wrote:
> Hi all,
>
> a chinese user came up with a weird problem.
>
> He wants to process
92 matches
Mail list logo