subject:"Chinese input causes mess in the REPL"

Re: Chinese input causes mess in the REPL

2018-02-08 Thread Enwei Zhang

Dear Danilo,

I've tried your method just now, and it works fine!
Thanks!

Zhang Enwei

From: picolisp@software-lab.de <picolisp@software-lab.de> on behalf of Danilo 
Kordic <danilo.kor...@gmail.com>
Sent: Sunday, January 28, 2018 3:54 PM
To: picolisp@software-lab.de
Subject: Re: Chinese input causes mess in the REPL

  GNU Emacs can be used as a line editor.  Execute elisp expression
``(term "/absolute/path/to/pil")'' then activate `term-line-mode' with
``C-C C-j''.

  Is this of any help?  How much does it count :) ?

--
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

Re: Chinese input causes mess in the REPL

2018-02-04 Thread Thorsten Jolitz

Danilo Kordic 
writes:

>   GNU Emacs can be used as a line editor.  Execute elisp expression
> ``(term "/absolute/path/to/pil")'' then activate `term-line-mode' with
> ``C-C C-j''.
>
>   Is this of any help?  How much does it count :) ?

Cool! I did not know this ...

-- 
cheers,
Thorsten


-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

Re: Chinese input causes mess in the REPL

2018-01-29 Thread Enwei Zhang

Hi all,

Thank you for your information.

At the last weedend, I've tried 'picolisp', 'pil', 'pil +' and the Java 
version, when deleting input Chinese charater, there is also an issue:

input '你好' (nihao in Pinyin) first,

: 你好

and then push backspace once, it will become:

: 你

there is a space after '你', so only half of '好' is deleted.
And after deleted the space, the start of the line is reached.
So the '你' can not be deleted further.

So I'm reading the source code of main.c of 32bit picolisp(the function load()).

BTW, I found that using Chinese in source code (utf8) works without any 
problem. like this:


# 星座

(client "xxx.com.cn"

80

(pack "lcsservice/constellation/getFortune?" (urlencode 
"key=15531CFAC1E4E9F2FF1DB09A5DEB=射手座=tomorrow"))

(out NIL (echo)) )

Zhang Enwei


From: picolisp@software-lab.de <picolisp@software-lab.de> on behalf of 
Alexander Burger <a...@software-lab.de>
Sent: Friday, January 26, 2018 3:11 PM
To: picolisp@software-lab.de
Subject: Re: Chinese input causes mess in the REPL

Hello Zhang Enwei,

> I'm studying and using PicoLisp, and trying to make it a regular dev tool in
> my daily development. It's some difficult but quite interesting for me.

Glad to hear that!


> I found a defect about Chinese language supporting, under Mac OS X 32bit, and
> Arm 64bit version(in Termux), but Java version is OK.

Yes, I know :( I've observed it with Japanese (Kanji and kana) input too.


> It is, for example, when I input (setq x "你好"), the REPL will display like 
> this:
> : ((s(s(s(s(s(setq x "你好")
> -> "你好"

Right. It is the line editor in @lib/led.l, which cannot handle characters
taking up two places on the screen. To do it correctly, it would need to output
*two* backspaces, but the line editor doesn't know the width of these
characters.

ErsatzLisp, the Java version, does not have this problem, because it uses no
line editor ;)

In normal PicoLisp it is only in debug mode, because this loads the line editor.
Production mode PicoLisp (i.e. started without '+') should be clean in this
regard, but not very useful.


> The result is OK, but the display is messy.
>
> I'd like to try to solve this issue, would anyone like to tell me the code
> position?

It is in some places, most importantly in the places in @lib/led.l where
backspaces are output, of the form like

   (do D (prin "^H"))

There is no easy fix. PicoLisp lacks the functionality to calculate the width of
unicode characters. I suppose it needs some extra character tables and lookup
mechanisms in the base system, or perhaps call an external C function via
'native'. Any suggestions?

♪♫ Alex

--
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

Re: Chinese input causes mess in the REPL

2018-01-28 Thread Alexander Burger

On Sun, Jan 28, 2018 at 02:13:41PM -0800, Michel Pelletier wrote:
> Ah yeah, thanks for clarifying that.  Does this "header only" C library
> seem useful?  There's a lot of confusing stuff out there about unicode
> "width"
> 
> https://github.com/joshuarubin/wcwidth9

I think it will not be very useful, as PicoLisp uses the UTF-8 representation
directly. We would need only a single function from that lib, the one returning
the print-width of a given char after converting it to wchar.

Perhaps the easiest solution for the Chinese problem would be a simple heuristic
range check, something like (and (> C SomeLimit) (prin "^H"))

♪♫ Alex

-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

Re: Chinese input causes mess in the REPL

2018-01-28 Thread Michel Pelletier

Ah yeah, thanks for clarifying that.  Does this "header only" C library
seem useful?  There's a lot of confusing stuff out there about unicode
"width"

https://github.com/joshuarubin/wcwidth9

On Sun, Jan 28, 2018 at 10:55 AM, Alexander Burger 
wrote:

> Hi Michel,
>
> > "by looking at the first byte of a multibyte character, we can determine
> > the length of the character: If the first byte is between 0xC0 and 0xDF,
>
> This is correct, and this is handlen in many places inside the PicoLisp
> interpreter. Without this, the names of symbols could not be handled at
> all.
>
> The problem we have here, however, is how many position a character takes
> up on
> the *screen* when printed. There is no easy rule, I think. Even Kana exist
> in
> two width variations.
>
> ♪♫ Alex
>
> --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
>

Re: Chinese input causes mess in the REPL

2018-01-28 Thread Alexander Burger

Hi Michel,

> "by looking at the first byte of a multibyte character, we can determine
> the length of the character: If the first byte is between 0xC0 and 0xDF,

This is correct, and this is handlen in many places inside the PicoLisp
interpreter. Without this, the names of symbols could not be handled at all.

The problem we have here, however, is how many position a character takes up on
the *screen* when printed. There is no easy rule, I think. Even Kana exist in
two width variations.

♪♫ Alex

-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

Re: Chinese input causes mess in the REPL

2018-01-28 Thread Michel Pelletier

I'm not an expert on this, but doing a little digging I see some links on
calculating utf-8 character sizes and it seems like a lookup table is not
necessary:

http://www.daemonology.net/blog/2008-06-05-faster-utf8-strlen.html

"by looking at the first byte of a multibyte character, we can determine
the length of the character: If the first byte is between 0xC0 and 0xDF,
the UTF-8 character has two bytes; if it is between 0xE0 and 0xEF, the
UTF-8 character has 3 bytes; and if it is 0xF0 and 0xFF, the UTF-8
character has 4 bytes."

On Sat, Jan 27, 2018 at 11:54 PM, Danilo Kordic 
wrote:
>
>   GNU Emacs can be used as a line editor.  Execute elisp expression
> ``(term "/absolute/path/to/pil")'' then activate `term-line-mode' with
> ``C-C C-j''.
>
>   Is this of any help?  How much does it count :) ?
>
> --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

Re: Chinese input causes mess in the REPL

2018-01-28 Thread Danilo Kordic

  GNU Emacs can be used as a line editor.  Execute elisp expression
``(term "/absolute/path/to/pil")'' then activate `term-line-mode' with
``C-C C-j''.

  Is this of any help?  How much does it count :) ?

-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

Re: Chinese input causes mess in the REPL

2018-01-26 Thread Mike

> 
> There is no easy fix. PicoLisp lacks the functionality to calculate the width 
> of
> unicode characters. I suppose it needs some extra character tables and lookup
> mechanisms in the base system, or perhaps call an external C function via
> 'native'. Any suggestions?

If 'native' then pil32 will be out of field.
Generic tables and lookup is enough for the next 20 years.

(mike)

--
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

Re: Chinese input causes mess in the REPL

2018-01-25 Thread Alexander Burger

Hello Zhang Enwei,

> I'm studying and using PicoLisp, and trying to make it a regular dev tool in
> my daily development. It's some difficult but quite interesting for me.

Glad to hear that!


> I found a defect about Chinese language supporting, under Mac OS X 32bit, and
> Arm 64bit version(in Termux), but Java version is OK.

Yes, I know :( I've observed it with Japanese (Kanji and kana) input too.


> It is, for example, when I input (setq x "你好"), the REPL will display like 
> this:
> : ((s(s(s(s(s(setq x "你好")
> -> "你好"

Right. It is the line editor in @lib/led.l, which cannot handle characters
taking up two places on the screen. To do it correctly, it would need to output
*two* backspaces, but the line editor doesn't know the width of these
characters.

ErsatzLisp, the Java version, does not have this problem, because it uses no
line editor ;)

In normal PicoLisp it is only in debug mode, because this loads the line editor.
Production mode PicoLisp (i.e. started without '+') should be clean in this
regard, but not very useful.


> The result is OK, but the display is messy.
> 
> I'd like to try to solve this issue, would anyone like to tell me the code
> position?

It is in some places, most importantly in the places in @lib/led.l where
backspaces are output, of the form like

   (do D (prin "^H"))

There is no easy fix. PicoLisp lacks the functionality to calculate the width of
unicode characters. I suppose it needs some extra character tables and lookup
mechanisms in the base system, or perhaps call an external C function via
'native'. Any suggestions?

♪♫ Alex

-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

Chinese input causes mess in the REPL

2018-01-25 Thread Enwei Zhang

Hello,

I'm studying and using PicoLisp, and trying to make it a regular dev tool in my 
daily development. It's some difficult but quite interesting for me.

I found a defect about Chinese language supporting, under Mac OS X 32bit, and 
Arm 64bit version(in Termux), but Java version is OK.

It is, for example, when I input (setq x "你好"), the REPL will display like this:
: ((s(s(s(s(s(setq x "你好")
-> "你好"
:
The result is OK, but the display is messy.

I'd like to try to solve this issue, would anyone like to tell me the code 
position?

Thanks a lot!

Zhang Enwei

Re: Chinese input causes mess in the REPL

Re: Chinese input causes mess in the REPL

Re: Chinese input causes mess in the REPL

Re: Chinese input causes mess in the REPL

Re: Chinese input causes mess in the REPL

Re: Chinese input causes mess in the REPL

Re: Chinese input causes mess in the REPL

Re: Chinese input causes mess in the REPL

Re: Chinese input causes mess in the REPL

Re: Chinese input causes mess in the REPL

Chinese input causes mess in the REPL

11 matches

Site Navigation

Mail list logo

Footer information