Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-14 Thread Martin

On 14/12/2012 21:15, patspiper wrote:

On 14/12/12 21:33, Martin wrote:

On 13/12/2012 14:39, Martin wrote:


Ok, bad news, I did some more testing myself, and it turns out, that 
WIndows allocates the extra space (song connection line) just 
anywhere in the word, or at the end of it, but not always where the 
ligation is.



.

So what's next.
The only way to support this, is to make SynEdit aware of the 
ligatures. And that is a lot of work, so it will take more time. And 
I don't yet know when I will schedule it


Ok, I found a quick way to get a use-able behaviour.

:)


*  it is currently WINDOWS ONLY

:(


For GTK, I believe this 
functionhttp://developer.gnome.org/pango/stable/pango-Text-Processing.html#pango-get-log-attrs
   can deliver the information needed.
But in addition there may be a need for further fixes/improvements in 
ExtTextOut (widgetset)

Other WidgetSets: No idea

If anyone wants to give GTK a try, I will mail what needs to be done.



- ligatures are handled as follows.
   There is  no middle caret
   Depending on the caret being before or after (and accordingly 
backspace or delete being used) the first or 2nd char is deleted
  # So it is 2 chars, but any caret move will just be translated into 
skipping the middle pos


When you mention ligatures, do you mean any 2 connected characters 
(ex: ???), or characters that combine such as Lam Alef (??)?

Lam Alef

combining codepoints (such as dots or accents added to chars) are 
handled diferrently (SynEdit knows them for Arabic Only):

- Backspace (from behind the char), deletes one codepoint
- Delete (from before the char) deletes char + all combining (leaving 
the dot without the char makes little sense?)


combining codepoints not known to synedit (none arabic), will act like 
ligatures.




I tested the behaviour of Libre Office Writer (Ubuntu):

- Damma or shadda or similar will combine with the affected character 
and form one character as far as the cursor movement is concerned. 
Backspace after the combined character will remove the damma. Delete 
before the character will remove both.

See above


- Lam Alef produced by pressing a single keyboard key (?) acts as a 
single character in all aspects.


- Lam followed by Alef will combine visually into one character (??) 
but acts as 2 characters in all aspects.


I can not test (I don't know haw to write Arabic, except for hitting 
random keys on an Arabic layout). If the single keystroke, produces a 
single utf8 char, then yes.


SynEdit works on Utf8 Codepoints. Except where it finds combining 
Codepoints, if they are in the list of combining, that SynEdit includes.


--
On addition, it asks Windows, if any sequence of codepoints is drawn as 
a single glyph. This is used to calculate the position of the caret, in 
relation to each glyph, as well as the length of the line, and where to 
apply highlights.



--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-14 Thread patspiper

On 14/12/12 21:33, Martin wrote:

On 13/12/2012 14:39, Martin wrote:


Ok, bad news, I did some more testing myself, and it turns out, that 
WIndows allocates the extra space (song connection line) just 
anywhere in the word, or at the end of it, but not always where the 
ligation is.



.

So what's next.
The only way to support this, is to make SynEdit aware of the 
ligatures. And that is a lot of work, so it will take more time. And 
I don't yet know when I will schedule it


Ok, I found a quick way to get a use-able behaviour.

:)


*  it is currently WINDOWS ONLY

:(
* It will NOT be enabled by default in the IDE (but if anyone needs 
it, you can add it for your own projects, or even the IDE)


- define WithSynExperimentalCharWidth  and it should work (need to 
recompile SynEdit package)

- define SynSystemWidthChars for log messages (should there be problems)

- Adds a small slowdown, but hardly noticeable
- not limited to Arabic, should do all languages, if windows does
- according to my tests it works for lines up to 32001 chars. After 
that the OS does not handle the line (SynEdit would need to split it)
  In this case SynEdit behaves (for that line) as if the define was 
not present.

- ligatures are handled as follows.
   There is  no middle caret
   Depending on the caret being before or after (and accordingly 
backspace or delete being used) the first or 2nd char is deleted
  # So it is 2 chars, but any caret move will just be translated into 
skipping the middle pos


When you mention ligatures, do you mean any 2 connected characters (ex: 
???), or characters that combine such as Lam Alef (??)?


I tested the behaviour of Libre Office Writer (Ubuntu):

- Damma or shadda or similar will combine with the affected character 
and form one character as far as the cursor movement is concerned. 
Backspace after the combined character will remove the damma. Delete 
before the character will remove both.


- Lam Alef produced by pressing a single keyboard key (?) acts as a 
single character in all aspects.


- Lam followed by Alef will combine visually into one character (??) but 
acts as 2 characters in all aspects.


Stephano
--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-14 Thread Martin

On 13/12/2012 14:39, Martin wrote:


Ok, bad news, I did some more testing myself, and it turns out, that 
WIndows allocates the extra space (song connection line) just anywhere 
in the word, or at the end of it, but not always where the ligation is.



.

So what's next.
The only way to support this, is to make SynEdit aware of the 
ligatures. And that is a lot of work, so it will take more time. And I 
don't yet know when I will schedule it


Ok, I found a quick way to get a use-able behaviour.

*  it is currently WINDOWS ONLY
* It will NOT be enabled by default in the IDE (but if anyone needs it, 
you can add it for your own projects, or even the IDE)


- define WithSynExperimentalCharWidth  and it should work (need to 
recompile SynEdit package)

- define SynSystemWidthChars for log messages (should there be problems)

- Adds a small slowdown, but hardly noticeable
- not limited to Arabic, should do all languages, if windows does
- according to my tests it works for lines up to 32001 chars. After that 
the OS does not handle the line (SynEdit would need to split it)
  In this case SynEdit behaves (for that line) as if the define was not 
present.

- ligatures are handled as follows.
   There is  no middle caret
   Depending on the caret being before or after (and accordingly 
backspace or delete being used) the first or 2nd char is deleted
  # So it is 2 chars, but any caret move will just be translated into 
skipping the middle pos




--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-13 Thread Martin

  
  
On 08/12/2012 18:02, Martin wrote:

On
  08/12/2012 16:27, patspiper wrote:
  
  On 07/12/12 16:58, Martin wrote:

Ok, but can you test them on windows,
  with "Extra Char Spacing" = 1
  

I use Ubuntu, and tested only monospaced fonts (source editor).
I understand extra char spacing is for Windows only.

  
  On GTK, it will break either way.
  
  
  The whole think can currently only be tested on windows.
  


Ok, bad news, I did some more testing myself, and it turns out, that
WIndows allocates the extra space (song connection line) just
anywhere in the word, or at the end of it, but not always where the
ligation is.



The problem is synEdit expects the chars (3+1 ligate = 8 chars)
*evenly* distributed (every n pixel). So if the caret is placed at
the begin of the line , SynEdit expects a char there, and then n
pixel further the next...
If you edit (insert, delete) at those points SynEdit will act as if
the char it expected had been there. And that in (a far as I can
see) not really useable.

Anyway on windows this is now the current behaviour. (*NO* "Extra
char spacing" needed any more).  
(Note: instead of the long line, you may get empty spaces, at word
borders (including letter to digit changes))

On GTK, QT, and carbon, the widgetset drawing support does not deal
with that, so results are even worse. 
If windows had worked better, the plan was to try and fix them
(Though I am not sure about doing QT or carbon myself). Now I am not
sure, how much sense this makes...

As for other RTL behaviour (none drawing related):
- I fixed backspace for combining
- If you find an combining chars that do not work, maybe they got
missed (SynEdit has it's own list). Let me know, I will check, and
add them, (adding is easy, finding takes the time).
  This is Arabic ONLY, other combining have not yet added (planned,
no need to report, I do know already)
- As for the treatment of weak chars (e.g. digits being part of the
RTL or LTR run. This is still beta. It has flaws, I know, no need to
report.
- tabs in RTL, are probably not handled very smart. If you know a
good algorithm... However, this will be very low prior.
- Column mode selection, is *not* implemented at all (normal
selection will work)
- Caret, if between a RTL and LTR run, is always placed next to the
LTR char. I know there a many smarter ways to do this, but low
prior.
- Anything else, that needs improvement or fixing: let me know.


So what's next. 
The only way to support this, is to make SynEdit aware of the
ligatures. And that is a lot of work, so it will take more time. And
I don't yet know when I will schedule it


  

--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-08 Thread Martin

On 08/12/2012 16:27, patspiper wrote:

On 07/12/12 16:58, Martin wrote:

Ok, but can you test them on windows, with "Extra Char Spacing" = 1
I use Ubuntu, and tested only monospaced fonts (source editor). I 
understand extra char spacing is for Windows only.

On GTK, it will break either way.

The whole think can currently only be tested on windows.


--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-08 Thread patspiper

On 07/12/12 16:58, Martin wrote:
Did you use "Extra Char Spacing" = 1 ? This is what happens, if not! 
(This and a few other real oddities)


And also, it can only be tested on windows. Because on GTK,QT,Carbon 
"Extra Char Spacing"  is faulty in an other way: It splits the 
combining chars into individuals, but since SynEdit does not know.


The problem is, that by current design, SynEdit has to calculate the 
pixel pos of each char on it's own.If it does not calculate the same, 
as the OS did when painting (SymEdit gives the OS tokens, fragments of 
the line or the whole line) then obviously things will be odd afterwards.


- Long connecting lines are not what I would like, but this is a 
monospaced font afterall.

Ok, but can you test them on windows, with "Extra Char Spacing" = 1
I use Ubuntu, and tested only monospaced fonts (source editor). I 
understand extra char spacing is for Windows only.


To test proportional fonts, I put a TSynEdit on a form, set the font to 
Arial, but the characters were all detached. Did I miss something here?




See that the caret pos is treadet correct, backspace and delete, 
insert work (on the correct char) on them, Copying a selection will 
copy the highlighted part (except column mode selection, which is not 
done yet)
Column selection is fantastic for coding, but is not a prime feature for 
Arabic at this stage.


About editing. (backspace and delete, insert)
- combining chars see below.
- ligatures. Caret and selection-wise the ligature, and the 
long-connecting-line, are both treaded as one char. One is the 1st, 
the other the 2nd char of the ligature (in the order they occur in 
text). The behaviour for editing should reflect this. Does it.




- The 456 should have come to the left of the Arabic words.


Ok, that could dbe fixed. Depends on treating digits as weak or strong 
LTR. Actually in this case, depends on treating the line end as such)


If the 456 were embedded in the middle of arab, it would have worked. 
But they border the EOL, and SynEdit treats the EOL strong LTR (and 
bordering weak 456 follows). This gives better result for pascal, 
where Arab occurs in strings. "a:='arab';" The '; in the end will and 
should be LTR due to bordering the EOL.

OK


This will be fixed eventually, when weak handling is made highlighter 
depending


- If you put a shaddah or damma on a character, it gets displayed on 
top of the character (correct behaviour). Pressing backspace at this 
stage should only delete that addition, and not the character.

Ok, Also simple to fix. Not a painting issue so.

Those are combining codepoints. So backspace must act on codepoints.

The editor understands the diff between "Char" and "codepoint". It is 
a question of assigning the right choice to each action (and that is a 
question of writing testcases too)



--
About "Long connecting lines are not what I would like."...
Longer connecting lines for monospaced fonts is acceptable especially 
that the characters have a fixed width. But it is less tolerable with 
proportional fonts.


I understand. And it would not be the final solution. But if all else 
works (as described above) then this is a solution, that I believe, I 
can reach without too much extra work from where I am now (Will still 
be next year...).


And then we had something at least use-able.

The rest will be on my todo list, and has to await it's time, between 
other features and debugger.

Excellent work!

Stephano

--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-08 Thread Martin

On 08/12/2012 15:25, Zaher Dirkey wrote:

I see there is a unit, and you are use it SynEditTextBidiChars.

So What for USE_UTF8BIDI_LCL? and this units FreeBidi and utf8bidi?


Those 2 units seem way incomplete.

SynEdit currently needs 2 things:

1) combining chars (that may move to LazUtils.. Lets see

2) stronk and weak RTL/LTR detection.
That is definetly not gcomplete in the above 2 units.
And SynEdit needs to integrate the highlighter in that.

',+; are all weak

but in Pascal:
a:='arab1'+'arab2+''more arab';

the '+' must be LTR
the +'' must be RTL (all in string)

--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-08 Thread Zaher Dirkey
I see there is a unit, and you are use it SynEditTextBidiChars.

So What for USE_UTF8BIDI_LCL? and this units FreeBidi and utf8bidi?


On Sat, Dec 8, 2012 at 4:49 PM, Martin  wrote:

>  On 08/12/2012 14:06, Zaher Dirkey wrote:
>
> From the first trying, Wow it works :D, but i need more tests.
>
>
> Main question at current are the ligatures, with the long line.
>
> 1) Acceptable?
>
> 2) BEhaves as described: the editor treads the long-line, as the 2nd char
> in the ligature, (if you delete it, it will delete the correct half of the
> ligature)
> (At least windows, with extra-char-spacing=1)
>
>
>
>
> On Sat, Dec 8, 2012 at 3:35 PM, Zaher Dirkey  wrote:
>
>> Hi,
>> Good feature for me, but my question (Off Topic), why you interested in
>> this feature while there is no many Arabic/RTL Lazarus users?
>>
>> For me, I will try to test it, and i like to look at the code too.
>>
>>
>> On Fri, Dec 7, 2012 at 12:53 AM, Martin  wrote:
>>
>>> A while ago, I started adding support for mixed LTR/RTL  text in SynEdit.
>>>
>>> The actual display of RTL text now works (that is, if you have some
>>> arabic chars in the text, they display RTL, and the caret moves accordingly
>>> / caret between RTL and LTR always means caret at LTR).
>>> uf8 LTR/RTL markers are not supported. This is absolute basics only.
>>>
>>> Unfortunately with RTL came other unicode features, that sofar no one
>>> had missed. Those are at the very least
>>> - combining codepoints
>>> - ligatures
>>> - maybe reordering of codepoints.
>>> - other?
>>> They are tasks of different extent. And I need to find out what is
>>> mandatory, and what optional. So I can then decide, what does fit into my
>>> schedule.
>>>
>>> The current state is:
>>> - combining: Only Arabic has been done (but they should be complete). So
>>> none Arabic RTL will not work.
>>> - ligatures: see below
>>> - reordering: not researched, hopefully optional.
>>>
>>> "work"
>>> means, that the text is stable (except ligatures, only with workaround),
>>> and does not expand/shrink, when selecting text, or moving the caret. Also
>>> that the caret will be at the correct pos. A newly inserted char will be
>>> where the caret was. Can be tested by hitting the "end" key, and see if the
>>> caret is at the end of visual text. If SynEdit thinks the text is
>>> shorter/longer than the actual painted display, then there is an issue.
>>>
>>> ligatures:
>>> The editor does not handle ligatures yet. So it calculates 2 screen
>>> cells, when only one is needed. However a stable "workaround" exists
>>> (currently depends on config)
>>>
>>> On windows and windows only (others will be done, if that turns out to
>>> be any good). In Options / Editor / Display / set "Extra CHAR spacing" to 1
>>> This will slightly widen the script, ignore that, its temporary.
>>> Requires a proper monospaced font. (Deja vu mono)
>>>
>>> What it will do: It will tell windows, that the ligature is expected to
>>> cover 2 display cells.
>>> Display: Arabic text is a script, glyphs are connected by a continuous
>>> line. The ligature will be in one cell, the next cell will be empty, except
>>> for the connecting line.
>>> Editing: The caret can be at either cell. Each cell stands for one of
>>> the 2 chars in the ligature. So the 2nd char can be edited, if the caret is
>>> at the empty cell
>>>
>>> --
>>> I need feedback from people who actually speak (or at least read and
>>> write) Arabic. I need to know, if the above situation is "useable".
>>>
>>> If so, then:
>>> - it can be fixed to work without the extra char spacing
>>> - on gtk, carbon, qt (well at least I hope)
>>> - combining can be added for other languages.
>>>
>>> If not, well I don't know yet.
>>>
>>>
>>  Best Regards
>> Zaher Dirkey
>>
>>
>
>
> --
> I am using last revision of Lazarus, FPC 2.6 on Windows XP SP3
>
> Best Regards
> Zaher Dirkey
>
>
>
> --
> ___
> Lazarus mailing 
> [email protected]://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
>
>
>
> --
> ___
> Lazarus mailing list
> [email protected]
> http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
>
>


-- 
I am using last revision of Lazarus, FPC 2.6 on Windows XP SP3

Best Regards
Zaher Dirkey
--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-08 Thread Zaher Dirkey
On Sat, Dec 8, 2012 at 4:49 PM, Martin  wrote:

>
> 1) Acceptable?
>
>
Yes, and it is wonderful :)

2) BEhaves as described: the editor treads the long-line, as the 2nd char
> in the ligature, (if you delete it, it will delete the correct half of the
> ligature)
> (At least windows, with extra-char-spacing=1)


I tested it and It is work fine until now.

Best Regards
Zaher Dirkey
--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-08 Thread Martin

On 08/12/2012 13:35, Zaher Dirkey wrote:

Hi,
Good feature for me, but my question (Off Topic), why you interested 
in this feature while there is no many Arabic/RTL Lazarus users?


For me, I will try to test it, and i like to look at the code too.



Someone did ask for it... No personal interest


(No, I don't automatically all I am ask for. )
--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-08 Thread Martin

On 08/12/2012 14:06, Zaher Dirkey wrote:

From the first trying, Wow it works :D, but i need more tests.


Main question at current are the ligatures, with the long line.

1) Acceptable?

2) BEhaves as described: the editor treads the long-line, as the 2nd 
char in the ligature, (if you delete it, it will delete the correct half 
of the ligature)

(At least windows, with extra-char-spacing=1)





On Sat, Dec 8, 2012 at 3:35 PM, Zaher Dirkey > wrote:


Hi,
Good feature for me, but my question (Off Topic), why you
interested in this feature while there is no many Arabic/RTL
Lazarus users?

For me, I will try to test it, and i like to look at the code too.


On Fri, Dec 7, 2012 at 12:53 AM, Martin mailto:[email protected]>> wrote:

A while ago, I started adding support for mixed LTR/RTL  text
in SynEdit.

The actual display of RTL text now works (that is, if you have
some arabic chars in the text, they display RTL, and the caret
moves accordingly / caret between RTL and LTR always means
caret at LTR).
uf8 LTR/RTL markers are not supported. This is absolute basics
only.

Unfortunately with RTL came other unicode features, that sofar
no one had missed. Those are at the very least
- combining codepoints
- ligatures
- maybe reordering of codepoints.
- other?
They are tasks of different extent. And I need to find out
what is mandatory, and what optional. So I can then decide,
what does fit into my schedule.

The current state is:
- combining: Only Arabic has been done (but they should be
complete). So none Arabic RTL will not work.
- ligatures: see below
- reordering: not researched, hopefully optional.

"work"
means, that the text is stable (except ligatures, only with
workaround), and does not expand/shrink, when selecting text,
or moving the caret. Also that the caret will be at the
correct pos. A newly inserted char will be where the caret
was. Can be tested by hitting the "end" key, and see if the
caret is at the end of visual text. If SynEdit thinks the text
is shorter/longer than the actual painted display, then there
is an issue.

ligatures:
The editor does not handle ligatures yet. So it calculates 2
screen cells, when only one is needed. However a stable
"workaround" exists (currently depends on config)

On windows and windows only (others will be done, if that
turns out to be any good). In Options / Editor / Display / set
"Extra CHAR spacing" to 1
This will slightly widen the script, ignore that, its temporary.
Requires a proper monospaced font. (Deja vu mono)

What it will do: It will tell windows, that the ligature is
expected to cover 2 display cells.
Display: Arabic text is a script, glyphs are connected by a
continuous line. The ligature will be in one cell, the next
cell will be empty, except for the connecting line.
Editing: The caret can be at either cell. Each cell stands for
one of the 2 chars in the ligature. So the 2nd char can be
edited, if the caret is at the empty cell

--
I need feedback from people who actually speak (or at least
read and write) Arabic. I need to know, if the above situation
is "useable".

If so, then:
- it can be fixed to work without the extra char spacing
- on gtk, carbon, qt (well at least I hope)
- combining can be added for other languages.

If not, well I don't know yet.


Best Regards
Zaher Dirkey




--
I am using last revision of Lazarus, FPC 2.6 on Windows XP SP3

Best Regards
Zaher Dirkey



--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-08 Thread Zaher Dirkey
>From the first trying, Wow it works :D, but i need more tests.


On Sat, Dec 8, 2012 at 3:35 PM, Zaher Dirkey  wrote:

> Hi,
> Good feature for me, but my question (Off Topic), why you interested in
> this feature while there is no many Arabic/RTL Lazarus users?
>
> For me, I will try to test it, and i like to look at the code too.
>
>
> On Fri, Dec 7, 2012 at 12:53 AM, Martin  wrote:
>
>> A while ago, I started adding support for mixed LTR/RTL  text in SynEdit.
>>
>> The actual display of RTL text now works (that is, if you have some
>> arabic chars in the text, they display RTL, and the caret moves accordingly
>> / caret between RTL and LTR always means caret at LTR).
>> uf8 LTR/RTL markers are not supported. This is absolute basics only.
>>
>> Unfortunately with RTL came other unicode features, that sofar no one had
>> missed. Those are at the very least
>> - combining codepoints
>> - ligatures
>> - maybe reordering of codepoints.
>> - other?
>> They are tasks of different extent. And I need to find out what is
>> mandatory, and what optional. So I can then decide, what does fit into my
>> schedule.
>>
>> The current state is:
>> - combining: Only Arabic has been done (but they should be complete). So
>> none Arabic RTL will not work.
>> - ligatures: see below
>> - reordering: not researched, hopefully optional.
>>
>> "work"
>> means, that the text is stable (except ligatures, only with workaround),
>> and does not expand/shrink, when selecting text, or moving the caret. Also
>> that the caret will be at the correct pos. A newly inserted char will be
>> where the caret was. Can be tested by hitting the "end" key, and see if the
>> caret is at the end of visual text. If SynEdit thinks the text is
>> shorter/longer than the actual painted display, then there is an issue.
>>
>> ligatures:
>> The editor does not handle ligatures yet. So it calculates 2 screen
>> cells, when only one is needed. However a stable "workaround" exists
>> (currently depends on config)
>>
>> On windows and windows only (others will be done, if that turns out to be
>> any good). In Options / Editor / Display / set "Extra CHAR spacing" to 1
>> This will slightly widen the script, ignore that, its temporary.
>> Requires a proper monospaced font. (Deja vu mono)
>>
>> What it will do: It will tell windows, that the ligature is expected to
>> cover 2 display cells.
>> Display: Arabic text is a script, glyphs are connected by a continuous
>> line. The ligature will be in one cell, the next cell will be empty, except
>> for the connecting line.
>> Editing: The caret can be at either cell. Each cell stands for one of the
>> 2 chars in the ligature. So the 2nd char can be edited, if the caret is at
>> the empty cell
>>
>> --
>> I need feedback from people who actually speak (or at least read and
>> write) Arabic. I need to know, if the above situation is "useable".
>>
>> If so, then:
>> - it can be fixed to work without the extra char spacing
>> - on gtk, carbon, qt (well at least I hope)
>> - combining can be added for other languages.
>>
>> If not, well I don't know yet.
>>
>>
> Best Regards
> Zaher Dirkey
>
>


-- 
I am using last revision of Lazarus, FPC 2.6 on Windows XP SP3

Best Regards
Zaher Dirkey
--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-08 Thread Zaher Dirkey
Hi,
Good feature for me, but my question (Off Topic), why you interested in
this feature while there is no many Arabic/RTL Lazarus users?

For me, I will try to test it, and i like to look at the code too.


On Fri, Dec 7, 2012 at 12:53 AM, Martin  wrote:

> A while ago, I started adding support for mixed LTR/RTL  text in SynEdit.
>
> The actual display of RTL text now works (that is, if you have some arabic
> chars in the text, they display RTL, and the caret moves accordingly /
> caret between RTL and LTR always means caret at LTR).
> uf8 LTR/RTL markers are not supported. This is absolute basics only.
>
> Unfortunately with RTL came other unicode features, that sofar no one had
> missed. Those are at the very least
> - combining codepoints
> - ligatures
> - maybe reordering of codepoints.
> - other?
> They are tasks of different extent. And I need to find out what is
> mandatory, and what optional. So I can then decide, what does fit into my
> schedule.
>
> The current state is:
> - combining: Only Arabic has been done (but they should be complete). So
> none Arabic RTL will not work.
> - ligatures: see below
> - reordering: not researched, hopefully optional.
>
> "work"
> means, that the text is stable (except ligatures, only with workaround),
> and does not expand/shrink, when selecting text, or moving the caret. Also
> that the caret will be at the correct pos. A newly inserted char will be
> where the caret was. Can be tested by hitting the "end" key, and see if the
> caret is at the end of visual text. If SynEdit thinks the text is
> shorter/longer than the actual painted display, then there is an issue.
>
> ligatures:
> The editor does not handle ligatures yet. So it calculates 2 screen cells,
> when only one is needed. However a stable "workaround" exists (currently
> depends on config)
>
> On windows and windows only (others will be done, if that turns out to be
> any good). In Options / Editor / Display / set "Extra CHAR spacing" to 1
> This will slightly widen the script, ignore that, its temporary.
> Requires a proper monospaced font. (Deja vu mono)
>
> What it will do: It will tell windows, that the ligature is expected to
> cover 2 display cells.
> Display: Arabic text is a script, glyphs are connected by a continuous
> line. The ligature will be in one cell, the next cell will be empty, except
> for the connecting line.
> Editing: The caret can be at either cell. Each cell stands for one of the
> 2 chars in the ligature. So the 2nd char can be edited, if the caret is at
> the empty cell
>
> --
> I need feedback from people who actually speak (or at least read and
> write) Arabic. I need to know, if the above situation is "useable".
>
> If so, then:
> - it can be fixed to work without the extra char spacing
> - on gtk, carbon, qt (well at least I hope)
> - combining can be added for other languages.
>
> If not, well I don't know yet.
>
>
Best Regards
Zaher Dirkey
--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-07 Thread Martin

On 07/12/2012 10:32, Mattias Gaertner wrote:

I know how it should be shown ideally.

Then you are ahead of most scientists. ;)



Well ok, that was not 100% true.

The utf8 docs are a lot of text, and no I did not read most of it.

But I looked at other apps (LibreOffice, Notepad++, and what windows 
does, if SynEdit does not give extra instruction)


Also I mainly referred to the handling of ligatures (like the one 
discussed. Which ideally should occupy one cell in the text.


--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-07 Thread Martin

On 07/12/2012 06:46, patspiper wrote:

On 07/12/12 00:53, Martin wrote:
A while ago, I started adding support for mixed LTR/RTL  text in 
SynEdit.


The actual display of RTL text now works (that is, if you have some 
arabic chars in the text, they display RTL, and the caret moves 
accordingly / caret between RTL and LTR always means caret at LTR).

uf8 LTR/RTL markers are not supported. This is absolute basics only.
This is ok in cases like the IDE editor where the document is mainly 
English. I suppose it will be somehow odd for documents with mainly 
RTL languages. Formatting (like indentation, bullets), where 
implemented, will suffer.


Unfortunately with RTL came other unicode features, that sofar no one 
had missed. Those are at the very least

- combining codepoints
- ligatures
- maybe reordering of codepoints.
- other?
They are tasks of different extent. And I need to find out what is 
mandatory, and what optional. So I can then decide, what does fit 
into my schedule.


The current state is:
- combining: Only Arabic has been done (but they should be complete). 
So none Arabic RTL will not work.

- ligatures: see below
- reordering: not researched, hopefully optional.

I am not aware of any need for reordering.

"work"
means, that the text is stable (except ligatures, only with 
workaround), and does not expand/shrink, when selecting text, or 
moving the caret. Also that the caret will be at the correct pos. A 
newly inserted char will be where the caret was. Can be tested by 
hitting the "end" key, and see if the caret is at the end of visual 
text. If SynEdit thinks the text is shorter/longer than the actual 
painted display, then there is an issue.


ligatures:
The editor does not handle ligatures yet. So it calculates 2 screen 
cells, when only one is needed. However a stable "workaround" exists 
(currently depends on config)


On windows and windows only (others will be done, if that turns out 
to be any good). In Options / Editor / Display / set "Extra CHAR 
spacing" to 1

This will slightly widen the script, ignore that, its temporary.
Requires a proper monospaced font. (Deja vu mono)

What it will do: It will tell windows, that the ligature is expected 
to cover 2 display cells.
Display: Arabic text is a script, glyphs are connected by a 
continuous line. The ligature will be in one cell, the next cell will 
be empty, except for the connecting line.
Editing: The caret can be at either cell. Each cell stands for one of 
the 2 chars in the ligature. So the 2nd char can be edited, if the 
caret is at the empty cell


--
I need feedback from people who actually speak (or at least read and 
write) Arabic. I need to know, if the above situation is "useable".


If so, then:
- it can be fixed to work without the extra char spacing
- on gtk, carbon, qt (well at least I hope)
- combining can be added for other languages.

If not, well I don't know yet.

I have tested on Linux/gtk2 (ubuntu 11.04), and courier new only:
- The attached snapshot (lines 29 and 30) shows an extra space before 
the 456.
Did you use "Extra Char Spacing" = 1 ? This is what happens, if not! 
(This and a few other real oddities)


And also, it can only be tested on windows. Because on GTK,QT,Carbon 
"Extra Char Spacing"  is faulty in an other way: It splits the combining 
chars into individuals, but since SynEdit does not know.


The problem is, that by current design, SynEdit has to calculate the 
pixel pos of each char on it's own.If it does not calculate the same, as 
the OS did when painting (SymEdit gives the OS tokens, fragments of the 
line or the whole line) then obviously things will be odd afterwards.


- Long connecting lines are not what I would like, but this is a 
monospaced font afterall.

Ok, but can you test them on windows, with "Extra Char Spacing" = 1

See that the caret pos is treadet correct, backspace and delete, insert 
work (on the correct char) on them, Copying a selection will copy the 
highlighted part (except column mode selection, which is not done yet)


About editing. (backspace and delete, insert)
- combining chars see below.
- ligatures. Caret and selection-wise the ligature, and the 
long-connecting-line, are both treaded as one char. One is the 1st, the 
other the 2nd char of the ligature (in the order they occur in text). 
The behaviour for editing should reflect this. Does it.




- The 456 should have come to the left of the Arabic words.


Ok, that could dbe fixed. Depends on treating digits as weak or strong 
LTR. Actually in this case, depends on treating the line end as such)


If the 456 were embedded in the middle of arab, it would have worked. 
But they border the EOL, and SynEdit treats the EOL strong LTR (and 
bordering weak 456 follows). This gives better result for pascal, where 
Arab occurs in strings. "a:='arab';" The '; in the end will and should 
be LTR due to bordering the EOL.


This will be fixed eventually, when weak handling is made highlighter 
de

Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-07 Thread Mattias Gaertner
On Fri, 07 Dec 2012 03:25:14 +
Martin  wrote:

> On 07/12/2012 02:38, Mattias Gaertner wrote:
> > On Thu, 06 Dec 2012 22:53:12 +
> > Martin  wrote:
> >> [...]
> >> I need feedback from people who actually speak (or at least read and
> >> write) Arabic. I need to know, if the above situation is "useable".
> > Attached is a small example text and a screenshot showing, how Firefox
> > with font "serif"renders it, how synedit with "Courier New" renders it
> > and how synedit with "monospace" renders it.
> >
> > Firefox and the "monospace" synedit shows only one
> > ligature (??), combined of the two codepoints alif ? and lam ?.
> > Firefox allows to select the two independently.
> >
> > The "courier new" font shows another ligature:
> > The ??? are three letters in "monospace", while with "courier new" it
> > shows the two rightmost combined as one.
> >
> 
> Was the synedit picture taken with extra char spacing? Because without 
> that, you get the wrong behaviour (select line, char by char)

It was taken under Linux/gtk2 with ExtraCharSpace=0. Setting
ExtraCharSpace to 1 shows the characters separated.

 
> I know how it should be shown ideally.

Then you are ahead of most scientists. ;)

 
> The question is how far away from ideal would still be useful? And that 
> I can not tell for a script, that I do not use at all.
> Someone once said, some editors display the ligatures, as 2 chars. And 
> that would be accepted by many people. But I can not judge that.

The ligatures are not important at the beginning. Later at least
the alif+lam ligature is needed.
It is crucial that the connected characters are shown connected.


> The proper ligature handling is a lot more work. And it depends on the 
> font (or it may at least) On windows there is an API, that would allow 
> to get that info. However if not carefully done, it may have effect on 
> the speed of synedit (quite possible noticeable).
> On others, I wouldn't even know where to look for such an API. But 
> little point, until SynEdit is ready to use an external API in this 
> place. (That is it does not call it excessively)

Maybe it can be done by a plugin?
Then the LCL+synedit do not need to implement it fully, but users can
implement the subset they need.

 
> As for placing the caret in the middle of a char, that would require 90% 
> of the work to render proportional fonts (on the list, but not now)

IMO the placing in the middle is somewhat strange. Perhaps showing
a special caret or icon is doable. That has low priority.


> Below is with extra char spacing, and you can see a stretched bit of 
> horizontal line

I see a long connection between the second and third from the left
(the alif and the jim).
And I see the alif-lam ligature.
See the attachment.

 
> Without the etra char spacing, there is a gap, between the RTL and the 
> LTR spaces atthe end (use visible spaces, r type real latin text at the 
> end).
> Delete half the ligature, and it will go away, causing the LTR text to 
> move, even so the RTL has still the same length (ligature replaced by 
> remaining normal char)
> 
> So at the moment for synEdit the ligature are 2 chars, they must take to 
> cells on screen.
> 
> The  point is. It might be some time until I can add proper ligature 
> handling.
> But I can try to get the current behaviour without the extra char 
> spacing (if that current behaviour is of any use).
> It might be possible to force the splitting of the ligature, I do not 
> know that for sure. Also GTK2 will probably split it, others I do not 
> know. (GTK2, will have to pain every char on its own.)


Mattias
<>--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-07 Thread Reinier Olislagers
On 6-12-2012 23:53, Martin wrote:
> A while ago, I started adding support for mixed LTR/RTL  text in SynEdit.

Thanks a lot, Martin.

I took the liberty of posting to the forum in an "Arabic issues" thread:
http://lazarus.freepascal.org/index.php/topic,17903.msg108631.html#msg108631

Thanks,
Reinier


--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-06 Thread patspiper

On 07/12/12 00:53, Martin wrote:

A while ago, I started adding support for mixed LTR/RTL  text in SynEdit.

The actual display of RTL text now works (that is, if you have some 
arabic chars in the text, they display RTL, and the caret moves 
accordingly / caret between RTL and LTR always means caret at LTR).

uf8 LTR/RTL markers are not supported. This is absolute basics only.
This is ok in cases like the IDE editor where the document is mainly 
English. I suppose it will be somehow odd for documents with mainly RTL 
languages. Formatting (like indentation, bullets), where implemented, 
will suffer.


Unfortunately with RTL came other unicode features, that sofar no one 
had missed. Those are at the very least

- combining codepoints
- ligatures
- maybe reordering of codepoints.
- other?
They are tasks of different extent. And I need to find out what is 
mandatory, and what optional. So I can then decide, what does fit into 
my schedule.


The current state is:
- combining: Only Arabic has been done (but they should be complete). 
So none Arabic RTL will not work.

- ligatures: see below
- reordering: not researched, hopefully optional.

I am not aware of any need for reordering.

"work"
means, that the text is stable (except ligatures, only with 
workaround), and does not expand/shrink, when selecting text, or 
moving the caret. Also that the caret will be at the correct pos. A 
newly inserted char will be where the caret was. Can be tested by 
hitting the "end" key, and see if the caret is at the end of visual 
text. If SynEdit thinks the text is shorter/longer than the actual 
painted display, then there is an issue.


ligatures:
The editor does not handle ligatures yet. So it calculates 2 screen 
cells, when only one is needed. However a stable "workaround" exists 
(currently depends on config)


On windows and windows only (others will be done, if that turns out to 
be any good). In Options / Editor / Display / set "Extra CHAR spacing" 
to 1

This will slightly widen the script, ignore that, its temporary.
Requires a proper monospaced font. (Deja vu mono)

What it will do: It will tell windows, that the ligature is expected 
to cover 2 display cells.
Display: Arabic text is a script, glyphs are connected by a continuous 
line. The ligature will be in one cell, the next cell will be empty, 
except for the connecting line.
Editing: The caret can be at either cell. Each cell stands for one of 
the 2 chars in the ligature. So the 2nd char can be edited, if the 
caret is at the empty cell


--
I need feedback from people who actually speak (or at least read and 
write) Arabic. I need to know, if the above situation is "useable".


If so, then:
- it can be fixed to work without the extra char spacing
- on gtk, carbon, qt (well at least I hope)
- combining can be added for other languages.

If not, well I don't know yet.

I have tested on Linux/gtk2 (ubuntu 11.04), and courier new only:
- The attached snapshot (lines 29 and 30) shows an extra space before 
the 456.
- Long connecting lines are not what I would like, but this is a 
monospaced font afterall.

- The 456 should have come to the left of the Arabic words.
- If you put a shaddah or damma on a character, it gets displayed on top 
of the character (correct behaviour). Pressing backspace at this stage 
should only delete that addition, and not the character.


Stephano
<>--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-06 Thread Martin

  
  
On 07/12/2012 02:38, Mattias Gaertner
  wrote:


  On Thu, 06 Dec 2012 22:53:12 +
Martin  wrote: 

  
[...]
I need feedback from people who actually speak (or at least read and 
write) Arabic. I need to know, if the above situation is "useable".

  
  
Attached is a small example text and a screenshot showing, how Firefox
with font "serif"renders it, how synedit with "Courier New" renders it
and how synedit with "monospace" renders it.

Firefox and the "monospace" synedit shows only one
ligature (لإ), combined of the two codepoints alif إ and lam ل.
Firefox allows to select the two independently.

The "courier new" font shows another ligature: 
The يجا are three letters in "monospace", while with "courier new" it
shows the two rightmost combined as one.




Was the synedit picture taken with extra char spacing? Because
without that, you get the wrong behaviour (select line, char by
char)

I know how it should be shown ideally.

The question is how far away from ideal would still be useful? And
that I can not tell for a script, that I do not use at all.
Someone once said, some editors display the ligatures, as 2 chars.
And that would be accepted by many people. But I can not judge that.

The proper ligature handling is a lot more work. And it depends on
the font (or it may at least) On windows there is an API, that would
allow to get that info. However if not carefully done, it may have
effect on the speed of synedit (quite possible noticeable).
On others, I wouldn't even know where to look for such an API. But
little point, until SynEdit is ready to use an external API in this
place. (That is it does not call it excessively)

As for placing the caret in the middle of a char, that would require
90% of the work to render proportional fonts (on the list, but not
now)

Below is with extra char spacing, and you can see a stretched bit of
horizontal line



Without the etra char spacing, there is a gap, between the RTL and
the LTR spaces atthe end (use visible spaces, r type real latin text
at the end).
Delete half the ligature, and it will go away, causing the LTR text
to move, even so the RTL has still the same length (ligature
replaced by remaining normal char)

So at the moment for synEdit the ligature are 2 chars, they must
take to cells on screen.

The  point is. It might be some time until I can add proper ligature
handling.
But I can try to get the current behaviour without the extra char
spacing (if that current behaviour is of any use).
It might be possible to force the splitting of the ligature, I do
not know that for sure. Also GTK2 will probably split it, others I
do not know. (GTK2, will have to pain every char on its own.)
  

--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-06 Thread Mattias Gaertner
On Thu, 06 Dec 2012 22:53:12 +
Martin  wrote:

> A while ago, I started adding support for mixed LTR/RTL  text in SynEdit.

Great!

 
>[...]
> I need feedback from people who actually speak (or at least read and 
> write) Arabic. I need to know, if the above situation is "useable".

Attached is a small example text and a screenshot showing, how Firefox
with font "serif"renders it, how synedit with "Courier New" renders it
and how synedit with "monospace" renders it.

Firefox and the "monospace" synedit shows only one
ligature (لإ), combined of the two codepoints alif إ and lam ل.
Firefox allows to select the two independently.

The "courier new" font shows another ligature: 
The يجا are three letters in "monospace", while with "courier new" it
shows the two rightmost combined as one.

Mattias
الإيجاب  .

<>--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


Re: [Lazarus] Arabic beta tester for SynEdit needed

2012-12-06 Thread Martin

On 06/12/2012 22:53, Martin wrote:

A while ago, I started adding support for mixed LTR/RTL  text in SynEdit.


I forgot: trunk 1.1 only

--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus


[Lazarus] Arabic beta tester for SynEdit needed

2012-12-06 Thread Martin

A while ago, I started adding support for mixed LTR/RTL  text in SynEdit.

The actual display of RTL text now works (that is, if you have some 
arabic chars in the text, they display RTL, and the caret moves 
accordingly / caret between RTL and LTR always means caret at LTR).

uf8 LTR/RTL markers are not supported. This is absolute basics only.

Unfortunately with RTL came other unicode features, that sofar no one 
had missed. Those are at the very least

- combining codepoints
- ligatures
- maybe reordering of codepoints.
- other?
They are tasks of different extent. And I need to find out what is 
mandatory, and what optional. So I can then decide, what does fit into 
my schedule.


The current state is:
- combining: Only Arabic has been done (but they should be complete). So 
none Arabic RTL will not work.

- ligatures: see below
- reordering: not researched, hopefully optional.

"work"
means, that the text is stable (except ligatures, only with workaround), 
and does not expand/shrink, when selecting text, or moving the caret. 
Also that the caret will be at the correct pos. A newly inserted char 
will be where the caret was. Can be tested by hitting the "end" key, and 
see if the caret is at the end of visual text. If SynEdit thinks the 
text is shorter/longer than the actual painted display, then there is an 
issue.


ligatures:
The editor does not handle ligatures yet. So it calculates 2 screen 
cells, when only one is needed. However a stable "workaround" exists 
(currently depends on config)


On windows and windows only (others will be done, if that turns out to 
be any good). In Options / Editor / Display / set "Extra CHAR spacing" to 1

This will slightly widen the script, ignore that, its temporary.
Requires a proper monospaced font. (Deja vu mono)

What it will do: It will tell windows, that the ligature is expected to 
cover 2 display cells.
Display: Arabic text is a script, glyphs are connected by a continuous 
line. The ligature will be in one cell, the next cell will be empty, 
except for the connecting line.
Editing: The caret can be at either cell. Each cell stands for one of 
the 2 chars in the ligature. So the 2nd char can be edited, if the caret 
is at the empty cell


--
I need feedback from people who actually speak (or at least read and 
write) Arabic. I need to know, if the above situation is "useable".


If so, then:
- it can be fixed to work without the extra char spacing
- on gtk, carbon, qt (well at least I hope)
- combining can be added for other languages.

If not, well I don't know yet.



--
___
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus