Re: [NTG-context] Problem with ConTeXt (MkIV), Hebrew and ligatures

2012-10-10 Thread Sietse Brouwer
 But what left me really wondering this time was following. I tried to
 comment out U0xfb35 table in char-def.lua for proof that this solution
 should work at all. However, for my surprise, it had no effect at all. For
 just in case, I even purged my Ubuntu PPA packaged version of ConTeXt to
 make sure that this modification I made to standalone version is really
 used. No effect. I also checked that my TeX file should have the correct
 characters, not the already combined ones. No error in there either.

After changing one of ConTeXt's source files, you should remake the formats with

context --make cont-en

(or context --make, if you want to make the dutch/german/etc. interfaces, too.)

Can you try that, and let us know if it works?
Kind regards,
Sietse
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] Problem with ConTeXt (MkIV), Hebrew and ligatures

2012-10-10 Thread Simo Ojala

But what left me really wondering this time was following. I tried to
comment out U0xfb35 table in char-def.lua for proof that this solution
should work at all. However, for my surprise, it had no effect at all. For
just in case, I even purged my Ubuntu PPA packaged version of ConTeXt to
make sure that this modification I made to standalone version is really
used. No effect. I also checked that my TeX file should have the correct
characters, not the already combined ones. No error in there either.


After changing one of ConTeXt's source files, you should remake the formats with

context --make cont-en

(or context --make, if you want to make the dutch/german/etc. interfaces, too.)

Can you try that, and let us know if it works?
Kind regards,
Sietse


Ok, thanks. After running 'context --make cont-en' it works. So at least 
the concept is now proofed with my setup.


Now that I learned to make modifications to ConTeXt effective, I made 
following observations. If I comment out lines 485 and 486 from 
char-utf.lua (should be the only two lines of function 
characters.filters.utf.enable()) I am able to turn collapsing on and off 
with following command line calls.


context --directives=filters.utf.collapse=true testcase.tex
context --directives=filters.utf.collapse=false testcase.tex

However, this also turns collapsing off by default. When there is no 
modifications to ConTeXt, I still don't know how to toggle this thing on 
and off.


So, maybe char-utf.lua is still flawed or I am still missing something.

Thank you for your help,

Simo


___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] Problem with ConTeXt (MkIV), Hebrew and ligatures

2012-10-09 Thread Simo Ojala

 Am 08.10.2012 um 20:51 schrieb Simo Ojala smsojala at gmail.com:

 On 1-10-2012 19:25, Philipp Gesang wrote:

 
utilities.sequencers.disableaction(resolvers.openers.helpers.textfileactions,characters.filters.utf.collapse)


 Doesn’t work. What helps is to comment out the “appendaction” in
 char-utf.lua or the corresponding table for U0xfb35 in
 char-def.lua. My guess is that this is the case because the .tex
 file is processed *before* you can disable it.

 so we need a directive (as they can be given on the commandline)

 local textfileactions = resolvers.openers.helpers.textfileactions

 directives.register(filters.utf.collapse, function(v)
 utilities.sequencers[v and enableaction or
 disableaction](textfileactions,characters.filters.utf.collapse)
 end)



 Hans

 Sorry to still bother you with this. I just could not get this 
working. Hopefully it is just that I could not figure out right command 
line stuff. (I tried out several different ways.) So could somebody tell 
me how it should be run.


 My guess is something like..

 context --directives=filters.utf.collapse=what_should_i_put_here? 
testcase.tex


 Thanks,

 Simo

 PS: Both Context setups I tried to get this working (Ubuntu PPA and 
standalone) should have had code updated. So that should have not been 
the problem.


 Does it work when you add

   \enabledirectives[filters.utf.collapse]

 at the begin of your document.

 Wolfgang

Did not work for me. However there was discussion that the directive 
should be invoked from command line.


But what left me really wondering this time was following. I tried to 
comment out U0xfb35 table in char-def.lua for proof that this 
solution should work at all. However, for my surprise, it had no effect 
at all. For just in case, I even purged my Ubuntu PPA packaged version 
of ConTeXt to make sure that this modification I made to standalone 
version is really used. No effect. I also checked that my TeX file 
should have the correct characters, not the already combined ones. No 
error in there either.


Unfortunately internals of ConTeXt seems so complex for my level of 
programming skills that this time I could not think of anything more to try.


Simo
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Problem with ConTeXt (MkIV), Hebrew and ligatures

2012-10-08 Thread Simo Ojala

On 1-10-2012 19:25, Philipp Gesang wrote:


utilities.sequencers.disableaction(resolvers.openers.helpers.textfileactions,characters.filters.utf.collapse)


Doesn’t work. What helps is to comment out the “appendaction” in
char-utf.lua or the corresponding table for U0xfb35 in
char-def.lua. My guess is that this is the case because the .tex
file is processed *before* you can disable it.


so we need a directive (as they can be given on the commandline)

local textfileactions = resolvers.openers.helpers.textfileactions

directives.register(filters.utf.collapse, function(v)
 utilities.sequencers[v and enableaction or
disableaction](textfileactions,characters.filters.utf.collapse)
end)



Hans


Sorry to still bother you with this. I just could not get this working. 
Hopefully it is just that I could not figure out right command line 
stuff. (I tried out several different ways.) So could somebody tell me 
how it should be run.


My guess is something like..

context --directives=filters.utf.collapse=what_should_i_put_here? 
testcase.tex


Thanks,

Simo

PS: Both Context setups I tried to get this working (Ubuntu PPA and 
standalone) should have had code updated. So that should have not been 
the problem.


___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Problem with ConTeXt (MkIV), Hebrew and ligatures

2012-10-08 Thread Wolfgang Schuster

Am 08.10.2012 um 20:51 schrieb Simo Ojala smsoj...@gmail.com:

 On 1-10-2012 19:25, Philipp Gesang wrote:
 
 utilities.sequencers.disableaction(resolvers.openers.helpers.textfileactions,characters.filters.utf.collapse)
 
 Doesn’t work. What helps is to comment out the “appendaction” in
 char-utf.lua or the corresponding table for U0xfb35 in
 char-def.lua. My guess is that this is the case because the .tex
 file is processed *before* you can disable it.
 
 so we need a directive (as they can be given on the commandline)
 
 local textfileactions = resolvers.openers.helpers.textfileactions
 
 directives.register(filters.utf.collapse, function(v)
 utilities.sequencers[v and enableaction or
 disableaction](textfileactions,characters.filters.utf.collapse)
 end)
 
 
 
 Hans
 
 Sorry to still bother you with this. I just could not get this working. 
 Hopefully it is just that I could not figure out right command line stuff. (I 
 tried out several different ways.) So could somebody tell me how it should be 
 run.
 
 My guess is something like..
 
 context --directives=filters.utf.collapse=what_should_i_put_here? testcase.tex
 
 Thanks,
 
 Simo
 
 PS: Both Context setups I tried to get this working (Ubuntu PPA and 
 standalone) should have had code updated. So that should have not been the 
 problem.

Does it work when you add

  \enabledirectives[filters.utf.collapse]

at the begin of your document.

Wolfgang___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Problem with ConTeXt (MkIV), Hebrew and ligatures

2012-10-01 Thread Simo Ojala

On 09/29/2012 02:35 PM, Hans Hagen wrote:

On 29-9-2012 01:41, Simo Ojala wrote:

Hans Hagen pra...@wxs.nl

On 09/28/2012 11:46 AM, Hans Hagen wrote:

On 27-9-2012 21:27, Simo Ojala wrote:

This is a problem originally posted in TeX/StackExchange. However,
since
I have not had any luck in finding a solution I post it here too. I am
confident that somebody here should know the answer.


http://tex.stackexchange.com/questions/73970/problem-with-context-mkiv-hebrew-and-ligatures





Since I last played with the latest ConTeXt MkIV, there has been
introduced this new feature. It now seems to combine Hebrew characters
automatically when possible to ligatures. So for example. If I have a
word with following two characters:

U+05D5 (HEBREW LETTER VAV)
U+05BC (HEBREW POINT DAGESH OR MAPIQ)

ConTeXt will combine these to:

U+FB35 (HEBREW LETTER VAV WITH DAGESH)

However, I would need to disable this feature for a number of reasons.
For example, this breaks my little database query, because the query
key
is changed before(?) macro gets it.

So if somebody would know how to turn this off and maybe also that what
has changed.


It depends on the font ... normally you can disable this by *not* using
the mark and mkmk features

Hans



Ok, I have now tried turning off all kinds of features without luck. So,
I tried putting together minimal test case. I suspect that there should
be done something more than just turn off some font features. However,
my ConTeXt skills are very limited so I can be wrong.

The goal is that the word passed from ConTeXt file remains as it is
written and gives unicode characters U+5e1, U+5d5, U+5bc and U+5e1. This
is what already happens when the word is in the lua file.

Simo

PS: In case this matters. My ConTeXt MkIV version is 2012.09.23 12:40.
It should be the latest for Ubuntu 12.04 LTS Precise Pangolin that is in
the Adam Reviczky's PPA.


%% testcase.tex

\definefontfeature[hebrew][arabic][script=hebr]
\definefont[dejavusans][name:dejavusans*hebrew at 26pt]
\setupdirections[bidi=global]

\starttext
\dejavusans

\def\Macro#1{\directlua{
dofile(resolvers.findfile(testcase.lua))
userdata.testfunction(#1)
}}

\Macro{סוּס}

\blank[1cm]however, we can still color these independently\blank[0.5cm]

\color[red]{ס}\color[green]{ו}\color[blue]{ּ}\color[yellow]{ס}

\stoptext


-- testcase.lua

userdata = userdata or {}

function userdata.testfunction(word)

 tex.sprint(\\blank[1cm]word passed by macro\\blank[0.5cm])

 for i = 1, unicode.utf8.len(word) do
 tex.sprint(U+ ..
string.format(%x,unicode.utf8.byte(word,i)) .. :  ..
unicode.utf8.sub(word,i,i) .. \\par )
 end

 tex.sprint(\\blank[1cm]word written in lua file\\blank[0.5cm])

 word = סוּס

 for i = 1, unicode.utf8.len(word) do
 tex.sprint(U+ ..
string.format(%x,unicode.utf8.byte(word,i)) .. :  ..
unicode.utf8.sub(word,i,i) .. \\par )
 end
end


I see three characters next to each other so what exactly is the problem?

(BTW, take a look at goodies-002.tex in the test suite ... you can
define colored glyphs as a feature)

Hans



Sorry for being unclear, I try to clarify. The problem is:

1. I have tex file with which calls a macro with argument that has 
characters U+5d5 and U+5bc.
2. Macro passes argument further to lua code. When it gets there 
characters have turned to U+fb35.
3. When the lua code then compares the U+fb35 with xml file that has the 
original forms U+5d5 and U+5bc it of course fails.


So, the problem is that there is this phase 2 that has not always 
happened. If possible I would like to turn it off somehow. Of course I 
could try to write some workaround code to countermeasure this 
substitution or what it should be called. But that could be complicated 
and lead to more problems.



Simo


PS: I attached my result of the test case in case this is problem with 
my setup. Compiled with ConTeXt MkIV 2012.09.25 21:44.


testcase.pdf
Description: Adobe PDF document
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Problem with ConTeXt (MkIV), Hebrew and ligatures

2012-10-01 Thread Philipp Gesang
···date: 2012-10-01, Monday···from: Simo Ojala···

 On 09/29/2012 02:35 PM, Hans Hagen wrote:
 On 29-9-2012 01:41, Simo Ojala wrote:
 Hans Hagen pra...@wxs.nl
 
 On 09/28/2012 11:46 AM, Hans Hagen wrote:
 On 27-9-2012 21:27, Simo Ojala wrote:
 This is a problem originally posted in TeX/StackExchange. However,
 since
 I have not had any luck in finding a solution I post it here too. I am
 confident that somebody here should know the answer.
 
 
 http://tex.stackexchange.com/questions/73970/problem-with-context-mkiv-hebrew-and-ligatures
 
 
 
 
 
 Since I last played with the latest ConTeXt MkIV, there has been
 introduced this new feature. It now seems to combine Hebrew characters
 automatically when possible to ligatures. So for example. If I have a
 word with following two characters:
 
 U+05D5 (HEBREW LETTER VAV)
 U+05BC (HEBREW POINT DAGESH OR MAPIQ)
 
 ConTeXt will combine these to:
 
 U+FB35 (HEBREW LETTER VAV WITH DAGESH)
 
 However, I would need to disable this feature for a number of reasons.
 For example, this breaks my little database query, because the query
 key
 is changed before(?) macro gets it.
 
 So if somebody would know how to turn this off and maybe also that what
 has changed.
 
 It depends on the font ... normally you can disable this by *not* using
 the mark and mkmk features
 
 Hans
 
 
 Ok, I have now tried turning off all kinds of features without luck. So,
 I tried putting together minimal test case. I suspect that there should
 be done something more than just turn off some font features. However,
 my ConTeXt skills are very limited so I can be wrong.
 
 The goal is that the word passed from ConTeXt file remains as it is
 written and gives unicode characters U+5e1, U+5d5, U+5bc and U+5e1. This
 is what already happens when the word is in the lua file.
 
 Simo
 
 PS: In case this matters. My ConTeXt MkIV version is 2012.09.23 12:40.
 It should be the latest for Ubuntu 12.04 LTS Precise Pangolin that is in
 the Adam Reviczky's PPA.
 
 
 %% testcase.tex
 
 \definefontfeature[hebrew][arabic][script=hebr]
 \definefont[dejavusans][name:dejavusans*hebrew at 26pt]
 \setupdirections[bidi=global]
 
 \starttext
 \dejavusans
 
 \def\Macro#1{\directlua{
 dofile(resolvers.findfile(testcase.lua))
 userdata.testfunction(#1)
 }}
 
 \Macro{סוּס}
 
 \blank[1cm]however, we can still color these independently\blank[0.5cm]
 
 \color[red]{ס}\color[green]{ו}\color[blue]{ּ}\color[yellow]{ס}
 
 \stoptext
 
 
 -- testcase.lua
 
 userdata = userdata or {}
 
 function userdata.testfunction(word)
 
  tex.sprint(\\blank[1cm]word passed by macro\\blank[0.5cm])
 
  for i = 1, unicode.utf8.len(word) do
  tex.sprint(U+ ..
 string.format(%x,unicode.utf8.byte(word,i)) .. :  ..
 unicode.utf8.sub(word,i,i) .. \\par )
  end
 
  tex.sprint(\\blank[1cm]word written in lua file\\blank[0.5cm])
 
  word = סוּס
 
  for i = 1, unicode.utf8.len(word) do
  tex.sprint(U+ ..
 string.format(%x,unicode.utf8.byte(word,i)) .. :  ..
 unicode.utf8.sub(word,i,i) .. \\par )
  end
 end
 
 I see three characters next to each other so what exactly is the problem?
 
 (BTW, take a look at goodies-002.tex in the test suite ... you can
 define colored glyphs as a feature)
 
 Hans
 
 
 Sorry for being unclear, I try to clarify. The problem is:
 
 1. I have tex file with which calls a macro with argument that has
 characters U+5d5 and U+5bc.
 2. Macro passes argument further to lua code. When it gets there
 characters have turned to U+fb35.

Hi,

I don’t have clue about hebrew but isn’t this a correct
normalization[0], not a ligature? If so, the behavior of Luatex
is perfectly fine. Lua otoh treats the string as a sequence of
bytes, which is just how it treats strings everywhere.

[0] http://www.unicode.org/charts/normalization/chart_Hebrew.html

Regards
Philipp


 3. When the lua code then compares the U+fb35 with xml file that has
 the original forms U+5d5 and U+5bc it of course fails.
 
 So, the problem is that there is this phase 2 that has not always
 happened. If possible I would like to turn it off somehow. Of course
 I could try to write some workaround code to countermeasure this
 substitution or what it should be called. But that could be
 complicated and lead to more problems.
 
 
 Simo
 
 
 PS: I attached my result of the test case in case this is problem
 with my setup. Compiled with ConTeXt MkIV 2012.09.25 21:44.


 ___
 If your question is of interest to others as well, please add an entry to the 
 Wiki!
 
 maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
 webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
 archive  : http://foundry.supelec.fr/projects/contextrev/
 wiki : http://contextgarden.net
 ___


-- 
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - 

Re: [NTG-context] Problem with ConTeXt (MkIV), Hebrew and ligatures

2012-10-01 Thread Hans Hagen

On 1-10-2012 18:23, Philipp Gesang wrote:

···date: 2012-10-01, Monday···from: Simo Ojala···


On 09/29/2012 02:35 PM, Hans Hagen wrote:

On 29-9-2012 01:41, Simo Ojala wrote:

Hans Hagen pra...@wxs.nl

On 09/28/2012 11:46 AM, Hans Hagen wrote:

On 27-9-2012 21:27, Simo Ojala wrote:

This is a problem originally posted in TeX/StackExchange. However,
since
I have not had any luck in finding a solution I post it here too. I am
confident that somebody here should know the answer.


http://tex.stackexchange.com/questions/73970/problem-with-context-mkiv-hebrew-and-ligatures





Since I last played with the latest ConTeXt MkIV, there has been
introduced this new feature. It now seems to combine Hebrew characters
automatically when possible to ligatures. So for example. If I have a
word with following two characters:

U+05D5 (HEBREW LETTER VAV)
U+05BC (HEBREW POINT DAGESH OR MAPIQ)

ConTeXt will combine these to:

U+FB35 (HEBREW LETTER VAV WITH DAGESH)

However, I would need to disable this feature for a number of reasons.
For example, this breaks my little database query, because the query
key
is changed before(?) macro gets it.

So if somebody would know how to turn this off and maybe also that what
has changed.


It depends on the font ... normally you can disable this by *not* using
the mark and mkmk features

Hans



Ok, I have now tried turning off all kinds of features without luck. So,
I tried putting together minimal test case. I suspect that there should
be done something more than just turn off some font features. However,
my ConTeXt skills are very limited so I can be wrong.

The goal is that the word passed from ConTeXt file remains as it is
written and gives unicode characters U+5e1, U+5d5, U+5bc and U+5e1. This
is what already happens when the word is in the lua file.

Simo

PS: In case this matters. My ConTeXt MkIV version is 2012.09.23 12:40.
It should be the latest for Ubuntu 12.04 LTS Precise Pangolin that is in
the Adam Reviczky's PPA.


%% testcase.tex

\definefontfeature[hebrew][arabic][script=hebr]
\definefont[dejavusans][name:dejavusans*hebrew at 26pt]
\setupdirections[bidi=global]

\starttext
\dejavusans

\def\Macro#1{\directlua{
dofile(resolvers.findfile(testcase.lua))
userdata.testfunction(#1)
}}

\Macro{סוּס}

\blank[1cm]however, we can still color these independently\blank[0.5cm]

\color[red]{ס}\color[green]{ו}\color[blue]{ּ}\color[yellow]{ס}

\stoptext


-- testcase.lua

userdata = userdata or {}

function userdata.testfunction(word)

 tex.sprint(\\blank[1cm]word passed by macro\\blank[0.5cm])

 for i = 1, unicode.utf8.len(word) do
 tex.sprint(U+ ..
string.format(%x,unicode.utf8.byte(word,i)) .. :  ..
unicode.utf8.sub(word,i,i) .. \\par )
 end

 tex.sprint(\\blank[1cm]word written in lua file\\blank[0.5cm])

 word = סוּס

 for i = 1, unicode.utf8.len(word) do
 tex.sprint(U+ ..
string.format(%x,unicode.utf8.byte(word,i)) .. :  ..
unicode.utf8.sub(word,i,i) .. \\par )
 end
end


I see three characters next to each other so what exactly is the problem?

(BTW, take a look at goodies-002.tex in the test suite ... you can
define colored glyphs as a feature)

Hans



Sorry for being unclear, I try to clarify. The problem is:

1. I have tex file with which calls a macro with argument that has
characters U+5d5 and U+5bc.
2. Macro passes argument further to lua code. When it gets there
characters have turned to U+fb35.


Hi,

I don’t have clue about hebrew but isn’t this a correct
normalization[0], not a ligature? If so, the behavior of Luatex
is perfectly fine. Lua otoh treats the string as a sequence of
bytes, which is just how it treats strings everywhere.

[0] http://www.unicode.org/charts/normalization/chart_Hebrew.html

Regards
Philipp


In that case you can try

utilities.sequencers.disableaction(resolvers.openers.helpers.textfileactions,characters.filters.utf.collapse)

if this is needed, I can provide a directive for it

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Problem with ConTeXt (MkIV), Hebrew and ligatures

2012-10-01 Thread Philipp Gesang
···date: 2012-10-01, Monday···from: Hans Hagen···

 On 1-10-2012 18:23, Philipp Gesang wrote:
 ···date: 2012-10-01, Monday···from: Simo Ojala···
 
 On 09/29/2012 02:35 PM, Hans Hagen wrote:
 On 29-9-2012 01:41, Simo Ojala wrote:
 Hans Hagen pra...@wxs.nl
 
 On 09/28/2012 11:46 AM, Hans Hagen wrote:
 On 27-9-2012 21:27, Simo Ojala wrote:
 This is a problem originally posted in TeX/StackExchange. However,
 since
 I have not had any luck in finding a solution I post it here too. I am
 confident that somebody here should know the answer.
 
 
 http://tex.stackexchange.com/questions/73970/problem-with-context-mkiv-hebrew-and-ligatures
 
 
 
 
 
 Since I last played with the latest ConTeXt MkIV, there has been
 introduced this new feature. It now seems to combine Hebrew characters
 automatically when possible to ligatures. So for example. If I have a
 word with following two characters:
 
 U+05D5 (HEBREW LETTER VAV)
 U+05BC (HEBREW POINT DAGESH OR MAPIQ)
 
 ConTeXt will combine these to:
 
 U+FB35 (HEBREW LETTER VAV WITH DAGESH)
 
 However, I would need to disable this feature for a number of reasons.
 For example, this breaks my little database query, because the query
 key
 is changed before(?) macro gets it.
 
 So if somebody would know how to turn this off and maybe also that what
 has changed.
 
 It depends on the font ... normally you can disable this by *not* using
 the mark and mkmk features
 
 Hans
 
 
 Ok, I have now tried turning off all kinds of features without luck. So,
 I tried putting together minimal test case. I suspect that there should
 be done something more than just turn off some font features. However,
 my ConTeXt skills are very limited so I can be wrong.
 
 The goal is that the word passed from ConTeXt file remains as it is
 written and gives unicode characters U+5e1, U+5d5, U+5bc and U+5e1. This
 is what already happens when the word is in the lua file.
 
 Simo
 
 PS: In case this matters. My ConTeXt MkIV version is 2012.09.23 12:40.
 It should be the latest for Ubuntu 12.04 LTS Precise Pangolin that is in
 the Adam Reviczky's PPA.
 
 
 %% testcase.tex
 
 \definefontfeature[hebrew][arabic][script=hebr]
 \definefont[dejavusans][name:dejavusans*hebrew at 26pt]
 \setupdirections[bidi=global]
 
 \starttext
 \dejavusans
 
 \def\Macro#1{\directlua{
 dofile(resolvers.findfile(testcase.lua))
 userdata.testfunction(#1)
 }}
 
 \Macro{סוּס}
 
 \blank[1cm]however, we can still color these independently\blank[0.5cm]
 
 \color[red]{ס}\color[green]{ו}\color[blue]{ּ}\color[yellow]{ס}
 
 \stoptext
 
 
 -- testcase.lua
 
 userdata = userdata or {}
 
 function userdata.testfunction(word)
 
  tex.sprint(\\blank[1cm]word passed by macro\\blank[0.5cm])
 
  for i = 1, unicode.utf8.len(word) do
  tex.sprint(U+ ..
 string.format(%x,unicode.utf8.byte(word,i)) .. :  ..
 unicode.utf8.sub(word,i,i) .. \\par )
  end
 
  tex.sprint(\\blank[1cm]word written in lua file\\blank[0.5cm])
 
  word = סוּס
 
  for i = 1, unicode.utf8.len(word) do
  tex.sprint(U+ ..
 string.format(%x,unicode.utf8.byte(word,i)) .. :  ..
 unicode.utf8.sub(word,i,i) .. \\par )
  end
 end
 
 I see three characters next to each other so what exactly is the problem?
 
 (BTW, take a look at goodies-002.tex in the test suite ... you can
 define colored glyphs as a feature)
 
 Hans
 
 
 Sorry for being unclear, I try to clarify. The problem is:
 
 1. I have tex file with which calls a macro with argument that has
 characters U+5d5 and U+5bc.
 2. Macro passes argument further to lua code. When it gets there
 characters have turned to U+fb35.
 
 Hi,
 
 I don’t have clue about hebrew but isn’t this a correct
 normalization[0], not a ligature? If so, the behavior of Luatex
 is perfectly fine. Lua otoh treats the string as a sequence of
 bytes, which is just how it treats strings everywhere.
 
 [0] http://www.unicode.org/charts/normalization/chart_Hebrew.html
 
 Regards
 Philipp
 
 In that case you can try
 
 utilities.sequencers.disableaction(resolvers.openers.helpers.textfileactions,characters.filters.utf.collapse)

Doesn’t work. What helps is to comment out the “appendaction” in
char-utf.lua or the corresponding table for U0xfb35 in
char-def.lua. My guess is that this is the case because the .tex
file is processed *before* you can disable it.

Philipp


 
 if this is needed, I can provide a directive for it
 
 Hans
 
 -
   Hans Hagen | PRAGMA ADE
   Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
  | www.pragma-pod.nl
 -
 ___
 If your question is of interest to others as well, please add an entry to the 
 Wiki!
 
 maillist : 

Re: [NTG-context] Problem with ConTeXt (MkIV), Hebrew and ligatures

2012-10-01 Thread Hans Hagen

On 1-10-2012 19:25, Philipp Gesang wrote:


utilities.sequencers.disableaction(resolvers.openers.helpers.textfileactions,characters.filters.utf.collapse)


Doesn’t work. What helps is to comment out the “appendaction” in
char-utf.lua or the corresponding table for U0xfb35 in
char-def.lua. My guess is that this is the case because the .tex
file is processed *before* you can disable it.


so we need a directive (as they can be given on the commandline)

local textfileactions = resolvers.openers.helpers.textfileactions

directives.register(filters.utf.collapse, function(v)
utilities.sequencers[v and enableaction or 
disableaction](textfileactions,characters.filters.utf.collapse)

end)



Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] Problem with ConTeXt (MkIV), Hebrew and ligatures

2012-10-01 Thread Philipp Gesang
···date: 2012-10-01, Monday···from: Hans Hagen···

 On 1-10-2012 19:25, Philipp Gesang wrote:
 
 utilities.sequencers.disableaction(resolvers.openers.helpers.textfileactions,characters.filters.utf.collapse)
 
 Doesn’t work. What helps is to comment out the “appendaction” in
 char-utf.lua or the corresponding table for U0xfb35 in
 char-def.lua. My guess is that this is the case because the .tex
 file is processed *before* you can disable it.
 
 so we need a directive (as they can be given on the commandline)

Yes, I think so, too. Btw. according to this faq:

  http://www.unicode.org/faq/ligature_digraph.html#Pf1

these are not normalizations but in fact some kind of
second-class ligatures that the unicode people seem to grudgingly
keep around for compatibility reasons (like with precombined
Greek). Is it really wise to have them enabled by default?

Philipp



 
 local textfileactions = resolvers.openers.helpers.textfileactions
 
 directives.register(filters.utf.collapse, function(v)
 utilities.sequencers[v and enableaction or
 disableaction](textfileactions,characters.filters.utf.collapse)
 end)
 
 
 
 Hans
 


pgpq388bj0AFh.pgp
Description: PGP signature
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Problem with ConTeXt (MkIV), Hebrew and ligatures

2012-10-01 Thread Hans Hagen

On 1-10-2012 22:18, Philipp Gesang wrote:

···date: 2012-10-01, Monday···from: Hans Hagen···


On 1-10-2012 19:25, Philipp Gesang wrote:


utilities.sequencers.disableaction(resolvers.openers.helpers.textfileactions,characters.filters.utf.collapse)


Doesn’t work. What helps is to comment out the “appendaction” in
char-utf.lua or the corresponding table for U0xfb35 in
char-def.lua. My guess is that this is the case because the .tex
file is processed *before* you can disable it.


so we need a directive (as they can be given on the commandline)


Yes, I think so, too. Btw. according to this faq:

   http://www.unicode.org/faq/ligature_digraph.html#Pf1

these are not normalizations but in fact some kind of
second-class ligatures that the unicode people seem to grudgingly
keep around for compatibility reasons (like with precombined
Greek). Is it really wise to have them enabled by default?


most is controlled by char-def.lua and arbitrary disabling some is 
confusing I guess (btw, for special purposes one could add non-joiners 
in between)


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] Problem with ConTeXt (MkIV), Hebrew and ligatures

2012-09-29 Thread Hans Hagen

On 29-9-2012 01:41, Simo Ojala wrote:

Hans Hagen pra...@wxs.nl

On 09/28/2012 11:46 AM, Hans Hagen wrote:

On 27-9-2012 21:27, Simo Ojala wrote:

This is a problem originally posted in TeX/StackExchange. However, since
I have not had any luck in finding a solution I post it here too. I am
confident that somebody here should know the answer.


http://tex.stackexchange.com/questions/73970/problem-with-context-mkiv-hebrew-and-ligatures




Since I last played with the latest ConTeXt MkIV, there has been
introduced this new feature. It now seems to combine Hebrew characters
automatically when possible to ligatures. So for example. If I have a
word with following two characters:

U+05D5 (HEBREW LETTER VAV)
U+05BC (HEBREW POINT DAGESH OR MAPIQ)

ConTeXt will combine these to:

U+FB35 (HEBREW LETTER VAV WITH DAGESH)

However, I would need to disable this feature for a number of reasons.
For example, this breaks my little database query, because the query key
is changed before(?) macro gets it.

So if somebody would know how to turn this off and maybe also that what
has changed.


It depends on the font ... normally you can disable this by *not* using
the mark and mkmk features

Hans



Ok, I have now tried turning off all kinds of features without luck. So,
I tried putting together minimal test case. I suspect that there should
be done something more than just turn off some font features. However,
my ConTeXt skills are very limited so I can be wrong.

The goal is that the word passed from ConTeXt file remains as it is
written and gives unicode characters U+5e1, U+5d5, U+5bc and U+5e1. This
is what already happens when the word is in the lua file.

Simo

PS: In case this matters. My ConTeXt MkIV version is 2012.09.23 12:40.
It should be the latest for Ubuntu 12.04 LTS Precise Pangolin that is in
the Adam Reviczky's PPA.


%% testcase.tex

\definefontfeature[hebrew][arabic][script=hebr]
\definefont[dejavusans][name:dejavusans*hebrew at 26pt]
\setupdirections[bidi=global]

\starttext
\dejavusans

\def\Macro#1{\directlua{
dofile(resolvers.findfile(testcase.lua))
userdata.testfunction(#1)
}}

\Macro{סוּס}

\blank[1cm]however, we can still color these independently\blank[0.5cm]

\color[red]{ס}\color[green]{ו}\color[blue]{ּ}\color[yellow]{ס}

\stoptext


-- testcase.lua

userdata = userdata or {}

function userdata.testfunction(word)

 tex.sprint(\\blank[1cm]word passed by macro\\blank[0.5cm])

 for i = 1, unicode.utf8.len(word) do
 tex.sprint(U+ ..
string.format(%x,unicode.utf8.byte(word,i)) .. :  ..
unicode.utf8.sub(word,i,i) .. \\par )
 end

 tex.sprint(\\blank[1cm]word written in lua file\\blank[0.5cm])

 word = סוּס

 for i = 1, unicode.utf8.len(word) do
 tex.sprint(U+ ..
string.format(%x,unicode.utf8.byte(word,i)) .. :  ..
unicode.utf8.sub(word,i,i) .. \\par )
 end
end


I see three characters next to each other so what exactly is the problem?

(BTW, take a look at goodies-002.tex in the test suite ... you can 
define colored glyphs as a feature)


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Problem with ConTeXt (MkIV), Hebrew and ligatures

2012-09-28 Thread Hans Hagen

On 27-9-2012 21:27, Simo Ojala wrote:

This is a problem originally posted in TeX/StackExchange. However, since
I have not had any luck in finding a solution I post it here too. I am
confident that somebody here should know the answer.


http://tex.stackexchange.com/questions/73970/problem-with-context-mkiv-hebrew-and-ligatures


Since I last played with the latest ConTeXt MkIV, there has been
introduced this new feature. It now seems to combine Hebrew characters
automatically when possible to ligatures. So for example. If I have a
word with following two characters:

U+05D5 (HEBREW LETTER VAV)
U+05BC (HEBREW POINT DAGESH OR MAPIQ)

ConTeXt will combine these to:

U+FB35 (HEBREW LETTER VAV WITH DAGESH)

However, I would need to disable this feature for a number of reasons.
For example, this breaks my little database query, because the query key
is changed before(?) macro gets it.

So if somebody would know how to turn this off and maybe also that what
has changed.


It depends on the font ... normally you can disable this by *not* using 
the mark and mkmk features


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] Problem with ConTeXt (MkIV), Hebrew and ligatures

2012-09-28 Thread Simo Ojala

Hans Hagen pra...@wxs.nl

On 09/28/2012 11:46 AM, Hans Hagen wrote:

On 27-9-2012 21:27, Simo Ojala wrote:

This is a problem originally posted in TeX/StackExchange. However, since
I have not had any luck in finding a solution I post it here too. I am
confident that somebody here should know the answer.


http://tex.stackexchange.com/questions/73970/problem-with-context-mkiv-hebrew-and-ligatures



Since I last played with the latest ConTeXt MkIV, there has been
introduced this new feature. It now seems to combine Hebrew characters
automatically when possible to ligatures. So for example. If I have a
word with following two characters:

U+05D5 (HEBREW LETTER VAV)
U+05BC (HEBREW POINT DAGESH OR MAPIQ)

ConTeXt will combine these to:

U+FB35 (HEBREW LETTER VAV WITH DAGESH)

However, I would need to disable this feature for a number of reasons.
For example, this breaks my little database query, because the query key
is changed before(?) macro gets it.

So if somebody would know how to turn this off and maybe also that what
has changed.


It depends on the font ... normally you can disable this by *not* using
the mark and mkmk features

Hans



Ok, I have now tried turning off all kinds of features without luck. So, 
I tried putting together minimal test case. I suspect that there should 
be done something more than just turn off some font features. However, 
my ConTeXt skills are very limited so I can be wrong.


The goal is that the word passed from ConTeXt file remains as it is 
written and gives unicode characters U+5e1, U+5d5, U+5bc and U+5e1. This 
is what already happens when the word is in the lua file.


Simo

PS: In case this matters. My ConTeXt MkIV version is 2012.09.23 12:40. 
It should be the latest for Ubuntu 12.04 LTS Precise Pangolin that is in 
the Adam Reviczky's PPA.



%% testcase.tex

\definefontfeature[hebrew][arabic][script=hebr]
\definefont[dejavusans][name:dejavusans*hebrew at 26pt]
\setupdirections[bidi=global]

\starttext
\dejavusans

\def\Macro#1{\directlua{
dofile(resolvers.findfile(testcase.lua))
userdata.testfunction(#1)
}}

\Macro{סוּס}

\blank[1cm]however, we can still color these independently\blank[0.5cm]

\color[red]{ס}\color[green]{ו}\color[blue]{ּ}\color[yellow]{ס}

\stoptext


-- testcase.lua

userdata = userdata or {}

function userdata.testfunction(word)

tex.sprint(\\blank[1cm]word passed by macro\\blank[0.5cm])

for i = 1, unicode.utf8.len(word) do
		tex.sprint(U+ .. string.format(%x,unicode.utf8.byte(word,i)) .. : 
 .. unicode.utf8.sub(word,i,i) .. \\par )

end

tex.sprint(\\blank[1cm]word written in lua file\\blank[0.5cm])

word = סוּס

for i = 1, unicode.utf8.len(word) do
		tex.sprint(U+ .. string.format(%x,unicode.utf8.byte(word,i)) .. : 
 .. unicode.utf8.sub(word,i,i) .. \\par )

end
end

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

[NTG-context] Problem with ConTeXt (MkIV), Hebrew and ligatures

2012-09-27 Thread Simo Ojala
This is a problem originally posted in TeX/StackExchange. However, since 
I have not had any luck in finding a solution I post it here too. I am 
confident that somebody here should know the answer.



http://tex.stackexchange.com/questions/73970/problem-with-context-mkiv-hebrew-and-ligatures

Since I last played with the latest ConTeXt MkIV, there has been 
introduced this new feature. It now seems to combine Hebrew characters 
automatically when possible to ligatures. So for example. If I have a 
word with following two characters:


U+05D5 (HEBREW LETTER VAV)
U+05BC (HEBREW POINT DAGESH OR MAPIQ)

ConTeXt will combine these to:

U+FB35 (HEBREW LETTER VAV WITH DAGESH)

However, I would need to disable this feature for a number of reasons.
For example, this breaks my little database query, because the query key 
is changed before(?) macro gets it.


So if somebody would know how to turn this off and maybe also that what 
has changed.




Sincerely,

Simo Ojala
smsoj...@gmail.com
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___