[NTG-context] Index sorting for other languages than English (2)

2006-05-30 Thread R. Ermers
Hi all,

I have a document in Dutch (\mainlanguage[nl]) in which I quote Turkish 
items, which I want to collect in a separate index, like this:

Enkele voorbeelden zijn: \quote{oudere zus} \turkish{abla}, 
\quote{jongere broer of zus} \turkish{karde\c{s}}, de \quote{zus van 
vader} (\quote{tante}) \turkish{hala, \quote{de zus van moeder} 
\turkish{teyze}. Voor aangetrouwde familieleden gelden soms juist vagere 
termen dan in het Nederlands, bijv. \quote{aangetrouwde tante} en 
\quote{schoonzuster}, \turkish{yenge}.

The index, however, is based on Dutch (mainlanguage). This causes two 
problems:

1. words with accents, like s\oz, are not sorted correctly to any standard:
S
söz kesmek 76
saygı 14
s¸eref 3, 14, 24, 27

2. letters with diacritics, like \c{s} (under which \c{s}eref is to be 
placed) are not included in the alphabetical listing in the index, which 
of course follows the Dutch alphabet.

Does anyone have a solution?

Regards,

Robert


___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context


Re: [NTG-context] Index sorting for other languages than English (2)

2006-05-30 Thread Richard Gabriel




I'd suggest you to use the extended variant of the \index macro. There you can specify an ASCII equivalent of the word, which will be used for sorting:\index[soz kesmek]{s\"oz kesmek}\index[seref]{\c seref}-RichardFrom: "R. Ermers" [mailto:[EMAIL PROTECTED]To: mailing list for ConTeXt users [mailto:[EMAIL PROTECTED]Sent: Tue, 30 May 2006 08:43:01 +0200Subject: [NTG-context] Index sorting for other languages than English (2)Hi all,

I have a document in Dutch (\mainlanguage[nl]) in which I quote Turkish 
items, which I want to collect in a separate index, like this:

"Enkele voorbeelden zijn: \quote{oudere zus} \turkish{abla}, 
\quote{jongere broer of zus} \turkish{karde\c{s}}, de \quote{zus van 
vader} (\quote{tante}) \turkish{hala, \quote{de zus van moeder} 
\turkish{teyze}. Voor aangetrouwde familieleden gelden soms juist vagere 
termen dan in het Nederlands, bijv. \quote{aangetrouwde tante} en 
\quote{schoonzuster}, \turkish{yenge}."

The index, however, is based on Dutch (mainlanguage). This causes two 
problems:

1. words with accents, like s\"oz, are not sorted correctly to any standard:
S
söz kesmek 76
saygı 14
s¸eref 3, 14, 24, 27

2. letters with diacritics, like \c{s} (under which \c{s}eref is to be 
placed) are not included in the alphabetical listing in the index, which 
of course follows the Dutch alphabet.

Does anyone have a solution?

Regards,

Robert


___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context

___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context


Re: [NTG-context] Index sorting for other languages than English (2)

2006-05-30 Thread Hans Hagen
R. Ermers wrote:
 Hi all,

 I have a document in Dutch (\mainlanguage[nl]) in which I quote Turkish 
 items, which I want to collect in a separate index, like this:

 Enkele voorbeelden zijn: \quote{oudere zus} \turkish{abla}, 
 \quote{jongere broer of zus} \turkish{karde\c{s}}, de \quote{zus van 
 vader} (\quote{tante}) \turkish{hala, \quote{de zus van moeder} 
 \turkish{teyze}. Voor aangetrouwde familieleden gelden soms juist vagere 
 termen dan in het Nederlands, bijv. \quote{aangetrouwde tante} en 
 \quote{schoonzuster}, \turkish{yenge}.

 The index, however, is based on Dutch (mainlanguage). This causes two 
 problems:

 1. words with accents, like s\oz, are not sorted correctly to any standard:
 S
 söz kesmek 76
 saygı 14
 s¸eref 3, 14, 24, 27

 2. letters with diacritics, like \c{s} (under which \c{s}eref is to be 
 placed) are not included in the alphabetical listing in the index, which 
 of course follows the Dutch alphabet.

 Does anyone have a solution?
   
hm, so we need a mixed sorting mechanism

(in sort-lan.tex you can define a sort order for turkish but it still 
concerns the whole doc then)

Hans
___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context


Re: [NTG-context] Index sorting for other languages than English (2)

2006-05-30 Thread Hans Hagen
Richard Gabriel wrote:
 I'd suggest you to use the extended variant of the \index macro. There 
 you can specify an ASCII equivalent of the word, which will be used 
 for sorting:

 \index[soz kesmek]{s\oz kesmek}
 \index[seref]{\c seref}
actually, supporting multiple indexes with their own sort order is kind 
of prepared but never completed, so i'll have a look at it

Hans

-- 

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-

___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context


Re: [NTG-context] Index sorting for other languages than English

2006-05-30 Thread Richard Gabriel




Hello Hans,I'm sorry but when you were adding my sorting rules for Czech, you've (probably by accident) deleted the definition of \czsortdivisionch which leads to errors when trying to sort a word on "ch".I've also made some minor corrections. Here is the updated version:% ---\def\czsortdivisionch{ch}\def\czsortdivisionCh{Ch}\startmode[sortorder-cz] \exportsortexpansion {aacute} {a+1} \exportsortexpansion {Aacute} {A+1} \exportsortexpansion {ccaron} {c+1} \exportsortexpansion {Ccaron} {C+1} \exportsortdivision {c+1} {ccaron} \exportsortexpansion {dcaron} {d+1} \exportsortexpansion {Dcaron} {D+1} \exportsortdivision {d+1} {dcaron} \exportsortexpansion {eacute} {e+1} \exportsortexpansion {Eacute} {E+1} \exportsortexpansion {ecaron} {e+2} \exportsortexpansion {Ecaron} {E+2} \exportsortreduction {ch} {h+1} \exportsortexpansion {ch} {h+1} \exportsortreduction {Ch} {H+1} \exportsortexpansion {Ch} {H+1} \exportsortdivision {h+1} {czsortdivisionch} \exportsortexpansion {iacute} {i+1} \exportsortexpansion {Iacute} {I+1} \exportsortexpansion {ncaron} {n+1} \exportsortexpansion {Ncaron} {N+1} \exportsortdivision {n+1} {ncaron} \exportsortexpansion {oacute} {o+1} \exportsortexpansion {Oacute} {O+1} \exportsortexpansion {rcaron} {r+1} \exportsortexpansion {Rcaron} {R+1} \exportsortdivision {r+1} {rcaron} \exportsortexpansion {scaron} {s+1} \exportsortexpansion {Scaron} {S+1} \exportsortdivision {s+1} {scaron} \exportsortexpansion {tcaron} {t+1} \exportsortexpansion {Tcaron} {T+1} \exportsortdivision {t+1} {tcaron} \exportsortexpansion {uacute} {u+1} \exportsortexpansion {Uacute} {U+1} \exportsortexpansion {uring} {u+2} \exportsortexpansion {Uring} {U+2} \exportsortexpansion {yacute} {y+1} \exportsortexpansion {Yacute} {Y+1} \exportsortexpansion {zcaron} {z+1} \exportsortexpansion {Zcaron} {Z+1} \exportsortdivision {z+1} {zcaron}\stopmode% ---___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context


Re: [NTG-context] Index sorting for other languages than English

2006-05-30 Thread Hans Hagen
Richard Gabriel wrote:
 Hello Hans,

 I'm sorry but when you were adding my sorting rules for Czech, you've 
 (probably by accident) deleted the definition of \czsortdivisionch 
 which leads to errors when trying to sort a word on ch.
 I've also made some minor corrections. Here is the updated version:
i've (a bit more) finished teutil sorting so that it also can sort different 
registers conform their own language: 

i'll post an alpha (generating now) that can do: 

\defineregister[one]
\defineregister[two] \setupregister[two][language=cz]

\starttext

test \one{one} test \one{two} test \one {\aacute} test \one{alpha} test 
\one{chow}
test \two{one} test \two{two} test \two {\aacute} test \two{alpha} test 
\two{chow}

\blank[3*big] \placeregister[one]
\blank[3*big] \placeregister[two]

\stoptext



-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-

___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context


Re: [NTG-context] Index sorting for other languages that English

2006-05-26 Thread Hans Hagen
Richard Gabriel wrote:
 Thanks Hans, it works with my test file,
 unless I set up:

 \setupregister[index][expansion=xml]

 which i need for correct processing of the XML files.
 If I simply add this command into the testing TeX file (no XML), the 
 Czech sorting stops to work and all accented characters are placed 
 under A.
test file ...

 Regarding the sorting itself (sort-lan.tex):
 I found the definiton of the sorting quite strange, let's say, 
 incomplete.
 It makes no sense to separate ccaron while all other accented letters 
 are placed under the unaccented ones.
 I'll update the definitions, test it and send it to you.
ok 

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-

___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context


Re: [NTG-context] Index sorting for other languages that English

2006-05-26 Thread Richard Gabriel




Here is the test file. If you remove the \setupregister command, or simply set expansion=no, the sorting will work perfectly.With expansion=yes or expansion=xml, the accented letters are sorted under "A".Below are my updated sorting rules again...-Richard---\def\czsortdivisionch{ch}\def\czsortdivisionCh{Ch}\startmode[sortorder-cz] \exportsortexpansion{aacute}{a+1} \exportsortexpansion{Aacute}{A+1} \exportsortexpansion{ccaron}{c+1} \exportsortexpansion{Ccaron}{C+1} \exportsortdivision{c+1}{ccaron} \exportsortexpansion{dcaron}{d+1} \exportsortexpansion{Dcaron}{C+1} \exportsortdivision{d+1}{dcaron} \exportsortexpansion{eacute}{e+1} \exportsortexpansion{Eacute}{E+1} \exportsortexpansion{ecaron}{e+2} \exportsortexpansion{Ecaron}{E+2} \exportsortreduction{ch}{h+1} \exportsortexpansion{ch}{h+1} \exportsortreduction{Ch}{h+1} \exportsortexpansion{Ch}{h+1} \exportsortdivision{h+1}{czsortdivisionch} \exportsortexpansion{iacute}{i+1} \exportsortexpansion{Iacute}{I+1} \exportsortexpansion{ncaron}{n+1} \exportsortexpansion{Ncaron}{n+1} \exportsortdivision{n+1}{ncaron} \exportsortexpansion{oacute}{o+1} \exportsortexpansion{Oacute}{O+1} \exportsortexpansion{rcaron}{r+1} \exportsortexpansion{Rcaron}{R+1} \exportsortdivision{r+1}{rcaron} \exportsortexpansion{scaron}{s+1} \exportsortexpansion{Scaron}{S+1} \exportsortdivision{s+1}{scaron} \exportsortexpansion{tcaron}{t+1} \exportsortexpansion{Tcaron}{T+1} \exportsortdivision{t+1}{tcaron} \exportsortexpansion{uacute}{u+1} \exportsortexpansion{Uacute}{U+1} \exportsortexpansion{uring}{u+2} \exportsortexpansion{Uring}{U+2} \exportsortexpansion{yacute}{y+1} \exportsortexpansion{Yacute}{Y+1} \exportsortexpansion{zcaron}{z+1} \exportsortexpansion{Zcaron}{Z+1} \exportsortdivision{z+1}{zcaron}\stopmodeFrom: Hans Hagen [mailto:[EMAIL PROTECTED]To: mailing list for ConTeXt users [mailto:[EMAIL PROTECTED]Sent: Wed, 24 May 2006 17:55:02 +0200Subject: Re: [NTG-context] Index sorting for other languages that EnglishRichard Gabriel wrote:
 Thanks Hans, it works with my test file,
 unless I set up:

 \setupregister[index][expansion=xml]

 which i need for correct processing of the XML files.
 If I simply add this command into the testing TeX file (no XML), the 
 Czech sorting stops to work and all accented characters are placed 
 under "A".
test file ...

 Regarding the sorting itself (sort-lan.tex):
 I found the definiton of the sorting quite strange, let's say, 
 incomplete.
 It makes no sense to separate ccaron while all other accented letters 
 are placed under the unaccented ones.
 I'll update the definitions, test it and send it to you.
ok 

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-

___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context



test.tex
Description: TeX document
___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context


Re: [NTG-context] Index sorting for other languages that English

2006-05-24 Thread Richard Gabriel




Thanks Hans, it works with my test file, unless I set up:\setupregister[index][expansion=xml]which i need for correct processing of the XML files.If I simply add this command into the testing TeX file (no XML), the Czech sorting stops to work and all accented characters are placed under "A".Regarding the sorting itself (sort-lan.tex): I found the definiton of the sorting quite strange, let's say, incomplete. It makes no sense to separate ccaron while all other accented letters are placed under the unaccented ones.I'll update the definitions, test it and send it to you.-RichardFrom: Hans Hagen [mailto:[EMAIL PROTECTED]To: mailing list for ConTeXt users [mailto:[EMAIL PROTECTED]Sent: Tue, 23 May 2006 17:02:53 +0200Subject: Re: [NTG-context] Index sorting for other languages that EnglishRichard Gabriel wrote:
 Hello Hans,

 after an upgrade I noticed thar the index sorting works even worse 
 than before (tested on Czech, Chinese and Japanese, but probably 
 related to non-ASCII characters in common).

 With TeXExec 5.4.3, all words beginning with national (accented) 
 characters were put into a separate ("symbols") group and placed 
 before "A". This was not good but more or less acceptable.
 With TeXExec 6.2.0, words beginning with accented characters are 
 placed under certain unaccented letter. My colleague found out that 
 these words are sorted according the first unaccented letter. This is 
 unacceptable and unusable.

 We do a "work-around" so we try to avoid indexing words beginning with 
 accented charaters. But it's impossible in many cases.
 I'd like to ask you to improve the index sorting. Could I help or 
 contribute in some way?

 Attached is a testing file, which creates 2 indexes from various Czech 
 words (covering the Czech alphabet). The index should be sorted 
 exactly that way as the terms are written in the file.

actually the nex texexec implementation does czech sorting but it's not enables yet in context itself (was experimental until now) 

- download the latest version (i uploaded a version that enables it) 
- don't forget \mainlanguage[cz] at the top of your document 
- in sort-lan.tex you can see how czech sorting is defined 

(context adds a lot of into to the tui file in order to get sorting done) 

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-

___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context

___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context


[NTG-context] Index sorting for other languages that English

2006-05-23 Thread Richard Gabriel




Hello Hans,after an upgrade I noticed thar the index sorting works even worse than before (tested on Czech, Chinese and Japanese, but probably related to non-ASCII characters in common).With TeXExec 5.4.3, all words beginning with national (accented) characters were put into a separate ("symbols") group and placed before "A". This was not good but more or less acceptable.With TeXExec 6.2.0, words beginning with accented characters are placed under certain unaccented letter. My colleague found out that these words are sorted according the first unaccented letter. This is unacceptable and unusable.We do a "work-around" so we try to avoid indexing words beginning with accented charaters. But it's impossible in many cases.I'd like to ask you to improve the index sorting. Could I help or contribute in some way?Attached is a testing file, which creates 2 indexes from various Czech words (covering the Czech alphabet). The index should be sorted exactly that way as the terms are written in the file.Thanks,Richard

test.tex
Description: TeX document
___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context


Re: [NTG-context] Index sorting for other languages that English

2006-05-23 Thread John R. Culleton
On Tuesday 23 May 2006 06:22, Richard Gabriel wrote:
 Hello Hans,

 after an upgrade I noticed thar the index sorting works even worse than
 before (tested on Czech, Chinese and Japanese, but probably related to
 non-ASCII characters in common).

 With TeXExec 5.4.3, all words beginning with national (accented) characters
 were put into a separate (symbols) group and placed before A. This was
 not good but more or less acceptable. With TeXExec 6.2.0, words beginning
 with accented characters are placed under certain unaccented letter. My
 colleague found out that these words are sorted according the first
 unaccented letter. This is unacceptable and unusable.

 We do a work-around so we try to avoid indexing words beginning with
 accented charaters. But it's impossible in many cases. I'd like to ask you
 to improve the index sorting. Could I help or contribute in some way?

 Attached is a testing file, which creates 2 indexes from various Czech
 words (covering the Czech alphabet). The index should be sorted exactly
 that way as the terms are written in the file.

 Thanks,
 Richard

Try Xindy. It has facilities for sorting according to arbitrary
alphabetic orders including Czech. It fits in the workflow much
as does makeindex, but perhaps it could be adapted to a Context
runstream. 

-- 
John Culleton
Books with answers to marketing and publishing questions:
http://wexfordpress.com/tex/shortlist.pdf

Book coaches, consultants and packagers:
http://wexfordpress.com/tex/packagers.pdf

___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context


Re: [NTG-context] Index sorting for other languages that English

2006-05-23 Thread Hans Hagen
Richard Gabriel wrote:
 Hello Hans,

 after an upgrade I noticed thar the index sorting works even worse 
 than before (tested on Czech, Chinese and Japanese, but probably 
 related to non-ASCII characters in common).

 With TeXExec 5.4.3, all words beginning with national (accented) 
 characters were put into a separate (symbols) group and placed 
 before A. This was not good but more or less acceptable.
 With TeXExec 6.2.0, words beginning with accented characters are 
 placed under certain unaccented letter. My colleague found out that 
 these words are sorted according the first unaccented letter. This is 
 unacceptable and unusable.

 We do a work-around so we try to avoid indexing words beginning with 
 accented charaters. But it's impossible in many cases.
 I'd like to ask you to improve the index sorting. Could I help or 
 contribute in some way?

 Attached is a testing file, which creates 2 indexes from various Czech 
 words (covering the Czech alphabet). The index should be sorted 
 exactly that way as the terms are written in the file.

actually the nex texexec implementation does czech sorting but it's not enables 
yet in context itself (was experimental until now) 

- download the latest version (i uploaded a version that enables it) 
- don't forget \mainlanguage[cz] at the top of your document 
- in sort-lan.tex you can see how czech sorting is defined 

(context adds a lot of into to the tui file in order to get sorting done) 

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-

___
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context