On 6/5/15 14:14, Joseph Wright wrote:
Based on the current files, we have a block to set \XeTeXcharclass,
which only applies to XeTeX. The logic followed in that code is that
characters in the file LineBreak.txt which have class ID (ideographs)
not only set the \XeTeXcharclass class to 1 but
Hello all,
As some people will have seen, the LaTeX team have recently integrated
setting of codes (\catcode, \lccode, etc.) for the entire Unicode range
into the kernel when XeTeX/LuaTeX are in use. This is not a functional
change for end users but does mean that the team now have some control
While working on these bugs, we also discussed how surrogate
characters were handled in XeTeX. Surrogate characters are the 2048
code points that are used in UTF-16 to encode characters with code
points above 65536: a pair of them makes up one Unicode character;
however they're not meant to be
On 6 May 2015 at 23:04, Arthur Reutenauer
arthur.reutena...@normalesup.org wrote:
While working on these bugs, we also discussed how surrogate
characters were handled in XeTeX. Surrogate characters are the 2048
code points that are used in UTF-16 to encode characters with code
points above
The character itself, as bytes that is, is not wrong and users should be able
to create these.
But preferably through macros that ensure that they come correctly paired.
placing two character tokens representing a surrogate pair should not
though magically turn itself
into a single character.
The only mark that remains when making all capitals is the dieredis
(dialytika). All other vanish. This is common knowledge for people who speak
and write Greek.
AS
Στάλθηκε από το Ταχυδρομείο Yahoo στο Android
--
Subscriptions, Archive,
Hi Arthur,
On 07/05/2015, at 8:04, Arthur Reutenauer arthur.reutena...@normalesup.org
wrote:
While working on these bugs, we also discussed how surrogate
characters were handled in XeTeX. Surrogate characters are the 2048
code points that are used in UTF-16 to encode characters with code
On 06/05/2015 21:06, David Carlisle wrote:
On 6 May 2015 at 20:15, Philip Taylor p.tay...@rhul.ac.uk wrote:
Apostolos Syropoulos wrote:
It seems to me that most people have no idea what Unicode is and what is
really
involved.
OK, so if we restrict the Universe of Discourse to the set of
On 2015-05-06, Apostolos Syropoulos asyropou...@yahoo.com wrote:
I checked a bit the file and I have noticed that
\L 1F10 1F18 1F10 %
while xgreek.sty defines
\global\lccode1F10=1F10 \global\uccode1F10=0395
You see the uppercase of 'GREEK SMALL LETTER EPSILON WITH PSILI'
is 'GREEK LETTER
On 6 May 2015 at 20:15, Philip Taylor p.tay...@rhul.ac.uk wrote:
Apostolos Syropoulos wrote:
It seems to me that most people have no idea what Unicode is and what is
really
involved.
OK, so if we restrict the Universe of Discourse to the set of native
Hellenic speakers who know what
On 06/05/2015 15:09, Jonathan Kew wrote:
On 6/5/15 14:14, Joseph Wright wrote:
Based on the current files, we have a block to set \XeTeXcharclass,
which only applies to XeTeX. The logic followed in that code is that
characters in the file LineBreak.txt which have class ID (ideographs)
not
David Carlisle wrote:
I don't think that's the right question. Even if everyone, including
the Unicode technical committee, agreed some properties are
incorrect for some characters, it isn't clear we should change them
at this level.
You are (inadvertently) conflating my question with
On 06/05/2015 16:04, Apostolos Syropoulos wrote:
Hello,
I checked a bit the file and I have noticed that
\L 1F10 1F18 1F10 %
while xgreek.sty defines
\global\lccode1F10=1F10 \global\uccode1F10=0395
You see the uppercase of 'GREEK SMALL LETTER EPSILON WITH PSILI'
is 'GREEK
Apostolos Syropoulos wrote:
It seems to me that most people have no idea what Unicode is and what is
really
involved.
OK, so if we restrict the Universe of Discourse to the set of native
Hellenic speakers who know what Unicode is, know the importance of being
able to use it to identify
Apostolos Syropoulos wrote:
I'd suggest that the basic (Xe|Lua)TeX formats should simply follow
Unicode properties.
In addition, I would suggest that somewhere it is explained why this
is not correct. Otherwise, people would see strange things and might
wonder why they see them.
How
How united is the Hellenic-speaking world about this, Apostolos ? Is it
a universal truth, universally accepted, or are there some (even just a
few) who maintain that Unicode is right and everyone else is wrong ?
It seems to me that most people have no idea what Unicode is and what is
On 6/5/15 16:29, Philip Taylor wrote:
Apostolos Syropoulos wrote:
the uppercase of 'GREEK SMALL LETTER EPSILON WITH PSILI'
is 'GREEK LETTER EPSILON' and not 'GREEK LETTER EPSILON WITH PSILI.
Some time ago I reported this to the Unicode people and they told me
something like we cannot
Hi David,
On 07/05/2015, at 9:26 AM, David Carlisle wrote:
The character itself, as bytes that is, is not wrong and users should be
able to create these.
But preferably through macros that ensure that they come correctly paired.
placing two character tokens representing a surrogate pair
Apostolos Syropoulos wrote:
the uppercase of 'GREEK SMALL LETTER EPSILON WITH PSILI'
is 'GREEK LETTER EPSILON' and not 'GREEK LETTER EPSILON WITH PSILI.
Some time ago I reported this to the Unicode people and they told me
something like we cannot change it now (I do not remember the
Hello,
I checked a bit the file and I have noticed that
\L 1F10 1F18 1F10 %
while xgreek.sty defines
\global\lccode1F10=1F10 \global\uccode1F10=0395
You see the uppercase of 'GREEK SMALL LETTER EPSILON WITH PSILI'
is 'GREEK LETTER EPSILON' and not 'GREEK LETTER EPSILON WITH PSILI.
Some
20 matches
Mail list logo