Hi Zdenek,
I do not think anybody disputes the fact that characters are not glyphs.
The confusion arises that a character in CS is well defined and has a
history.
To be more exact it is just one byte in size so that there can be only
256 characters.
Keith J. Schultz wrote:
I do not think anybody disputes the fact that characters are not glyphs.
The confusion arises that a character in CS is well defined and has a
history.
To be more exact it is just one byte in size so that there can be only
256 characters.
2011/11/19 Ross Moore ross.mo...@mq.edu.au:
Hi Zdenek,
On 19/11/2011, at 10:30 AM, Zdenek Wagner wrote:
/ActualText is your friend here.
You tag the content and provide the string that you want to appear
with Copy/Paste as the value associated to a dictionary key.
I do not know whether
2011/11/19 Keith J. Schultz keithjschu...@web.de:
Hi Zdenek,
I do not think anybody disputes the fact that characters are not
glyphs.
The confusion arises that a character in CS is well defined and has a
history.
To be more exact it is just one byte in size so that
Am Sat, 19 Nov 2011 00:30:58 +0100 schrieb Zdenek Wagner:
/ActualText is your friend here.
You tag the content and provide the string that you want to appear
with Copy/Paste as the value associated to a dictionary key.
I do not know whether the PDF specification has evolved since I read
it
2011/11/19 Ulrike Fischer ne...@nililand.de:
Am Sat, 19 Nov 2011 00:30:58 +0100 schrieb Zdenek Wagner:
/ActualText is your friend here.
You tag the content and provide the string that you want to appear
with Copy/Paste as the value associated to a dictionary key.
I do not know whether the
Karljürgen G. Feuerherm, PhD
Undergraduate Advisor
Department of Archaeology and Classical Studies
Wilfrid Laurier University
75 University Avenue West
Waterloo, Ontario N2L 3C5
Tel. (519) 884-1970 x3193
Fax (519) 883-0991 (ATTN Arch. Classics)
On Sat, Nov 19, 2011 at 3:39 AM, in message
OUCH! I have been hit by a veteran truck drivers truck. ;-))
I concede!
I am curious if many still know what a XX-bit word is. Is that term even still
used?
Turn Unicode needs to be clean up it has become to fragmented.
regards
Keith.
Am 19.11.2011 um 09:39 schrieb Philip TAYLOR:
Am 19.11.2011 um 13:51 schrieb Zdenek Wagner:
2011/11/19 Keith J. Schultz keithjschu...@web.de:
As for getting junk when copying unicode, just copy between to text
using different fonts, where one font does
not contain the glyph.
When performing copypaste or text search
On 2011-11-19 14:25, Keith J. Schultz wrote:
Perhaps this can be of use:
https://github.com/wspr/fontspec/issues/121
Am 19.11.2011 um 13:51 schrieb Zdenek Wagner:
2011/11/19 Keith J. Schultz keithjschu...@web.de
mailto:keithjschu...@web.de:
As for getting junk when copying
2011/11/19 Pander pan...@users.sourceforge.net:
On 2011-11-19 14:25, Keith J. Schultz wrote:
Perhaps this can be of use:
https://github.com/wspr/fontspec/issues/121
As Khaled wrote, it belongs to the engine. ZWJ and ZWNJ are used in
Indic scripts and they work fine since I started to use
On Sat, Nov 19, 2011 at 5:19 AM, Keith J. Schultz keithjschu...@web.de wrote:
OUCH! I have been hit by a veteran truck drivers truck. ;-))
I concede!
I am curious if many still know what a XX-bit word is. Is that term even
still used?
It will fade out of use until someone decides we need
Hi Pihilip,
Thoughout, my programming life and experience I have learned
that internal structure means nothing, as long as the result is correct
when it comes out.
As you rightfully point out the problem lies inside how TeX internally
handles space characters when adding them to its internal
2011/11/18 Keith J. Schultz keithjschu...@web.de:
Hi Pihilip,
Thoughout, my programming life and experience I have learned
that internal structure means nothing, as long as the result is correct
when it comes out.
As you rightfully point out the problem lies inside how TeX internally
Zdenek Wagner wrote:
I admit that things could be done better than in nowadays TeX but its
complete revamping seems to me as bad investment. I would rather think
of an FO processor.
And I agree with Zdeněk : this discussion will be productive only
if we focus on what can be accomplished
Am Fri, 18 Nov 2011 08:31:28 +1100 schrieb Ross Moore:
Yes, that's the point. The goal of TeX is nice typographical
appearance. The goal of XML is easy data exchange. If I want to send
structured data, I send XML, not PDF.
These days people want both.
One question which pops up regularly
Is it safe to assume that these code listings
are restricted to the ASCII character set ? If
so, yes, spaces are likely to be a problem, but
if the code listing can also include ligature-
digraphs, then these are likely to prove even
more problematic.
** Phipl.
Ulrike Fischer wrote:
2011/11/18 Philip TAYLOR p.tay...@rhul.ac.uk:
Is it safe to assume that these code listings
are restricted to the ASCII character set ? If
so, yes, spaces are likely to be a problem, but
if the code listing can also include ligature-
digraphs, then these are likely to prove even
more
On Fri, 18 Nov 2011 13:52:56 +0100, Zdenek Wagner
zdenek.wag...@gmail.com
wrote:
2011/11/18 Philip TAYLOR p.tay...@rhul.ac.uk:
Is it safe to assume that these code listings
are restricted to the ASCII character set ? If
so, yes, spaces are likely to be a problem, but
if the code listing can
2011/11/18 maxwell maxw...@umiacs.umd.edu:
On Fri, 18 Nov 2011 13:52:56 +0100, Zdenek Wagner
zdenek.wag...@gmail.com
wrote:
2011/11/18 Philip TAYLOR p.tay...@rhul.ac.uk:
Is it safe to assume that these code listings
are restricted to the ASCII character set ? If
so, yes, spaces are likely
Hi Zdenek,
On 19/11/2011, at 9:51 AM, Zdenek Wagner wrote:
This is a demonstration that glyphs are not the same as characters. I
will startt with a simpler case and will not put Devanagari to the
mail message. If you wish to write a syllable RU, you have to add a
dependent vowel (matra) U to
2011/11/19 Ross Moore ross.mo...@mq.edu.au:
Hi Zdenek,
On 19/11/2011, at 9:51 AM, Zdenek Wagner wrote:
This is a demonstration that glyphs are not the same as characters. I
will startt with a simpler case and will not put Devanagari to the
mail message. If you wish to write a syllable RU,
Hi Zdenek,
On 19/11/2011, at 10:30 AM, Zdenek Wagner wrote:
/ActualText is your friend here.
You tag the content and provide the string that you want to appear
with Copy/Paste as the value associated to a dictionary key.
I do not know whether the PDF specification has evolved since I read
O.K.
You mention in a later post that you do consider a space as a printable
character.
I do disagree, in the sense that, even though you actually can not see how many
spaces are in a run,
that it does have a size and thereby does have a fixed visual affect.
I do agree with you, that a space
Am 17.11.2011 um 11:26 schrieb Keith J. Schultz:
O.K.
You mention in a later post that you do consider a space as a printable
character.
This line should read as:
You mention in a later post that you consider a space as a
non-printable character.
I do disagree, in the
Keith J. Schultz wrote:
Am 17.11.2011 um 11:26 schrieb Keith J. Schultz:
O.K.
You mention in a later post that you do consider a space as a printable
character.
This line should read as:
You mention in a later post that you consider a space as a
non-printable
Hi Phil,
On 17/11/2011, at 23:53, Philip TAYLOR p.tay...@rhul.ac.uk wrote:
Keith J. Schultz wrote:
You mention in a later post that you do consider a space as a printable
character.
This line should read as:
You mention in a later post that you consider a space as a
Ross, I do not dispute your arguments : I was answering
Keith's question in an honest way. I (personally) do not
think of a space in TeX output as a character at all,
because I am steeped in TeX philosophy; but I am quite
willing to accept that /if/ the objective is not to
produce output for the
2011/11/17 Ross Moore ross.mo...@mq.edu.au:
Hi Phil,
On 17/11/2011, at 23:53, Philip TAYLOR p.tay...@rhul.ac.uk wrote:
Keith J. Schultz wrote:
You mention in a later post that you do consider a space as a printable
character.
This line should read as:
You mention in a later
Hello Zdenek,
On 18/11/2011, at 7:49 AM, Zdenek Wagner wrote:
But a formatting instruction for one program cannot serve as reliable input
for another.
A heuristic is then needed, to attempt to infer that a programming
instruction must have been used, and guess what kind of instruction it
Hi Phil,
On 18/11/2011, at 6:56 AM, Philip TAYLOR wrote:
Ross, I do not dispute your arguments : I was answering
Keith's question in an honest way. I (personally) do not
think of a space in TeX output as a character at all,
because I am steeped in TeX philosophy; but I am quite
willing to
2011/11/17 Ross Moore ross.mo...@mq.edu.au:
Hello Zdenek,
On 18/11/2011, at 7:49 AM, Zdenek Wagner wrote:
But a formatting instruction for one program cannot serve as reliable input
for another.
A heuristic is then needed, to attempt to infer that a programming
instruction must have been
Hi Philip,
We are basically are following the same lines.
TeX is foremost a layout program based standard printers
methology.where the space character is white space and not a glyph.
We actually, do have to differentiate between the two in discussions.
The crux of of the problem is in
Keith J. Schultz wrote:
The crux of of the problem is in (Xe)TeX's parsing algorithm. I never liked it
and personally I have many problems it.
Is this XeTeX-specific, Keith, or do you also dislike
TeX's parsing algorithm ? And what is it that you
dislike, and how would you propose that it
Hi Tobias,
Am 14.11.2011 um 18:42 schrieb Tobias Schoel:
Am 14.11.2011 18:30, schrieb msk...@ansuz.sooke.bc.ca:
[snip, snip]
Now we come to the trouble of Unicode specifying a line-breaking algorithm (
http://www.unicode.org/reports/tr14/tr14-26.html ), which probably isn't
On Tue, Nov 15, 2011 at 2:27 AM, Keith J. Schultz keithjschu...@web.de wrote:
Hi all,
I agree that XeTeX should support all printable characters.
Given your definition I would say all visible printed characters.
Invisible characters are a problem in a programming language.
A non.breaking
Keith J. Schultz wrote:
A non.breaking space is to me a printable character, in so far that
it is important and must be used to distinguish between word space, et all.
If, for you, [a] non.breaking space is a printable character, then
presumably that character must be taken from some font.
On 11/15/2011 5:39 AM, Chris Travers wrote:
My recommendation is:
1) Default to handling all white space as it exists now.
2) Provide some sort of switch, whether to the execution of XeTeX or
to the document itself, to turn on handling of special unicode
characters.
3) If that switch is
2011/11/15 Mike Maxwell maxw...@umiacs.umd.edu:
On 11/15/2011 5:39 AM, Chris Travers wrote:
My recommendation is:
1) Default to handling all white space as it exists now.
2) Provide some sort of switch, whether to the execution of XeTeX or
to the document itself, to turn on handling of
2011/11/15 Zdenek Wagner zdenek.wag...@gmail.com:
2011/11/15 Mike Maxwell maxw...@umiacs.umd.edu:
On 11/15/2011 5:39 AM, Chris Travers wrote:
My recommendation is:
1) Default to handling all white space as it exists now.
2) Provide some sort of switch, whether to the execution of XeTeX or
Zdenek Wagner wrote:
The only reasonable solution seems to be the one suggested by Phil Taylor, to
extend \catcode up to 255 and assign special categories to other types
of characters. Thus we could say that normal space id 10, nonbreakable
space is 16, thin space is 17 etc. XeTeX will then
Chris Travers wrote:
But we are talking two different things here. The first is user
interface, and the second is mechanism.
What I am saying is special handling of this sort should be required
to be enabled somehow by the user. I don't really care how. It could
be by a commandline switch
2011/11/15 Chris Travers chris.trav...@gmail.com:
2011/11/15 Zdenek Wagner zdenek.wag...@gmail.com:
2011/11/15 Mike Maxwell maxw...@umiacs.umd.edu:
On 11/15/2011 5:39 AM, Chris Travers wrote:
My recommendation is:
1) Default to handling all white space as it exists now.
2) Provide some
2011/11/15 Philip TAYLOR p.tay...@rhul.ac.uk:
Zdenek Wagner wrote:
The only reasonable solution seems to be the one suggested by Phil
Taylor, to
extend \catcode up to 255 and assign special categories to other types
of characters. Thus we could say that normal space id 10, nonbreakable
2011/11/15 Philip TAYLOR p.tay...@rhul.ac.uk:
Chris Travers wrote:
But we are talking two different things here. The first is user
interface, and the second is mechanism.
What I am saying is special handling of this sort should be required
to be enabled somehow by the user. I don't
Zdenek Wagner wrote:
If you know what such characters are (and it will certainly be
documented), you just set their categories back to 12 in order to get
the old behaviour.
No ! A catcode is for life, not just for Christmas ! Once a
character has been read, and bound into a
On Tue, Nov 15, 2011 at 02:20:17PM +, Philip TAYLOR wrote:
No ! A catcode is for life, not just for Christmas ! Once a
character has been read, and bound into a character/catcode pair,
that catcode remains immutable.
Do you mean that as a general good practice in TeX programming, or as
2011/11/15 Philip TAYLOR p.tay...@rhul.ac.uk:
Zdenek Wagner wrote:
If you know what such characters are (and it will certainly be
documented), you just set their categories back to 12 in order to get
the old behaviour.
No ! A catcode is for life, not just for Christmas ! Once a
Arthur Reutenauer wrote:
On Tue, Nov 15, 2011 at 02:20:17PM +, Philip TAYLOR wrote:
No ! A catcode is for life, not just for Christmas ! Once a
character has been read, and bound into a character/catcode pair,
that catcode remains immutable.
Do you mean that as a general good
On Nov 15, 2011, at 8:52 AM, Philip TAYLOR wrote:
Arthur Reutenauer wrote:
On Tue, Nov 15, 2011 at 02:20:17PM +, Philip TAYLOR wrote:
No ! A catcode is for life, not just for Christmas ! Once a
character has been read, and bound into a character/catcode pair,
that catcode remains
Zdenek Wagner wrote:
Of course, I know it. What I meant was that you could set \catcode of
all these extended characters to 12 at the beginning of your
document. Thus you get the same behaviour as now.
Ah yes : with that, I have no problem.
** Phil.
2011/11/15 Herbert Schulz he...@wideopenwest.com:
On Nov 15, 2011, at 8:52 AM, Philip TAYLOR wrote:
Arthur Reutenauer wrote:
On Tue, Nov 15, 2011 at 02:20:17PM +, Philip TAYLOR wrote:
No ! A catcode is for life, not just for Christmas ! Once a
character has been read, and bound into
The latter is what the TeXbok says (P.~39) : Once a category code
has been attached to a character token, the attachment is permanent.
Yes, because you meant individual tokens (which I understood in
retrospect). But in the context of the discussion, you really seemed to
be saying that you
Arthur Reutenauer wrote:
The latter is what the TeXbok says (P.~39) : Once a category code
has been attached to a character token, the attachment is permanent.
Yes, because you meant individual tokens (which I understood in
retrospect). But in the context of the discussion, you really
Herbert Schulz wrote:
The latter is what the TeXbok says (P.~39) : Once a category code
has been attached to a character token, the attachment is permanent.
** Phil.
What happens in a verbatim environment?
The verbatim environment sets up an environment within
which characters that have
On Nov 15, 2011, at 11:19 AM, Philip TAYLOR wrote:
Herbert Schulz wrote:
The latter is what the TeXbok says (P.~39) : Once a category code
has been attached to a character token, the attachment is permanent.
** Phil.
What happens in a verbatim environment?
The verbatim
On Nov 15, 2011, at 11:11 AM, Herbert Schulz wrote:
On Nov 15, 2011, at 11:19 AM, Philip TAYLOR wrote:
Herbert Schulz wrote:
The latter is what the TeXbok says (P.~39) : Once a category code
has been attached to a character token, the attachment is permanent.
** Phil.
What
I think it made more sense with can't, Herb,
but that could be a trans-Atlantic difference
of usage -- you would, I think, say I could care
less where I would say I couldn't care less.
** Phil.
Herbert Schulz wrote:
What I meant to say was...
So what you are saying is not that you
On Nov 15, 2011, at 2:43 PM, Ross Moore wrote:
On 16/11/2011, at 5:56 AM, Herbert Schulz wrote:
Given that TeX (and XeTeX too) deal wit a non-breakble space already (where
we usually use the ~ to represent that space) it seems to me that XeTeX
should treat that the same way.
No, I
2011/11/15 Ross Moore ross.mo...@mq.edu.au:
On 16/11/2011, at 5:56 AM, Herbert Schulz wrote:
Given that TeX (and XeTeX too) deal wit a non-breakble space already (where
we usually use the ~ to represent that space) it seems to me that XeTeX
should treat that the same way.
No, I disagree
Hi Zdenek,
On 16/11/2011, at 8:58 AM, Zdenek Wagner wrote:
2011/11/15 Ross Moore ross.mo...@mq.edu.au:
On 16/11/2011, at 5:56 AM, Herbert Schulz wrote:
Given that TeX (and XeTeX too) deal wit a non-breakble space already (where
we usually use the ~ to represent that space) it seems to
2011/11/15 Ross Moore ross.mo...@mq.edu.au:
Hi Zdenek,
On 16/11/2011, at 8:58 AM, Zdenek Wagner wrote:
2011/11/15 Ross Moore ross.mo...@mq.edu.au:
On 16/11/2011, at 5:56 AM, Herbert Schulz wrote:
Given that TeX (and XeTeX too) deal wit a non-breakble space already
(where we usually use
Hi Phil,
On 16/11/2011, at 8:45 AM, Philip TAYLOR wrote:
Ross Moore wrote:
On 16/11/2011, at 5:56 AM, Herbert Schulz wrote:
Given that TeX (and XeTeX too) deal wit a non-breakble space already (where
we usually use the ~ to represent that space) it seems to me that XeTeX
should treat
2011/11/15 Ross Moore ross.mo...@mq.edu.au:
Hi Phil,
On 16/11/2011, at 8:45 AM, Philip TAYLOR wrote:
Ross Moore wrote:
On 16/11/2011, at 5:56 AM, Herbert Schulz wrote:
Given that TeX (and XeTeX too) deal wit a non-breakble space already
(where we usually use the ~ to represent that
I was going to make the following point earlier--maybe in light of
Phil's conclusion I should do it now.
There seems to be a tendency not to distinguish between a(n orginal)
character in the sense of character of a writing system, and a computer
character.
The former are visible symbols on a
Hi Phil,
On 16/11/2011, at 10:08 AM, Zdenek Wagner wrote:
How do you explain to somebody the need to do something really,
really special to get a character that they can type, or copy/paste?
There is no special role for this character in other vital aspects
of how TeX works, such as there
2011/11/16 Ross Moore ross.mo...@mq.edu.au:
On 16/11/2011, at 9:45 AM, Zdenek Wagner wrote:
2011/11/15 Ross Moore ross.mo...@mq.edu.au:
What if you really want the Ux00A0 character to be in the PDF?
That is, when you copy/paste from the PDF, you want that character
to come along for the
Ross Moore wrote:
Hi Phil,
On 16/11/2011, at 10:08 AM, Zdenek Wagner wrote:
Not I, Sir : Zdeněk !
** Phil.
--
Subscriptions, Archive, and List information, etc.:
http://tug.org/mailman/listinfo/xetex
Hi Zdenek,
On 16/11/2011, at 11:19 AM, Zdenek Wagner wrote:
Just like any other Unicode character, if you want it then
you should be able to put it in there.
You ARE able to do it. Choose a font with that glyph, set \catcode to
11 or 12 and that's it. What else do you wish to do?
The
msk...@ansuz.sooke.bc.ca wrote:
various points with which I have no reason to disagree at this time, followed
by
2. Inevitably, people will include invalid characters in TeX input; and
U+00A0 is an invalid character for TeX input.
Firstly (as is clear from the list on which we are
2011/11/14 Philip TAYLOR p.tay...@rhul.ac.uk:
msk...@ansuz.sooke.bc.ca wrote:
various points with which I have no reason to disagree at this time,
followed by
2. Inevitably, people will include invalid characters in TeX input; and
U+00A0 is an invalid character for TeX input.
Firstly
On Mon, 14 Nov 2011, Philip TAYLOR wrote:
2. Inevitably, people will include invalid characters in TeX input; and
U+00A0 is an invalid character for TeX input.
Firstly (as is clear from the list on which we are discussing
this), we are not discussing TeX but XeTeX. Secondly, even
XeTeX is
msk...@ansuz.sooke.bc.ca wrote:
XeTeX is a TeX engine. Obviously, it is free to define its own input
format, and that format already differs from other TeX engines by (for
instance) allowing some Unicode code points outside the 7-bit range.
I think (with respect) that some Unicode code
On Mon, 14 Nov 2011, Philip TAYLOR wrote:
I think (with respect) that some Unicode code points outside the 7-bit range
is a gross understatement. As far as I am aware, XeTeX permits a very
considerable
subset of Unicode (perhaps even all of it; I do not know) as input.
My point is that it
Am 14.11.2011 18:30, schrieb msk...@ansuz.sooke.bc.ca:
1. No. That is not what Unicode is for. Unicode's goal is to subsume
all reasonable pre-existing encodings.
Unicode is even more. Look at all the Annexes to Unicode 6.0
Some reasonable pre-existing
encodings include a non-breaking
msk...@ansuz.sooke.bc.ca wrote:
On Mon, 14 Nov 2011, Philip TAYLOR wrote:
I think (with respect) that some Unicode code points outside the 7-bit range
is a gross understatement. As far as I am aware, XeTeX permits a very
considerable
subset of Unicode (perhaps even all of it; I do not know)
On Mon, Nov 14, 2011 at 12:15 PM, in message
4ec14cb5.7000...@rhul.ac.uk,
Philip TAYLOR p.tay...@rhul.ac.uk wrote:
XeTeX is a TeX engine. Obviously, it is free to define its own
input
format, and that format already differs from other TeX engines by
(for
instance) allowing some Unicode code
On Mon, 14 Nov 2011, Karljurgen Feuerherm wrote:
I use U+12000 and above regularly, as a case in point...
Do you think that basic formatting control functions should be bound to
code points in that range, as the preferred way of accessing those
functions? Let's not lose track of what this
I didn't say anything about U+00A0 one way or the other
Keeping in mind that the purpose of this software is to get work done,
and not to fulfil anyone's philosophical notions of software, my general
feeling is that:
* Xe(La)TeX should support plain text characters--for *my* present
purpose,
msk...@ansuz.sooke.bc.ca wrote:
various points with which I have no reason to disagree at this time, followed
by
2. Inevitably, people will include invalid characters in TeX input; and
U+00A0 is an invalid character for TeX input.
Firstly (as is clear from the list on which we are
80 matches
Mail list logo