Re: [XeTeX] nbsp; in XeTeX
On Mon, Nov 14, 2011 at 02:27:03AM -0800, Chris Travers wrote: On Mon, Nov 14, 2011 at 2:24 AM, Petr Tomasek toma...@etf.cuni.cz wrote: Using different color. Do we really want to tie XeTeX users to a small number of editors? Chris Travers Do we really make XeTeX incompatible with the rest of the (unicode) world? P.T. -- Petr Tomasek http://www.etf.cuni.cz/~tomasek Jabber: but...@jabbim.cz EA 355:001 DU DU DU DU EA 355:002 TU TU TU TU EA 355:003 NU NU NU NU NU NU NU EA 355:004 NA NA NA NA NA -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
Hi Tobias, Am 14.11.2011 um 18:42 schrieb Tobias Schoel: Am 14.11.2011 18:30, schrieb msk...@ansuz.sooke.bc.ca: [snip, snip] Now we come to the trouble of Unicode specifying a line-breaking algorithm ( http://www.unicode.org/reports/tr14/tr14-26.html ), which probably isn't exactly TeX's. I'm not into these algorithms, so I can't compare. But I would ask some Master of this Art to speak up about this conflict. I went and briefly look at the annex. In the beginning it states that the annexes are not necessarily a requirement unless mentioned in the standard! I did not check the standard, but as you read on the description of the LBA is not mandatory at all. Furthermore, it more or less describes which characters are directly involved with line breaking (top of table 1). The rest is just a suggest how one Might go about achieving line breaking. This is not a standard at all. Since TeX has its own line breaking algorithms we need not be interrested with the content of this annex as far as Unicode is concerned. What you should be aware of is that the LBA is intended as an aide for a preprocessor to a more elaborate line breaking algorithm. It has been approved for printing, but no where does it state that it must be followed nor that it is complete. In other words it is merely a suggestion. There is no conflict per se. Just another way of dealing with line breaking. There is no real standard for line breaking. It is more or less a matter of taste, style and aesthetics. (Yes, there are many conventions that should be observed, and many are grammatical in nature). regards Keith. -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
On Tue, Nov 15, 2011 at 2:27 AM, Keith J. Schultz keithjschu...@web.de wrote: Hi all, I agree that XeTeX should support all printable characters. Given your definition I would say all visible printed characters. Invisible characters are a problem in a programming language. A non.breaking space is to me a printable character, in so far that it is important and must be used to distinguish between word space, et all. As long as this is an option which defaults to off, again I have no problem with this. I mean by this definition, carriage returns and line feeds are also printable characters, and these are supported by options which are turned on rather than on by default. To go back in history, one of my pet peeves in LaTeX was that I had to enter the German characters öäüß as \o, \a, etc and later the short cut forms s, u, etc. later with inputenc I finally, could just enter öäüß.But I had trouble, (actually just needed to convert) my files to and from apple to windows (so that editing was possible on windows). Yet, I still had trouble with quoting, so I was force to use \quote, et al. to have a simple method of quoting properly in english, german and french in one document! I even modified them to suite some requirements I need and I had one command. Unicode has thankfully change all this. I can forget about using all those TeX commands for the characters I need. I just type away. The only problem is now is the keyboard equivalents and how the editor of choice displays them. But here you have a problem. An editor can display a non-breaking space as its semantic value (i.e. with a special glyph, but this is not without problems. For example, we could also display line feeds as the paragraph symbol but now that's also U+00B6, so now you have ambiguity issues-- is it a unicode character or is it a line feed). or you can color code, but this is problematic for a large number of other reasons. So I am not sure these are simple problems that admit of simple solutions. My recommendation is: 1) Default to handling all white space as it exists now. 2) Provide some sort of switch, whether to the execution of XeTeX or to the document itself, to turn on handling of special unicode characters. 3) If that switch is enabled, then treat the whitespaces according to unicode meanings. If not, treat them as standard whitespace. The advantage of this approach is that people who don't want to worry about what sort of whitespace is in text files they are inputting don't have to worry about it, and that those who do have an easy way of determining if a layout issue is caused by non-breaking spaces. Best Wishes, Chris Travers -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
Keith J. Schultz wrote: A non.breaking space is to me a printable character, in so far that it is important and must be used to distinguish between word space, et all. If, for you, [a] non.breaking space is a printable character, then presumably that character must be taken from some font. If you take a character from a font, it will have a size, and although it can be combined with kerning rules to adjust its position w.r.t. adjacent characters, the logic for this is fairly restricted. In particular, it cannot take into account the amount by which TeX is seeking to expand or contract spaces on the current line in order to achieve optimal paragraphs. So in your model of the ideal universe, non-breaking Unicode spaces would not behave as do conventional TeX non-breaking spaces (which /do/ expand and contract to assist in TeX's line-breaking), nor would they conform to their Unicode definition where their decomposition is defined as : noBreak SPACE (U+0020) I wonder if you would like to discuss these points ? Philip Taylor -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] nbsp; in XeTeX
2011/11/14 Mike Maxwell maxw...@umiacs.umd.edu: On 11/14/2011 4:56 PM, Zdenek Wagner wrote: 2011/11/14 Mike Maxwellmaxw...@umiacs.umd.edu: We are not (at least I am not) suggesting that everyone must use the Unicode non-breaking space character, or etc. What we *are* suggesting is that in Xe(La)Tex, we be *allowed* to use those characters, and that they have their You are allowed to use them, nothing prevents you. At least one participant in this thread (or actually the related thread Whitespace in input--the person in question is msk...@ansuz.sooke.bc.ca) has said: U+00A0 is an invalid character for TeX input That sounds pretty much like prevention (although maybe you don't agree with him). I strongly disagree. From the TeX point of view a character is invalid if its \catcode is equal to 15 which is not the case of U+00a0. If an invalid character is found on input, an error message appears in the log. It does not happen with U+00a0 because its \catcode is 12 which means other character. When talking about \catcode I ave in mind a value defined in the format. Even if a character is declared as invalid in the format, a user can assign another \catcode if the character can be rendered. But in fact, the last time I tried this, the NBSP character was interpreted in the same way as an ASCII space, which is not what I want. What I want (repeating myself again) is for such characters to-- NBSP's \catcode is 12, so it is just a glyph in the font, it is not treated specially by XeTeX. Line can be broken at glue if in does not follow other discardable element, at penalty, at \discretionary but not at a glyph, that's why this space is nonbreakable in the XeTeX's eyes. Since it is a glyph, its width is fixed. You can do a few things with it: Change its \catcode to 10, then it will be normal strethable/shrinkable space but will not be nonbreakable Change its \catcode to 13 and define it as \nobreak\space. In such a case it will have the same meaning as ~ have their Unicode-defined semantics, to the extent that makes sense in XeTeX. --just the same as I would expect XeTeX (or xdvipdfmx) to correctly handle the visual re-ordering behavior of U+09C7 through U+09CC, or U+093F (Devanagari vowel sign I). OpenOffice has some intelligence and recognizes the Devanagari script automatically. This is not the case of XeTeX. When loading a Devanagari font you have to switch the script to Devanagari too. Then XeTeX properly handles U+093F and U+094D (other characters are handled properly even without setting the script). Similarly you have to set the Arabic script in order to connect the characters properly, without setting the script only isolated forms will be typeset. Everything is done in XeTeX, xdvipdfmx just renders properly reordered and composed glyphs into PDF. The Velthuis Devanagari package contains even samples for XeLaTeX, some support files have recently been moved to the xetex-devanagari package. However, I would not like to think, why I have overful/underful boxes and opening hex editor to see what kind of space is written between words. A number of alternatives to a hex editor have been pointed out: 1) color coding 2) using a font that has a representation of these code points 3) using any text editor that allows you to see the Unicode code point of a character (I use jEdit this way, I'm sure many other editors offer this support) Again, this is not about _forcing_ anyone to use NBSP etc., it is about _allowing_ their use *with the expected Unicode behavior.* -- Mike Maxwell maxw...@umiacs.umd.edu My definition of an interesting universe is one that has the capacity to study itself. --Stephen Eastmond -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
On 11/15/2011 5:39 AM, Chris Travers wrote: My recommendation is: 1) Default to handling all white space as it exists now. 2) Provide some sort of switch, whether to the execution of XeTeX or to the document itself, to turn on handling of special unicode characters. 3) If that switch is enabled, then treat the whitespaces according to unicode meanings. If not, treat them as standard whitespace. I think you asked me earlier whether that would satisfy me, and I failed to answer. Yes, it would. -- Mike Maxwell maxw...@umiacs.umd.edu My definition of an interesting universe is one that has the capacity to study itself. --Stephen Eastmond -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
2011/11/15 Mike Maxwell maxw...@umiacs.umd.edu: On 11/15/2011 5:39 AM, Chris Travers wrote: My recommendation is: 1) Default to handling all white space as it exists now. 2) Provide some sort of switch, whether to the execution of XeTeX or to the document itself, to turn on handling of special unicode characters. 3) If that switch is enabled, then treat the whitespaces according to unicode meanings. If not, treat them as standard whitespace. I think you asked me earlier whether that would satisfy me, and I failed to answer. Yes, it would. But such a solution is not clean, you cannot plug in such logic to the TeX mouth when the input is being read nor to the output stage when TECkit maps are in effect. I wrote the reasons earlier. The only reasonable solution seems to be the one suggested by Phil Taylor, to extend \catcode up to 255 and assign special categories to other types of characters. Thus we could say that normal space id 10, nonbreakable space is 16, thin space is 17 etc. XeTeX will then be able to treat them properly. -- Mike Maxwell maxw...@umiacs.umd.edu My definition of an interesting universe is one that has the capacity to study itself. --Stephen Eastmond -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
2011/11/15 Zdenek Wagner zdenek.wag...@gmail.com: 2011/11/15 Mike Maxwell maxw...@umiacs.umd.edu: On 11/15/2011 5:39 AM, Chris Travers wrote: My recommendation is: 1) Default to handling all white space as it exists now. 2) Provide some sort of switch, whether to the execution of XeTeX or to the document itself, to turn on handling of special unicode characters. 3) If that switch is enabled, then treat the whitespaces according to unicode meanings. If not, treat them as standard whitespace. I think you asked me earlier whether that would satisfy me, and I failed to answer. Yes, it would. But such a solution is not clean, you cannot plug in such logic to the TeX mouth when the input is being read nor to the output stage when TECkit maps are in effect. I wrote the reasons earlier. The only reasonable solution seems to be the one suggested by Phil Taylor, to extend \catcode up to 255 and assign special categories to other types of characters. Thus we could say that normal space id 10, nonbreakable space is 16, thin space is 17 etc. XeTeX will then be able to treat them properly. But we are talking two different things here. The first is user interface, and the second is mechanism. What I am saying is special handling of this sort should be required to be enabled somehow by the user. I don't really care how. It could be by a commandline switch to xelatex. It could be by a call in the document if that's possible. It should be optional, and disabled by default, given that the characters involved are not intended to be displayed with glyphs. Best Wishes, Chris Travers -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
Zdenek Wagner wrote: The only reasonable solution seems to be the one suggested by Phil Taylor, to extend \catcode up to 255 and assign special categories to other types of characters. Thus we could say that normal space id 10, nonbreakable space is 16, thin space is 17 etc. XeTeX will then be able to treat them properly. which may, unfortunately, then require new types of node in TeX's internal list structures ... (may, not will). ** Phil. -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
Chris Travers wrote: But we are talking two different things here. The first is user interface, and the second is mechanism. What I am saying is special handling of this sort should be required to be enabled somehow by the user. I don't really care how. It could be by a commandline switch to xelatex. It could be by a call in the document if that's possible. It should be optional, and disabled by default, given that the characters involved are not intended to be displayed with glyphs. But /if/ it requires a change to the number of category codes (and/or the creation of one or more classes of internal node), then this is not something that should be capable of being turned on or off within a document. I don't have any problem with the idea of turning the functionality on or off either within a format file or from a command-line qualifier. ** Phil. -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
2011/11/15 Chris Travers chris.trav...@gmail.com: 2011/11/15 Zdenek Wagner zdenek.wag...@gmail.com: 2011/11/15 Mike Maxwell maxw...@umiacs.umd.edu: On 11/15/2011 5:39 AM, Chris Travers wrote: My recommendation is: 1) Default to handling all white space as it exists now. 2) Provide some sort of switch, whether to the execution of XeTeX or to the document itself, to turn on handling of special unicode characters. 3) If that switch is enabled, then treat the whitespaces according to unicode meanings. If not, treat them as standard whitespace. I think you asked me earlier whether that would satisfy me, and I failed to answer. Yes, it would. But such a solution is not clean, you cannot plug in such logic to the TeX mouth when the input is being read nor to the output stage when TECkit maps are in effect. I wrote the reasons earlier. The only reasonable solution seems to be the one suggested by Phil Taylor, to extend \catcode up to 255 and assign special categories to other types of characters. Thus we could say that normal space id 10, nonbreakable space is 16, thin space is 17 etc. XeTeX will then be able to treat them properly. But we are talking two different things here. The first is user interface, and the second is mechanism. What I am saying is special handling of this sort should be required to be enabled somehow by the user. I don't really care how. It could be by a commandline switch to xelatex. It could be by a call in the document if that's possible. It should be optional, and disabled by default, given that the characters involved are not intended to be displayed with glyphs. The mechanism is simple, set this \catcode to 13 and define it as \nobreak\space. If you wish to make it clever in all XeLaTeX corners, find one of my previous posts to see what has to be taken into account. It may be present in a package called nbsp.sty or so. No change in XeTeX is needed if you do it this way. Best Wishes, Chris Travers -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
2011/11/15 Philip TAYLOR p.tay...@rhul.ac.uk: Zdenek Wagner wrote: The only reasonable solution seems to be the one suggested by Phil Taylor, to extend \catcode up to 255 and assign special categories to other types of characters. Thus we could say that normal space id 10, nonbreakable space is 16, thin space is 17 etc. XeTeX will then be able to treat them properly. which may, unfortunately, then require new types of node in TeX's internal list structures ... (may, not will). Sure, the change will not be trivial. I do not know how the category codes are stored internally but extending them from 16 possible values to 256 may require dramatic change in the internal structures. ** Phil. -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
2011/11/15 Philip TAYLOR p.tay...@rhul.ac.uk: Chris Travers wrote: But we are talking two different things here. The first is user interface, and the second is mechanism. What I am saying is special handling of this sort should be required to be enabled somehow by the user. I don't really care how. It could be by a commandline switch to xelatex. It could be by a call in the document if that's possible. It should be optional, and disabled by default, given that the characters involved are not intended to be displayed with glyphs. But /if/ it requires a change to the number of category codes (and/or the creation of one or more classes of internal node), then this is not something that should be capable of being turned on or off within a document. I don't have any problem with the idea of turning the functionality on or off either within a format file or from a command-line qualifier. If you know what such characters are (and it will certainly be documented), you just set their categories back to 12 in order to get the old behaviour. ** Phil. -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
Zdenek Wagner wrote: If you know what such characters are (and it will certainly be documented), you just set their categories back to 12 in order to get the old behaviour. No ! A catcode is for life, not just for Christmas ! Once a character has been read, and bound into a character/catcode pair, that catcode remains immutable. That means that code that is /not/ expecting to have to deal with non-standard catcodes could none the less be passed token lists containing such entities if it is possible, within a document, to turn such a feature on and off again. ** Phil. -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
On Tue, Nov 15, 2011 at 02:20:17PM +, Philip TAYLOR wrote: No ! A catcode is for life, not just for Christmas ! Once a character has been read, and bound into a character/catcode pair, that catcode remains immutable. Do you mean that as a general good practice in TeX programming, or as a description of how TeX works? The latter is obviously wrong. Arthur -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
2011/11/15 Philip TAYLOR p.tay...@rhul.ac.uk: Zdenek Wagner wrote: If you know what such characters are (and it will certainly be documented), you just set their categories back to 12 in order to get the old behaviour. No ! A catcode is for life, not just for Christmas ! Once a character has been read, and bound into a character/catcode pair, that catcode remains immutable. That means that code that is /not/ expecting to have to deal with non-standard catcodes could none the less be passed token lists containing such entities if it is possible, within a document, to turn such a feature on and off again. Of course, I know it. What I meant was that you could set \catcode of all these extended characters to 12 at the beginning of your document. Thus you get the same behaviour as now. ** Phil. -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
Arthur Reutenauer wrote: On Tue, Nov 15, 2011 at 02:20:17PM +, Philip TAYLOR wrote: No ! A catcode is for life, not just for Christmas ! Once a character has been read, and bound into a character/catcode pair, that catcode remains immutable. Do you mean that as a general good practice in TeX programming, or as a description of how TeX works? The latter is obviously wrong. The latter is what the TeXbok says (P.~39) : Once a category code has been attached to a character token, the attachment is permanent. ** Phil. -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
On Nov 15, 2011, at 8:52 AM, Philip TAYLOR wrote: Arthur Reutenauer wrote: On Tue, Nov 15, 2011 at 02:20:17PM +, Philip TAYLOR wrote: No ! A catcode is for life, not just for Christmas ! Once a character has been read, and bound into a character/catcode pair, that catcode remains immutable. Do you mean that as a general good practice in TeX programming, or as a description of how TeX works? The latter is obviously wrong. The latter is what the TeXbok says (P.~39) : Once a category code has been attached to a character token, the attachment is permanent. ** Phil. Howdy, What happens in a verbatim environment? Good Luck, Herb Schulz (herbs at wideopenwest dot com) -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
Zdenek Wagner wrote: Of course, I know it. What I meant was that you could set \catcode of all these extended characters to 12 at the beginning of your document. Thus you get the same behaviour as now. Ah yes : with that, I have no problem. ** Phil. -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
2011/11/15 Herbert Schulz he...@wideopenwest.com: On Nov 15, 2011, at 8:52 AM, Philip TAYLOR wrote: Arthur Reutenauer wrote: On Tue, Nov 15, 2011 at 02:20:17PM +, Philip TAYLOR wrote: No ! A catcode is for life, not just for Christmas ! Once a character has been read, and bound into a character/catcode pair, that catcode remains immutable. Do you mean that as a general good practice in TeX programming, or as a description of how TeX works? The latter is obviously wrong. The latter is what the TeXbok says (P.~39) : Once a category code has been attached to a character token, the attachment is permanent. ** Phil. Howdy, What happens in a verbatim environment? It will have to be redefined, there will just be additional special characters that will have to be handled. \XeTeXrevision will give you information whether extended \catcode is implemented. Good Luck, Herb Schulz (herbs at wideopenwest dot com) -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
The latter is what the TeXbok says (P.~39) : Once a category code has been attached to a character token, the attachment is permanent. Yes, because you meant individual tokens (which I understood in retrospect). But in the context of the discussion, you really seemed to be saying that you could not change the \catcode's of characters to be read, which was the point (not that there is much point left to the whole threads any more...) Arthur -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
Arthur Reutenauer wrote: The latter is what the TeXbok says (P.~39) : Once a category code has been attached to a character token, the attachment is permanent. Yes, because you meant individual tokens (which I understood in retrospect). But in the context of the discussion, you really seemed to be saying that you could not change the \catcode's of characters to be read, which was the point (not that there is much point left to the whole threads any more...) No no : changing catodes on the fly is standard TeX programming; what we should not contemplate is changing the /number/ of catcodes on the fly ... ** Phil. -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Printing OTF glyph features
Stephan, you can print font glyphs in XeTeX using \XeTeXglyph followed by the glyph's decimal index. You'd need to use a different tool to do the parsing of the OpenType Layout tables, though. The Python package FontTools/TTX or FontForge compiled as a Python module can be used to extract this information. You'd need to do some coding though, going through the GSUB lookups and compile a list of glyphs that are being output. A. On 11-11-15 06:46, Stephan wrote: Good day, I have been trying to print out the glyphs of a font (in my case Minion) that are used in a stylistic variant. But I have not been able to do that... Is there a way of printing, let's say, all the glyphs that would be used if a feature in a font is turned on ? For example, the k in this stylistic variant is different from the regular k in the Minion font, however, I would like to know what other glyphs may be affected. Thanks, -Stephan -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- May success attend your efforts, -- Adam Twardoch (Remove list. from e-mail address to contact me directly.) -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
Herbert Schulz wrote: The latter is what the TeXbok says (P.~39) : Once a category code has been attached to a character token, the attachment is permanent. ** Phil. What happens in a verbatim environment? The verbatim environment sets up an environment within which characters that have not yet been seen by TeX's mouth receive category codes that potentially differ from the category code that would normally be associated with that character. Once the category code has been bound to a particular instance of that character, that instance never changes its catcode. ** Phil. -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
On Nov 15, 2011, at 11:19 AM, Philip TAYLOR wrote: Herbert Schulz wrote: The latter is what the TeXbok says (P.~39) : Once a category code has been attached to a character token, the attachment is permanent. ** Phil. What happens in a verbatim environment? The verbatim environment sets up an environment within which characters that have not yet been seen by TeX's mouth receive category codes that potentially differ from the category code that would normally be associated with that character. Once the category code has been bound to a particular instance of that character, that instance never changes its catcode. ** Phil. Howdy, So what you are saying is not that you can't control the catcode of a particular character but that you can't change it after it is set and in TeX's ``stomach.'' That I can agree with. Good Luck, Herb Schulz (herbs at wideopenwest dot com) -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
[XeTeX] XePersian in Persian version of Wikipedia
http://fa.wikipedia.org/wiki/%D8%B2%DB%8C%E2%80%8C%D9%BE%D8%B1%D8%B4%DB%8C%D9%86 -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
On Nov 15, 2011, at 11:11 AM, Herbert Schulz wrote: On Nov 15, 2011, at 11:19 AM, Philip TAYLOR wrote: Herbert Schulz wrote: The latter is what the TeXbok says (P.~39) : Once a category code has been attached to a character token, the attachment is permanent. ** Phil. What happens in a verbatim environment? The verbatim environment sets up an environment within which characters that have not yet been seen by TeX's mouth receive category codes that potentially differ from the category code that would normally be associated with that character. Once the category code has been bound to a particular instance of that character, that instance never changes its catcode. ** Phil. Howdy, So what you are saying is not that you can't control the catcode of a particular character but that you can't change it after it is set and in TeX's ``stomach.'' That I can agree with. Good Luck, Herb Schulz (herbs at wideopenwest dot com) Howdy, What I meant to say was... So what you are saying is not that you can control the catcode of a particular character but that you can't change it after it is set and in TeX's ``stomach.'' That I can agree with. (notice the can't control --- can control) Good Luck, Herb Schulz (herbs at wideopenwest dot com) -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Printing OTF glyph features
On Tue, Nov 15, 2011 at 06:46, Stephan wrote: Good day, I have been trying to print out the glyphs of a font (in my case Minion) that are used in a stylistic variant. But I have not been able to do that... Is there a way of printing, let's say, all the glyphs that would be used if a feature in a font is turned on ? For example, the k in this stylistic variant is different from the regular k in the Minion font, however, I would like to know what other glyphs may be affected. I think that Hans Hagen sent me an example for that, written in ConTeXt MKIV (based on LuaTeX) for that. (I'm not sure if this was included or not; there was definitely a document for showing alternatives for OpenType Math, and there was definitely some document showing different numbers with different features turned on.) I need to remember where I have those documents, or you can try to ask the same question on the ConTeXt mailing list (maybe Hans will find that faster than me). I assume that you want to inspect the font and that the exact engine being used to get the job done doesn't matter so much to you? Mojca -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] XePersian in Persian version of Wikipedia
2011/11/15 Vafa Khalighi vafa...@gmail.com: http://fa.wikipedia.org/wiki/%D8%B2%DB%8C%E2%80%8C%D9%BE%D8%B1%D8%B4%DB%8C%D9%86 خوب -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
I think it made more sense with can't, Herb, but that could be a trans-Atlantic difference of usage -- you would, I think, say I could care less where I would say I couldn't care less. ** Phil. Herbert Schulz wrote: What I meant to say was... So what you are saying is not that you can control the catcode of a particular character but that you can't change it after it is set and in TeX's ``stomach.'' That I can agree with. (notice the can't control --- can control) -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
On Nov 15, 2011, at 2:43 PM, Ross Moore wrote: On 16/11/2011, at 5:56 AM, Herbert Schulz wrote: Given that TeX (and XeTeX too) deal wit a non-breakble space already (where we usually use the ~ to represent that space) it seems to me that XeTeX should treat that the same way. No, I disagree completely. What if you really want the Ux00A0 character to be in the PDF? That is, when you copy/paste from the PDF, you want that character to come along for the ride. In TeX ~ *simulates* a non-breaking space visually, but there is no actual character inserted. If you want the character you have to ensure that it gets there, and what more natural way is there than to put it in explicitly. This is how XeTeX treats it currently, according to my experiments, using just fontspec and Charis SIL font. Anyone who has a different experience should check what other packages and fonts are being loaded, and whether there is something that specifically changes how that character is handled. Howdy, But isn't that also true about a regular space character? Doesn't (Xe)TeX insert some glue rather than a Space Character? The big puzzle will happen when someone, not using an editor capable of displaying invisibles, can't understand why they can't get XeTeX to break between the two words. That is an editor problem, not one that XeTeX itself should be concerned with. Agreed. But I'll be you end up with lots of questions on ctt/texhax/etc. about line breaking; assuming that the non-breaking space actually does it's ``job.'' Now having Ux00A0 between two words may change the way hyphenation works for those words. But surely if you are wanting to inhibit a line-break between words, you probably also don't want either word to be hyphenated. So this could really be the correct thing. or not. :-) Good Luck, Herb Schulz (herbs at wideopenwest dot com) -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
2011/11/15 Ross Moore ross.mo...@mq.edu.au: On 16/11/2011, at 5:56 AM, Herbert Schulz wrote: Given that TeX (and XeTeX too) deal wit a non-breakble space already (where we usually use the ~ to represent that space) it seems to me that XeTeX should treat that the same way. No, I disagree completely. What if you really want the Ux00A0 character to be in the PDF? That is, when you copy/paste from the PDF, you want that character to come along for the ride. From the typographical point of view it is the worst of all possible methods. If you really wish it, then do not use TeX but M$ Word or OpenOffice. M$ Word automatically inserts nonbreakable spaces at some points in the text written in Czech. As far as grammer is concerned, it is correct. However, U+00a0 is fixed width. If you look at the output, the nonbreakable spaces are too wide on some lines and too thin on other lines. I cannot imagine anything uglier. -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
Hi Zdenek, On 16/11/2011, at 8:58 AM, Zdenek Wagner wrote: 2011/11/15 Ross Moore ross.mo...@mq.edu.au: On 16/11/2011, at 5:56 AM, Herbert Schulz wrote: Given that TeX (and XeTeX too) deal wit a non-breakble space already (where we usually use the ~ to represent that space) it seems to me that XeTeX should treat that the same way. No, I disagree completely. What if you really want the Ux00A0 character to be in the PDF? That is, when you copy/paste from the PDF, you want that character to come along for the ride. From the typographical point of view it is the worst of all possible methods. If you really wish it, The *really wish it* is the choice of the author, not the software. then do not use TeX but M$ Word or OpenOffice. M$ Word automatically inserts nonbreakable spaces at some points in the text written in Czech. As far as grammer is concerned, it is correct. However, U+00a0 is fixed width. If you look at the output, the nonbreakable spaces are too wide on some lines and too thin on other lines. I cannot imagine anything uglier. I do not disagree with you that this could be ugly. But that is not the point. If you want superior aesthetic typesetting, with nice choices for hyphenation, then don't use Ux00A0. Of course! Whatever the reason for wanting to use this character, there should be a straight-forward way to do it. Using the character itself is: a. the most understandable b. currently works c. requires no special explanation. -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz Cheers, Ross Ross Moore ross.mo...@mq.edu.au Mathematics Department office: E7A-419 Macquarie University tel: +61 (0)2 9850 8955 Sydney, Australia 2109 fax: +61 (0)2 9850 8114 -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
2011/11/15 Ross Moore ross.mo...@mq.edu.au: Hi Zdenek, On 16/11/2011, at 8:58 AM, Zdenek Wagner wrote: 2011/11/15 Ross Moore ross.mo...@mq.edu.au: On 16/11/2011, at 5:56 AM, Herbert Schulz wrote: Given that TeX (and XeTeX too) deal wit a non-breakble space already (where we usually use the ~ to represent that space) it seems to me that XeTeX should treat that the same way. No, I disagree completely. What if you really want the Ux00A0 character to be in the PDF? That is, when you copy/paste from the PDF, you want that character to come along for the ride. From the typographical point of view it is the worst of all possible methods. If you really wish it, The *really wish it* is the choice of the author, not the software. then do not use TeX but M$ Word or OpenOffice. M$ Word automatically inserts nonbreakable spaces at some points in the text written in Czech. As far as grammer is concerned, it is correct. However, U+00a0 is fixed width. If you look at the output, the nonbreakable spaces are too wide on some lines and too thin on other lines. I cannot imagine anything uglier. I do not disagree with you that this could be ugly. But that is not the point. If you want superior aesthetic typesetting, with nice choices for hyphenation, then don't use Ux00A0. Of course! Whatever the reason for wanting to use this character, there should be a straight-forward way to do it. Using the character itself is: a. the most understandable b. currently works c. requires no special explanation. These are reasons why people might wish it in the source files, not in PDF. If you wish to take a [part of] PDF and include it in another PDF as is, you can take the PDF directly without the need of grabbing the text. If you are interested in the text that will be retypeset, you have to verify a lot of other things. If the text contained hyphenated words, you have to join the parts manually. You will have a lot of other work and the time saved by U+00a0 will be negligible. There are tools that may help you to insert nonbreakable spaces. I have even my own special tools written in perl to handle one class of input files that are really plain texts and the result is (almost) correctly marked LaTeX source. -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz Cheers, Ross Ross Moore ross.mo...@mq.edu.au Mathematics Department office: E7A-419 Macquarie University tel: +61 (0)2 9850 8955 Sydney, Australia 2109 fax: +61 (0)2 9850 8114 -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
Hi Phil, On 16/11/2011, at 8:45 AM, Philip TAYLOR wrote: Ross Moore wrote: On 16/11/2011, at 5:56 AM, Herbert Schulz wrote: Given that TeX (and XeTeX too) deal wit a non-breakble space already (where we usually use the ~ to represent that space) it seems to me that XeTeX should treat that the same way. No, I disagree completely. What if you really want the Ux00A0 character to be in the PDF? That is, when you copy/paste from the PDF, you want that character to come along for the ride. I'm not sure I entirely go along with this argument, Ross. What if you really want the \ character to be in the PDF, or the ^ character, or the $ character, or any character that TeX currently treats specially ? TeX already provides \$ \_ \# etc. for (most of) the other special characters it uses, but does not for ^^A0 --- but it does not need to if you can generate it yourself on the keyboard. Whilst I can agree that there is considerable merit in extending XeTeX such that it treats all of these new, special characters specially (by creating new catcodes, new node types and so on), in the short term I can see no fundamental problem with treating U+00A0 in such a way that it behaves indistinguishably from the normal expansion of ~. How do you explain to somebody the need to do something really, really special to get a character that they can type, or copy/paste? There is no special role for this character in other vital aspects of how TeX works, such as there is for $ _ # etc. In TeX ~ *simulates* a non-breaking space visually, but there is no actual character inserted. And I don't agree that a space is a character, non-breaking or not ! In this view you are against most of the rest of the world. If the output is intended to be PDF, as it really has to be with XeTeX, then the specifications for the modern variants of PDF need to be consulted. With PDF/A and PDF/UA and anything based on ISO-32000 (PDF 1.7) there is a requirement that the included content should explicitly provide word boundaries. Having a space character inserted is by far the most natural way to meet this specification. (This does not mean that having such a character in the output need affect TeX's view of typesetting.) Before replying to anything in the above paragraph, please watch the video of my recent talk at TUG-2011. http://river-valley.tv/further-advances-toward-tagged-pdf-for-mathematics/ or similar from earlier years where I also talk a bit about such things. ** Phil. Hope this helps, Ross Ross Moore ross.mo...@mq.edu.au Mathematics Department office: E7A-419 Macquarie University tel: +61 (0)2 9850 8955 Sydney, Australia 2109 fax: +61 (0)2 9850 8114 -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
2011/11/15 Ross Moore ross.mo...@mq.edu.au: Hi Phil, On 16/11/2011, at 8:45 AM, Philip TAYLOR wrote: Ross Moore wrote: On 16/11/2011, at 5:56 AM, Herbert Schulz wrote: Given that TeX (and XeTeX too) deal wit a non-breakble space already (where we usually use the ~ to represent that space) it seems to me that XeTeX should treat that the same way. No, I disagree completely. What if you really want the Ux00A0 character to be in the PDF? That is, when you copy/paste from the PDF, you want that character to come along for the ride. I'm not sure I entirely go along with this argument, Ross. What if you really want the \ character to be in the PDF, or the ^ character, or the $ character, or any character that TeX currently treats specially ? TeX already provides \$ \_ \# etc. for (most of) the other special characters it uses, but does not for ^^A0 --- but it does not need to if you can generate it yourself on the keyboard. 00a0 Whilst I can agree that there is considerable merit in extending XeTeX such that it treats all of these new, special characters specially (by creating new catcodes, new node types and so on), in the short term I can see no fundamental problem with treating U+00A0 in such a way that it behaves indistinguishably from the normal expansion of ~. How do you explain to somebody the need to do something really, really special to get a character that they can type, or copy/paste? There is no special role for this character in other vital aspects of how TeX works, such as there is for $ _ # etc. In TeX ~ *simulates* a non-breaking space visually, but there is no actual character inserted. And I don't agree that a space is a character, non-breaking or not ! In this view you are against most of the rest of the world. TeX NEVER outputs a space as a glyph. Text extraction tools usually interpret horizontal spaces of sufficient size as U+0020. (The exception to the above mentioned never is the verbatim mode.) If the output is intended to be PDF, as it really has to be with XeTeX, then the specifications for the modern variants of PDF need to be consulted. With PDF/A and PDF/UA and anything based on ISO-32000 (PDF 1.7) there is a requirement that the included content should explicitly provide word boundaries. Having a space character inserted is by far the most natural way to meet this specification. A space character is a fixed-width glyph. If you insist in it, you will never be able to typeset justified paragraphs, you will move back to the era of mechanical typewriters. (This does not mean that having such a character in the output need affect TeX's view of typesetting.) Before replying to anything in the above paragraph, please watch the video of my recent talk at TUG-2011. http://river-valley.tv/further-advances-toward-tagged-pdf-for-mathematics/ or similar from earlier years where I also talk a bit about such things. ** Phil. Hope this helps, Ross Ross Moore ross.mo...@mq.edu.au Mathematics Department office: E7A-419 Macquarie University tel: +61 (0)2 9850 8955 Sydney, Australia 2109 fax: +61 (0)2 9850 8114 -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
I was going to make the following point earlier--maybe in light of Phil's conclusion I should do it now. There seems to be a tendency not to distinguish between a(n orginal) character in the sense of character of a writing system, and a computer character. The former are visible symbols on a background medium. The latter are an entirely different set of symbols which to some extent parallel the former, and some extent do not. Space, control codes, etc. don't exist in the former, but exist in the latter because it was a convenient way to encode certain functions one wished to apply to the encoded other characters--the ones that correspond more or less to original writing system characters. These encoding sets have developed over time, and have consequently inherited all sorts of legacy issues, not all of which need supporting. Unicode provides tools. No one says one has to use them all. Specifically, the purpose of XeTeX and other such engines is to all for the nice typographical formatting of visual representations of script characters against some other defined background. From that point of view, so long as it does it, once it does it, it has achieved its goal. Transparency of all sorts of other things, providing input via PDF to other software isn't and shouldn't be a *primary* goal. That being said, no doubt it might be helpful to some to have this or that control character passed along. But that's not the essence of the exercise, and should only be done if it can be done cheaply, i.e. without a lot of risk to the primary objective. I guess the real question is that latter part. K On Tue, Nov 15, 2011 at 4:45 PM, in message 4ec2dd63.3040...@rhul.ac.uk, Philip TAYLOR p.tay...@rhul.ac.uk wrote: Ross Moore wrote: On 16/11/2011, at 5:56 AM, Herbert Schulz wrote: Given that TeX (and XeTeX too) deal wit a non-breakble space already (where we usually use the ~ to represent that space) it seems to me that XeTeX should treat that the same way. No, I disagree completely. What if you really want the Ux00A0 character to be in the PDF? That is, when you copy/paste from the PDF, you want that character to come along for the ride. I'm not sure I entirely go along with this argument, Ross. What if you really want the \ character to be in the PDF, or the ^ character, or the $ character, or any character that TeX currently treats specially ? Whilst I can agree that there is considerable merit in extending XeTeX such that it treats all of these new, special characters specially (by creating new catcodes, new node types and so on), in the short term I can see no fundamental problem with treating U+00A0 in such a way that it behaves indistinguishably from the normal expansion of ~. In TeX ~ *simulates* a non-breaking space visually, but there is no actual character inserted. And I don't agree that a space is a character, non-breaking or not ! ** Phil. -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
Hi Phil, On 16/11/2011, at 10:08 AM, Zdenek Wagner wrote: How do you explain to somebody the need to do something really, really special to get a character that they can type, or copy/paste? There is no special role for this character in other vital aspects of how TeX works, such as there is for $ _ # etc. In TeX ~ *simulates* a non-breaking space visually, but there is no actual character inserted. And I don't agree that a space is a character, non-breaking or not ! In this view you are against most of the rest of the world. TeX NEVER outputs a space as a glyph. Text extraction tools usually interpret horizontal spaces of sufficient size as U+0020. I never said that it did, nor that it was necessary to do so. Those text extraction tools do a pretty reasonable job, but don't always get it right. Besides, there is reliance on a heuristic, which can be fallible, especially if there is content typeset in a very small font size. And what about at line-ends? They can get that wrong too. Such a reliance is rather against the TeX way of doing things, don't you think? Better is for TeX itself to apply the heuristic, since it knows the current font size and the separation between bits of words. (The exception to the above mentioned never is the verbatim mode.) That isn't good enough for TeX to produce PDF/A. Go and watch the videos that I pointed you to. Lower down I give a run-down of how a variant of TeX handles this problem, to very good effect. If the output is intended to be PDF, as it really has to be with XeTeX, then the specifications for the modern variants of PDF need to be consulted. With PDF/A and PDF/UA and anything based on ISO-32000 (PDF 1.7) there is a requirement that the included content should explicitly provide word boundaries. Having a space character inserted is by far the most natural way to meet this specification. A space character is a fixed-width glyph. If you insist in it, you will never be able to typeset justified paragraphs, you will move back to the era of mechanical typewriters. Absolutely wrong! I'm not insisting on it being included as the natural way to separate words within the PDF, though it certainly is a possible way that is used by other software. (This does not mean that having such a character in the output need affect TeX's view of typesetting.) Clearly you never even read this parenthetical statement ... Before replying to anything in the above paragraph, please watch the video of my recent talk at TUG-2011. ... and certainly you don't seem to have followed up on this piece of advice, to get a better perspective of what I'm talking about. http://river-valley.tv/further-advances-toward-tagged-pdf-for-mathematics/ or similar from earlier years where I also talk a bit about such things. Here is how you get *both* TeX-quality typesetting and explicit spaces as word-boundaries inside the PDF, with no loss of quality. What the experimental tagged-pdfTeX does is to use a font (called dummy-space) that contains just a single character at code Ux0020, at a size that is almost zero -- it cannot be exactly zero, else PDF browsers may not select it for copy/paste, or other text-extraction. These extra spaces are inserted into the PDF content stream, *after* TeX has determined the correct positioning for high-quality typesetting. That is, it is *not* done by macros or widgets or suchlike, but is done internally by the pdfTeX engine at shipout time. The almost-zero size has no perceptible effect on the visual output. But the existence of these extra space characters means that all text-extraction methods work much more reliably. There *are* extra primitives that can be used to turn this off and on in places where such extra spaces are not wanted; e.g. in math. And there is a primitive to insert such a space, in case it is required manually, for whatever reason. All of these primitives are used extensively when generating tagged PDF of mathematical expressions, and are thus available for other usage too. ** Phil. Hope this helps, Ross Ross Moore ross.mo...@mq.edu.au Mathematics Department office: E7A-419 Macquarie University tel: +61 (0)2 9850 8955 Sydney, Australia 2109 fax: +61 (0)2 9850 8114 -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
2011/11/16 Ross Moore ross.mo...@mq.edu.au: On 16/11/2011, at 9:45 AM, Zdenek Wagner wrote: 2011/11/15 Ross Moore ross.mo...@mq.edu.au: What if you really want the Ux00A0 character to be in the PDF? That is, when you copy/paste from the PDF, you want that character to come along for the ride. From the typographical point of view it is the worst of all possible methods. If you really wish it, Maybe you misunderstood what I meant here. I'm not saying that you might want Ux00A0 for *every* place where there is a word-breaking space. Just that there may be individual instance(s) where you have a reason to want it. Just like any other Unicode character, if you want it then you should be able to put it in there. You ARE able to do it. Choose a font with that glyph, set \catcode to 11 or 12 and that's it. What else do you wish to do? That's what XeTeX currently does (with the TeX-wise familiar ASCII exceptions) for any code-point supported by the chosen font. The *really wish it* is the choice of the author, not the software. then do not use TeX but M$ Word or OpenOffice. M$ Word automatically inserts nonbreakable spaces at some points in the text written in Czech. As far as grammer is concerned, it is correct. However, U+00a0 is fixed width. If you look at the output, the nonbreakable spaces are too wide on some lines and too thin on other lines. I cannot imagine anything uglier. I do not disagree with you that this could be ugly. But that is not the point. If you want superior aesthetic typesetting, with nice choices for hyphenation, then don't use Ux00A0. Of course! Whatever the reason for wanting to use this character, there should be a straight-forward way to do it. Using the character itself is: a. the most understandable b. currently works c. requires no special explanation. These are reasons why people might wish it in the source files, not in PDF. Yes. In the source, to have the occasional such character included within the PDF, for whatever reason appropriate to the material being typeset -- whether verbatim, or not. If you wish to take a [part of] PDF and include it in another PDF as is, you can take the PDF directly without the need of grabbing the text. If you are interested in the text that will be retypeset, you have to verify a lot of other things. How is any of this relevant to the current discussion? It was you who came with the argument that you wish to have nonbreakable spaces when copying the text from PDF. If the text contained hyphenated words, you have to join the parts manually. You will have a lot of other work and the time saved by U+00a0 will be negligible. There are tools that may help you to insert nonbreakable spaces. I have even my own special tools written in perl to handle one class of input files that are really plain texts and the result is (almost) correctly marked LaTeX source. All well and good. But how is that relevant to anything I said? See above. -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz Cheers, Ross Ross Moore ross.mo...@mq.edu.au Mathematics Department office: E7A-419 Macquarie University tel: +61 (0)2 9850 8955 Sydney, Australia 2109 fax: +61 (0)2 9850 8114 -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex -- Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
Ross Moore wrote: Hi Phil, On 16/11/2011, at 10:08 AM, Zdenek Wagner wrote: Not I, Sir : Zdeněk ! ** Phil. -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
[XeTeX] aligning characters at their centers
Is there a way to align characters at their centers instead of at their baselines? Take for example {\scshape Ee}. This will produce one big uppercase E and one little uppercase E; and their lower horizontal bars will be aligned. But is there any way I can make them aligned at their centers (center horizontal bars aligned) without using \raisebox? This has application to book publishing when placing rotated text on the spine of a book. Many thanks in advance, Dan -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] aligning characters at their centers
On Wed, Nov 16, 2011 at 11:28:33AM +0800, Daniel Greenhoe wrote: Is there a way to align characters at their centers instead of at their baselines? Take for example {\scshape Ee}. This will produce one big uppercase E and one little uppercase E; and their lower horizontal bars will be aligned. But is there any way I can make them aligned at their centers (center horizontal bars aligned) without using \raisebox? \documentclass{article} \begin{document} \scshape $\vcenter{\hbox{E}}\vcenter{\hbox{e}}$ or \valign{\vfill\hbox{#}\vfill\cr E\cr e\cr} \end{document} Yours sincerely Heiko Oberdiek -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex
Re: [XeTeX] Whitespace in input
Hi Zdenek, On 16/11/2011, at 11:19 AM, Zdenek Wagner wrote: Just like any other Unicode character, if you want it then you should be able to put it in there. You ARE able to do it. Choose a font with that glyph, set \catcode to 11 or 12 and that's it. What else do you wish to do? The *default* behaviour should stay as this. Any other behaviour needs to change the catcode and make perhaps a definition. These are reasons why people might wish it in the source files, not in PDF. Yes. In the source, to have the occasional such character included within the PDF, for whatever reason appropriate to the material being typeset -- whether verbatim, or not. If you wish to take a [part of] PDF and include it in another PDF as is, you can take the PDF directly without the need of grabbing the text. If you are interested in the text that will be retypeset, you have to verify a lot of other things. How is any of this relevant to the current discussion? It was you who came with the argument that you wish to have nonbreakable spaces when copying the text from PDF. No. I said that if you put one in, then you should be expecting to get one out. This should be the default behaviour, as it is now. I certainly suggested nothing like getting out non-breaking spaces as a replacement for anything else. Zdeněk Wagner http://hroch486.icpf.cas.cz/wagner/ http://icebearsoft.euweb.cz Hope this helps, Ross Ross Moore ross.mo...@mq.edu.au Mathematics Department office: E7A-419 Macquarie University tel: +61 (0)2 9850 8955 Sydney, Australia 2109 fax: +61 (0)2 9850 8114 -- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex