Re: [XeTeX] strange case of overlapping graphics

2021-08-09 Thread Ross Moore
Hello Sej.

This message came through empty to me.

Was it intentional?
How can we help you resolve the difficulty?


All the best.

Ross

On 8 Aug 2021, at 6:26 am, Sej Lyn Jimenez 
mailto:sejlynjime...@gmail.com>> wrote:



Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>



Re: [XeTeX] strange case of overlapping graphics

2021-06-10 Thread Ross Moore
Hi Zdenek,

On 11 Jun 2021, at 6:49 am, Zdenek Wagner 
mailto:zdenek.wag...@gmail.com>> wrote:

Yes, it is what I saw in pdflatex as well. Now I see the reason, I have not 
read my exiftool output carefully. It says:

X Resolution: 25
Y Resolution: 7

Yes, that does look a bit weird, doesn’t it.
One would expect it to be the same in both directions, unless specifically
set to be different for a special effect.

Notice that it also saysResolution Unit: inches .
Surely not!  unless it needs ~100 units for the inch.
Further down there are  Pixel units of meters.
Very strange.

Thanks in advance for any lessons on how to properly read/interpret this info.

So the culprit need not be XeTeX after all.


All the best.

Ross


So, pdftex as well as viewers used by Phil honour the resolution in both 
directions and the output is tall. Gwenview, gimp and ocular just honour the 
size in pixels. Xelatex probably calculates the dimensions correctly taking 
into account X/Y resolutions, generated the commands for PNG inclusion but 
xdvipdfmx honours the pixels only, not the resolutions.

Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml<http://ttsm.icpf.cas.cz/team/wagner.shtml>


čt 10. 6. 2021 v 22:37 odesílatel Ross Moore 
mailto:ross.mo...@mq.edu.au>> napsal:
Hi Janusz, and others

For me, using  pdftex, the only issue is that the aspect ratio isn’t correct.

XeTeX gets it wrong:


Notice the apparent lack of centering with respect to the caption.
It’s as if the width has been read incorrectly by XeTeX, for positioning
the 2 instances of the image.



Explicitly specifying a more realistic width fixes this, with both engines.
Examining the information in GraphicConverter  (on MacOS)
I cannot see anything amiss.  (see image)




So to me, this looks like a XeTeX issue after all.


Hope this helps.

Ross


On 11 Jun 2021, at 2:12 am, Janusz S. Bień 
mailto:jsb...@mimuw.edu.pl>> wrote:

On Thu, Jun 10 2021 at 11:04 -05, Herbert Schulz wrote:
>> On Jun 10, 2021, at 10:58 AM, Zdenek Wagner 
>> mailto:zdenek.wag...@gmail.com>> wrote:
>>
>> čt 10. 6. 2021 v 17:09 odesílatel Philip Taylor
>> mailto:p.tay...@hellenic-institute.uk>> 
>> napsal:
>>>
>>> In both Windows Preview and in Adobe Illustrator CC, the PNG file
>>> is roughly twice as tall (relative to its width) as it appears in
>>> the PDF.
>>> --
>> So this means that the PNG contains something strange which is
>> interpreted by some programs and ignored by other programs. Maybe
>> different vertical and horizontal resolutions? I do not have a tool to
>> analyze it further.

Tomorrow I will make more tests and submit the problem to ddjvu
author(s).

> Howdy,
>
> And
>
> \includegraphics[width=1.3cm]{Zaborowski_MBC_page19y42_a}
> \includegraphics[width=1.3cm]{Zaborowski_MBC_page19y42_a}
>
> works fine (not exactly the same size but that can be adjusted).

Yes. It's great you have found a way to circumvent the problem!

On the other hand this make the problem more misterious...

Best regards

Janusz

--
,
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien<https://sites.google.com/view/jsbien>


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au<http://www.maths.mq.edu.au/>

CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>



Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>



Re: [XeTeX] Coloured fonts

2021-03-18 Thread Ross Moore
Hi David, Philip.


On 19 Mar 2021, at 7:17 am, David Carlisle 
mailto:d.p.carli...@gmail.com>> wrote:

Not sure if xetex can do colour fonts currently,

According to here:

 https://www.colorfonts.wtf

there’s not many applications that do support this new technology.

The colour doesn’t show in Phil’s example PDF, neither in Adobe’s  Illustrator, 
nor Acrobat Pro,
despite Adobe being one of the instigators of this font format.
So presumably the font isn’t installed correctly into the PDF.

Presumably the  /Style  dictionary here:

9 0 obj
<<
/Descent -173
/StemV 87
/Ascent 631
/FontName /THVNSG+BabelStoneXiangqiColour
/ItalicAngle 0
/Style
<<
/Panose <080002020604010101010101>
>>
/AvgWidth 734
/FontBBox [-14 -232 1014 795]
/Type /FontDescriptor
/CIDSet 16 0 R
/CapHeight 631
/Flags 4
/FontFile2 17 0 R
>>
endobj

is where the colour is specified, by that /Panose  entry.
But there must be something else that is missing.


Unfortunately the link to get the font doesn’t work for me.

[cid:4B73A63B-0EAF-4379-91AC-6674DBBF5B27@modem]

So David, could you possibly send the PDF of the example you posted below, 
please?


You can always experiment with luatex which gets this if using harfbuzz



\documentclass{article}

\usepackage{fontspec}

\newfontfamily\chess[Renderer=HarfBuzz]{BabelStoneXiangqiColour.ttf}
\begin{document}

testing {\chess ^^01fa64}

\end{document}



On Thu, 18 Mar 2021 at 18:39, Philip Taylor 
mailto:p.tay...@rhul.ac.uk>> wrote:
Seeking to re-typeset a long out-of-print classic on Xiang-Qi ("Chinese 
Chess"), but with the pieces shewn as they really are rather than as upper-case 
Latin letters requiring a gloss (the presentation chosen by the original 
author), I downloaded and installed Andrew West's BabelStone Xiangqi Colour 
font<https://www.babelstone.co.uk/Fonts/Xiangqi.html>.  I then wrote a short 
piece of XeTeX code to check that the glyphs/pieces appear in the PDF as they 
should, and very sadly they do not, coming out as monochrome rather than in 
colour (see attached PDF).

The red pieces are described by Andrew as red Chinese characters on a sandy 
yellow background, and the black pieces as black Chinese characters on a sandy 
yellow background.  In the resulting PDF, however, they appear as white Hanzi 
on a black ground and black Hanzi on a white ground.  Does XeTeX support 
coloured fonts, and if so, how do I persuade it to render these glyphs as 
intended rather than in monochrome ?

I can, of course, load \font \redpieces = "BabelStone Xiangqi 
Colour":color=FF scaled \magstep 5 (see code below), but that still does 
not give me the sandy yellow ground that each glyph was designed to have.

'opentype-info.tex', when run against BabelStone Xiangqi Colour, tells me that 
the font does not provide any Opentype layout features, so it does not look as 
if XeTeX's "/ICU:+abcd" convention would allow me to indicate that I require 
colour support.

% !TeX Program=XeTeX

\font \pieces = "BabelStone Xiangqi Colour" scaled \magstep 5
\font \redpieces = "BabelStone Xiangqi Colour":color=FF scaled \magstep 5
\font \blackpieces = "BabelStone Xiangqi Colour" scaled \magstep 5
\pieces
\centerline {\char "1FA60\relax \ \char "1FA61\relax \ \char "1FA62\relax \ 
\char "1FA63\relax \ \char "1FA64\relax \ \char "1FA65\relax \ \char 
"1FA66\relax}
\centerline {\strut}
\centerline {\char "1FA67\relax \ \char "1FA68\relax \ \char "1FA69\relax \ 
\char "1FA6A\relax \ \char "1FA6B\relax \ \char "1FA6C\relax \ \char 
"1FA6D\relax}
\centerline {\strut}
\centerline {\strut}
\centerline {\redpieces \char "1FA60\relax \ \char "1FA61\relax \ \char 
"1FA62\relax \ \char "1FA63\relax \ \char "1FA64\relax \ \char "1FA65\relax \ 
\char "1FA66\relax}
\centerline {\strut}
\centerline {\blackpieces \char "1FA67\relax \ \char "1FA68\relax \ \char 
"1FA69\relax \ \char "1FA6A\relax \ \char "1FA6B\relax \ \char "1FA6C\relax \ 
\char "1FA6D\relax}
\end
--
Philip Taylor



Cheers.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>



Re: [XeTeX] latin-1 encoded characters in commented out parts trigger log warnings

2021-02-21 Thread Ross Moore
Hi Jonathan, and others.

On 22 Feb 2021, at 10:39 am, Jonathan Kew 
mailto:jfkth...@gmail.com>> wrote:

On 21/02/2021 22:55, Ross Moore wrote:
The file reading has failed  before any tex accessible processing has happened 
(see the ebcdic example in the TeXBook)
OK.

Also pdfTeX has no trouble with an xstring example.
It just seems pretty crazy that the comments need to be altered
for that package to be used with XeTeX.

Well as long as the Latin-1 accented characters are only in comments, it 
arguably doesn't "really" matter; xetex logs a warning that it can't interpret 
them, but if you know that part of the line is going to be ignored anyway, you 
can ignore the warning.

There’s actually a pretty easy fix, at least for XeLaTeX.
The package contains 2 files only:   xstring.sty  and  xstring.tex .
The .sty is just a 1-liner to load the .tex .

It could be beefed up with:

 \RequirePackage{ifxetex} %   is this still the best package for  \ifxetex  
 ?
\ifxetex
  \XeTeXdefaultencoding "iso-8859-1"
\input{xstring.tex}
  \XeTeXdefaultencoding "utf8"
\else
 \input{xstring.tex}
\fi

(ignore if straight quotes have become curly ones in my email editor!)



Even nicer would be to beef it up further by:
 1. record the current default encoding – is this possible ?
 then restore from this.
 2. use a grouping while deciding what to do,
 expanding the required commands before ending the group.

\showthe\XeTeXdefaultencoding  doesn’t work,
so is there another container that tells what is the default encoding?
Or should we always assume it is UTF-8 and revert to that afterwards?

e.g. something like:

\RequirePackage{ifxetex}
\begingroup
 \def\next{\endgroup \input{xstring.tex}}%
 \ifxetex
  \XeTeXdefaultencoding "iso-8859-1"
  \def\next{\endgroup
   \input{xstring.tex}%
   \XeTeXdefaultencoding "utf8"}%
 \fi
\next




(pdfTeX doesn't care because it simply reads the bytes from the file; any 
interpretation of bytes as one encoding or another is handled at the TeX macro 
level.)

Right.
Which is why I do my PDF development work in pdfTeX before
testing whether it can be adapted also to XeTeX and/or LuaTeX.


JK



Cheers.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>



Re: [XeTeX] latin-1 encoded characters in commented out parts trigger log warnings

2021-02-21 Thread Ross Moore
Hi David,

On 22 Feb 2021, at 8:43 am, David Carlisle 
mailto:d.p.carli...@gmail.com>> wrote:

Surely the line-end characters are already known, and the bits
have been read up to that point *before* tokenisation.

This is not a pdflatex inputenc style utf-8 error failing to map a stream of 
tokens.

It is at the file reading stage and if you have the file encoding wrong you do 
not know reliably what are the ends of lines and you haven't interpreted it as 
tex at all, so the comment character really can't have an effect here.

Ummm. Is that really how XeTeX does it?
How then does Jonathan’s
   \XeTeXdefaultencoding "iso-8859-1”
ever work ?
Just a rhetorical question; don’t bother answering.   :-)

This mapping is invisible to the tex macro layer just as you can change the 
internal character code mapping in classic tex to take an ebcdic stream, if you 
do that then read an ascii file you get rubbish with no hope to recover.



So I don't think such a switch should be automatic to avoid reporting encoding 
errors.

I reported the issue at xstring here
https://framagit.org/unbonpetit/xstring/-/issues/4<https://framagit.org/unbonpetit/xstring/-/issues/4>


I looked at what you said here, and some of it doesn’t seem to be in accord with
my TeXLive installations.

viz.

/usr/local/texlive/2016/.../xstring.tex:\expandafter\ifx\csname 
@latexerr\endcsname\relax% on n'utilise pas LaTeX ?
/usr/local/texlive/2016/.../xstring.tex:\fi% fin des d\'efinitions LaTeX
/usr/local/texlive/2016/.../xstring.tex:%   - Le package ne n\'ecessite plus 
LaTeX et est d\'esormais utilisable sous
/usr/local/texlive/2016/.../xstring.tex:% Plain eTeX.
/usr/local/texlive/2017/.../xstring.tex:% conditions of the LaTeX Project 
Public License, either version 1.3
/usr/local/texlive/2017/.../xstring.tex:% and version 1.3 or later is part of 
all distributions of LaTeX
/usr/local/texlive/2017/.../xstring.tex:\expandafter\ifx\csname 
@latexerr\endcsname\relax% on n'utilise pas LaTeX ?
/usr/local/texlive/2017/.../xstring.tex:\fi% fin des d\'efinitions LaTeX
/usr/local/texlive/2017/.../xstring.tex:%   - Le package ne n\'ecessite plus 
LaTeX et est d\'esormais utilisable sous
/usr/local/texlive/2017/.../xstring.tex:% Plain eTeX.
/usr/local/texlive/2018/.../xstring.tex:% !TeX encoding = ISO-8859-1
/usr/local/texlive/2018/.../xstring.tex:% Licence: Released under the LaTeX 
Project Public License v1.3c %
/usr/local/texlive/2018/.../xstring.tex:% Plain eTeX.
/usr/local/texlive/2019/.../xstring.tex:% !TeX encoding = ISO-8859-1
/usr/local/texlive/2019/.../xstring.tex:% Licence: Released under the LaTeX 
Project Public License v1.3c %
/usr/local/texlive/2019/.../xstring.tex: Plain eTeX.

prior to 2018, the accents in comments used ASCII, so UTF-8, but not 
intentionally so.

In 2017, the accents in comments became  latin-1 chars.
A 1st line was added:  % !TeX encoding = ISO-8859-1
to indicate this.

Such directive comments are useless, except at the beginning of the main 
document source.
They are for Front-End software, not TeX processing, right?

Jonathan, David,
so far as I can tell, it was *never* in UTF-8 with preformed accents.



David


that says what follows next is to be interpreted in a different way to what 
came previously?
Until the next switch that returns to UTF-8 or whatever?


If XeTeX is based on eTeX, then this should be possible in that setting.


Even replacing by U+FFFD
is being lenient.

Why has the mouth not realised that this information is to be discarded?
Then no replacement is required at all.

The file reading has failed  before any tex accessible processing has happened 
(see the ebcdic example in the TeXBook)

OK.
But that’s changing the meaning of bit-order, yes?
Surely we can be past that.



\danger \TeX\ always uses the internal character code of Appendix~C
for the standard ASCII characters,
regardless of what external coding scheme actually appears in the files
being read.  Thus, |b| is 98 inside of \TeX\ even when your computer
normally deals with ^{EBCDIC} or some other non-ASCII scheme; the \TeX\
software has been set up to convert text files to internal code, and to
convert back to the external code when writing text files.


the file encoding is failing at the  "convert text files to internal code" 
stage which is before the line buffer of characters is consulted to produce the 
stream of tokens based on catcodes.

Yes, OK; so my model isn’t up to it, as Bruno said.
 … And Jonathan has commented.

Also pdfTeX has no trouble with an xstring example.
It just seems pretty crazy that the comments need to be altered
for that package to be used with XeTeX.





David




Cheers, and thanks for this discussion.


Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.

Re: [XeTeX] latin-1 encoded characters in commented out parts trigger log warnings

2021-02-21 Thread Ross Moore
Hi Ulrike,

On 22 Feb 2021, at 7:52 am, Ulrike Fischer 
mailto:ne...@nililand.de>> wrote:

Am Sun, 21 Feb 2021 20:26:04 + schrieb Ross Moore:

> Once you have encountered the (correct) comment character,
> what follows on the rest of the line is going to be discarded,
> so its encoding is surely irrelevant.
>
> Why should the whole line need to be fully tokenised,
> before the decision is taken as to what part of it is retained?

Well you need to find the end of the line to know where to stop with
the discarding don't you? So you need to inspect the part after the
comment char until you find something that says "newline”.

My understanding is that this *is* done first.
Similarly to TeX's  \read  towhich grabs a line of input from a 
file,
before doing the tokenisation and storing the result in the .
   page 217 of The TeXbook

If I’m wrong with this, for high-speed input, then yes you need to know where 
to stop.
But that’s just as easy, since you stop when a byte is to be tokenised
as an end-of-line character, and these are known.
You need this anyway, even when you have tokenised every byte.


So all we are saying is that when handling the bytes between
a comment and its end-of-line, just be a bit more careful.

It’s not necessary for each byte to be tokenised as valid for UTF-8.
Maybe change the (Warning) message when you know that you are within
such a comment, to say so.  That would be more meaningful to a package-writer,
and to an author who uses the package, looks in the .log file, and sees the 
message.

None of this is changing how the file is ultimately processed;
it’s just about being friendlier in the human interface.




--
Ulrike Fischer
https://www.troubleshooting-tex.de/<https://www.troubleshooting-tex.de>


All the best.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>



Re: [XeTeX] latin-1 encoded characters in commented out parts trigger log warnings

2021-02-21 Thread Ross Moore
Hi David,

On 21 Feb 2021, at 11:02 pm, David Carlisle 
mailto:d.p.carli...@gmail.com>> wrote:


I don't think there is any reasonable way to say you can comment out parts of a 
file in a different encoding.

I’m not convinced that this ought to be correct for TeX-based software.

TeX (not necessarily XeTeX) has always operated as a finite-state machine.
It *should* be possible to say that this part is encoded as such-and-such,
and a later part encoded differently.

I fully understand that editor software external to TeX might well have 
difficulties
with files that mix encodings this way, but TeX itself has always been 
byte-based
and should remain that way.

A comment character is meant to be viewed as saying that:
 *everything else on this line is to be ignored*
– that’s the impression given by TeX documentation.


But you only know it is a comment character if you can interpret the incoming 
byte stream
If there are encoding errors in that byte stream then everything ls is guess 
work.

Who said anything about errors in the byte stream?
Once you have encountered the (correct) comment character,
what follows on the rest of the line is going to be discarded,
so its encoding is surely irrelevant.

Why should the whole line need to be fully tokenised,
before the decision is taken as to what part of it is retained?

In the case of a package file, rather than author input for typesetting,
the intention of the coding is completely unknown,
is probably all ASCII anyway, except (as in this case) for comments intended
for human eyes only, following a properly declared comment-character.


In this particular case with mostly ascii text and a few latin-1 characters it 
may be that you can guess that
the invalid utf-8 is in fact valid latin1 and interpret it that way,

You don’t need to interpret it as anything; that part is to be discarded.

and the guess would be right for this file
but what if the non-utf8 file were utf-16 or latin-2  or

Surely the line-end characters are already known, and the bits
have been read up to that point *before* tokenisation.
So provided the tokenisation of the comment character has occurred before
tackling what comes after it, why would there be a problem?

... just guessing the encoding (which means guessing where the line and so the 
comment ends)
is just guesswork.

No guesswork intended.


The file encoding specifies the byte stream interpretation before any tex 
tokenization
If the file can not be interpreted as utf-8 then it can't be interpreted at all.

Why not?
Why can you not have a macro — presumably best on a single line by itself –

there is an xetex   primitive that switches the encoding as Jonathan showed, 
but  guessing a different encoding
if a file fails to decode properly against a specified encoding is a dangerous 
game to play.

I don’t think anyone is asking for that.

I can imagine situations where coding for packages that used to work well
without UTF-8 may well be commented involving non-UTF-8 characters.
(Indeed, there could even be binary bit-mapped images within comment sections;
having bytes not intended to represent any characters at all, in any encoding.)

If such files are now subjected to constraints that formerly did not exist,
then this is surely not a good thing.


Besides, not all the information required to build PDFs need be related to
putting characters onscreen, through the typesetting engine.

For example, when building fully-tagged PDFs, there can easily be more 
information
overall within the tagging (both structure and content) than in the visual 
content itself.
Thank goodness for Heiko’s packages that allow for re-encoding strings between
different formats that are valid for inclusion within parts of a PDF.

I’m thinking here about how a section-title appears in:
 bookmarks, ToC entries, tag-titles, /Alt strings, annotation text for 
hyperlinking, etc.
as well as visually typeset for on-screen.
These different representations need to be either derivable from a common 
source,
or passed in as extra information, encoded appropriately (and not necessarily 
UTF-8).


So I don't think such a switch should be automatic to avoid reporting encoding 
errors.

I reported the issue at xstring here
https://framagit.org/unbonpetit/xstring/-/issues/4<https://framagit.org/unbonpetit/xstring/-/issues/4>


David


that says what follows next is to be interpreted in a different way to what 
came previously?
Until the next switch that returns to UTF-8 or whatever?


If XeTeX is based on eTeX, then this should be possible in that setting.


Even replacing by U+FFFD
is being lenient.

Why has the mouth not realised that this information is to be discarded?
Then no replacement is required at all.


David





Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@

Re: [XeTeX] latin-1 encoded characters in commented out parts trigger log warnings

2021-02-21 Thread Ross Moore
Hi David.

On 21 Feb 2021, at 10:12 pm, David Carlisle 
mailto:d.p.carli...@gmail.com>> wrote:

I think that should be taken up with the xstring maintainers.

Is  xstring  intended for use with XeTeX ?
I suspect not.
But anyway, there are still issues with this.

(BTW, I wrote this before Jonathan Kew’s response.)


I don't think there is any reasonable way to say you can comment out parts of a 
file in a different encoding.

I’m not convinced that this ought to be correct for TeX-based software.

TeX (not necessarily XeTeX) has always operated as a finite-state machine.
It *should* be possible to say that this part is encoded as such-and-such,
and a later part encoded differently.

I fully understand that editor software external to TeX might well have 
difficulties
with files that mix encodings this way, but TeX itself has always been 
byte-based
and should remain that way.

A comment character is meant to be viewed as saying that:
 *everything else on this line is to be ignored*
– that’s the impression given by TeX documentation.

If it is the documentation that is incorrect, then it should certainly be 
clarified.

For XeTeX and this particular example, it’s probably just a matter of checking
that the non-UTF8 characters occur *after* a UTF-8  ‘%' , and not issuing
an error message under these conditions.
A warning, maybe, but not an error.


The file encoding specifies the byte stream interpretation before any tex 
tokenization
If the file can not be interpreted as utf-8 then it can't be interpreted at all.

Why not?
Why can you not have a macro — presumably best on a single line by itself –
that says what follows next is to be interpreted in a different way to what 
came previously?
Until the next switch that returns to UTF-8 or whatever?


If XeTeX is based on eTeX, then this should be possible in that setting.


Even replacing by U+FFFD
is being lenient.

David




On Sun, 21 Feb 2021 at 11:04, jfbu mailto:j...@free.fr>> wrote:
Hi,

consider this

\documentclass{article}
\usepackage{xstring}
\begin{document}
\end{document}

and call it xexstring.tex

Then xelatex xexstring triggers 136 warnings of the type

Invalid UTF-8 byte or sequence at line 35 replaced by U+FFFD.

Looking at file

/usr/local/texlive/2020/texmf-dist/tex/generic/xstring/xstring.tex

I see that this matches with use of latin-1 encoded characters in comments.

Notice that it is a not a user decision here to use a latin-1
encoded file.

In fact I encountered this in a file I was given where
xstring package was loaded by another package.

Regards,

Jean-François


Cheers.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>



Re: [XeTeX] [tex-implementors] Proposal : that TeX engines generating PDF directly should be able to close the output file without terminating.

2020-07-03 Thread Ross Moore
Hi Jonathan,

On 4 Jul 2020, at 8:55 am, Jonathan Kew 
mailto:jfkth...@gmail.com>> wrote:

On 03/07/2020 20:13, Bruno Le Floch wrote:
On 7/3/20 6:50 PM, Jonathan Kew wrote:
On 03/07/2020 16:26, Philip Taylor wrote:
Jonathan Kew wrote:

Many potential use-cases, I think, can be equally well addressed by multiple TeX
invocations under the control of a higher-level script or tool of some kind.
Perhaps there are compelling examples where this would not be the case, but I'm
not aware of them at the moment.

JK

A major use case could be for AucTeX preview of equations, or other wysywyg-like
interfaces where one wants to compile chunks of TeX code always with the same
preamble, and with no relevant changes in macros: one could have an ongoing TeX
run producing pdfs when provided with further input.

This raises the question of what state the TeX engine should return to when the 
hypothetical \nextpdf primitive is executed. Does it return to a pristine 
"initex" state, or a "freshly-loaded .fmt file" state, or is the current state 
completely unchanged (does \jobname reflect the new output name?), or what? 
Should the \count variables that are by convention used to record page numbers 
get reset?

Does a new .log file also get started? What about \write output files -- are 
they flushed and new files started?

There’s currently a thread called “startup time” about \dump’ing a new format 
file.
Similar kinds of consideration exist there, but at a different level (of 
course),
in choosing the best place to make that  \dump  call.

At least there it is clear what is the purpose for making the new format.
It is much less clear here what would be the new use-cases made possible
if such a  \nextpdf  primitive were made available.

  A currently-working
variant of this is the following (in bash), which ships out a first page, then
waits 10 seconds, then ships out another one.

$ (echo '\relax Hello world!\vfill\break' && sleep 10 && echo '\relax Another
pdf.\bye') | xetex

One could imagine a primitive \nextpdf that would make xetex produce 2 separate
pdfs (in the present case texput.pdf and secondfile.pdf)

$ (echo '\relax Hello world!\nextpdf{secondfile}' && sleep 10 && echo '\relax
Another pdf.\bye') | xetex

This looks equivalent to (xetex '\relax Hello world!\bye' && sleep 10 && xetex 
--jobname secondfile '\relax Another pdf.\bye'), right?

It's true there would be a difference if there are macros etc. defined while 
processing the first file, and then used while generating the second. But I'm 
not sure this is really a commonly-required use case.

Consider me not yet persuaded……

I’m on the fence too.

Of course another possibility for Phil is to put all the pages of *both* his 
desired PDFs into the same “master PDF”,
then use a command-line tool like  pdftk  to extract the relevant pages for one 
or other into a new PDF.

This would likely not work with Tagged PDF documents.
There the extraction into separate files would need to be done by Acrobat Pro,
so as to preserve the tagging structures and the page-based relationships of 
marked content.
But that’s an issue for the future.


JK



All the best.
Stay safe.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>



Re: [XeTeX] startup time

2020-07-03 Thread Ross Moore
Hi Michael, and others.

On 04/07/2020, at 5:44, "Michael Maxwell"  wrote:

> 
> 
> On 7/3/2020 2:28 PM, Zdenek Wagner wrote:
>> There are several options:
>> 1. Dump your own format with your styles. You will have to regenerate the 
>> format after update of any of these style files and you will have to take 
>> care of dependencies

>> I already have 2 and 3, although afaict 3 has little if any effect, because 
>> the processing time appears to be taken up with macro expansion (or whatever 
>> it is that tex does while processing the preamble).
> 
> 1 is what I was looking for, but how do you do it?  I tried
>   xelatex -ini 

You will need to also have
   \input latex.ltx
first, as you are building an extended LaTeX from scratch.

Place the  \dump  command after all your packages and definitions,
but before  \begin{document} .

But there are tricks and traps that will have to explore for your own documents 
and choice of packages.
For example, if any package defines its own auxiliary file for passing 
information to later LaTeX runs, then find exactly when this file is read, and 
opened for writing.
A file-pointer cannot be preserved in dumped format file. 
So you'll probably have to move that package to after the  \dump  rather than 
before it.


> but it chokes on the \documentclass; or when I try it on my style sheet 
> alone, it chokes on the \ProvidesPackage.  Apparently it works with plain 
> TeX, but not with LaTeX?  And I'm not sure this would do what I want anyway; 
> what it means to "be xeinitix"; a web search for that term was unproductive 
> (unless you're looking for some kind of gas warning light).
> 
> Can you give me a lead on how to do #1?

I used to do this kind of thing a lot, almost 20 years ago when computers were 
much slower than they are now. I even wrote a package called  ldump  to capture 
some of the definitions that packages delay using \AtBeginDocument .
Moore's Law (no relation) has made it rather unnecessary now, though we are 
putting more and more into our documents, so it may still have a use.

Hope this helps.

   Ross


Re: [XeTeX] Proposal : that TeX engines generating PDF directly should be able to close the output file without terminating.

2020-07-03 Thread Ross Moore
Hi Philip.

On 3 Jul 2020, at 9:46 pm, Philip Taylor 
mailto:p.tay...@hellenic-institute.uk>> wrote:

Zdenek Wagner wrote:

I have never tried but could not you use \write18 to run XeTeX from XeTeX? I am 
not sure whether it is supported.

It would seem that adding the command-line qualifier "--shell-escape" is 
required in order that \write 18 be permitted, but then we run into the problem 
that (with the benefit of hindsight) we should have foreseen at the outset :  
the PDF file is incomplete at this point —


Then the main file should be the TeX job that *requires* something else to be 
available,
not the one that is required.
If the other PDF is not yet up-to-date then you use a  \write 18  call to 
process it
at the *beginning* of the job that requires it, not at the end.

Alternatively, use a  makefile  and run  ‘make'  from the command-line.
This is like a ‘batch’ command that runs all dependent jobs before doing
the final one, with everything up-to-date.


XeTeX --shell-escape "Hoi-An TA menu (separate pages).tex"
This is XeTeX, Version 3.14159265-2.6-0.92 (TeX Live 2020/W32TeX) (preloaded
 format=xetex)
 \write18 enabled.
entering extended mode
(./Hoi-An TA menu (separate pages).tex (./Hoi-An TA menu.xlsx.tex omnivore
omnivore omnivore omnivore omnivore omnivore vegan vegan vegan vegan vegan
omnivore omnivore omnivore vegan vegan vegan omnivore omnivore omnivore
omnivore omnivore vegan vegan vegan vegan vegan omnivore omnivore vegan
omnivore vegan omnivore omnivore omnivore omnivore vegan vegan vegan vegan
vegan [1] omnivore omnivore omnivore omnivore vegan vegan omnivore omnivore
vegan vegan vegan omnivore [2]) [3] [4] [5] [6]This is XeTeX, Version 3.14159265
-2.6-0.92 (TeX Live 2020/W32TeX) (preloaded format=xetex)
 restricted \write18 enabled.
entering extended mode
(./Hoi-An TA menu (combine pages).texSyntax Error: Couldn't find trailer diction
ary
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table

! Unable to load picture or PDF file '"Hoi-An TA menu (separate pages).pdf"'.

   }
l.28 ... - 0,666 \rulewidth \relax height \vsize }
  \relax
?

Philip Taylor


Hope this helps.
Stay safe.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>



Re: [XeTeX] \font "":color=FFFFFF produces black, not white glyphs \font "":color=FFFFFF produces black, not white glyphs, re-visited

2020-05-26 Thread Ross Moore
Hi Phil,

On 26 May 2020, at 5:12 pm, Philip Taylor 
mailto:p.tay...@hellenic-institute.uk>> wrote:

Ross Moore wrote:

I’m sorry, but this just doesn’t make any sense to me — but see further below.

No, the fourth couplet is TT, where T is "Transparency".  Unfortunately , it is 
a misnomer, since 00 = completely transparent and FE is almost opaque, which is 
why I spoke of "opacity" rather than transparency.  Unfortunately FF is not 
opaque when preceded by FF, because the driver treats FF [FF] specially.

As I said, it didn’t make sense to me.  :-)
Thanks for the clarification, and sorry for my added noise.


First it is important to realise that both flattening and conversion to CMKY 
will take place (the document is for digital printing).  When flattening takes 
place, RGB FF text will completely obscure the ground, and after conversion 
to CMYK there will then be no ink where the text occurs.  Unfortunately as 
things are at the moment, there will be 1/256 bleed-through of the ground 
because the RGB white was not perfectly opaque.

"knockout", tho' interesting, should not be needed.  The example earlier sent 
shews that one can get very close to 100% white (and of course there are no 
white inks involved) but not to 100% and this is what I would like to achieve 
(and which should, IMHO, be achievable).  Were it not for the fact that the 
driver treats FF and  specially, there would be no problem at all 
in achieving my aim.

It looks like Akira has done what you wanted, so the exercise was a success. :-)


** Phil.

Cheers.
Stay safe.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>



Re: [XeTeX] \font "":color=FFFFFF produces black, not white glyphs \font "":color=FFFFFF produces black, not white glyphs, re-visited

2020-05-25 Thread Ross Moore
Hi Zdenek.

We have very aggressive Mail protection software.
Sorry, it has blocked your  .tar.gz  file.

Is there a website that I can download it from?

Sorry for the hassle.

Ross


On 26 May 2020, at 10:35 am, Zdenek Wagner 
mailto:zdenek.wag...@gmail.com>> wrote:

Hi Ross,

I have packed the whole directory with the PDF/X tests, it includes
the generated files as well.

Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml<https://protect-au.mimecast.com/s/5cTGC0YKgRsjZq9Yi2VgRi?domain=ttsm.icpf.cas.cz>
http://icebearsoft.euweb.cz<https://protect-au.mimecast.com/s/BEK2CgZ05Jf8pOj4h3JSq-?domain=icebearsoft.euweb.cz>

From: IT Mail Adminmailto:postmas...@mq.edu.au>>
Subject: pdfx-tests_tar_gz
Date: 26 May 2020 at 10:35:43 am AEST
To: Ross Mooremailto:ross.mo...@mq.edu.au>>


Attachment Notice

The following attachment was removed from the associated email message.

File Name
pdfx-tests.tar.gz
File Size
1033141 Bytes

Attachment management policies limit the types of attachments that are allowed 
to pass through the email infrastructure.

Attachments are monitored and audited for security reasons.


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>



Re: [XeTeX] \font "":color=FFFFFF produces black, not white glyphs \font "":color=FFFFFF produces black, not white glyphs, re-visited

2020-05-25 Thread Ross Moore
Hi Zdenek,

On 26 May 2020, at 9:31 am, Zdenek Wagner 
mailto:zdenek.wag...@gmail.com>> wrote:

út 26. 5. 2020 v 0:59 odesílatel Ross Moore 
mailto:ross.mo...@mq.edu.au>> napsal:
and at the time, that appeared to solve my problem. However, it would appear 
that since then "xdvipdfmx" has been enhanced to support transparency, as a 
result of which Khaled's suggested FF00 no longer works (the text is 
invisible, see attached).  Could anyone tell me how, short of using \specials, 
I can achieve 100% white with 100% opacity (= 0% ink) in XeTeX ?

I’m sorry, but this just doesn’t make any sense to me — but see further below.
Surely 100% opacity means that the blend between background and foreground is 
100% background, 0% foreground.
Thus your text will be invisible, whatever colour has been specified; that this 
is white becomes irrelevant.

The only way to get 100% white, over a coloured background, would be with 100% 
ink, so 0% opacity.
Any other opacity level will allow some of the background colour to be mixed in.
At least that is how I understand what colour mixing is all about.

Sorry, correct me if my English is wrong but I would expect 100% ink = 100% 
opacity = 0% transparency

You’re absolutely correct, my mistake.
Certainly I meant 0% *transparency*, the opposite of what Phil was trying to do.
It was he who said   100% opacity (= 0% ink)  which is where the error lies.
Surely 100% ink = 100% opacity = 0% transparency,  as you say.


However, there is another PDF parameter called “knockout”.
See this link for an brief description of the issue:

   
https://www.enfocus.com/en/products/pitstop-pro/pdf-editing/knockout-white-text-in-a-pdf<https://protect-au.mimecast.com/s/tuWYCBNqgBCvBNrAuzR3NV?domain=enfocus.com>

This is another topic. This addresses "black overprint" in printing.

Sure, but it is about knocking out the colour in the background.
Using black text is probably its most common usage.
But if you want the natural paper to come through, then surely this is the only 
way to do it conveniently.

Otherwise you would have to manually define regions outlining the letters, and 
make these
boundary curves for your background. Totally impractical.
PDF does this for you, if you have used the correct way to ask for it.


The idea is that process colours are printed in the following order: 
cyan-magenta-yellow-black. If you want to print a yellow text on a cyan 
background, RIP must erase the cyan plate to white where the characters will 
later appear on the yellow plate, otherwise the text would not be yellow. If 
the offset films are not precisely adjusted, you will see colour artifacts at 
the boarders of the characters. If you want to type a dark text (usually black) 
to a light background, it can just be overprinted. In order to make it work, 
both colours must be defined in CMYK (not RGB, not grayscale). Professional 
Acrobat since version 9 contains a prepress checking function which can verify 
whether overprint was really used. Black overprint is implemented in my 
zwpagelayout package. It was tested in xelatex, pdflatex, and latex + dvips. 
The package does not my test files. If you like, I can send them.

Sure; I’d love to see these.
I’m sure that this would most closely approach what Phil seems to want to do.



How to achieve knockout using TeX+gs or pdfTeX or XeTeX?
I’m not at all sure. It must have a graphics state parameter.
The next image shows what I think is the relevant portion of the PDF specs.



There’s a whole section on “Transparency Groups”, but mostly it is about how 
different transparent objects
must combine to produce the final colour where objects overlap.

Transparency should not be used for prepress. It works fine on office printers 
but often come out as black on phototypesetters and CTP.

Phil hasn’t said what is his application.

After a cursory look, I think you need to use a Form X Object, which can be 
done in pdftex using the  \pdfxform  primitive,
with appropriate attributes specified.
For XeTeX you would need to be using  \special  commands.
Someone here must have some experience with this.


Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml<https://protect-au.mimecast.com/s/TZFICD1vRkCX0ZzMs5cgKz?domain=ttsm.icpf.cas.cz>
http://icebearsoft.euweb.cz<https://protect-au.mimecast.com/s/4H07CE8wlRCBMj2nhptby5?domain=icebearsoft.euweb.cz>




Philip Taylor



Sorry for my error adding to confusion.

Cheers.
Stay safe.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addresse

Re: [XeTeX] Translation commands for package dramatist to put in the marathi.ldf file

2020-04-12 Thread Ross Moore
Hi Zdenek,

On 13/04/2020, at 0:00, "Zdenek Wagner" 
mailto:zdenek.wag...@gmail.com>> wrote:

Hi all,

This can be done with \let and \def, not with \newcommand:

\let\savedLabel\label
\def\label#1{Do something with #1 \SavedLabel{#1}}

The second line could be replaced with
\renewcommand*\label[1]{Do something with #1 \SavedLabel{#1}}

This idea can be extended somewhat.

\makeatletter
\def\MY@label#1{Do something with #1 \LTX@label{#1}}
\AtBeginDocument{
 \let\LTX@label\label
 \let\label\MY@label
}
\makeatother

This way you retain pointers to both your own version and the original,
so can change between them if necessary, within different parts of your
document, or within specially tailored environments.

Delaying the \let rebindings using \AtBeginDocument means that it will still 
work
if some other package (such as  hyperref ) makes its own changes,
which you also want to incorporate.


\providecommand is useful if you assume that a definition exist but you want to 
provide a default definition.

Sure.
It is particularly useful when devising templates that will be filled with 
information
provided by a database, and command-line shell software that automates
calls to TeX or LaTeX or other engine.

latex '\def\name{client name} ... \input{mytemplate.tex}'

where  mytemplate.tex  is the main LaTeX source, having
  \providecommand{\name}{some default}
and similarly for all the variable data fields.

I use this kind of setup to automate personalised assignment cover-sheets,
generated online in response to student requests from a web page.
Sometimes the full question sheet is done this way,
with versions personalized, or randomized, based upon student ID numbers.


The newcommand family is useful because it offers a default first argument but 
if you use arguments with the newcommand family, use always the star version so 
that the macro is not \long. If you forget a right brace after an argument, you 
will get an error message at the end of a paragraph but without  the star you 
get an error message at the end of a file hence it is difficult the source of 
the error.

Construct \csname scenename\endcsname expands to the contents of \scenename if 
already defined or is defined to be identical with \relax if not yet defined. 
When checking existence of definition, LaTeX does the following:

\expandafter\ifx\csname scenename\endcsname\relax
  code for \scenename not yet defined
\else
  code for \scenename already defined
\fi

With \csname  you can test for all kinds of things,
and even adjust macros like  \begin  and  \end  to patch in extra coding
for specific environments, whether a package is loaded or not.

The possibilities are endless.


Cheers.
Stay safe.

  Ross


Of course, the whole \else part can be omitted if you have nothing to put there.

Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml<https://protect-au.mimecast.com/s/QcrSCnx1Z5UgvO8wS9Msu9?domain=ttsm.icpf.cas.cz>
http://icebearsoft.euweb.cz<https://protect-au.mimecast.com/s/fkdBCoV1Y2S5EqO9HzWPDx?domain=icebearsoft.euweb.cz>


ne 12. 4. 2020 v 12:49 odesílatel Ross Moore 
mailto:ross.mo...@mq.edu.au>> napsal:
Hi Phil, Zdeněk and others.

On 12 Apr 2020, at 7:46 pm, Philip Taylor 
mailto:p.tay...@hellenic-institute.uk>> wrote:

Zdeněk Wagner wrote:


I would not do it. Of course, you cannot use \renewcommand because
\scenename is not used in any *.ldf. You could use \def\scenename{दरश्य} […]


LaTeX has  \providecommand  with the same syntax as \newcommand  and  
\renewcommand .

It makes the definition *only* if the c-s is not already known.
This means that you can always use:

  \providecommand\mycs{}
  \renewcommand\mycs{ what I really want }

to get around such issues.


A thought — if \scenename is not known at the point that the last line of 
[gloss-]marathi.ldf is read, would there be any point in using \def \scenename 
{दरश्य}, since such definition would either get over-ridden by whatever 
subsequent definition of \scename is causing the problem (\def, \renewcommand), 
or would prevent a subsequent \newcommand from working as \scenename would 
already be defined.  Is this not the case (he asked, as someone who barely 
understands anything that LaTeX does ...) ?

There is always a way to get what you want,
whether using Plain TeX or LaTeX or whatever other high-level macro structures.

Thus the important thing is how to make it resistant to updates, as Zdeněk said.


Philip Taylor

Hope this helps.
Stay safe.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au

CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confide

Re: [XeTeX] Translation commands for package dramatist to put in the marathi.ldf file

2020-04-12 Thread Ross Moore
Hi Phil, Zdeněk and others.

On 12 Apr 2020, at 7:46 pm, Philip Taylor 
mailto:p.tay...@hellenic-institute.uk>> wrote:

Zdeněk Wagner wrote:


I would not do it. Of course, you cannot use \renewcommand because
\scenename is not used in any *.ldf. You could use \def\scenename{दरश्य} […]


LaTeX has  \providecommand  with the same syntax as \newcommand  and  
\renewcommand .

It makes the definition *only* if the c-s is not already known.
This means that you can always use:

  \providecommand\mycs{}
  \renewcommand\mycs{ what I really want }

to get around such issues.


A thought — if \scenename is not known at the point that the last line of 
[gloss-]marathi.ldf is read, would there be any point in using \def \scenename 
{दरश्य}, since such definition would either get over-ridden by whatever 
subsequent definition of \scename is causing the problem (\def, \renewcommand), 
or would prevent a subsequent \newcommand from working as \scenename would 
already be defined.  Is this not the case (he asked, as someone who barely 
understands anything that LaTeX does ...) ?

There is always a way to get what you want,
whether using Plain TeX or LaTeX or whatever other high-level macro structures.

Thus the important thing is how to make it resistant to updates, as Zdeněk said.


Philip Taylor

Hope this helps.
Stay safe.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>



Re: [XeTeX] Math class initialization in Unicde-aware engine

2019-11-28 Thread Ross Moore
Hi Joseph.

On 28 Nov 2019, at 6:29 pm, Joseph Wright 
mailto:joseph.wri...@morningstar2.co.uk>> 
wrote:

On 28/11/2019 00:16, Ross Moore wrote:
If by ignoring you mean removing the character entirely, then that is surely 
not best at all.
Most  N Class (Normal) characters would be simply of the default  \mathord  
class.

That is already the case: it's where IniTeX starts off, chars are mathord. So 
'nothing to do here'. Also note that some of this information is already set 
from the main Unicode file: it tells us which chars are letters.

OK. That’s what I’d expect.

I’d expect others to be mapped instead into a macro that corresponds to 
something that TeX does support.
e.g.
 space characters for  thinspace, 2-em space, etc.  in  U+2000 – U+200A
can expand into things like:   \, \; \> \quad \qquad  etc.  ( even to 
constructions like  \mskip1mu )

That's not a generic IniTeX thing, I'm afraid.

Yeah, well there are so many of these extra space characters.
I really don’t know where they are all used in practice by other (non-TeX) apps.

The Unicode data loaders are explicitly about setting up the basic data in 
Unicode TeX engines that's held in (primitive) tables.

Creating macros is the job of the 'rest' of the format. Here, presumably you 
are thinking of making chars math-active: that's well out-of-scope for the 
loader.

Fair enough; especially if this is all happening before processing any textual 
input intended for the typeset page.


After all, this is essentially what happens when pdfTeX reads raw Unicode input.

pdfTeX reads bytes, there's not really much comparison. In IniTeX mode, there 
is not much happening with UTF-8 and pdfTeX: perhaps you are thinking of with 
LaTeX?

Yes, sure I’m thinking of LaTeX; at least now that UTF-8 input has become the 
default.
Previously there would be (inputenc) package and  .def  file loading.
But, as you say above, this comes later.

One has to wonder then, how much of the Unicode range needs to be (or can be) 
handled earlier;
e.g, when there is only one sensible interpretation for the use of specific 
characters?
Conversely, how much can, or should, be left to later when there may be a 
better idea of which
(classes of) characters are present within the input source?

I suppose that is the kind of question you are dealing with; so I’ll now butt 
out of this conversation,
but still watch it if there’s further continuation.


Joseph



Cheers,

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>



Re: [XeTeX] Math class initialization in Unicde-aware engine

2019-11-27 Thread Ross Moore
Hi Joe, Doug

On 28 Nov 2019, at 10:27 am, Joseph Wright 
mailto:joseph.wri...@morningstar2.co.uk>> 
wrote:

> # N - Normal - includes all digits and symbols requiring only one form

> # D - Diacritic

> # F - Fence - unpaired delimiter (often used as opening or closing)

> # G - Glyph_Part - piece of large operator

> # S - Space
> # U - Unary - operators that are only unary

> # X - Special - characters not covered by other classes


> Unfortunately, the documentation/comments don't say what happens to entries 
> having these other Unicode math codes (N, D, F, G, S, U, and X). Are they 
> completely ignored, or are they mapped to one of the other eight codes that 
> matches what TeX is interested in or only capable of handing?
>
> I can imagine that the space character, given Unicode math class 'S' in 
> MathClass.txt, is ignored during this parse. But what happens to the '¬' 
> character (U+00AC) ("NOT SIGN"), which is assigned 'U' (Unary Operator). 
> Surely the logical not sign is not being ignored during initialization of a 
> Unicode-aware engine, yet the comments in load-unicode-math-classes.tex don't 
> say one way or the other, and it appears to me that the parsing code is 
> ignoring it.

The other Unicode math classes don't really map directly to TeX ones, so
they are currently ignored. Suggestions for improvements here are of
course welcome.

If by ignoring you mean removing the character entirely, then that is surely 
not best at all.

Most  N Class (Normal) characters would be simply of the default  \mathord  
class.

I’d expect others to be mapped instead into a macro that corresponds to 
something that TeX does support.
e.g.
 space characters for  thinspace, 2-em space, etc.  in  U+2000 – U+200A
can expand into things like:   \, \; \> \quad \qquad  etc.  ( even to 
constructions like  \mskip1mu )

After all, this is essentially what happens when pdfTeX reads raw Unicode input.

The G class (Glyph_Part) is a lot harder, as those glyph parts don’t correspond 
to any single
TeX macro. Think about a very large opening brace spanning 3+ ordinary line 
widths, say,
as may be generated by  \left\{ ... \right\}  surrounding some (inner-) 
displayed math alignment.
On input, the whole grouping would need to be identified and mapped to 
appropriate TeX coding.

Basically there is a lot here that needs to be looked more or less individually.

I’ve been through this kind of exercise, in reverse, to decide what to specify 
as /Alt  and /ActualText
replacements (for accessibility) for what TeX produces with various math 
constructions.
I don’t have definitive answers for everything, but have tried some 
possibilities for many things.


Joseph


Hope this helps.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>



Re: [XeTeX] Erroneous \lastkern

2019-06-28 Thread Ross Moore
Hi Ulrike, Jonathan,

On 28 Jun 2019, at 6:59 pm, Jonathan Kew 
mailto:jfkth...@gmail.com>> wrote:

On 28/06/2019 08:32, Ulrike Fischer wrote:
> Am Thu, 27 Jun 2019 23:03:09 + schrieb Ross Moore:
>
>>> there is probably something wrong with XeLaTeX, but I cannot find what.
>>
>> The difference between xetex and xelatex is the font:
>>
>> I’m sorry but I don’t understand this as an answer.
>
> It wasn't meant as an answer. I only explained why you get different
> results with plain and latex: because they use different default
> fonts.

OK. But that just emphasises that there is a bug lurking here.

> xetex has a different typesetting engine: it doesn't handle chars
> but words as units.
>
> See page 31 here 
> http://xml.web.cern.ch/XML/lgc2/xetexmain.pdf<https://protect-au.mimecast.com/s/jOdnCWLVn6iDMlXKfxOK2u?domain=xml.web.cern.ch>.
>
> So I'm not really surprised that you get the y, I was more suprised
> that it doesn't happen with legacy fonts - there it seem to switch
> back to the "handle characters" mode.

Yes, the bug arises because of how xetex collects a series of characters
to be "shaped" by an opentype font, rather than the core tex engine
handling each character individually. So at the point when \lastkern is
encountered, the letter A has not yet been appended to the current node
list being built; it is "pending" in the buffer of characters that will
become a whole-word node.

OK, I understand that now.
In fact I was think of surmising that such a mechanism could be in play.

But surely this means that XeTeX's “current list” should be that buffer, not 
TeX’s usual horizontal list.
So  \lastkern , \lastpenalty  and  \lastskip  should be getting their values 
from there.

Alternatively, and perhaps equivalently, just set all these to zero whenever we 
are adding characters
to this “pending” buffer. Only when the words are ready to be put back into the 
horizontal list,
should these be set back to what is there, if those values are still relevant 
to any typesetting tasks
at the beginning of that resulting word.


Still, I would regard this as a bug that we ought to fix. I imagine
similar primitives like \lastpenalty or \lastskip probably share the
same buggy behavior.

Yes, I would think so.

BTW, this is relevant to my Tagged PDF work, as it must insert extra literal 
material via \special  commands.
To ensure the same typesetting as without those tags, it is frequently 
necessary to transfer that previous
\skip , \kern  or \penalty  to come *after* the \special  and nullify the one 
before it.

So far this is only developed for pdfTeX, but in future we’ll want it for XeTeX 
too.
Discovering this difference now, and fixing it, will surely avoid a headache 
when that time comes.


JK


Cheers,

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>



Re: Google Disk (was: XeLaTeX to Word/OpenOffice - the state of the art?)

2019-03-17 Thread Ross Moore
Hi Andrew,

On 18/03/2019, at 0:18, "Andrew Cunningham" 
mailto:lang.supp...@gmail.com>> wrote:

Ross,

It is also dependent in the fonts themselves and the scripts the language is 
written in.

Absolutely.

Depending on the language and script the only way to ensure accessibility is to 
include the ActualText attributes for each relevant tag.

Indeed, provided you have supplied tagging at all, as of course should be done.

Considering how complex opentype fonts  can become for some scripts the 
simplistic To Unicode mappings in a PDF can be insufficient.

Yes, but it is better for the CMaps to at least be appropriate, rather than 
inaccurate or missing altogether, as can be the case. Different software tools 
get information from different places, so ideally one needs to provide the best 
values for all those possible places.

And text in a PDF may by WCAG definition be non-textual content.

Presumably you mean, adding descriptive text to graphics that convey meaningful 
information; e.g. a company logo, and most illustrations.
Of course this should be done too. But this can only be useful if the alternate 
descriptive text can be found via the structure tagging; hence the need for 
fully tagged PDF, navigable via that tagging.

And Zdenek's comment emphasises how what might work well in one language 
setting can be quite insufficient for others. We need to be able to accommodate 
all things that are helpful.
That is surely what the U (for Universal) means in PDF/UA.


Cheers,

  Ross



On Sunday, 17 March 2019, Ross Moore 
mailto:ross.mo...@mq.edu.au>> wrote:
Hi Karljūrgen,

On 17/03/2019, at 1:42, "Karljürgen Feuerherm" 
mailto:kfeuerh...@kfeuerherm.ca>> wrote:

> Ross,
>
> Your reply caught my eye, and I am now looking at the pdfx package 
> documentation.
>
> May I ask, if accessibility is a concern, why a-2b/-2u rather than -ua-1, 
> which seems directly targeted at this?

PDF/UA and PDF/A-1a,2a,3a  require a fully tagged PDF.
This is a highly non-trivial task, which requires adding much extra to the 
document, done almost entirely through \special commands. The pdfx package does 
not provide this, but is useful for meeting the Metadata and other requirements 
of these formats.

Abstractly, accessibility is about having sufficient information stored in the 
PDF for software tools to be able to build and present a description of the 
content and structure, other than the visual one. The same can be said of 
software for converting into a different format.

A significant part of this is being able to correctly identify each character 
in the fonts used within the TeX/produced PDF. Even this is a non-trivial 
problem, due to TeX's non-standard font encodings, and virtual font technique.

>
> Many thanks,
>
> K
>
>> You should use the  pdfx  package and prepare for  PDF/A-2b or -2u.
>> This fixes many of these things that affect conversions, as well as 
>> Accessibility and Archivability.
>>
>> It's not fully tagged PDF, but handles many other technical issues.
>>


Hope this helps.

Ross



--
Andrew Cunningham
lang.supp...@gmail.com<mailto:lang.supp...@gmail.com>





Re: Google Disk (was: XeLaTeX to Word/OpenOffice - the state of the art?)

2019-03-16 Thread Ross Moore
Hi Karljūrgen,

On 17/03/2019, at 1:42, "Karljürgen Feuerherm"  wrote:

> Ross,
> 
> Your reply caught my eye, and I am now looking at the pdfx package 
> documentation.
> 
> May I ask, if accessibility is a concern, why a-2b/-2u rather than -ua-1, 
> which seems directly targeted at this?

PDF/UA and PDF/A-1a,2a,3a  require a fully tagged PDF.
This is a highly non-trivial task, which requires adding much extra to the 
document, done almost entirely through \special commands. The pdfx package does 
not provide this, but is useful for meeting the Metadata and other requirements 
of these formats.

Abstractly, accessibility is about having sufficient information stored in the 
PDF for software tools to be able to build and present a description of the 
content and structure, other than the visual one. The same can be said of 
software for converting into a different format.

A significant part of this is being able to correctly identify each character 
in the fonts used within the TeX/produced PDF. Even this is a non-trivial 
problem, due to TeX's non-standard font encodings, and virtual font technique.

> 
> Many thanks,
> 
> K
> 
>> You should use the  pdfx  package and prepare for  PDF/A-2b or -2u.
>> This fixes many of these things that affect conversions, as well as 
>> Accessibility and Archivability.
>> 
>> It's not fully tagged PDF, but handles many other technical issues.
>> 


Hope this helps.

Ross



Re: Google Disk (was: XeLaTeX to Word/OpenOffice - the state of the art?)

2019-03-16 Thread Ross Moore
Hi Janusz,

On 16/03/2019, at 17:51, "Janusz S. Bień" 
mailto:jsb...@mimuw.edu.pl>> wrote:

Sorry for the previous male sent by mistake (the shortcuts in Gnus are
sometimes confusing...)


On Fri, Mar 15 2019 at 13:34 +01, BPJ wrote:
> Den 2019-03-15 kl. 08:31, skrev Janusz S. Bień:
>> On Fri, Mar 15 2019 at 7:19 +01, BPJ wrote:
>>> I use, despite myself, Google Docs to convert PDF to DOCX,

For me the quality is similar to Acrobat 9, i.e. completely not
acceptable: spaces between words are often missing.

This is inherent in the way TeX was written.
But there are ways to tackle the issue.

You should use the  pdfx  package and prepare for  PDF/A-2b or -2u.
This fixes many of these things that affect conversions, as well as 
Accessibility and Archivability.

It's not fully tagged PDF, but handles many other technical issues.



Best regards

Janusz

--
,
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien


Hope this helps.

   Ross


Re: make4ht problem

2019-03-12 Thread Ross Moore
Hi Janusz,

This is the kind of incompatibility problem that I like to try nailing down.
But like Phil, my recommendation is to try Acrobat saving to Word.
Adobe has put a lot of effort into making this better.

It will become easier when TeX supports creating Tagged PDF.
But that's still a fair way off.


Cheers.

   Ross


On 12/03/2019, at 21:10, "Janusz S. Bień" 
mailto:jsb...@mimuw.edu.pl>> wrote:

On Tue, Mar 12 2019 at 10:38 +01, Michal Hoftich wrote:
> Hi Janusz,
>
>> --8<---cut here---start->8---
>> (/usr/share/texlive/texmf-dist/tex/generic/tex4ht/biblatex.4ht
>> ! Undefined control sequence.
>>  ...docsvlist \expandafter {\bbl@loaded
>> }\fi
>> l.228 \fi}{}
>> --8<---cut here---end--->8---
>
> It is hard to say what is going on without a TeX example. This seems
> like an issue with BibLaTeX support, but without trying an actual TeX
> example it is hard to guess what the problem is.

I don't mind sending the source files to anybody interested. I can also
try to prepare a minimal example.

Best regards

Janusz

--
,
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien


Re: [XeTeX] Fake italics for some characters only

2018-12-05 Thread Ross Moore
BPJ and John Was,

Please join this XeTeX list.
Otherwise I have to authorize each of your postings.
This delays them being sent out to everyone.


Cheers.

 Ross

On 05/12/2018, at 22:10, "BPJ" mailto:b...@melroch.se>> wrote:

@Zdenek, the point is that other characters inside `\textit` should be real 
italics. I at least have tried it using a macro around the "culprit" characters 
and I think it looks better than fake italics throughout, which looks really 
bad (shades of low-budget publications from the early eighties! :-). Anyway I'm 
working on a solution in my head which I'll try when I get back to my desktop. 
I think I'll try to use a boolean which I set/unset at the start/end of my 
"`\mytextit` and a single macro for the active characters which checks this 
boolean. I have no idea yet if it will work, but it seems the semantically 
cleanest way to do it to my mind.

/bpj

ons 5 dec. 2018 kl. 10:53 skrev Zdenek Wagner 
mailto:zdenek.wag...@gmail.com>>:
Hi,

I am afraid that I do not understand why to make only 4 FakeSlant
characters instead of a FakeSlant font. Does it mean that other
characters will remain upright inside \textit?

Anyway, making a few characters active for \textit is quite simple.
Let's suppose that A and B should be active. You then define:

\def\mytextit{\begingroup \catcode`\A=13 \catcode`\B=13 \dotextit}
\def\dotextit#1{\textit{#1}\endgroup}

You will then call \mytextit{Test of A and B}

Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz

st 5. 12. 2018 v 5:51 odesílatel Alan Munn 
mailto:am...@gmx.com>> napsal:
>
> Can you provide a bit more detail? Maybe a small example document?
>
> Alan
>
>
> Benct Philip Jonsson wrote:
> > I have a somewhat unusual problem. In a document produced using
> > XeLaTeX I need to use four Unicode letters with scarce font support in
> > italicized words and passages but the font which I have to use
> > supports these characters only in roman. The obvious solution is to
> > use the FakeSlant feature of fontspec but I don’t want to enclose
> > these characters in a command argument, in the hope that a future
> > version of the document can use an italic font which supports these
> > characters, but neither do I (perhaps needless to say) want to use
> > fake italics except for these four characters. In other words I would
> > like to perform some kind of “keyhole surgery” in the preamble and use
> > these characters normally in the body of the document, which I guess
> > means having to make them active and somehow detect when they are
> > inside the argument of `\textit`. (Note: it is appropriate to use
> > `\textit` rather than `\emph` here because the purpose of the
> > italicization is to mark text as being in an object language in a
> > linguistic text.) Is that at all possible? I guess I could wrap
> > `\textit` in a macro which locally redefines the active characters,
> > but I’m not sure how to do that, nor how to access the glyphs
> > corresponding to the characters once the characters are active. I am a
> > user who isn’t afraid of using and making the most of various packages
> > or of writing an occasional custom command to wrap up some repeatedly
> > needed operation, but I am no expert. I am aware of all the arguments
> > against fake italics — that is why I want to limit the damage as much
> > as possible! — but I have no choice here. Waiting for the/an
> > appropriate font to include italic versions of these characters is not
> > an option at the moment.
> >
> > /Benct
> >
> >
> >
> > --
> > Subscriptions, Archive, and List information, etc.:
> >  
> > http://tug.org/mailman/listinfo/xetex
>
>
>
> --
> Subscriptions, Archive, and List information, etc.:
>   
> http://tug.org/mailman/listinfo/xetex



--
Subscriptions, Archive, and List information, etc.:
  
http://tug.org/mailman/listinfo/xetex


--
Subscriptions, Archive, and List information, etc.:
 https://protect-au.mimecast.com/s/kD-7CoV1Y2SDN42oTVDIZs?domain=tug.org


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Controlling font embedding in PDF output

2018-11-16 Thread Ross Moore
Hi Werner,

On 17/11/2018, at 1:36, "Werner LEMBERG"  wrote:

> 
> > > Is there a simple option to make XeTeX (or rather xdvipdfmx) not
> > > embed fonts in PDFs? I'm going to post-process the output, which
> > > will do the embedding.
> >
> > Perhaps it is easier to generate the PDF, then remove the embedded
> > fonts?
> 
> Not for my use case, which is to include many PDFs (generated by
> LilyPond) into a master PDF (generated by XeLaTeX). The
> post-processor (Ghostscript's ps2pdf script) should then compute
> subsetted fonts for the whole document, which can make the final PDF
> *a lot* smaller in comparison to the standard way because subsetted
> fonts usually can't be merged.

Are you sure that this is even feasible, in that the same characters are 
referred to in the same way, in each of the Lilypond PDFs?

If the fonts are all Type1, with the same encodings in each PDF, this would be 
OK.
But I've seen PDFs where the subsetting of Type0 or TTF fonts is as an array, 
which simply assigns a number to the used glyphs, perhaps in the order of first 
occurrence within the PDF. These certainly cannot be merged, without adjusting 
essentially every string in every embedded PDF.

> 
> In LilyPond I can control whether its output PDF gets generated
> (1) the usual way (using subsetted fonts), (2) with embedded but not
> subsetted fonts, or (3) without embedded fonts. Ideally, I want
> option (3) for XeTeX (and for pdfTeX and luatex also, BTW). If this
> isn't possible, I would like to enforce option (2) so that ps2pdf can
> still do a decent job (at the cost of larger intermediate PDFs).

If you can get this to work, I'd be very interested in the technique.
Otherwise, a possible alternative approach is to combine the PDFs into a single 
Portfolio, using Adobe's Acrobat Pro. However I'd doubt that this gives any 
saving in file size over inclusion as attachments.

> 
> 
> Werner
> 

Hope this helps.

Ross




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Could Adobe Photoshop's "blending options" for text be supported in a future {Pdf|Xe}TeX variant

2018-04-04 Thread Ross Moore
Hi Phil.

On Apr 5, 2018, at 4:38 AM, Philip Taylor (RHUoL) 
<p.tay...@rhul.ac.uk<mailto:p.tay...@rhul.ac.uk>> wrote:


I have been playing with Adobe Photoshop's "blending options" for text 
recently, adding a gold or metallic texture to otherwise plain text.  The 
results are visually very striking, and I therefore began to wonder whether 
similar functionality might one day be added to Pdf/XeTeX, in the former case 
natively and in the latter case via \specials and an extended (x)dvipdfm(x) 
driver.

Three examples of the sorts of effect I have in mind can be seen at :

  *   https://www.dropbox.com/s/b7a1383rb1dx2vp/Ao%20dais.pdf?dl=0
  *   
https://www.dropbox.com/s/7s6s7n9w8popiyg/MENU%20001%20new%20ellipse.pdf?dl=0
  *   
https://www.dropbox.com/s/smmcjy9zuuxa1nu/MENU%20001%20%28metallic%20gold%20text%20demo%29.pdf?dl=0

I would be interested in others' reactions to this.

These are using PDF’s concept of “Text Rendering” modes.
In particular  7 Tr   meaning mode 7,
which uses the outlines of characters to be the clipping path for an underlying 
graphic.
Thus the letter shapes restrict what parts of the graphic come shining through.

This is essentially already available with pdfTeX; viz.

   
https://tex.stackexchange.com/questions/250156/problem-with-pdfliteral/250162#250162


There is one part missing:  how to make the underlying graphic correctly?
e.g., to have letters looking like they are embossed, or standing out in 3D, 
etc.

You need to construct the desired view in an image, and then place the actual 
characters,
with appropriate rendering mode, exactly over that image so that only the 
desired parts are shown.
This requires external image-processing software, which is what you paid Adobe 
to do with Photoshop.




Philip Taylor

Hope this helps.

Ross


Dr Ross Moore

Mathematics Dept | 12 Wally’s Walk, 734
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114<tel:%2B61%202%209850%209695>
M:+61 407 288 255<tel:%2B61%20409%20125%20670>  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] xelatex to doc?

2018-01-30 Thread Ross Moore
Hi Robert and Michal,

On Jan 31, 2018, at 8:29 AM, Michal Hoftich 
<michal@gmail.com<mailto:michal@gmail.com>> wrote:

Hi Bob,

On Tue, Jan 30, 2018 at 6:57 PM, Hueckstedt, Robert A. (rah2k)
<ra...@virginia.edu<mailto:ra...@virginia.edu>> wrote:
With a publisher’s permission I used xelatex to provide them copy, not
camera-ready copy, for a long book that has Sanskrit in Devanagari  and an
English translation. Of course, the files I provided the publisher are pdfs.
Now, the publisher wants them in doc. When they try to cut and paste from
the pdf to doc, none of the conjunct consonants are recognized in the doc
file. I used the velthuis-sanskrit mapping, and I am wondering if using the
RomDev mapping would make a difference. I somehow doubt it. Suggestions?

You can try to compile your TeX file to HTML using tex4ht. The HTML
code can be then pasted to Word. Basic command to compile XeTeX file
to HTML is

  make4ht -ux filename.tex

This might work, but first I’d try using Acrobat Pro to save the PDF
directly into a Word document.

This *can* work really well, especially when the PDF is enriched with
some tagging and the correct ToUnicode CMap resources for the fonts.
Try it and see if the result is reasonable.

Alternatively, you can Export to HTML from Acrobat Pro; though I’d
expect that if the .doc export is no good, then the HTML export would
suffer from similar issues.

It may even be that Adobe Reader can do these exports now,
as it is the same code-base.


Development version of make4ht can compile also to the ODT format,
which can be opened directly in Word:

  make4ht -ux -f odt filename.tex

It is possible that you will need some additional configurations for
the correct compilation. It depends on used packages or custom macros
in the document.

Best regards,
Michal


Hope this helps.

Ross


PS. if you don’t have access to Acrobat Pro to try this,
can you send me a few pages. I’ll then try it for you.
If the result is good, that may be sufficient reason for you
to consider investing in a license.


Dr Ross Moore
Mathematics Dept | 12 Wally’s Walk, 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>

http://www.maths.mq.edu.au


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] metalogo and bidi packages

2017-06-19 Thread Ross Moore
Hi Adam,

On Jun 20, 2017, at 8:10 AM, maxwell 
<maxw...@umiacs.umd.edu<mailto:maxw...@umiacs.umd.edu>> wrote:

I've installed the new TeXLive 2017.  There is a conflict between the metalogo 
and bidi packages.  I don't suppose this would be a biggie, except that the 
xltxtra package loads metalogo.  (And something else I'm using loads xltxtra...)

The conflict is shown in this minimal example:
--
\documentclass{report}
\usepackage{metalogo}
\usepackage{bidi}

It happens while processing the file
   latex-xetex-bidi.def

since metalogo has already defined macros:  \XeTeX  and  \XeLaTeX  .


\begin{document}
hi
\end{document}
--

The error msg is:

(/home/groups/tools/texlive/2017/texmf-dist/tex/xelatex/bidi/latex-xetex-bidi.d
ef

! LaTeX Error: Command \XeTeX already defined.
  Or name \end... illegal, see p.192 of the manual.

See the LaTeX manual or LaTeX Companion for explanation.
Type  H   for immediate help.
...

l.122 ...di@reflect@box{E}}\kern-.1667em \TeX}}$}}


The reverse loading order (bidi, then metalogo) triggers an error msg from bidi 
about loading order, and probably wouldn't help anyway.

This following works smoothly, and allows access to both versions of the logo.
Notice that bidi’s version of \XeLaTeX is slightly narrower than the one from  
metalogo.

\documentclass{report}
\usepackage{graphicx}
\usepackage{fontspec}
\usepackage{bidi}
\usepackage{metalogo}
%\usepackage{bidi}

% comment or delete these lines, in practice
\makeatletter
\show\XeTeX
\show\original@XeTeX
\makeatother

\begin{document}
Hi, from
\XeTeX\ and \XeLaTeX!

\makeatletter
Hi, from
\original@XeTeX\ and \original@XeLaTeX!
\makeatother

\end{document}




For the time being, doing the following before bidi is loaded seems to solve 
the problem:
-
\let\XeTeX\relax
\let\XeLaTeX\relax
-

  Mike Maxwell
  University of Maryland


--
Subscriptions, Archive, and List information, etc.:
http://tug.org/mailman/listinfo/xetex


Dr Ross Moore
Mathematics Dept | 12 Wally’s Walk, 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>

http://www.maths.mq.edu.au


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] pst-fill boxfill failure when compiling with XeLaTeX

2017-06-13 Thread Ross Moore
Hello Daniel,

On 14/06/2017, at 7:45, "Daniel Greenhoe"  wrote:

> Probably the most important reason I would like the XeTeX environment
> is because of the unicode font handling and ease of font switching
> (when the graphic includes text). However, even in that case, I could
> render the graphic with dvips+ps2pdf (as you said) and then apply the
> text on top of that using XeTeX.

There are several environments that help with this kind of thing;
e.g.,   LaTeX's  {picture}  environment
  Tikz  
   Xy-pic's  \xyimport  function.

The latter is extremely versatile, as it sets up a coordinate system based on 
the size of the imported image, without needing to know explicit dimensions.
Then you can use it to go anywhere within the image and use any of Xy-pic's 
graphic elements to place text, draw lines and arrows in different styles, put 
frames around parts of the picture, and much more. All this in a coordinate 
independent way, in case you decide to rescale the imported image, but retain 
the same font sizes.

> 
> Thank you again,
> Daniel


Hope this helps.

   Ross


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] :letterspace oddities

2016-08-26 Thread Ross Moore
Hi Jonathan, Zdenek, Phil,

On 26/08/2016, at 19:42, "Jonathan Kew"  wrote:

> On 25/8/16 18:02, Philip Taylor wrote:
>> For some time now I have been partially aware of some oddities in the
>> XeTeX implementation of :letterspace, but it was only today that my
>> thoughts crystallised sufficiently for me to attempt to record them on-
>> list :
>> 
>> 1) Search functionality.
>> 
>> For :
>> 
>> \font \errorfont = "Copperplate Gothic Bold:letterspace=8,color=BF"
>> scaled 2260
>> 
>> \newbox \errorbox
>> 
>> \setbox \errorbox = \leftline {\errorfont +++ NOT AT TOP OF PAGE +++}


>> Adobe Acrobat 7.1 has no problem locating the string "+++" if the
>> contents of \errorbox end up in the PDF file; however, for
>> 
>> \font \errorfont = "Copperplate Gothic Bold:letterspace=16,color=BF"
>> scaled 2260
>> 
>> the same string cannot be found.


> 
> Remember that TeX doesn't treat spaces as "characters" but as glue, which 
> means they don't end up as part of the *text* in the resulting DVI or PDF 
> file; they are merely implied by the positioning of the visible glyphs.
> 
> As a result, consider what Acrobat must be doing: it can "see" the visible 
> glyphs and their positions, but it "sees" no  characters separating 
> words. It must be inferring which characters are adjacent in the text stream, 
> and which are separated by spaces, purely from their positions. So when you 
> add a substantial amount of letter-spacing, it seems likely that Acrobat will 
> view the text as being "+ + +" rather than "+++".

Yes, this is a very good way of explaining it.

TeX's failure to include actual spaces in the output text-strings within the 
PDF is a double-edged sword. 
  On one hand, by treating spaces as glue, it is what allows TeX to produce the 
high-quality visual appearance that it does;
  but on the other hand this is the reason why normal TeX-produced PDFs do not 
work with Acrobat's 'Reflow' feature, when this setting is requested in the 
viewer.
(Think about how text in a web browser readjusts to fit a window when the 
viewing font size is increased, or when the window size is reduced.) 
For many people, especially those with eyesight difficulties, Reflow is 
extremely important. 

With small screens, as on smartphones and tablets, the lack of reflow within 
most PDF readers, is one of the biggest objections to use of PDF as a file 
format, as compared with HTML and XML-based formats, which do allow reflow.
As for the proliferation of PDF 2.0, PDF/UA and Tagged PDF formats generally, 
(e.g., as international standards) TeX will never be properly in the game 
unless the output is adjusted to include spaces within the output strings, in 
the font being use for the text.

Note that pdfTeX now has a mode that allows 'fake' spaces to be inserted, based 
upon the distance between letters, when sufficient for it to be reasonably 
inferred that a space must have been in the original input. But these are in a 
different font to the surrounding text, and as such are not regarded by Adobe 
Acrobat/Reader to be part of normal text strings, for the purpose of reflow.
Besides, the continual switching of fonts between text and fake spaces, adds 
quite a bit to the total size of the PDF file.

This is one direction that could be explored by the XeTeX, and dvipdfmx 
developers.
Develop a method to reinsert spaces into the PDF output, without altering the 
spacing in the non-reflowed view.


> It's possible that \XeTeXgenerateactualtext=1 would help,

How does this work?
Does it use a heuristic to infer that a space was originally present?
Or does it only work with syllables and special characters?
Can a user provide customized input to the actual-text strings, that will not 
affect typesetting?

> as I think it would annotate the letter-spaced "+++" as a unit with its 
> actual text, allowing Acrobat to find it correctly despite the intervening 
> spaces that *appear* to be present from just looking at the glyphs.

I'd certainly like to see the results of this kind of testing.

> 
> JK

Hope this helps.

Ross




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] [tex-live] Seemingly inexplicable shift in page origin between TL 2014 and TL 2016

2016-07-15 Thread Ross Moore
Hi Phil.

On Jul 15, 2016, at 5:28 PM, Philip Taylor 
<p.tay...@rhul.ac.uk<mailto:p.tay...@rhul.ac.uk>> wrote:

 I am intrigued to know why a package intended to support colour would want to 
set page size, though, and wonder from where it gets its information regarding 
the intended page size,

One of my messages answered this.
It is so that  \setpagecolor  can work correctly.
A coloured rectangle is drawn, at the size of the full page.
 \shipout  is patched to do it on every page.

since by the time that package {color} is loaded I have set all possible page 
dimensions to my intended size (B5, in this case).

Try  \pagecolor{yellow}  or somesuch.
Enjoy.


** Phil.
--

Philip Taylor

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114<tel:%2B61%202%209850%209695>
M:+61 407 288 255<tel:%2B61%20409%20125%20670>  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] [tex-live] Seemingly inexplicable shift in page origin between TL 2014 and TL 2016

2016-07-15 Thread Ross Moore
Hi David,

On Jul 15, 2016, at 4:47 PM, David Carlisle 
<d.p.carli...@gmail.com<mailto:d.p.carli...@gmail.com>> wrote:

Is there a mechanism similar to  \hypersetup  that allows the options to be 
changed
*after* the package has been loaded?

Not really, although the actual setting is in \AtBeginDocument{\AtBeginDVI  so 
as long
as you set the relevant registers to the right value it'll get set up (but let 
me know if you need more hooks)

No, I don’t need any more hooks.
A previous message explained why.

There is a very minor issue, as follows.

PDF/X generally requires CMYK color space, whereas PDF/A and PDF/E usually use 
RGB colors.
If one specifies  \pagecolor{yellow}  say, then it will use whatever color 
space was stated
when ‘yellow’ was defined as a color.
This would lead to a validation problem if it wasn’t the same space as the PDF 
type requires.

The “fix” for this is to use the  xcolor  package, and force conversions into 
the right color space.
Thank you for writing this, so many years ago!

Currently  pdfx.sty  force loads  xcolor   for PDF/X.
For some reason the command with PDF/A was commented-out, so that  xcolor would 
be loaded
only if requested by the document author, as per usual.
(I must have been testing something and didn’t uncomment it before releasing 
the package;
or maybe I thought more tests were needed, and just forgot about it.)

I do, however, put in a check that prevents the author from using the wrong 
color space.

The next version of  pdfx.sty  will force loading of  xcolor  in all 
situations, since there’s
no easy way of knowing what or when the author might request colours.


What is the issue?
  xcolor  can be very noisy:
viz.

Package xcolor Warning: Incompatible color definition on input line 3993.

[48]][48

Package xcolor Warning: Incompatible color definition on input line 4027.


Package xcolor Warning: Incompatible color definition on input line 4101.


Package xcolor Warning: Incompatible color definition on input line 4115.

[49]][49

That's 3 warnings per page.
Fortunately the final result is fine, passing validation.

xcolor  has an option  hideerrors  but this doesn’t suppress these warnings.



Cheers

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114<tel:%2B61%202%209850%209695>
M:+61 407 288 255<tel:%2B61%20409%20125%20670>  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] [tex-live] Seemingly inexplicable shift in page origin between TL 2014 and TL 2016

2016-07-14 Thread Ross Moore
Hi again David,

On Jul 15, 2016, at 9:19 AM, Ross Moore 
<ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>> wrote:

Hi David,

There is a potential conflict with  pdfx.sty  setting the  /MediaBox .

OK, I see what is going on now.
You are allowing a colored rectangle to be drawn the size of the page,
to support coloured pages, yes?

Nothing to do with the PDF Boxes, except of course you want the sizes
to match; especially when a  \mag  is used.



what does \usepackage [nosetpagesize]{color} do?

Is there a mechanism similar to  \hypersetup  that allows the options to be 
changed
*after* the package has been loaded?

OK; it just sets a boolean flag.


Alternatively, can I detect whether the  pagesize  special has been done 
already?
Then not repeat specifying  /MediaBox  when setting the other boxes:  
Bleed/Crop/Trim
which are required for PDF/X validation.

If not loaded yet, I can do  \PassOptionsToPackage{nosetpagesize}{color} .
But I’ll want to catch the case also if it is loaded.

Looks like this won’t be necessary.
The question now will be how having such colored pages affects validation.
Hopefully not at all, for PDF/X and PDF/A.

Maybe PDF/UA, according to the actual colors, but that would be a visual check 
not automated.


Thanks for a new code-branch to try out.


Cheers

     Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114<tel:%2B61%202%209850%209695>
M:+61 407 288 255<tel:%2B61%20409%20125%20670>  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] [tex-live] Seemingly inexplicable shift in page origin between TL 2014 and TL 2016

2016-07-14 Thread Ross Moore
Hi David,

On Jul 15, 2016, at 8:49 AM, David Carlisle 
<d.p.carli...@gmail.com<mailto:d.p.carli...@gmail.com>> wrote:

well that'll be the page size special change as mentioned earlier I assume.

Hmm. In which version of  color.sty  was this introduced?
Presumably later than:

Package: color 2016/05/09 v1.1c Standard LaTeX Color (DPC)

There is a potential conflict with  pdfx.sty  setting the  /MediaBox .


what does \usepackage [nosetpagesize]{color} do?

Is there a mechanism similar to  \hypersetup  that allows the options to be 
changed
*after* the package has been loaded?

Alternatively, can I detect whether the  pagesize  special has been done 
already?
Then not repeat specifying  /MediaBox  when setting the other boxes:  
Bleed/Crop/Trim
which are required for PDF/X validation.

If not loaded yet, I can do  \PassOptionsToPackage{nosetpagesize}{color} .
But I’ll want to catch the case also if it is loaded.


Cheers

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114<tel:%2B61%202%209850%209695>
M:+61 407 288 255<tel:%2B61%20409%20125%20670>  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] [tex-live] Seemingly inexplicable shift in page origin between TL 2014 and TL 2016

2016-07-14 Thread Ross Moore
Hi Phil,

On Jul 15, 2016, at 6:02 AM, Philip Taylor 
<p.tay...@rhul.ac.uk<mailto:p.tay...@rhul.ac.uk>> wrote:

Very happy to believe that, Ross (and will test later) but what I do not 
understand is why things have changed so dramatically in going from TeX Live 
2014 to 2016.  In 2014, adding the PDF/X-1A specials made no difference 
whatsoever to the position of size of the page; in 2016, they cause a vertical 
displacement.

There have been significant changes, at least in  xdvipdfmx  since 2014.

And the interaction with the color package is also inexplicable.  Anyhow, I 
first have to come up with a truly MWE and then I will be in a better position 
to investigate and report back.

Please send me such MWE when you have it.

I’ve been trying a real-world example (Serbian version of TeXLive documentation)
that compiles fine (with one small hiccup that doesn’t seem to affect the 
result)
using 2016’s  xdvipdfmx , but which crashes out almost immediately with 2014 
saying:

SCI:TL-SR16 ross$ /usr/local/texlive/2014/bin/universal-darwin/xdvipdfmx -z 0 
-vv --kpathsea-debug 4095 texlive-sr.xdv
texlive-sr.xdv
 -> texlive-sr.pdf
kdebug:fopen(texlive-sr.xdv, rb) => 0xa0e103ec
DVI ID = 7

xdvipdfmx:fatal: Something is wrong. Are you sure this is a DVI file?

Output file removed.
SCI:TL-SR16 ross$

Note that I have maximum verbosity turned on, as well as a lot of  kpathsea  
tracing.
Yet still it isn’t clear where it is going wrong.

So I’d appreciate your cut-down MWE to test with both versions.
Then we can play with /MediaBox and /CropBox values, to see whether that
is the cause of what you are getting. Or whether it is something else.



** Phil.

Cheers

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114<tel:%2B61%202%209850%209695>
M:+61 407 288 255<tel:%2B61%20409%20125%20670>  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] [tex-live] Seemingly inexplicable shift in page origin between TL 2014 and TL 2016

2016-07-14 Thread Ross Moore
Hi Phil,

On Jul 14, 2016, at 10:02 PM, Philip Taylor 
<p.tay...@rhul.ac.uk<mailto:p.tay...@rhul.ac.uk>> wrote:


Hallo David, and thank you for your suggestions.  I have now found the 
following :

1) The primary cause of the displacement is the PDF/X-1A:2003 specials -- 
remove these, and the page returns to its normal vertical position, but shifted 
outwards (leftwards) somewhat;
2) The use of \usepackage {color} within e-plain's \beginpackages ... 
\endpackages; in TeX Live 2016, this makes the page size considerably wider, 
IFF the PDF/X-1A specials are not emitted.

The displacement is visible both in TeXworks viewer and in Adobe Acrobat.


I think what you are seeing is due to the  /CropBox settings.
A PDF viewer shows the contents of the cropped area,
scaling to fill either the width or height or both.

There are 2 kinds of test that you can do.

1.  Print a few pages of your document, once with the \specials, once without.
Is there any difference in the position of your content on the printed page?

2. Vary the numbers in  /CropBox [ a b c d ] ;
such that (a,b) is bottom-left  and  (c,d)  is top-right corners of a 
rectangle.
Values such as  [ 200 200 300 300 ] should crop to a small-ish portion of a 
page,
which is then scaled up to fit your window-size.
   Observe the value of your browser’s scaling factor.

   With different values of [ a b c d ] you can simulate vertical and 
horizontal shifts,
to a small extent, according to how much smaller the /CropBox is, compared to
the /MediaBox.

Does this interpretation agree with what you observe?



A4 is indeed the default in both, and the difference between A4 and letter is 
only 3/4", whereas it appears to require a 1" correction ...

** Phil.


Cheers,

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114<tel:%2B61%202%209850%209695>
M:+61 407 288 255<tel:%2B61%20409%20125%20670>  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Σχετ: Plain XeTeX, pdftitle, pdfinfo

2016-07-13 Thread Ross Moore

On Jul 11, 2016, at 7:59 PM, Zdenek Wagner 
<zdenek.wag...@gmail.com<mailto:zdenek.wag...@gmail.com>> wrote:

Especially this one, it depends on a lot of files. I wanted to extract ideas 
how to build the XMP, how to include the ICC but I gave up.

XMP is done via a template file; e.g.  pdfx.xmp  or  pdfa.xmp .
There are many places where information can be supplied, via macros
such as  \xmp@Subject  and  \xmp@Author .
Much of the  pdfx  package is about supplying values for these, in UTF8 
encoding.


I know, writing XMP is easy, I do not know how to include it. I do not like to 
use hyperref for a document that will only be pronted and never will be online.

Using pdfTeX it is done like this in  pdfx.sty :

   \def\pdfx@numcoords{/N 4}% for CMYK colors
   \immediate\pdfobj stream attr{\pdfx@numcoords} file %
 {\pdfx@CMYKcolorprofiledir\pdfx@cmyk@profile}%
   \edef\OBJ@CMYK{\the\pdflastobj\space 0 R}%

Then that object reference in \OBJ@CMYK  is required for the  OutputIntent .
viz.

  \def\pdfx@outintent@dict{%
/Type/OutputIntent
/S/GTS_PDFX^^J
/OutputCondition (\pdfx@cmyk@intent)^^J
/OutputConditionIdentifier (\pdfx@cmyk@identifier)^^J
/Info(\pdfx@cmyk@intent)^^J
/RegistryName(\pdfx@cmyk@registry)
/DestOutputProfile \OBJ@CMYK
   }%

which is linked to the PDF Catalog via:

 \immediate\pdfobj{<<\pdfx@outintent@dict>>}%
  \edef\pdfx@outintents{[\the\pdflastobj\space 0 R]}%
 \def\pdfx@outcatalog@dict{%
  /ViewerPreferences <>
  /OutputIntents \pdfx@outintents % needs appropriate expansion
 }%
 \pdfcatalog{\pdfx@outcatalog@dict}%


Of course you need to supply all the information for the macros:
  \pdfx@cmyk@….
and  \pdfx@CMYKcolorprofiledir  (possibly empty).


Using XeTeX there is similar coding using  \special s,
including symbolic names for object references.
e.g.

\def\OBJ@CMYK{@colorprofile}%
\special{pdf:fstream @colorprofile %
  (\pdfx@CMYKcolorprofiledir\pdfx@cmyk@profile) <<\pdfx@numcoords >>}
   \def\pdfx@outintents{ @outintentsarray }%
   \def\pdfx@outintentref{ @outintent@dict }%
   \immediate\special{pdf:obj \pdfx@outintentref << \pdfx@outintent@dict >>}
   \immediate\special{pdf:obj \pdfx@outintents [ ]}%
   \immediate\special{pdf:put \pdfx@outintents \pdfx@outintent@dict}%

with \pdfcatalog defined appropriately:

 \def\pdfx@catalog@xetex#1{\special{pdf:put @catalog <<#1>>}}


You should be able to put all the pieces together now.

Cheers,

Ross


Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz<http://icebearsoft.euweb.cz/>


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114<tel:%2B61%202%209850%209695>
M:+61 407 288 255<tel:%2B61%20409%20125%20670>  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Σχετ: Plain XeTeX, pdftitle, pdfinfo

2016-07-11 Thread Ross Moore
  pdfx.xmp  or  pdfa.xmp .
There are many places where information can be supplied, via macros
such as  \xmp@Subject  and  \xmp@Author .
Much of the  pdfx  package is about supplying values for these, in UTF8 
encoding.



Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz<http://icebearsoft.euweb.cz/>


Cheers

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114<tel:%2B61%202%209850%209695>
M:+61 407 288 255<tel:%2B61%20409%20125%20670>  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Σχετ: Plain XeTeX, pdftitle, pdfinfo

2016-07-10 Thread Ross Moore
.  a hook will need to be inserted into \shipout to insert the bounding 
boxes on each page;

  pdfx.sty  uses\RequirePackage{everyshi}  and  \EveryShipout   for this.
It’s perhaps a bit of overkill, but a standard way to patch  
\shipout .

  1.  the colours will need to be converted to the desired output profile using 
Adobe Acrobat;

pdfx.sty  uses the  xcolor  package to handle this.
   Once a Color Profile is declared (either CMYK or RGB) the 
appropriate options
   are prepared for  xcolor  then the package is loaded with these 
options.
   Internal macros are rigged to stop changes being made, if the 
author tries to
   load the package separately. Similarly if  color  was loaded 
before  pdfx ,
   then appropriate coding imposes the correct color space.

   The upshot of this is that whenever a color is requested by name 
(‘blue’, ‘red’,
‘green’, ‘magenta’, etc.) then the correct color space 
coordinates are used.
Also, if a new color is declared (say as RGB) but the color 
model is CMYK,
then a conversion is done on the declaration, giving CMYK 
coords when that
new color is used.


  1.  the file will need to be reduced in size with Acrobat 4+ compatibility 
but with no image compression in order to convert it to PDF 1.3;

 Not sure of the specifics of this.
 Can anyone provide example documents?
 If this is really an issue, does  xdvipdfmx  have command-line options
 which allow specifying what can be compressed and what not?
 I don’t think so.

Such control is needed also to have uncompressed XMP Metadata
but compressed content streams, in all flavors/levels of PDF/A.
This is something that is highly desirable.

  pdftex and luatex already do this right, as also will Ghostscript when 
v9.20 emerges
from pre-release status.
The next version (1.5.9) of  pdfx.sty  will fully support  latex+dvips+GS  
using this.


  1.  the dimensions of the bounding boxes are for B5 in so-called "big points" 
(Postscript points) and will need to be amended for other page sizes;

 Setting these as a constant for all pages figures to be OK for most 
documents.
 Even better might be to reset to the size of each box being shipped-out.

 Since this can actually be done bypassing the \output  routine, then it
 requires patching  \shipout  rather than \makeheader  or similar.
 This is certainly an issue for further discussion.


  1.  \setboundingboxes will have to be called explicitly for the first page 
only.

\shipout can be hooked as follows :

\def \setboundingboxes
{%
\special {pdf: put @thispage << /ArtBox [0 0 498.89641  708.65968] 
>>}%
\special {pdf: put @thispage << /BleedBox [0 0 498.89641  
708.65968] >>}%
\special {pdf: put @thispage << /CropBox [0 0 498.89641  708.65968] 
>>}%
\special {pdf: put @thispage << /MediaBox [0 0 498.89641  
708.65968] >>}%
\special {pdf: put @thispage << /TrimBox [0 0 498.89641  708.65968] 
>>}%
}

Yes, (w/o /ArtBox ); but if you are hooking into  \shipout ,
why not measure the size of the box being shipped?
Do the conversion into actual points.
Will the bottom-left corner always be at  [0 0] ?
Probably need to look also at  \hoffset  and  \voffset .



\newcount \maxpage
\maxpage = 
\let \Shipout = \shipout
\def \shipout {\ifnum \pageno < \maxpage \setboundingboxes \fi \Shipout}
--

Philip Taylor


Hope this helps,

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114<tel:%2B61%202%209850%209695>
M:+61 407 288 255<tel:%2B61%20409%20125%20670>  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] xelatex, hyperref, and new TeXLive

2016-06-15 Thread Ross Moore

On Jun 16, 2016, at 9:45 AM, David Carlisle 
<d.p.carli...@gmail.com<mailto:d.p.carli...@gmail.com>> wrote:

The result is that when you subsequently request   [dvipdfmx]  or  any other 
driver,
hyperref thinks that we are in non-dvi mode, so  *incorrectly* throws the error.

So it’s surely an omission in  hyperref.sty .

But you don’t actually need to specify a driver option,
and everything works OK anyway.

It only works with no option if you are not using a hyperref.cfg that specifies 
incompatible options:-)

OK. So [xetex] is the correct option to use, if any is needed.
Besides, the actual driver binary is   xdvipdfmx   not  dvipdfmx .




  Mike Maxwell

David


Cheers,

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114<tel:%2B61%202%209850%209695>
M:+61 407 288 255<tel:%2B61%20409%20125%20670>  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] xelatex, hyperref, and new TeXLive

2016-06-15 Thread Ross Moore
Hi Mike, David, Herb,

On Jun 16, 2016, at 8:46 AM, maxwell 
<maxw...@umiacs.umd.edu<mailto:maxw...@umiacs.umd.edu>> wrote:

With the help of David Carlisle and Herbert Schulz, I've found part of the 
problem.  For some reason, in the (our?) 2016 version, kpsewhich points to this 
hyperref.cfg file:
  ...texlive/2016/texmf-dist/doc/latex/listings-ext/hyperref.cfg

I’m seeing the same behaviour, but for me the packages are as follows:

(/usr/local/texlive/2016/texmf-dist/tex/latex/latexconfig/hyperref.cfg)

/usr/local/texlive/2016/texmf-dist/tex/latex/hyperref/hyperref.sty:4322: Packag
e hyperref Error: Wrong DVI mode driver option `dvipdfmx',
(hyperref)because XeTeX is running.




This .cfg file contains a \hypersetup{...} command that specifies 'ps2pdf'.  
Changing that to 'xetex' fixes the problem, at least for xelatex (I'm not sure 
what would happen with other flavors of latex).  (Update: removing the line 
entirely, so it specifies neither xetex nor ps2pdf, works too, and presumably 
won't cause trouble for other latices.)

But:
1) Why does kpsewhich find that file, instead of this one:
  ...texlive/2016/texmf-dist/tex/latex/latexconfig/hyperref.cfg
  which does not have any \hypersetup{} command, and which would
  presumably not cause the same problem?
2) Why did this change from 2015 to 2016?  We did a pretty vanilla
  install, I think the only non-default choice we made was to use
  'letter' instead of 'a4'.
3) Is this a bug? (meaning should I report it?)

Here is the relevant coding from  hyperref.sty  with annotations added by me.

\newif\ifHy@DviMode
This defines  \ifHy@DviMode and switches, leaves it as  \iffalse
\let\Hy@DviErrMsg\ltx@empty
\ifpdf
  \def\Hy@DviErrMsg{pdfTeX or LuaTeX is running in PDF mode}%
\else
  \ifxetex
This is already  \iftrue
\def\Hy@DviErrMsg{XeTeX is running}%
… but surely we should be setting  \Hy@DviModetrue  here !!!
  \else
\ifvtex
  \ifvtexdvi
\Hy@DviModetrue
  \else
\def\Hy@DviErrMsg{VTeX is running, but not in DVI mode}%
  \fi
\else
  \Hy@DviModetrue
\fi
  \fi
\fi

The result is that when you subsequently request   [dvipdfmx]  or  any other 
driver,
hyperref thinks that we are in non-dvi mode, so  *incorrectly* throws the error.

So it’s surely an omission in  hyperref.sty .

But you don’t actually need to specify a driver option,
and everything works OK anyway.


  Mike Maxwell


Hope this helps,

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114<tel:%2B61%202%209850%209695>
M:+61 407 288 255<tel:%2B61%20409%20125%20670>  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] XeTeX - PDF-x1a

2016-04-17 Thread Ross Moore
Hi Zdenek, Bruno, Arthur,

Recall this conversation from several months ago?

I've now got 2 important contributions to make.

On 06/09/2015, at 5:47 PM, Zdenek Wagner wrote:

2015-09-06 8:10 GMT+02:00 Bruno Le Floch 
>:
On 9/4/15, Arthur Reutenauer 
> 
wrote:
> On Thu, Sep 03, 2015 at 10:44:19AM -0400, Adam Drissel wrote:
>> I need to be able to use XeTeX while still producing a PDF in x1a format
>> (PDF/A, PDF-x1a).  Do you have any idea how I can do this using XeTeX?
>
>   Unfortunately it's not really possible at the moment; the package pdfx
> aims at producing different standards of the PDF/A and PDF/X families
> but is aimed at pdfTeX.  To my knowledge there has been no serious
> effort to port it to XeTeX.

There has now!
I have successfully produced a validating PDF/A-2u document from
the Serbian version of the TeX-Live documentation for 2015.


As far as I remember from a talk at TUG 2015, the packages used to
produce PDF/A using pdftex were extended to work with LuaTeX (perhaps
using newer versions of the package) and one difficulty they faced
with XeTeX was the lack of a way to compute MD5 sums.  Now IIRC a
primitive was very recently added to XeTeX for MD5 sums.  So perhaps
it wouldn't be too much work to port the package to XeTeX.


#1
What is the name of this primitive please?
Was it really added?

One place where it was being used by  pdfTeX + pdfx.sty
is in generating a UUID; i.e., an identifier that can virtually
be guaranteed to be unique to the document being processed.

It isn't actually needed, for this purpose, but it would save
a significant amount of processing of macro evaluations.

Scott Pakin's coding in  hyperxmp  emulates a seeded RNG
(random number generator) to generate such a unique ID.
Currently I'm using Scott's coding with XeTeX + pdfx.sty
but the md5 sum primitive would shorten this considerably,
and most likely works much faster.



Luatex and pdftex have the \pdfminorversion primitive to set the required PDF 
version. AFAIK there is no way how XeTeX could communicate such a requirement 
to xdvipdfmx. The only way is to call xelatex -no-pdf ... and xdvipdfmx -V4 ... 
(default is PDF 1.5 but PDX/x-1a:2003 requires PDF 1.4).

#2
There's more to it than just this.

The command-line needs to be (something like):

xelatex -output-driver="xdvipdfmx -z 0"  .tex

This "-z 0" is needed because the XMP Metadata packet must *not*
be compressed, but must remain readable as plain text, in UTF-8 encoding.
With "-z 1" or higher, the Metadata is compressed.

With  texlive-sr.pdf  the filesize difference is enormous:
 ~798 kb  with  "-z 1"
~10.4 Mb  with  "-z 0"

I've been unable to find a way to specify that parts of the generated PDF
be uncompressed while other parts can be.




I really don't know tbh,
Bruno


Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz

If any of the XeTeX Development team can help with either of these
issues, I'd be most appreciative.
And we'd be getting a much better product for generating PDF/A files.


Cheers,

Ross




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] babel

2016-03-24 Thread Ross Moore
Hi Javier,

On Mar 24, 2016, at 5:59 PM, Javier Bezos 
<jbez...@gmail.com<mailto:jbez...@gmail.com>> wrote:

Apostolos,

preface = \textPi \textrho\acctonos \textomicron\textlambda 
\textomicron\textgamma

XeLaTeX is Unicode aware and can handle Unicode strings. Therefore, I fail to 
see
why you are doing things this way. The LGR font encoding is an ancient hack that
has no usage anymore.

Of course, in Unicode engines the default captions section
apply, not the captions.licr subsection.

I think that it is absolutely correct that you build in continuing support
for old encodings that may no longer be used with new documents.

The existence of old documents using such encodings certainly
warrants this — especially in the case of archives that process
old (La)TeX sources to create PDFs on the fly.

It is quite possible that in future these will be required to conform
to modern standards, rather than just reproduce exactly what those
sources did in past decades. Then there is the issue of old documents
being aggregated with newer ones, for “Collected Works”-like publications.

It is quite wrong to say that because we now have newer, better methods
that those older methods should be discarded entirely.


I’m facing exactly this problem, adapting  pdfx.sty  to be able to translate
Metadata provided in old encodings: KOI8-R, LGR, OT6 etc.
automatically into UTF-8, because the latter is required by XMP for
requirements to satisfy PDF/A, PDF/X and PDF/E standards.



Javier

Keep up the good work.

Cheers,

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114<tel:%2B61%202%209850%209695>
M:+61 407 288 255<tel:%2B61%20409%20125%20670>  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] potential new feature: \XeTeXgenerateactualtext

2016-02-24 Thread Ross Moore
Hi Will,

> On Feb 25, 2016, at 5:19 PM, Will Robertson <w...@wspr.io> wrote:
> 
> Hi Ross,
> 
> Great to hear from you.
> I thought of you straight away when writing my email :)
> 
> 
>> On 25 Feb 2016, at 11:35 AM, Ross Moore <ross.mo...@mq.edu.au> wrote:
>> 
>> You have to be *very* careful with /ActualText, since it must be done using 
>> PDFdoc encoding, 
>> as it becomes part of the page contents stream.
>> Any errors will corrupt the PDF file completely — but that’s true of other 
>> things as well.
>> Heiko’s  \pdfstringdef  in the hyperref package is very good for handling 
>> this…
> 
> That’s good to know, thanks.
> I think there has been *some* work by one or two of the LaTeX3 members on 
> general methods for this sort of thing, but it’s been a while.

Send me their names.
I may have a bit more time this year.


>> Look at some of my papers associated with TUG conferences, to see various
>> options that can be used to make mathematics more accessible in PDFs; i.e.,
>> papers numbered as 5, 6, 7 on this page: 
>> 
>>  http://www.tug.org/twg/accessibility/
>> 
>> Although these were done using pdfTeX, some of these things should be able
>> to be implemented for XeTeX + xdvipdfmx  also.
> 
> This is exactly where I was going with all this (so we’re getting quite far 
> away from the new primitive).
> My understanding is that the extended pdfTeX you were using was included in 
> TeX Live 2015, is that right? Or will be in TL2016?

The later papers, which are not directly on “Tagged PDF”, don’t require
the special tagging features.

> How much work would it be to translate that work into something that will 
> also function in XeTeX?

That depends on how easy it is to create PDF objects and object references
between them.
Since I don’t know how  xdvipdfmx does it — using pdfmark ?  as does dvips ?
then it’s nowhere near as convenient as with pdfTeX.

Hopefully someone with the necessary experience can pick up on those ideas.
That’s why I’ve followed up your comment on this list.
Indeed, we need someone to get  pdfx.sty  working with XeLaTeX;
it’s for similar reasons that it doesn’t do so already.

Switch it to another thread, if you think that is appropriate.

> Cheers,
> Will

Cheers,

Ross






--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] \(pdf)mdfivesum

2015-07-01 Thread Ross Moore
Hi Joseph,

On 01/07/2015, at 23:03, Joseph Wright joseph.wri...@morningstar2.co.uk wrote:

 Hello all,
 
 I have a request for a new primitive in XeTeX, not directly related to
 typesetting by I think useful. To understand why I'm asking, a bit of
 background would be useful.
 
 The LaTeX team have recently taken over looking after catcode/charcode
 info for the Unicode engines from the previous rather diffuse situation.
 As part of that, we were asked to ensure that the derived data was
 traceable and so have included the MD5 sum of the source files in the
 new unicode-letters.def file.

MD5 sums are also required pieces of data with some of the modern PDF 
standards, such as PDF/A, PDF/UA, and especially whenever attachments are 
included.
They are part of the bookkeeping data that can be used to ensure that embedded 
files are indeed  what was intended, and have not been intercepted and changed 
by Malware.

 We can happily generate that file using pdfTeX (\pdfmdfivesum primitive)
 or LuaTeX (using Lua code), but not using XeTeX. That's not a big issue
 but the need for an MD5 sum gives me an idea which would need support in
 XeTeX.
 
 LaTeX offers \listfiles to help us track down package version issues but
 this fails if files have been locally modified or don't have
 date/version info. It would therefore be useful to have a system that
 can ensure that files match, which is where MD5 sums come in. Once can
 imagine arranging that every file \input (or \read) has the MD5 sum
 calculated as part of document typesetting: this is not LaTeX-specific.
 This data could then be available as an additional file listing to help
 track problems. However, to be truly useful this would need to work with
 all three major engines, and currently XeTeX is out. I'd therefore like
 to ask that \pdfmdfivesum (or perhaps just \mdfivesum) is added to XeTeX.

I fully support this request.
Issues of guaranteeing fidelity and conformance to standards are actually quite 
important in areas other than academia.
It is time TeX caught up with regard to such issues.


 There are a small number of other 'utility' primitives in pdfTeX/LuaTeX
 (some in the latter as Lua code emulation) that might also be looked at
 at the same time (see
 http://chat.stackexchange.com/transcript/message/22496265#22496265):
 
 - \pdfcreationdate
 - \pdfescapestring
 - \pdfescapename
 - \pdfescapehex
 - \pdfunescapehex
 - \pdfuniformdeviate
 - \pdfnormaldeviate
 - \pdffilemoddate
 - \pdffilesize
 - \pdffiledump
 - \pdfrandomseed
 - \pdfsetrandomseed

Several of these are definitely needed when generating PDFs that conform to 
existing standards, particularly with regard to attached or embedded files.

- \pdffilemoddate
- \pdfcreationdate
- \pdffilesize

Of course it is not hard to get such information from command-line utilities, 
when the files to be included are pre-existing, prior to commencement of a 
typesetting job.
But in cases where TeX is used to itself write out the files before re-reading 
for inclusion, then it is much easier to code when such primitives are 
available within the engine. Otherwise one needs to encode a call-out to 
command-line utilities, then read back the output. This introduces OS system 
dependencies, which is something that we definitely want to avoid with TeX 
systems.

 
 most of which are not related to PDF output and which may have good use
 cases. I am specifically *not* asking for any of these to be added here
 but note this list as it *may* be that the work may be closely related.
 --
 Joseph Wright

Hope this helps,

Ross




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Bug fixes and new features related to Unicode character codes, surrogates, etc

2015-05-06 Thread Ross Moore
Hi Arthur,

On 07/05/2015, at 8:04, Arthur Reutenauer arthur.reutena...@normalesup.org 
wrote:

  While working on these bugs, we also discussed how surrogate
 characters were handled in XeTeX.  Surrogate characters are the 2048
 code points that are used in UTF-16 to encode characters with code
 points above 65536: a pair of them makes up one Unicode character;
 however they're not meant to be used in isolation, even though they have
 code points like other characters (they're not just byte sequences).
 
  Right now, XeTeX allows isolated surrogate characters, and also
 combines sequences such as d835dc00 into one Unicode character.
 We want to flag the former case but are not sure how: should we make the
 characters invalid (with catcode 15)?  

That would definitely be wrong.
The character itself, as bytes that is, is not wrong and users should be able 
to create these.
But preferably through macros that ensure that they come correctly paired.

IMHO, this is a macro issue, not an engine issue.

The same kind of thing applies with combining accents and diacritics.
I've written macros that take an argument and follow it with a combining 
character.
This is useful for generating correct UTF8 bytes to put into XML packets, as 
needed for the XMP Metadata that is required in PDF files that must validate 
for ISO specifications.

Similar macros could be used to construct upper-plane characters from 
surrogates, given only the math style and Latin letter. For these, single 
surrogate characters will be needed in the macro definitions, with the ultimate 
matching pair to be determined algorithmically, probably using an \ifcase  
instance. Single characters thus need to be able to be input, so as to create 
the macro definition.

OK, a clever macro programmer can change the catcodes to become valid local to 
the macro definition. But that is really complicating things.


 Or we could map them to the
 standard unknown character (U+FFFD).  The latter case is more nasty
 and should definitely be forbidden -- the ^^ notation should only be
 used for proper characters (so instead of the above, the Unicode code
 point of the resulting Unicode character should be used, in this case
 ^1d400).

I disagree. 
The ^^ notation can be used in macros to create the required bytes, for writing 
out into a file other than the  .dvi  or .pdf  output.
pdfTeX (or other engine) then can cause that file to become embedded as a file 
object stream in the final PDF.


 
  Any thoughts?
 
Best,
 
Arthur


Hope this helps,

Ross




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Bug fixes and new features related to Unicode character codes, surrogates, etc

2015-05-06 Thread Ross Moore
Hi David,

On 07/05/2015, at 9:26 AM, David Carlisle wrote:

 The character itself, as bytes that is, is not wrong and users should be 
 able to create these.
 But preferably through macros that ensure that they come correctly paired.
 
 placing two character tokens representing a surrogate pair should not
 though magically turn itself
 into a single character.

Agreed.
You don't know whether you want a single character until 
you know what kind of output is being generated.
That need not be known on input.

 The UTF-8 or  encoding should refer to
 the unicode code point not
 to the UTF-16 encoding,

No disagreement to this.

 
 In the current versions d835dc00 is two characters in luatex
 and one character in xetex
 as the implementation detail that xetex's underlying storage is mostly
 UTF-16 is exposed.

This seems to be premature of XeTeX then.
It seems to be making an assumption on how those bytes 
will ultimately be used.

 If it is
 not possible to prevent ^^^ or utf8 encoded surrogate pairs combining
 then it is better to
 prevent them being formed.

Hmm. 
What if you have an entirely different purpose in mind for those bytes?
You still need to be able to create them and do further processing with them.

Maybe there should be a primitive that sets a flag controlling what
happens to surrogates' bytes on input?
It may well be that XeTeX's current behaviour is best for putting
content into PDF pages; but not best in other situations. So a macro
programmer should have a means to change this, when needed.

 
 this is no different to XML where  #xd835; #xdc00; always refers to
 two (invalid) characters not
 to  #x1d400;

Seems fine to me.
If application software wants/needs to combine them, it can do so.

 
 David


Cheers,

Ross


Ross Moore

Senior Lecturer
Mathematics Department  |   Level 2, E7A 
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955   |  F: +61 2 9850 8114
M: +61 407 288 255  |  http://www.maths.mq.edu.au/

CRICOS Provider Number 2J. Think before you print. Please consider the 
environment before printing this email.

This message is intended for the addressee named and may contain confidential 
information. If you are not the intended recipient, please delete it and notify 
the sender. Views expressed in this message are those of the individual sender, 
and are not necessarily the views of Macquarie University.



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] XeTeX maintenance

2015-04-27 Thread Ross Moore
Hi Joseph,

On 27/04/2015, at 4:19 PM, Joseph Wright wrote:

 On 27/04/2015 00:22, Ross Moore wrote:
 But of course that doesn't address the problem for LaTeXt users until
 someone writes a suitable/comparable package (maybe someone did
 already, I didn't try to follow).
 
 I have coding for much of what is needed, using the modified pdfTeX.
 But there is a lot that still needs to be added; e.g. PDF’s table model,
 References, footnotes, etc. 
 
 Somewhat away from the original topic, but it strikes me that building a
 tagged PDF is going to be much more problematic at the macro layer than
 at the engine level: is that fair?

Certainly one needs help at the engine level, to build the tree
structures: what is a parent/child of what else.

But macros are needed to determine where new structure starts
and finishes.
Think  \section  and friends, list environments, \item  etc.

Indicators must go in at a high level, before these are decomposed
into the content:  letters, font-switches, etc.


In short, determining where structure is to be found is *much* harder 
at the engine level; but doing the book-keeping to preserve that 
structure, once known, is definitely easier when done at that level.



Philip Taylor is correct in thinking that such things can be 
better controlled in XML. But there the author has to put in 
the extra verbose markup for themselves --- hopefully with help
from some kind of interface.
However, that can involve a pretty steep learning curve anyway.

Word has had styles for decades, but how many authors actually
make proper use of them?  e.g. linking one style to another,
setting space before  after, rather than just using newlines,
and inserting space runs instead of setting tabs.
How many even know of the difference between return  and 
Shift-return  (or is it Option-return ) ?


The point of (La)TeX is surely to allow the human author
to not worry too much about detailed structure, but still allow
sufficient hints (via the choice of environments and macros used) 
that most things should be able to be worked out.


In particular, you need to hack into  \everypar  to determine
where the TeX mode switches from vertical to horizontal.
(LaTeX already does this, so it is delicate programming to mix
in what (La)TeX wants with what is needed for tagging.)

Doing it this way keeps things well hidden from the author,
who most likely just doesn't want to know anyway.


 Deciding what elements of a document
 are 'structure' is hard, and in 'real' documents it's not unusual to see
 a lot of input that's more about appearance than structure. That of
 course isn't limited to TeX: I suspect anyone trying to generate tagged
 output has the same concern (users do odd things).

Absolutely, as in my Word examples above.

LaTeX wants you to use a \section-like command, rather than
switching to bold-face, perhaps after inserting vertical space.
But if a human can recognise this, it should also be possible
to program TeX to recognise it. A really friendly system would
pause and question the author, perhaps with several options
available on how to proceed --- TeX can do this.
And TeX has a  \nonstopmode  to override such stoppages.


 --
 Joseph Wright

Enough on this for now.  This is surely a topic for TUG-2015.
By then we should know when the revised ADA Section 508 
will come into effect
--- or if it has been delayed or watered down. :-)


Cheers,

Ross


Ross Moore

Senior Lecturer
Mathematics Department  |   Level 2, E7A 
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955   |  F: +61 2 9850 8114
M: +61 407 288 255  |  http://www.maths.mq.edu.au/



CRICOS Provider Number 2J. Think before you print. Please consider the 
environment before printing this email.

This message is intended for the addressee named and may contain confidential 
information. If you are not the intended recipient, please delete it and notify 
the sender. Views expressed in this message are those of the individual sender, 
and are not necessarily the views of Macquarie University.



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] XeTeX maintenance

2015-04-27 Thread Ross Moore
Hi all,

On 28/04/2015, at 0:40, Apostolos Syropoulos asyropou...@yahoo.com wrote:

 
 As to whether XML is a particularly good format not only here or for
 anything, all I can say is that in my experience we (humanity, that is)
 have not yet come up with anything better; LaTeX 2e, by explicitly
 permitting the conflation of form and content, fails abysmally in this
 respect (IMHO, of course).
 
 
 
 Well I think that JSON is currently the next hot thing in computing together 
 with big data.
 Many consider that JSON will eventually replace XML. Also, about the PDF 
 format I think that
 archivable PDF
 
 
 http://www.digitalpreservation.gov/formats/fdd/fdd000252.shtml
 
 is very important.

Agreed.
Now PDF/A-1a, PDF/A-2a, PDF/A-3a are all accessible tagged PDF
(whereas the 'b' and 'u' sub levels need not be tagged).
It isn't much more to get PDF/UA from PDF/A-1a, etc, and so have validation for 
both.
This should be a major aim of our community.

 
 A.S.
 --
 Apostolos Syropoulos
 Xanthi, Greece


Ross


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] XeTeX maintenance

2015-04-26 Thread Ross Moore
Hi Mojca,

 On 27 Apr 2015, at 6:53 am, Mojca Miklavec mojca.miklavec.li...@gmail.com 
 mailto:mojca.miklavec.li...@gmail.com wrote:
 
 On Sun, Apr 26, 2015 at 10:26 PM, Ross Moore wrote:
 
 No standard TeX implementation currently comes close to producing Tagged PDF.
 
 ConTeXt MkIV does:
https://www.tug.org/TUGboat/tb31-3/tb99hagen.pdf 
 https://www.tug.org/TUGboat/tb31-3/tb99hagen.pdf

Yes; I’m aware of what Hans can achieve, and hold him in awe. :-)
Besides, this uses LuaTeX.  viz. this quote from the end of Hans’ article.
“Also, it is yet another nice test case and torture test for LuaTEX and it 
helps us to find buglets and oversights.” 

That is precisely why I used the word “standard” qualifying “TeX installation” 
in my statement above.

 
 But of course that doesn't address the problem for LaTeXt users until
 someone writes a suitable/comparable package (maybe someone did
 already, I didn't try to follow).

I have coding for much of what is needed, using the modified pdfTeX.
But there is a lot that still needs to be added; e.g. PDF’s table model,
References, footnotes, etc. 

 
 Mojca
 
 PS: Our government is still mainly depending on documents with a doc
 extension.

Right. Conversion to PDF requires Adobe’s converters.
There are known bugs — but this is doubtless being worked on.

The point is that, for people wishing to use TeX-based software to
produce PDFs, then extra converters or manual conversion techniques
(e.g., using Acrobat Pro) will be required to produce a valid PDF/UA document.
Unless, that is, our community takes this seriously and creates a major project.

Another quote from Han’s article:
 “This is a typical case where more energy has to be spent on driving the voice 
of Acrobat but I will do that when we find a good reason.” 

That reason is getting much, much closer.


All the best,

Ross




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] XeTeX maintenance

2015-04-26 Thread Ross Moore
Hi all,

On 26/04/2015, at 20:51, Joseph Wright joseph.wri...@morningstar2.co.uk wrote:

 On 26/04/2015 11:47, Philip Taylor wrote:
 To my mind, XeTeX /is/ the future of TeX.  The days of entering
 français as fran\c cais are surely numbered, and it has never been
 possible to enter العربية, ελληνικά or עברית (etc) in an analogous
 way.  Therefore, is it not time to petition the TUG Board to adopt XeTeX
 as a formal TUG project, and to allocate adequate funding to ensure not
 only its continued existence but its continued development, at least
 until such time as a clearly superior alternative not only emerges but
 becomes adopted as the /de facto/ replacement for TeX ?
 
 Philip Taylor
 
 The problem as always is not so much money as people. [Also, you do know
 about LuaTeX, yes? ;-) More seriously, XeTeX isn't a drop-in replacement
 for TeX90/pdfTeX.]

There is an even bigger issue which is going to affect the future of TeX.

http://www.access-board.gov/guidelines-and-standards/communications-and-it/about-the-ict-refresh/proposed-rule

The laws (in the US, but this will propagate) are going to becoming much 
tougher about requiring Accessibility of electronic documents, both for 
websites and PDFs.
Basically all PDFs (produced by government agencies for public consumption) 
*must* satisfy the PDF/UA published standard. That is, they must be Tagged PDF, 
and satisfy all the extra recommendations for enhancing Accessibility of the 
document's content and structure.
Being a legal requirement for Government Agencies has a knock-on effect for 
everyone, so TeX software will need to be enhanced to meet such requirements, 
else extinction becomes a real possibility.

No standard TeX implementation currently comes close to producing Tagged PDF.
LuaTeX, with it's extra scripting, has the potential to do so.
Extra primitives for pdfTeX go a long way, but require 1000s of extra lines of 
TeX/LaTeX coding to implement proper structure tagging without placing a burden 
on authors.
(Those primitives are not yet standard in pdfTeX, but are in a separate 
development branch.)

It may be possible to continue with a  .tex  — .dvi — .pdf  workflow, but I 
doubt it very much.
Structure tagging requires a completely separate tree-like view of a document’s 
structure and which must be interleaved with the content within the page-tree 
structure. Storing everything that will be required into the .dvi  file, on a 
page-page basis for later processing by a separate program, is unlikely to give 
a viable solution; at least not without substantial extension of  dvips , 
dvipdfmx, etc. and Ghostscript itself perhaps.

Direct production of the PDF by a single engine is surely the best approach.


 --
 Joseph Wright


Hope this helps,

Ross


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] XeTeX maintenance

2015-04-26 Thread Ross Moore
Hi Doug,

On 27/04/2015, at 10:05 AM, Douglas McKenna wrote:

 Given that the number of TeX input files using ^^u is likely miniscule, and 
 the number of those that follow the ^^u or ^^U with four or six hex digits is 
 even smaller, it seemed like a worthwhile benefit vs. cost, 
 compatibility-wise.  Maybe there's something I've not thought out well.

For user-input files, then yes it is probably very small.
But such constructions figure to be used a lot within package sources
--- precisely to create macros that shield users from the syntax.

For example, try in a terminal:

  grep \^\^\^\^ `kpsewhich unicode-math.dtx` | wc -l
 
There are 160 lines of input and/or macro definitions.
(four of these use ^ )
Doubtless packages supporting other languages are similar.

Of course, since these are in packages the coding can be changed,
if engines need to be changed.
(Except that old versions will still have to be retained for those 
people who do not update to newer versions of the engine.)

 
 This discussion I just found is both pertinent and frightening, I suppose:
 
 http://stackroulette.com/tex/62725/the-notation-in-various-engines

Yeah. Thanks for this link. It is from July 2012
--- so maybe some of that incompatibility is fixed now?

If not, then TUG-2015 in Germany this July may be a good 
place to discuss the status of all this?


 
 
 Doug McKenna
 

Cheers,

Ross


Ross Moore

Senior Lecturer
Mathematics Department  |   Level 2, E7A 
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955   |  F: +61 2 9850 8114
M: +61 407 288 255  |  http://www.maths.mq.edu.au/



CRICOS Provider Number 2J. Think before you print. Please consider the 
environment before printing this email.

This message is intended for the addressee named and may contain confidential 
information. If you are not the intended recipient, please delete it and notify 
the sender. Views expressed in this message are those of the individual sender, 
and are not necessarily the views of Macquarie University.



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] additional beginL endL nodes in math

2015-04-17 Thread Ross Moore
Hi David, Zdenek, and others

On 18/04/2015, at 1:05, Zdenek Wagner zdenek.wag...@gmail.com wrote:

 2015-04-17 16:31 GMT+02:00 David Carlisle d.p.carli...@gmail.com:
 
 package, anyway)
 
 Colour is a font attribut in XeTeX but AFAIK it allowx RGB and RGBA only, 
 CMYK is not supported. If I want to print the document by offset, I have to 
 use colour specials, otherwise I risk unwanted result. 

It is not just colour information that one may want to insert.

Here are some more instances in support of  boojums  (using  pdfTeX , not 
XeTeX).

A.
For tagged PDF you may need to tag individual characters with attributes for 
accessibility and Copy/Paste, such as to get the PDF page-contents stream 
looking like:

/Span /Alt(...)/ActualTextFFEF  BDC
  ... normal TeX-placed material ...
EMC

\pdfliteral   page { }can provide the extra PDF coding lines, 
but this is 2 extra  boojums  for each actual character in the math list. 

B.
I'm currently writing a paper describing a method to attach tooltips to 
mathematical symbols and expressions. After setting the chunks in boxes for 
measuring, this ultimately puts
   \pdfannot 
into the math-stream.  
To not affect spacing, this would need to be a  boojum  surely.

I can supply instances where spacing has changed, by an extra thin space.
Sometimes placing extra {...} avoids this extra space, other cases require a \! 
to fix.



 
 Zdeněk Wagner
 http://hroch486.icpf.cas.cz/wagner/
 http://icebearsoft.euweb.cz
 
 
 
 David


Is there a good tool or TeXnique that lets one see the contents of a 
math-stream, after all macros are processed, and during the internal 
page-construction stage?
I'd like something a bit better than just examining box contents after using 
\tracingall .


Cheers,

 Ross



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] [arXiv #128410] Re: XeLaTeX generated pdf metadata

2014-09-22 Thread Ross Moore
Hi Axel, Mike and others.

On 23/09/2014, at 12:04 PM, Axel E. Retif wrote:

 On 09/22/2014 08:42 PM, Mike Maxwell wrote:
 
 
 I guess these jokers haven't heard of Unicode.  Are they stuck back in
 the 1990s?
 
 Are you and Philip Taylor even aware that you're replying directly to an 
 arXiv administrator?
 
 I think arXiv and Cornell University are doing a great service to the 
 scientific community and public in general and deserve more respect.

Yes, they are doing a great service.

But, having said that, there should still be an obligation 
to keep up with the times, and not *prevent* the archiving
of publications that have special typesetting requirements.

How else can we advance aspects of mathematical/scientific 
publishing, when the main repository refuses to accept works 
that ably demonstrate useful new ideas?


Not that XeTeX is really all that special any more.
It started in 2004, on the Mac only.
Support for Unicode math is a bit more recent, 
starting around 2006, similarly to when XeTeX went
multi-platform (I think).



We talked about this at TUG 2014, in one of the discussion
sessions, where someone had reported dissatisfaction with 
what could be submitted to arXiv. 
Other issues were raised as well, including the fact that
the current TeXLive version being used there is ~3 years
out of date.


Earlier this year I submitted a paper that was meant 
to demonstrate use of PDF/A-3u features for publishing 
*accessible* mathematical content. 
But because the version of pdfTeX was outdated at 2011, 

http://arxiv.org/help/faq/texlive

the PDFs produced on-the-fly by arXiv do not validate to 
the standard declared within them.

They would not accept the PDF that I myself can compile,
in which validation is 100%.

(The particular differences in the PDF output are due 
to a mistake in 2011 and later versions of pdfTeX itself.
This has now been fixed, but perhaps is available only
by download from the  pdftex  source repository.)



The upshot of this is that it is not possible to *lead by 
example* with PDFs that are meant to demonstrate the value 
of new and emerging standards.
This includes standards that are accepted elsewhere within 
the publishing industry, and are to some extent mandated 
by existing US accessibility laws, applicable to many 
government and academic institutions.


 
 It seems to me that if they start accepting Xe(La)TeX submissions they will 
 be receiving documents with strange fonts,
 the license of which they will have to investigate first to see if they can 
 post the articles.

Most fonts are allowed to be subsetted and included within PDFs.
The subsetting prevents sensible extraction of the font as 
a whole, so foundries do not object.
After all, how can the beauty and craftsmanship within a font 
be displayed, and its popularity increased to the benefit of
the designer and foundry, unless documents using it are allowed 
to be distributed?

So no, that is *not* the crux of the issue.

It is the insistence on being able to reproduce the PDF
*automatically from source* that is where the problem lies.


There should be more circumstances under which users' PDFs 
would be accepted *as-is*, and distributed from arXiv.

Sources should certainly be included in the arXiv, primarily 
for verification purposes, even when not able to be presently 
compiled to the desired satisfaction.
 

If font licensing is still deemed to be an issue, then surely
there is a difference between recreating the PDF from source 
using a purchased, fully-licensed copy of the font, and simply 
serving a copy of a document for which the author has used 
their own (presumably purchased or licensed) copy of that font.

By all means tell the author that full acceptance of the paper
may be delayed if some investigation needs to be carried out.
Tell them the real reason; but *do not* insult the author 
by saying that (s)he must submit in a completely different 
format to what is best for the content of the work that 
(s)he has already prepared. 



Hope this helps,

Ross Moore
Director, TeX Users Group



Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114





--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] problem xelatex+tikz

2014-09-11 Thread Ross Moore
Hi Francois,

On 12/09/2014, at 7:39 AM, François Patte wrote:

 I wanted to get rid of these one inch to provide the publisher a
 camera ready file and the way I did was an easy way to give a
 centered text with the text width and height given by the publisher...

Then the way to do this in TeX is *not* with a single TeX run.
Typeset your book using LaTeX, using the publisher supplied dimensions
for the page content.

The run a 2nd (La)TeX job that simply includes the pages of your
typeset book, as if images.
The  pdfpages.sty  package is very good for this. 
You lose annotations this way; but it is for printing, isn't it?

In that 2nd job you can reorder and reposition your included pages 
in whatever way you like; face-to-face for 2-up, another facing pair
upside down for 4-up folding to create a booklet,  whatever ...

Centering this is fine, so adjusting \hoffset and \voffset is OK
as it cannot affect the details contained in the pages of your book.


Changing those global settings is very risky when you are using
packages that manipulate raw PDF structures and coordinates, 
as does  tikz  via  pgfgraphics.
It makes good sense to me that different typesetting engines might 
well give you different results, as each has had to find its own 
way to implement how raw PDF graphics streams need to be handled.


So, I'd have to disagree with Ulrike that it is necessarily a  tikz
bug. I'd say that if you want to employ such effects, then beware
of how you are interacting with the graphics environments that
your ultimate PDF engine needs to work within.

There may not be any documentation to help you, so encapsulate
your tasks better to eliminate any unwanted effects.

 
 When I have done this, it was the first edition of the latex companion
 and in the second edition, the offset commands are still there!
 
 
 
 - -- 
 François Patte


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114





--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] rangecheck in --run--

2014-08-27 Thread Ross Moore
Hi Mike,

On 28/08/2014, at 7:27 AM, maxwell wrote:

 One of our people is getting a crash in xetex, which I can't reproduce.  It's 
 very odd, since afaik we're both using the same input files, the same 
 instance of xetex, the same TeXLive 2014 files, and so forth, and running on 
 the same machine.  Clearly s.t. is different, but I'm not sure what, and this 
 email is a query about what I should be looking for.
 
 The error msg is:
 --
 Error: /rangecheck in --run--
 Operand stack:
   --dict:11/20(L)--   TT0   1   FontObject   --dict:8/8(L)--   
 --dict:8/8(L)--   TimesNewRomanPSMT   --dict:13/13(L)--   Times-Roman   
 Times-Roman
 Execution stack:
   %interp_exit   .runexec2   --nostringval--   --nostringval--   
 --nostringval--   2   %stopped_push   --nostringval--   --nostringval--   
 --nostringval--   false   1   %stopped_push   1862   1   3   %oparray_pop   
 1861   1   3   %oparray_pop   1845   1   3   %oparray_pop   --nostringval--   
 --nostringval--   2   1   1   --nostringval--   %for_pos_int_continue   
 --nostringval--   --nostringval--   false   1   %stopped_push   
 --nostringval--   --nostringval--
 Dictionary stack:
   --dict:1155/1684(ro)(G)--   --dict:1/20(G)--   --dict:76/200(L)--   
 --dict:76/200(L)--   --dict:106/127(ro)(G)--   --dict:286/300(ro)(G)--   
 --dict:22/25(L)--   --dict:4/6(L)--   --dict:26/40(L)--
 Current allocation mode is local
 Command /groups/tools/texlive/2014/bin/x86_64-linux/xelatex   -halt-on-error 
 -output-directory=./LinguistInABox/output/latex 
 ./LinguistInABox/output/latex/linguistInABoxGrammar.xetex -no-pdf died with 
 signal 13, without coredump

The problem looks to be with Ghostscript.
You may be using different versions, so check that first.


 
 
 Signal 13 is Write on a pipe with no reader, Broken pipe.
 
 I believe the crash is happening at the point xelatex is trying to embed an 
 existing PDF.  

Yes. That PDF presumably has some text in it, using Times font as  
TimesNewRomanPSMT .
Others used to using XeTeX under Linux may be able to offer a more detailed 
understanding
of the specific kind of error.


 If I'm right (we're going to verify it tomorrow), the command that crashes is
 --
 \imgexists{list_intonation.pdf}{{\imgevalsize{list_intonation.pdf}{\includegraphics[width=\imgwidth,height=\imgheight,keepaspectratio=true]{list_intonation.pdf{}
 -
 
 Googling this:
xetex OR xelatex rangecheck in --run--
 brings up about six msgs from 2011, which seem to be the same thread, and 
 afaict are irrelevant.
 
 We're running the version of xetex that came with TeXLive 2014 
 (3.14159265-2.6-0.1) on Linux.
 
 Any suggestions as to what I should be looking for?
 
   Mike Maxwell
   University of Maryland


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114





--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] FakeBold vs TikZ

2014-06-18 Thread Ross Moore
Hi Khaled,

On 18/06/2014, at 14:04, Khaled Hosny khaledho...@eglug.org wrote:

 On Wed, Jun 18, 2014 at 10:53:40AM +1000, Ross Moore wrote:
 It seems that one cannot break up the processing any more,
 or at least not in the simple-minded way.
 Would one of the developers please explain how to do this 
 kind of testing now.
 
 So called “native” fonts (AKA non-TFM fonts) are now stored in the XDV
 files using their full path not font name, so you can’t process XDV
 files using such fonts uless you have fonts in the exact same location
 (but whether this relates to the error you are seeing or not, I don’t
 no).

Yes, I discovered this.
First, to get XeTeX to work, I had to load various TeX fonts to become system 
accessible.
I did this by 
 1. making symbolic links from a subdirectory of /Library/Fonts  (on a Mac) to 
appropriate directories in the  texmf-dist/fonts/otf/  hierarchy;
 2. opening  FontBook.app   and choosing to load more fonts.

This was done on both systems, so the paths to the font names should indeed be 
the same...

  ...except that my account names are different on the 2 machines.
Symbolic links are again your friend here. I made symlinks in  /Users  so that 
the name used on one system becomes valid also on the other.

With these symlinks in place, XeTeX worked just fine to create the .xdv  files,
and  xdvipdfmx  no longer complained about not finding fonts by the full path 
in an .xdv file created on the other system. 
However, it does barf with the TFM error that I stated in the previous email.

That error occurs also when I split the job on the same system;
that is,
 xelatex -no-pdf  testfile.tex
 xdvipdfmx testfile.xdv

but all is fine with   xelatex testfile.tex  
so there must be some extra information that is being used when  xdvipdfmx  is 
called automatically. Any idea on what this could be? or how to run some 
tracing of the xdv processing?


My next step will be to install both TeXLive versions on one of the machines, 
and create TeXShop engine scripts to be able to easily choose which one to use.



 
 Regards,
 Khaled

Cheers,

Ross






--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] FakeBold vs TikZ

2014-06-17 Thread Ross Moore
Hi Stefan, Marcin, and others,

On 17/06/2014, at 7:29 AM, Stefan Solbrig wrote:

 Hi,
 
 Just compiled the document under TeX Live 2013 (all updates till the 
 freeze) and still no dot.

Interesting.

I just tried with TeXLive 2013 and 2014 on different Macs.

2013 has the dot!
2014 does *not* have the dot.

For the 2013 installation we have xdvipdfmx-0.7.9
For the 2014 installation,  xdvipdfmx  is version  20140317

I tried using  xelatex -no-pdf  to keep the  .xdv  file.
Then renamed these and copied to the other machine, for processing
with the other version of  xdvipdfmx .
If this would work, then it could identify whether the problem
was in  xdvipdfmx  or due to what is put into the .xdv  file.

No joy came from this test.
Instead, all I get is a fatal error:   Invalid TFM ID: 0 .


It seems that one cannot break up the processing any more,
or at least not in the simple-minded way.
Would one of the developers please explain how to do this 
kind of testing now.



Earlier testing with TeX Live 2012, with  xdvipdfmx-0.7.8 
was just giving the similar error

 Output written on Tikz-test.xdv (1 page, 4856 bytes).
 Transcript written on Tikz-test.log.
 /usr/local/texlive/2012/bin/x86_64-darwin/xdvipdfmx
 Tikz-test.xdv - Tikz-test.pdf
 [1
 ** ERROR ** TFM: Invalid TFM ID: 0
 
 Output file removed.




Cheers,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114





--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] XeTeX (/not/ XeLaTeX) : Marginal kerning, font protrusion, hyperlinks

2014-04-26 Thread Ross Moore
I'm finding this thread to be quite distressing.

On 26/04/2014, at 6:45, Philip Taylor p.tay...@rhul.ac.uk wrote:

  Wagner wrote:
 
 XeTeX can do via xdvipdfmx specials almost everything explained in the
 PDF reference and pdfmark reference. Will you insist to include these
 1000+ pages in the XeTeX manual?
 
 No, Zdeněk; I am asking for /important/ facts to be documented,
 not the entire known universe.

Phil, who is supposed to write the documentation that you desire?

You know that all TeX development is done voluntarily.
Most documentation is written by whomever wrote the code.
So you cannot expect any XeTeX developer to write beyond  \special  and it's 
initial keywords.

Anything about the arguments to \special  will necessarily be written by the 
authors of the driver applications, or someone else who has kindly donated 
their time to contribute their less than complete knowledge, based upon their 
own specific experience.

Asking for anything else is unreasonable, and insisting upon it is arrogance.

Yes, you do need to understand that there is a driver application, even if only 
one is currently supported with XeTeX, and that it currently — it wasn't always 
this way — is called automatically. The TeX world has always been this way, in 
that tasks are devolved to different application programs, and their 
documentation is typically written independently. 

 
 And how about the code below:
 
 \bf\special{pdf: code q 2 Tr 0.4 w 0 .5 .5 .05 k 1 1 0 .1 K}Hello
 world!\special{pdf:code}
 \bye
 
 The number of such productions is infinite; no documentation system,
 no no matter how complex and complete, can fully document an infinite
 universe of discourse, and therefore unless you can show that whatever
 your write-only code accomplishes is something that B L User is likely
 to want to accomplish, then I can see little point in documenting it.

Of course.
So please take the obvious hints, face reality, and desist on pursuing this 
thread.

Off list I have given you the advice of:
  1.  employing the   miniltx.tex   input file, to enable you to load important 
LaTeX internals, without having to submit to LaTeX's model of what is a 
document and how it might be structured; and
  2.  use  \tracingall  with LaTeX examples to see what is really happening.
And being prepared to open the package files themselves, to see what other 
branches are possible with the package's internal coding.

Method 2. has always worked for me, as it reveals far more accurate information 
than any documentation can ever do. Yes, it can be difficult and daunting, but 
it is accurate.
I mean, if a computer can understand it, then surely so can I.

 
 ** Phil.
 -- 
 All duplicate recycled material deleted on principle.

A fine principle.

Please apply such empathetic principles also to developers, who supply their 
efforts entirely voluntarily, and respect the fact that they may have a 
different perception to you of what is important, and what is not.


Best regards,

Ross






--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] XeTeX (/not/ XeLaTeX) : Marginal kerning, font protrusion, hyperlinks

2014-04-15 Thread Ross Moore
Hi Phil,

Have you ever tried

\input miniltx.tex

This then allows a subset of LaTeX structural commands and internals to be used 
without the documentclass stuff — which is what I think you detest most.
Now many LaTeX packages can be loaded and used, without problems, in what are 
otherwise plain TeX documents. Just use, e.g.

  \usepackage{color}

as in a LaTeX document.

I'm pretty sure that  graphics.sty  and  graphicx.sty  are usable this way, and 
also  hyperref.sty ,
and its driver files, including  hdvipdfmx.def  — I think this is the correct 
name.

Now all the documentation you want about hyperlinks is in the book The LaTeX 
Web Companion,  or in the PDFs built from the .dtx  documentation files for the 
LaTeX packages. 
Use the  texdoc  command to access these.

Beware that not all LaTeX packages work this way. That will depend upon what 
the authors of the packages have used internally. There can be inter-package 
dependencies, ultimately leading back to the parts of LaTeX that you have not 
loaded. You have to just try things out, and use what works, perhaps keeping 
records of what does or does not.


Hope this helps,

 Ross

On 15/04/2014, at 19:07, Philip Taylor p.tay...@rhul.ac.uk wrote:

 
 
 Khaled Hosny wrote:
 
 On Thu, Apr 10, 2014 at 12:58:23PM +0100, Philip Taylor wrote:
 
 Why are these key XeTeX primitives (\XeTeXprotrudechars, \rpcode, etc)
 not documented in /The XƎTEX reference guide/ ?   Will, Khaled,
 Jonathan :  can you comment on this, and will these (and any other
 currently undocumented primitives) be documented in the version of
 /The XƎTEX reference guide/ which accompanies TeX Live 2014 ?
 
 From me: simply because I know near nothing about them.
 
 Fully understood.  In that case, may I ask Jonathan where these primitives 
 are, in fact, documented, so that Khaled and Will can
 potentially make use of this information if they choose to
 prepare a TeX Live 2014 edition of /The XƎTEX reference guide/ ?
 
 Also, are there any other currently undocumented XeTeX primitives,
 and where can be found any information on embedding hyperlinks
 using XeTeX ?
 
 Philip Taylor
 
 
 
 
 --
 Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] turn off special characters in PDF

2014-01-06 Thread Ross Moore
Hi Joe,

On 04/01/2014, at 8:43 AM, Joe Corneli wrote:

 Hi All:
 
 I'm glad my message sparked some discussion.  My M[N]WE for my
 specific use case on tex.stackexchange.com has not gotten much
 attention - I recently attached a +200 bounty.
 
 http://tex.stackexchange.com/questions/151835/actualtext-in-small-cap-hyperlinks
 
 I figured I should put in a plug for that here.  I already got a reply
 from one of the main authors of hyperref, but patching \href at the
 necessary level is beyond me.  Finally, I realize a detailed
 discussion of this issue is probably not germane to this list, so if
 you feel that way, please direct further comments there, or to me off
 list.

No, it is quite germane for this list, and relates to
a very recent thread.

The attached PDF is a variant of your example.
Copy/Paste the text using Adobe Reader or Acrobat Pro.
You should get:

Old: Sexy tex: .
New: Sexy tex: sxe .

Apples's Preview (at least within TeXshop) doesn't seem to recognise
the /ActualText  tagging.




accsupp-href.pdf
Description: Adobe PDF document


To achieve this I had to do several things.
Here are the relevant definitions:

\newcommand*{\hrefnew}[2]{%
\hrefold{#1}{\BeginAccSupp{method=pdfstringdef,unicode,ActualText={#2}}#2\EndAccSupp{}}}
\AtBeginDocument{%
 \let\hrefold\href 
 \let\href\hrefnew
}

Notes:
  1. Use \BeginAccSupp and \EndAccSupp  as tightly
 as possible around the text needing to be tagged.

  2. You want the  method=pdfstringdef   option.
 (It is  pdfstringdef  not  pdfstring .)
 This results in appropriate strings for the /ActualText value;
 either ASCII if possible (as here) or UTF16 strings with BOM.

 3.  Delay the rebinding of \href  to \AtBeginDocument .
 This way you do not interfere with any other package making
 its own redefinition of what \href does.



What follows is highly technical and of no real concern to anyone
just wanting to use /ActualText tagging.
Rather it is about implementing this (and more general kinds of)
tagging in the most efficient way.


The result of the above coding is to adjust the PDF page stream 
to include:

  q 
  1 0 0 1 129.04 -82.56 cm 
  /Span/ActualText(sxe)BDC
  Q BT /F1 11.955 Tf 129.04 -82.56 Td[095e09630950]TJ ET q 
  1 0 0 1 145.89 -82.56 cm EMC 
  Q

where you can see the /Span tagging of the content between BDC and EMC.
This works, but is excessive, to my mind, by duplicating some operations.

Now the xdvipdfmx processor allows an alternative form for
the \special  used to place the tagging.
It can be invoked with the following redefinition of internals
from the  accsupp.sty  package:

\makeatletter
 \def\ACCSUPP@bdc{\special {pdf:literal direct \ACCSUPP@span BDC}}
 \def\ACCSUPP@emc{\special {pdf:literal direct EMC}}
\makeatother


This gives a much more efficient PDF stream:

   ...60059001b]TJ ET
   /Span/ActualText(sxe)BDC 
   BT /F1 11.955 Tf 129.04 -82.56 Td[095e09630950]TJ ET 
   EMC
   BT /F1 11.955 Tf ...

in which the irrelevant coordinate/matrix changes (using 'cm')
no longer occur.


But even this could possibly be improved further to avoid the
extra BT ... ET :

   ...60059001b]TJ 
   /Span/ActualText(sxe)BDC 
   /F1 11.955 Tf 129.04 -82.56 Td[095e09630950]TJ 
   EMC
   /F1 11.955 Tf ...


In the experimental version of  pdfTeX  there is a
keyword 'noendtext' that can be used with the new 
 \pdfstartmarkedcontent  primitive:

  \pdfstartmarkedcontent attr{attributes} noendtext ...

which is designed with this aim in mind.
Use of this keyword sets a flag so that the matching  
 \pdfendmarkcontent  can keep the BT/ET nesting consistent.


 
 Thank you!
 
 Joe


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114


inline: logo.png


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] turn off special characters in PDF

2014-01-01 Thread Ross Moore
Hi Zdenek, and others,

On 01/01/2014, at 11:53, Zdenek Wagner zdenek.wag...@gmail.com wrote:

 The attached file (produced using pdfTeX, not XeTeX) is an example
 that I've used in TUG talks, and elsewhere.
 Try copy/paste of portions of the mathematics. Be aware that you can
 get different results depending upon the PDF viewer used when
 extracting the text.  (The file has uncompressed streams, so you
 can view it in a decent text editor to see the tagging structures
 used within the PDF content.)
 
 If I remember it well, ActualString supports only bytes, not
 cotepoints. Thus accfented characters cannot be encoded, neither Indic
 characters.

I don't know what you mean by this.
In my testing I can tag pretty-much any piece of content, and map it to any 
string using /ActualText .
Mostly I use Adobe's Acrobat Pro as the PDF reader, and this works fine with it,
modulo some bugs that have been reported when using very long replacement 
strings.

In the example PDF that I attached to my previous message, each mathematical 
character is mapped to a big-endian UTF-16 hexadecimal string, with Plane-1 
alphanumerics expressed using surrogate pairs. 

I see no reason why Indic character strings could not be done similarly.
You probably need some on-the-fly preprocessing to work out the required 
strings to use.
This is certainly possible, and is what I do with mathematical expressions.
It should be possible to do it entirely within TeX, but the programming can get 
very tricky, so I use Perl instead.

 ToUnicode supports one byte to many bytes, not many bytes
 to many bytes.

Exactly. This is why /ActualText  is the structure to use.


 Indic scripts use reordering where a matra precedes the
 consonants or some scripts contain two-piece matras. Unless the
 specification was corrected the ToUnicode map is unable to handle the
 Indic scritps properly.

Agreed;  /ToUnicode  is not what is needed here.
This sounds like precisely the kind of situation where you want to tag an 
extended block of content and use /ActualText  to map it to a pre-constructed 
Unicode string.
I'm no expert in Indic languages, so cannot provide specific details or 
examples.


 
 --
 Regards,
 Alexey Kryukov anagnost at yandex dot ru
 
 Moscow State University
 Faculty of History
 
 
 
 Hope this helps,
 
Ross

 -- 
 Zdeněk Wagner
 http://hroch486.icpf.cas.cz/wagner/
 http://icebearsoft.euweb.cz

Happy New Year,


Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] turn off special characters in PDF

2014-01-01 Thread Ross Moore
Hi Zdeněk,

On 02/01/2014, at 2:14 AM, Zdenek Wagner wrote:

 2014/1/1 Ross Moore ross.mo...@mq.edu.au:

 In the example PDF that I attached to my previous message, each mathematical
 character is mapped to a big-endian UTF-16 hexadecimal string, with Plane-1
 alphanumerics expressed using surrogate pairs.
 
 Thank you, now I see it. The book where I read about /ActualText did
 not mention that I can use UTF16 if I start the string with BOM.

Fair enough; this I had to discover for myself.
The PDF Reference Manual (e.g. for ISO 32000) has no such examples,
so I had to experiment with different ways to specify strings requiring
non-ascii characters. UTF16 is the most elegant, and avoids the messiness
of using escape characters and octal codes, even for some non-letter
ASCII characters.

 Can I
 see the source of the PDF? It could help me much to see how you do all
 these things.

Each piece of mathematics is captured, saved to a file, converted to MathML,
then run through my Perl script to create alternative (La)TeX source.
This is done to be able to create a fully-tagged PDF description of the 
mathematical content, using a special version of  pdftex  that Han The Thanh
created for me (and others) --- still in experimental stage.

You should not need all of this machinery, but I'm happy to answer
any questions you may have.

I've attached a couple of examples of the output from my Perl script, 
in which you can see how the /ActualText  replacement strings
are specified, using a macro \SMC — which ultimately expands to use
the  \pdfstartmarkedcontent  primitive.



2013-Assign2-soln-inline-2-tags.tex
Description: Binary data


2013-Assign2-soln-inline-1-tags.tex
Description: Binary data


Without the special primitives, you should be able to use  \pdfliteral 
to insert the tagging needed for just using  /ActualText .

 
 I see no reason why Indic character strings could not be done similarly.
 You probably need some on-the-fly preprocessing to work out the required
 strings to use.


I'm not sure whether there is a LaTeX package that allows you to get the
literal bits into the correct place without upsetting other fine
details of the typesetting with Indic characters.
This certainly should be possible, at least when using  pdfLaTeX .
Not sure of the details using XeTeX — but you work with the source code,
so can devise anything that is needed, right?

 
 -- 
 Zdeněk Wagner
 http://hroch486.icpf.cas.cz/wagner/
 http://icebearsoft.euweb.cz



Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114


inline: logo.png


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] turn off special characters in PDF

2013-12-29 Thread Ross Moore
Hi Joe,

On 30/12/2013, at 8:12 AM, Joe Corneli wrote:

 This answer talks about how to turn off litgatures:
 http://tex.stackexchange.com/a/5419/4357
 
 Is there a way to turn off *all* special characters (e.g. small caps)
 and just get ASCII characters in the copy-and-paste level of the PDF?

In short, no!
 — because this is against the idea of making more use of Unicode,
across all computing platforms.

Certainly a ligature can have an /ActualText replacement consisting
of the separate characters, but this requires the PDF producer
to have supplied this within the PDF, as it is being generated.

I've played a lot with this kind of thing, and think that this
is the wrong approach. One should use /ActualText to provide
the correct Unicode replacement, when one exists. Thus one
can extract textual information reliably, even when the PDF
uses legacy fonts that may not contain a /ToUnicode resource,
or if that resource is inadequate in special situations.


Besides, do you really mean *all* special characters?
What about simple symbols like: ß∑∂√∫Ω  and all the other 
myriad foreign/accented letters and mathematical symbols?

If you want these to Copy/Paste as TeX coding (\beta  \Sum \delta  
\sqrt etc.) within documents that you write yourself, then I wrote 
a package called  mmap  where this is an option for the original 
Computer Modern fonts.


Alternatively, a PDF reader might supply a filtering mode that
converts the ligatures back to separate characters. Then the
user ought to be able to choose whether or not to use this filter.
I don't know of any that actually do this.
(In any case, you would want such a tool to allow you to specify
which characters to replace, and which to preserve.)


Your best option is surely to (get someone else to) write such 
a filter that meets your needs, and use it to post-process the text 
extracted via Copy/Paste or with other text-extraction tools.

Of course this is no use if your aim is to create documents for
which others get the desired result via Copy/Paste.
For this, the /ActualText approach is what you need.



Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114


inline: logo.png


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] wrong glyphs with FreeSerif Italic

2013-12-26 Thread Ross Moore

On 27/12/2013, at 9:39 AM, Zdenek Wagner wrote:

 Sorry, I cannot reproduce it, there must be something wrong in your
 installation. I tried both with TeX Live 2011 and TeX Live 2013 and I
 get the expected result.

Me too, with:

 This is XeTeX, Version 3.1415926-2.2-0.9997.4 (TeX Live 2010)
and
 This is XeTeX, Version 3.1415926-2.4-0.9998 (TeX Live 2012)


With 2010 the font versions, as encoded in the font itself, are
  FontForge 2.0 : Free Serif : 4-1-2009Version $Revision: 1.358 $
  FontForge 2.0 : Free Serif Italic : 4-1-2009   Version $Revision: 1.175 $

With 2012 the font versions, as encoded in the font itself, are 
  GNU: FreeSerif Normal: 2012   Version 0412.2263
  GNU: FreeSerif Italic: 2012   Version 0412.2268


With the 2012 font, I get a lot of warnings about unsupported features;
viz.

*
* fontspec warning: icu-feature-not-exist-in-font
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif',
* with script 'Latin', and language 'Default'.
*
*
* fontspec warning: icu-feature-not-exist-in-font
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif',
* with script 'Latin', and language 'Default'.
*
*
* fontspec warning: icu-feature-not-exist-in-font
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif/B',
* with script 'Latin', and language 'Default'.
*
*
* fontspec warning: icu-feature-not-exist-in-font
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif/I',
* with script 'Latin', and language 'Default'.
*
*
* fontspec warning: icu-feature-not-exist-in-font
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif/BI',
* with script 'Latin', and language 'Default'.
*
*
* fontspec warning: icu-feature-not-exist-in-font
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif',
* with script 'Latin', and language 'Default'.
*
*
* fontspec warning: icu-feature-not-exist-in-font
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif',
* with script 'Latin', and language 'Default'.
*
*
* fontspec warning: icu-feature-not-exist-in-font
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif',
* with script 'Latin', and language 'Default'.
*
*
* fontspec warning: icu-feature-not-exist-in-font
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif',
* with script 'Latin', and language 'Default'.
*



 
 2013/12/26 Julian Bradfield jcb+xe...@jcbradfield.org:
 This is probably FA, but I haven't found it by searching...
 
 I'm a first-time user of xelatex (but 30-year user of TeX in general),
 and have used it to typeset a linguistic article with Charis SIL. I
 then wanted to switch to GNU Freefont, and encountered the weird
 symptom that all the glyphs are displaced by two codepoints in the
 Italic version.
 Here's a minimal example:
 
 \documentclass{article}
 \usepackage{mathspec}
 \setallmainfonts(Digits,Latin,Greek,Special)[Mapping=tex-text,Fractions=Off]{FreeSerif}
 \begin{document}
 ABCabc \it ABCabc
 \end{document}
 
 
 On processing, the PDF shows ABCabd CDEcde; the right character
 metrics appear to have been used, but the glyphs are wrong.
 
 My xelatex version is
 This is XeTeX, Version 3.1415926-2.3-0.9997.5 (TeX Live 2011) 
 (format=xelatex 20
 12.11.27)
 and the Freefont is the release of 20120503 (in either otf or ttf).

Sorry, I don't have TeX Live 2011 installed, nor 2013.
Though I'd suspect the font itself for such a result.


 -- 
 Zdeněk Wagner
 http://hroch486.icpf.cas.cz/wagner/
 http://icebearsoft.euweb.cz


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114


inline: logo.png

Re: [XeTeX] XeTeX : images as links

2013-09-20 Thread Ross Moore
Hi Phil,

On 21/09/2013, at 3:23 AM, Philip Taylor wrote:

 In a forthcoming PDF catalogue of Greek MSS, a number of thumbnail
 images of folia, bindings, etc., will appear, many if not all of
 which will be expected to function as hyperlinks to full-sized
 (or perhaps pannable/zoomable) versions of the same.  However,
 endeavouring to achieve this functionality using either Eplain's
 \hyperref or Eplain's \hlstart/\hlend fails to produce the desired
 effect -- whilst text can act as a clickable region for a hyperlink,
 an image included using \XeTeXpicfile seemingly cannot.   
 
 The following, a verbatim copy from the test file, demonstrates
 the problem --

Can you post a PDF, preferably uncompressed,
so we can look at how the hyperlink is specified.

 
 
   \catcode `\ = \catcode `\@
   \input eplain
   \catcode `\ = \active
   
   \enablehyperlinks
   
   \uselanguage {UKenglish}
   
   \hlstart {url}{}{http://example.org/fullsize}
   \hbox
   \bgroup
   \XeTeXpicfile Images/LPL-MS-1214/JPG/f1r.jpg height 0,25\vsize
   \egroup
   \hlend
   
   \vskip \baselineskip
   
   \hbox
   \bgroup
   \hlstart {url}{}{http://example.org/fullsize}
   \XeTeXpicfile Images/LPL-MS-1214/JPG/f1r.jpg height 0,25\vsize
   \hlend
   \egroup
   
   \vskip \baselineskip
   
   \href{http://example.org/fullsize}{\hbox {\XeTeXpicfile
 Images/LPL-MS-1214/JPG/f1r.jpg height 0,25\vsize}}
   
   \end

What is this  0,25\vsize  ?
Should it not be  0.25\vsize  for TeX to get the correct
vertical dimension?

Try also:  
  \setbox0=\hbox{\XeTeXpicfile Images/LPL-MS-1214/JPG/f1r.jpg height 0,25\vsize 
}
then
  \message{height=\ht0 + \dp0, width=\wd0}
to see what height you are really getting.

Then set the link with:
  \href{http://example.org/fullsize}{\box0}

 
 Needless to say, if any of the \XeTeXpicfiles are replaced by text,
 all works as expected.
 
 Can anyone please explain why this does not work, and how the
 problem can best be transcended ?  (NB.  UNIV = Plain XeTeX, not XeLaTeX)
 
 Philip Taylor
 
 P.S. The catcode stuff at the top is because the real project goes on
 to load an XML file.


I'd doubt that there's any problem caused by this, with  and 
**unless**
 these are used internally by packages which you load *after* the
catcode changes.

e.g. conditionals may fail
   \ifnum ...     
if the catcodes had not been setup robustly within the package.

But surely you would have noticed problems of this kind already,
if they were indeed going to occur with your larger document.


Cheers

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114


inline: logo.png


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] On an ugly hack to mathbf only in the local style.

2012-10-30 Thread Ross Moore
Hi Michaël,

On 31/10/2012, at 1:39 AM, Michaël Cadilhac wrote:

 Howdy,
 
 \vec{v}_1 ?
 
 Herb,
 
 Thanks, but of course, I'd like to avoid going through hundreds of pages (ok,
 a script would be easy to write, but still...).  Also, I'd like to keep the
 semantics \vec{T} is for a vector T, whether T=v or T=v_1.

It's a pity that you chose to write you manuscripts this way.
You can see how difficult it gets when you write a macro
that represents just an abstract concept, without detailed thought
for all the different ways it may be used.

What I do for this kind of thing is:

 \newcommand{\TT}{\boldsymbol{T}}  
 \newcommand{\vv}{\boldsymbol{v}}  

% \boldsymbol gives a bold-italic, rather than bold-upright

then use it in the body material as:

  \TT  or  \TT_1  or  \vv^{(1)}_2  etc.

When reading your own source coding, you see `\TT' and
think `vector T' or just `T' --- which are what you would 
say out loud if you were writing on a black/white-board
while giving a lecture.

The other advantage of doing it this way is that you do not
need to change the body of your document when you choose, in
future, to use a different kind of processor, creating a view 
of your document for a different format: HTML, XML, tagged-PDF, 
ePub, MathML, etc.

You'll only need to adjust the macro definitions to add whatever
is necessary for the required kind of enrichment.


 
 Thanks!
 
 M.



Cheers

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] how to do (better) searchable PDFs in xelatex?

2012-10-15 Thread Ross Moore
Hi Peter, Jonathan,

On 16/10/2012, at 2:02, Peter Baker ps...@virginia.edu wrote:

 On 10/15/12 10:59 AM, Jonathan Kew wrote:
 
 That's exactly the problem - these glyphs are encoded at PUA codepoints, so 
 that's what (most) tools will give you as the corresponding character data. 
 If they were unencoded, (some) tools would use the glyph names to infer the 
 relevant characters, which would work better.
 
 Small caps are named like a.sc and they are unencoded.
 And as they're unencoded, (some) tools will look at the glyph name and map 
 it to the appropriate character.
 
 I've been trying to explain this:  but Jonathan does it much better than I 
 did, and with more authority.

Yes, but why would he tools be designed this way?
Surely unencoded means that the code-point has not been assigned yet, and may 
be assigned in future. So using these is asking for trouble.
Was not the intention of PUA to be the place to put characters that you need 
now, but have no corresponding Unicode point? This is precisely where using the 
font name should work. Or am I missing something?

So why would the tool be designed to infer the right composition of characters 
when a ligature is properly named at an unencoded point, but that same 
algorithm is not used when it is at a PUA point?

 
 P.

Perplexed.

Ross

PS. would not this be particulr issue with ligatures be resolved with a 
/ToUnicode  CMap for the font, which can do one–many assignments. 
Yes, this does not handle the many–one and many–many requirements of complex 
scripts, but that isn't what was being reported here, and is a much harder 
recognition problem.
Besides, it isn't clear there what copy-paste should best produce. Nor how to 
specify the desired search.


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] how to do (better) searchable PDFs in xelatex?

2012-10-14 Thread Ross Moore
Hi Peter,

On 15/10/2012, at 1:45 PM, Peter Baker wrote:

 On 10/14/12 7:47 PM, msk...@ansuz.sooke.bc.ca wrote:
 If font designers did that, and if PDF readers looked at the glyph names 
 according to Adobe's directions, then searches would work regardless of PUA 
 use. However, not all fonts and not all readers do this.
 My experience is that not all = none. I've tested my own font (Junicode) 
 in Adobe Reader, Preview, Evince and Goodreader (with PDFs generated by 
 XeTeX), and the result is the same in all. Standard ligatures (those encoded 
 at FB00 and following) work fine, but others do not. For example, Junicode 
 has an f_t ligature in the PUA, properly named, and when that is used you 
 cannot search for after or often in any of those PDF readers. But when I 
 move it out of the PUA into an unencoded slot, it works fine.

Any chance of providing example PDFs of this?
(preferably using uncompressed streams, to more easily
examine the raw PDF content)

Do the documents also have CMap resources for the fonts,
or is the sole means of identifying the meaning of the
ligature characters coming from their names only?

Have these difficulties been reported to Adobe recently?
If not, would you mind me doing so?

 
 Same with Libertine.
 
 Peter


Cheers

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] The future of XeTeX

2012-07-30 Thread Ross Moore
Hi Phil,

On 31/07/2012, at 7:35 AM, Philip TAYLOR wrote:

 
 OK, but how about things such as graphics ?  I already know
 that I cannot use (e.g.,) \pdfximage, \pdfrefximage and
 \pdflastximage in XeTeX, and have to use instead \XeTeXpicfile;
 if I were to migrate to LuaTeX, would I be correct in expecting
 to find a whole new set of graphics primitives ?

I know you don't like using LaTeX, but really people have
put a huge amount of effort into that, and its packages.

When I occasionally still use Plain TeX, almost always
I make use of:

   \input miniltx.tex 

This loads a *very limited* subset of LaTeX, but enough
to allow  \usepackage  to work.
Some LaTeX packages have been written in TeX, so as to not 
depend on other parts of LaTeX for loading and doing their stuff.
This includes graphics and graphicx, and perhaps color too.
(Xy-pic was written this way too!)

So you can make use of the  \includegraphics  command,
along with its options and driver support, from within
your Plain TeX documents.
This then allows you to just use them with all their
associated power and effects, without having to worry 
about how it all works ...

OR
   ... turn on some  \tracingall  to find out just what
primitives are being constructed, to fully satiate your 
ever-inquiring mind.


  And what of
 \XeTeX's extensions to the \font primitive ?

Isn't there some documentation about that?

 
 ** Phil.


Cheers,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114






--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Footnote rule line without any footnote text material

2012-07-12 Thread Ross Moore
Hi Andy, and Peter,

On 13/07/2012, at 1:38 AM, Andy Black wrote:

 On 7/12/2012 2:29 AM, Peter Dyballa wrote:
 Am 12.07.2012 um 02:02 schrieb Andy Black:
 

 Is it for an almost minimal test case necessary to set up fancyhdr and 
 hyperref?
 
 As you may have guessed, the TeX code is generated automatically from an XML 
 mark-up language

That explains all the unnecessary extra grouping,
and in-body commands where an environment would be more appropriate
 --- well, it would help to make the coding easier to read,
and debug, by making the structure clearer.
But none of this is actually wrong.


 for linguistic documents (see http://www.xlingpaper.org/). This generation 
 process is attempting to use many of the features that LaTeX and friends 
 provide (to avoid having to re-invent the wheel) while still allowing for 
 larger variations in layout parameters than basic LaTeX has.  What is here is 
 what I found to work.  I'm not recalling the exact reason why I used MainFont 
 but I do know that the
 
\font\MainFont=/font family name/ at /pointsize/pt
 
 was a way to allow for varying font families and point sizes, including 
 larger or smaller than the three I understand LaTeX provides (10, 11, and 12).
 
 I would not be shocked to learn that there is a better way to do this, but I 
 found this to work.

First off, dump the XML elements as environments.
Put the constant coding into the \newenvironment  definitions,
read from a separate package or document-class file.

If there are attribute values that need to be passed as parameters,
then that's OK. You can specify the number of parameters, via.

  \newenvironment[num]{...
... coding at the start of the environment ...
  }{
 ... coding at the end of the environment ...
  }

 
 
 Why is
 
  \protect\footnote
 
 necessary?

Irrelevant. Tracing shows that  \protect = \relax  
at the time this is called.

 
 There are situations when a footnote is embedded within other constructions 
 (perhaps a table within a table -I'm not recalling the exact context) where 
 the \protect was necessary.  Rather than coding the TeX generator to have to 
 determine the set of contexts where the \protect was required, I opted to 
 just always use it.
 
 Could this provoke setting the footnotes line on every page?
 
 No, it doesn't.  

Correct.

 I'm using the \protect\footnote for every footnote and it is only in very, 
 very rare circumstances that we get the extra footnotes line.
 
 In addition, I also just removed the \protect in the .tex file I sent and 
 re-ran it using tl 2012.  The extra footnotes line is still there.
 
 Thanks again so much for exploring this with me.

The problem is definitely related to  {longtable} 
since it calls \output  and fiddles with the \pagegoal .

Your example has a  \begin{longtable} ... \end{longtable}
at the bottom of the page, prior to where the unwanted rule
occurs.

I can make the extra footnote rule go away, by including
some extra space at the bottom of the table; viz.
  
  ... table cell data ...
  \\noalign{\vspace{some amount}}
  \end{longtable}

varying the some amount one can either get the extra rule,
or suppress it.  More space, beyond some limit, suppresses
the rule, and has no other effect.
What that limit is, may vary according to the actual vertical
size of the tabular material, so this does *not* give an easy
way to solve the problem programatically.


What I suspect is happening is that when the \LT@output  is called
{longtable} is aware that a footnote is around, and splits its
contents, perhaps leaving only glue to go onto the next page.
On that next page, the {longtable} environment finishes, and still
thinks that there is a footnote to be placed, even though there
is no content remaining.
The glue at the top of the page gets discarded, but the apparent
presence of footnote material is retained.

When extra space is added at the end of the {longtable} the 
desirability of where to split the table changes, and perhaps
the whole table now gets placed on the first page. 
There is no carry-over of any knowledge of a footnote, so no
extra line is drawn on the next page.

This is all pretty-much speculation.
Someone more familiar with the inner workings of {longtable}
may be able to make more sense of what is happening.

 
 --Andy
 
 
 --
 Greetings
 
   Pete
 
 The human brain operates at only 10% of its capacity. The rest is overhead 
 for the operating system.

Needed more than 10% to study this weirdness!


Cheers,

Ross



Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

Re: [XeTeX] Extraneous comma shows conflict between memoir and xecjk package?

2012-06-06 Thread Ross Moore
Hello Jon,

On 07/06/2012, at 5:15 AM, jon wrote:

 THE BAD
 The following does not produce expected results. There is an *extraneous 
 comma* (or superscripted comma) inserted  into the body text directly before 
 every footnote number.
 
 \documentclass[12pt]{memoir}
 
 \usepackage{fontspec}
 \setmainfont[Mapping=tex-text,Numbers=OldStyle]{Linux Libertine O}
 
 
 \usepackage{xeCJK}
 
 \setCJKmainfont[]{AR PL UKai TW}
 
 \begin{document}

{\tracingall
 But if I don't ``\emph{See through it and let it 
 go},\footnote{看得破,放得下。\emph{Kàn de pò, fàng de xià}.} the karma will carry 
 forward
 
 
 
}

 \end{document}

 The book is 99.5% typeset so it is not practical to switch away from memoir 
 at this point. If there is a way to use Chinese fonts without using xeCJK, I 
 could try that. Any ideas how to get rid of that extraneous comma?

First you need to find out where it is coming from.
Inserting \tracingall (within a group {...} to limit the scope)
will produce lots of output in the .log  or Console window.

You should be able to search this to find where the extra comma
is inserted, then follow backwards the various macro-expansions
that caused this.

Once the source is located, you should be able to make a single
macro re-definition in your document preamble, to prevent 
the behaviour that you do not want.


This may seem a rather tedious way to tackle the problem, 
but it is reliable and very instructive for solving such
delicate problems, if you are interested in programming. 
— Not everyone's cup of tea, though.

 
 Thanks!
 
 Jon
 --
 Jon Babcock


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] XeLaTeX and SIunitx

2012-05-13 Thread Ross Moore
Hi Ulrike, and Bruno,

On 13/05/2012, at 11:05 PM, Ulrike Fischer wrote:

 Am Fri, 11 May 2012 19:44:00 +0200 schrieb Bruno Le Floch:
 
 I'm really no expert, but the siunitx package could include, e.g., µ
 as 00b5.  This  would not make pdftex choke when appearing in the
 false branch of an engine-dependent conditional.
 
 Using ^^..-notation is certainly a good idea in styles - regardless
 of the engine - as it avoids encoding confusing.

If by styles, you mean in a macro definition made within 
a separate style file, then I agree with you 100%.

But ...

 But it doesn't
 solve the problem here as pdftex chokes if it sees more than two ^^: 
 

  ... this is not a good example to support this view.

 \documentclass{article}
 \begin{document}
 00b5
 \end{document}

The body of your document source should be engine independent,
so this should look more like:


\documentclass{article}
\usepackage{ifxetex}

\ifxetex
 \newcommand{\micronChar}{00b5}
  % handle other characters
  ...
\else
 \if ... 
  % handle other possibilities
  %  e.g.  ^^c2^^b5
  ...
 \fi
\fi

\begin{document}
\micronChar
\end{document}


Better still, of course is to have the conditional
definitions made in a separate file, so that similar things
can all be handled together and used in multiple documents.

You want to avoid having to find and replace multiple instances
of the special characters, when you share you work with colleagues
or need to reuse your own work in other contexts.
Instead you should simply need to adjust the macro expansions,
and all that previous work will adapt automatically.

 
 
 ! Text line contains an invalid character.
 l.9  ^^^
^00b5
 ? x
 
 
 For pdftex you would have to code it as two 8bit-octect:  ^^c2^^b5
 But this naturally will assume that pdftex is expecting utf8-input.
 
 -- 
 Ulrike Fischer 

Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Producer entry in info dict

2012-02-28 Thread Ross Moore
Hi Heiko,

On 29/02/2012, at 8:44 AM, Heiko Oberdiek wrote:

 Hello,
 
 the entries in the information dictionary can be controlled
 at TeX macro level except for /Producer:
 
 % xetex --ini
 \catcode`\{=1
 \catcode`\}=2
 \shipout\hbox{%
  \special{pdf:docinfo%
/Producer(MyProducer)%
/Creator(MyCreator)%
/Author(MyAuthor)%
/Title(MyTitle)%
/Subject(MySubject)%
/Keywords(MyKeywords)%
/CreationDate(D:2012010100Z)%
/ModDate(D:2012010100Z)%
/MyKey(MyValue)%
 }%
 }
 \csname @@end\endcsname\end

Surely  /Creator  is (La)TeX, Xe(La)TeX, ConTeXt, etc.
while   /Producer  is the PDF engine:  
   Ghostscript, xdvipdfmx, pstopdf, Acrobat Distiller, etc.
and  /Author  is the person who wrote the bulk of
the document source.

Why should it be reasonable that an author can set the
 /Producer and /Creator  arbitrarily within the document 
source?

The author chooses his workflow, and should pass this
information on to the appropriate package ...

 
 The entry for /Producer gets overwritten by xdvipdfmx,
 e.g. xdvipdfmx (0.7.8). Result:
 
 * Bug-reports/hyperref: pdfproducer={XeTeX ...} does not work.
 * hyperxmp is at a loss, it *MUST* know the value of the
  /Producer, because the setting in the XMP part has to be
  the same.

  ... via options to  \usepackage[...]{hyperxmp}

and the package should be kept up-to-date with the exact strings
that will be produced by the different processing engines, in all 
their existing versions.


I know that one processor cannot know in advance how its output
will be further processed, but that is not the point of XMP.

The person who is the author, or production editor, *does* know 
this information (at least in principle) and should ensure that 
this gets encoded properly within the final PDF --- if complete 
validation against an existing standard is of any importance.


 
 Please fix this issue in xdvipdfmx.

I'm not sure that it is  xdvipdfmx's duty to handle this
issue; though see my final words below.

My initial thoughts are as follows:

The nature and purpose of XMP  is such that an author
cannot just  \usepackage{hyperxmp}   with no extra options,
and expect the XMP information to be created automagically,
correctly in every detail.


The alternative is to have an auxiliary file that contains
macro definitions, to be used both in the  docinfo  and XMP.
This auxiliary file needs to be created either manually,
or automatically extracting the information from a PDF,
first time it is created.

With PDF/A and PDF/UA the PDF file is not supposed to be 
compressed, so automating this is not so hard --- though 
it may well be platform-dependent.
(Not sure about other flavours of PDF/??? .)


 
 Yours sincerely
  Heiko Oberdiek


BTW, what about the  /CreationDate  and  /ModificationDate ?
Surely these should be set automatically too ?
Doesn't  pdfTeX  have the means to do this?

Of course when it is a 2-engine process, such as
  XeTeX + xdvipdfmx 
then which time should be encoded here?
XeTeX cannot know the time at which  xdvipdfmx  will do 
its work.  Maybe it can extrapolate ahead, from information
saved from the previous run ?


So maybe what is really desirable is for  xdvipdfmx  to write
out an auxiliary file containing all relevant metadata, including
timings, that can then be used by the next run of  XeLaTeX .
A  \special{ ... }  command could be used to trigger the need
for such an action to be performed.

Is that what you had in mind?



Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114






--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] fontspec loading the wrong font?

2011-12-15 Thread Ross Moore
Hello Daniel,

On 16/12/2011, at 8:43 AM, Daniel Greenhoe wrote:

 I have run into a very strange problem when using fontspec and trying
 to test a new experimental version of GNU FreeSerif. In particular,
 suppose I try labeling the old FreeSerif as \fntFreeSerif and the new
 experimental FreeSerif as \fntFreeSerifx like this:

try doing some detailed tracing, using  \tracingall

{\tracingall % detailed trace of just the next 2 top-level commands
 \newfontfamily{\fntFreeSerif}[
   ExternalLocation,
   Path   = {/xfonts/gnuFreeFont/},
   Extension  = {.otf},
   UprightFont= {*},
   BoldFont   = {*Bold},
   ItalicFont = {*Italic},
   BoldItalicFont = {*BoldItalic},
   ]{FreeSerif}
 
 \newfontfamily{\fntFreeSerifx}[
  ExternalLocation,
  Path   = {/xfonts/gnuFreeFont/2011dec12/},
  Extension  = {.ttf},
  UprightFont= {*},
  BoldFont   = {*Bold},
  ItalicFont = {*Italic},
  BoldItalicFont = {*BoldItalic},
  ]{FreeSerif}
} % closing delimiter to restrict the scope of \tracingall


Then study the .log file output.
There will be *masses* of extra output lines, most of which
is quite irrelevant to your needs.
nevertheless, you may be able to spot where something is obviously
Not how you would like it to be.

 
 Then XeLaTeX seems to get confused and does not seem to find the new
 \fntFreeSerifx font, but is maybe using \fntFreeSerif or another
 version of FreeSerif, perhaps one in my Texlive setup.
 
 In the log file, both fonts are assigned the same label FreeSerif(0):
 
 . Font family 'FreeSerif(0)' created for font 'FreeSerif' with options [
 . ExternalLocation, Path = {/xfonts/gnuFreeFont/}, Extension = {.otf},
 . UprightFont = {*}, BoldFont = {*Bold}, ItalicFont = {*Italic},
 . BoldItalicFont = {*BoldItalic}, ].
 
 . Font family 'FreeSerif(0)' created for font 'FreeSerif' with options [
 . ExternalLocation, Path = {/xfonts/gnuFreeFont/2011dec12/}, Extension =
 . {.ttf}, UprightFont = {*}, BoldFont = {*Bold}, ItalicFont = {*Italic},
 . BoldItalicFont = {*BoldItalic}, ].
 
 
 But if I comment out any *one* (or all four) of the shape directive
 lines like this
 
 \newfontfamily{\fntFreeSerifx}[
  ExternalLocation,
  Path   = {/xfonts/gnuFreeFont/2011dec12/},
  Extension  = {.ttf},
   UprightFont= {*},
   BoldFont   = {*Bold},
   ItalicFont = {*Italic},
 %   BoldItalicFont = {*BoldItalic},
  ]{FreeSerif}
 
 then the problem goes away, and the two fonts are given different labels:
 
 . Font family 'FreeSerif(0)' created for font 'FreeSerif' with options [
 . ExternalLocation, Path = {/xfonts/gnuFreeFont/}, Extension = {.otf},
 . UprightFont = {*}, BoldFont = {*Bold}, ItalicFont = {*Italic},
 . BoldItalicFont = {*BoldItalic}, ].
 
 . Font family 'FreeSerif(1)' created for font 'FreeSerif' with options [
 . ExternalLocation, Path = {/xfonts/gnuFreeFont/2011dec12/}, Extension =
 . {.ttf}, UprightFont = {*}, BoldFont = {*Bold}, ItalicFont = {*Italic}, ].
 
 Is this something I am doing wrong, a fontspec bug, or a problem with
 FreeSerif and variants?

The .log output using  \tracingall  may offer some clues to help
someone to answer this question.

 
 Many thanks in advance,
 Dan


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114






--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] fontspec loading the wrong font?

2011-12-15 Thread Ross Moore
Hi Daniel,

On 16/12/2011, at 9:44 AM, Daniel Greenhoe wrote:

 On Fri, Dec 16, 2011 at 5:56 AM, Ross Moore ross.mo...@mq.edu.au wrote:
 try doing some detailed tracing, using  \tracingall
 
 Thank you for your help. I did try it with the \tracingall directive.
 However, the compilation crashes with message
 ! Undefined control sequence.
 l.85   GNU FreeSerif:\fntFreeSerif


Yes, because of the {...} delimiting.
This makes part of the \newfontfamily  stuff local to that group.

You can scan through that tracing to see whether all is as
it is supposed to be, with respect to filenames of the fonts
being loaded, or being setup for later loading.

The fact that the document fails is irrelevant to obtaining
that information.


If you remove those braces, then you'll trace an awful lot more
of the document, getting masses more output into the .log  file.

The document may now process to completion, but it may be a lot
harder to find the relevant parts to font-loading.


 
 Maybe the log file would still be helpful to someone; but it is huge
 (about 2.4MByte), so I won't attach it to this email. If anyone wants
 the file, I can email it or put it on a publicly accessible server.
 
 Dan


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114






--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] tabular in footnote

2011-12-06 Thread Ross Moore
Hello Daniel,

On 07/12/2011, at 8:16 AM, Daniel Greenhoe wrote:

 Thank you everyone for your help with this problem. I will regard it
 as a bug. I hope that someday it can be fully resolved.

Heiko explains why the table doesn't align as you want.

Try this variant of your example.


\documentclass[12pt]{book}
\usepackage[a4paper,noheadfoot,nomarginpar,margin=20mm,showframe]
 {geometry}


% adjust this value to suit
\def\foottableraise{2ex}

% define a new environment
\newenvironment{foottable}{%
 \raise\foottableraise\hbox\bgroup\space
 \begin{tabular}[t]%
 }{%
 \end{tabular}\egroup\vskip\foottableraise
 }

\begin{document}%
  xyz\footnote{%
\raisebox{\foottableraise}{ % inserts a space
\begin{tabular}[t]{|l|}
   \hline
abc\\
def\\
ghj\\
klm\\
\hline
  \end{tabular}%\\
  }%
  \vskip \foottableraise
}
  xyz\footnote{%
\begin{foottable}{|l|}
   \hline
abc\\
def\\
ghj\\
klm\\
\hline
  \end{foottable}%\\
   }
\end{document}%

Note that you need to use TeX's  \raise  and  \bgroup ... \egroup
in the environment definition.
This is because \raisebox reads its argument too soon, so the
start and end of the box cannot then be split between the
\begin and \end of the \newenvironment .

 
 Dan


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114






--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Diacritics in color

2011-12-02 Thread Ross Moore
Hi Tobias,


On 03/12/2011, at 6:06, Tobias Schoel liesdieda...@googlemail.com wrote:

 As a teacher I can think of some more Applications. Of course, these are 
 pedagogical:
 
 Teaching scripts to beginners (learning to write a primary school, learning 
 to write in a different script when learning another language (or even in the 
 same language: Mongol?):
 
 You might want to color single parts of a glyph in order to highlight them. 
 So, for example in a handwritten (see 
 http://de.wikipedia.org/wiki/Schulausgangsschrift or english equivalents I 
 haven't found in the time) a the beginning or end-strokes might be colored.

Yes, but are these examples really requiring parts of the same whole character 
coloured differently?

Presuming that the font did allow access to individual glyphs, as if separate 
characters, then would not all meaningful aspects be equally well (if not 
better) encoded by an overlay?
That is, position a coloured version of the required glyph over the full 
character in monochrome.

In the pedagogical setting, you are presumably talking about the single stroke 
as a sub-part of the whole character, so it deserves to be placed as an entity 
in itself.
This is quite different to a colored diacritical mark modifying the meaning of 
a character.

 
 Of course the font creator has to create sub-glyphs or other fancy stufff, 
 but XeTeX should allow (re)composition of the glyph with different colors.
 
 bye
 
 Toscho

Hope this helps,

   Ross


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Diacritics in color

2011-11-30 Thread Ross Moore
Hi Heiko, and others

On 30/11/2011, at 8:56 PM, Heiko Oberdiek wrote:

 The PDF stuff:
 
  % without color:
  0 -99.63 Td[0024]TJ
  54.19 15.57 Td[0301]TJ
 
  % with color:
  -54.19 -115.2 Td[0024]TJ
  ET 1 0 0 RG 1 0 0 rg BT /F1 99.626 Tf
  59.3 -278.84 Td[0301]TJ
 
  % with color via \special{pdf:code ...}:
  0 -378.46 Td[0024]TJ
  ET 1 0 0 rg 1 0 0 RG BT /F1 99.626 Tf
  59.3 -378.46 Td[0301]TJ
 
 It seems that in XeTeX the color cannot be inserted without
 breaking the PDF text sections (BT...ET). \special{pdf:literal direct ...}
 (or short \special{pdf:code ...}) is not equivalent to
 \pdfliteral direct{...} but rather to \pdfliteral page{...}
 that ends a text section in the PDF output.
 
 If someone want's this issue fixed, make a feature request
 for XeTeX/xdvipdfmx to provide a real \pdfliteral direct/
 direct mode for colors without breaking text sections.

In the experimental version of pdfTeX with support for Tagged PDF,
we need a similar kind of variant to \pdfliteral .
It is called 'enclose'.

Here is how, in the coding of the corresponding pdftex.web 
the various options are read, following an occurrence 
of \pdfliteral :

@ @Implement \.{\\pdfliteral}@=
begin
check_pdfoutput(\pdfliteral, true);
new_whatsit(pdf_literal_node, write_node_size);
if scan_keyword(direct) then
pdf_literal_mode(tail) := direct_always
else if scan_keyword(page) then
pdf_literal_mode(tail) := direct_page
else if scan_keyword(enclose) then
pdf_literal_mode(tail) := enclose
else
pdf_literal_mode(tail) := set_origin;
scan_pdf_ext_toks;
pdf_literal_data(tail) := def_ref;
end

and here is the documentation on those options:

@# {data structure for \.{\\pdfliteral}; node size = 2}
@d pdf_literal_data(#) == link(#+1) {data}
@d pdf_literal_mode(#) == info(#+1) {mode of resetting the text matrix
  while writing data to the page stream}
@# {modes of setting the current transformation matrix (CTM)}
@d set_origin  == 0 {end text (ET) if needed, set CTM to current 
point}
@d direct_page == 1 {end text (ET) if needed, but don't change the 
CTM}
@d direct_always   == 2 {don't end text, don't change the CTM}
@d scan_special== 3 {look into special text}
@d enclose == 4 {like |direct_always|, but end current string 
and sync pos}


The 'enclose' option is used to position the 'BDC' and 'EMC'
as closely around the text snippets as can be achieved,
without breaking the BT ... ET.
This seems to be the same kind of requirement here.

It would be nice if any extension to literal specials in  xdvipdfmx
used the same keywords for similar functionality.

 
 Yours sincerely
  Heiko Oberdiek


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Diacritics in color (was Re: XETEX cannot access OpenType features in PUA?)

2011-11-28 Thread Ross Moore
Hi Aleks,

On 29/11/2011, at 6:18 AM, Aleksandr Andreev wrote:

 Jonathan Kew writes:
 
 Making this work in xetex would require a different approach to 
 implementing color.
 
 I have been able to get it to work (the base glyph in black and the
 diacritic in red) in LuaTeX using the luacolor package.
 
 Here's a minimal example:
 
 \documentclass{minimal}
 \usepackage{fontspec}
 \usepackage{xcolor}
 \usepackage{luacolor}
 
 \newfontface\moo{MezenetsUnicode}
 
 \begin{document}
 \moo
 \textcolor{red}{}
 \end{document}

Would you be so kind as to post the PDF from this?
And where does one obtain the font MezenetsUnicode ?
 --- Google gives nothing with this name.

Furthermore my LuaTeX gives a Segmentation Fault, so I cannot
just try with a different font!

 
 I'm not much of an expert in the inner workings of TeX and I know
 absolutely nothing about Lua (is that a derivative of LISP?) so I
 can't comment on whether the luacolor package could be ported to
 XeTeX.

I'd doubt that this could work currently.

My guess is that you would need to do some post-processing
of the PDF code snippet returned from the OS positioning
the glyphs. Once positioned, you would need to wrap the colour
commands around the part which places the diacritic.


 Any insights?

But XeTeX currently does not give you access to that PDF string,
and it is well past the place of macro-expansion in LaTeX, so 
there wouldn't be a mechanism for such late adjustments.

It can be done with LuaTeX, since it does have the appropriate
mechanism for such post-processing.

Others more familiar with how LuaTeX works can confirm this
explanation -- or shoot it down, as appropriate.


 Aleks


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Problems with lmroman10 fonts after Life TeX Update

2011-11-27 Thread Ross Moore
Hi Eckart,
On 28/11/2011, at 4:21 AM, Eckart Hasselbrink wrote:

 Hi,
 
 I need help: I let the Life TeX Utility do its thing last night (I have 
 LifeTeX-2011-BasicTeX installed). Since then I have problems with a document 
 which compiled fine before.
 
 I got the error that t3enc.def could not be found. I learned from the web, 
 that I need to install the tipa package as this now required by XeTeX.

Xunicode  most certainly does *not* require the tipa package, 
just the file  t3enc.def  for the font encoding that  tipa.sty  uses.
This is so that constructions from  tipa.sty  can be replaced by
their Unicode equivalents, enabling older documents using TIPA
constructions to be processed correctly.

Not sure why TIPA isn't part of your TeX installation anyway.
If there is a good reason for this, I can adjust that part
of  xunicode.sty  to bypass this dependency and the
accompanying TIPA functionality.

Have you tried a version of TeX from TeX Live that is 
a bit more advanced than  BasicTeX ?


 Now, I am stuck with the following error messages:
 
 (/usr/local/texlive/2011basic/texmf-dist/tex/latex/euenc/eu1lmr.fd)kpathsea: 
 Invalid fontname `[lmroman10-regular]', contains '['

Parsing to find the font name has failed somehow here.
Someone else can comment.

 
 ! Font EU1/lmr/m/n/10=[lmroman10-regular]:mapping=tex-text at 10.0pt not 
 loadab
 le: Metric (TFM) file or installed font not found.
 to be read again 
   relax 
 l.100 \fontencoding\encodingdefault\selectfont
 
 ? 
 ) (/usr/local/texlive/2011basic/texmf-dist/tex/xelatex/xunicode/xunicode.sty
 (/usr/local/texlive/2011basic/texmf-dist/tex/latex/tipa/t3enc.defkpathsea: 
 Invalid fontname `[lmromanslant10-regular]', contains '['
 
 ! Font EU1/lmr/m/sl/10=[lmromanslant10-regular]:mapping=tex-text at 10.0pt 
 not 
 loadable: Metric (TFM) file or installed font not found.
 to be read again 
   relax 

 
 These do not make any sense to me. How do I remedy this situation?
 
 TIA,
 
 Eckart




Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] VIQR pre-processor wrotten in (Xe)TeX ?

2011-11-24 Thread Ross Moore
Hi Andrew, and Phil,

On 25/11/2011, at 9:38 AM, Andrew Cunningham wrote:

 Word final vowels with punctuation following, e.g. full stop, question mark.
 
 the following sentence:
 
 Tôi yêu tiếng nước tôi từ khi mới ra đời.
 
 is represented in strict VIQR as:
 
 To^i ye^u tie^'ng nu+o+'c to^i tu+` khi mo+'i ra ddo+`i\.
 
 Notice the escaping of the full stop at the end of the sentence.
 Without the escaping the full stop would be converted to diacritic
 below the letter i.

What a pain for TeX, which already uses  '\.' to put 
a dot-below accent on a character.


At least here we have \.space  (since '.' has catcode 12, not 11).
So it should be possible to modify the definition of \.{ } within
the NFSS processing of \.  to assign a special meaning.
This is coding that could be easily included within Xunicode.sty , 
but applied only optionally, according to encoding or within a font-switch.

You would need that user code of  \.space  and equivalently \.{ } 
should expand to a sequence of:
   '\' + '.'  each with catcode 12  followed by a space.
Then your TecKit .map patterns can be defined to respect this.

There will be problems when \. is meant to be used as a symbolic
separator for other purposes; e.g. as a decimal point,
or something like:  word\.word   (for whatever purpose).
Or at the end of a block of characters with no trailing space.

An alternative may be to define some other way to get the '\'
with catcode 12.

Or within environments that want to use VIQR data, the \catcode
of '\' is set to 12, with some other character ('|' say) becoming
TeX's category 0  escape-character.

This is probably best, as you'll need to define such environments
anyway. However, it makes it awkward to pass VIQR strings around 
within the arguments of macros. 


 
 or another example:
 
 Anh ddi dda^u\?
 
 This escaping is part of VIQR and any input system that is based on VIQR.

Offhand I cannot think of any alternative meaning assigned to \? .
Is there one, in any special language, implemented in (La)TeX ?


 
 
 Andrew
 -- 
 Andrew Cunningham
 Senior Project Manager, Research and Development
 Vicnet
 State Library of Victoria
 Australia
 
 andr...@vicnet.net.au
 lang.supp...@gmail.com


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-18 Thread Ross Moore
Hi Zdenek,

On 19/11/2011, at 9:51 AM, Zdenek Wagner wrote:

 This is a demonstration that glyphs are not the same as characters. I
 will startt with a simpler case and will not put Devanagari to the
 mail message. If you wish to write a syllable RU, you have to add a
 dependent vowel (matra) U to a consonant RA. There is a ligature RU,
 so in PDF you will not see RA consonant with U matra but a RU glyph.
 Similarly, TRA is a single glyph representing the following
 characters: TA+VIRAMA+RA. The toUnicode map supports 1:1 and 1:many
 mappings thus it is possible to handle these cases when copying text
 from a PDF or when searching. More difficult case is I matra (short
 dependent vowel I). As a character it must always follow a consonant
 (this is a general rule for all dependent vowels) but visually (as a
 glyph) it precedes the consonant group after which it is pronounced.
 The sample word was kitab (it means a book). In Unicode (as
 characters) the order is KA+I-matra+TA+A-matra(long)+BA. Visually
 I-matra precedes KA. XeTeX (knowing that it works with a Devanagari
 script) runs the character sequence through ICU and the result is the
 glyph sequence. The original sequence is lost so that when the text is
 copied from PDF, we get (not exactly) i*katab.

/ActualText is your friend here.
You tag the content and provide the string that you want to appear
with Copy/Paste as the value associated to a dictionary key.

There is a macro package that can do this with pdfTeX, and it is 
a vital part of my Tagged PDF work for mathematics.
Also, I have an example where the CJK.sty package is extended
to tag Chinese characters built from multiple glyphs so that
Copy/Paste works correctly (modulo PDF reader quirks).

Not sure about XeTeX.

I once tried to talk with Jonathan Kew about what would be needed 
to implement this properly, but he got totally the wrong idea 
concerning glyphs and characters, and what was needed to be done
internally and what by macros. The conversation went nowhere.

 Microsoft suggested
 what additional characters should appear in Indic OpenType fonts. One
 of them is a dotted ring which denotes a missing consonant. I-matra
 must always follow a consonant (in character order). If it is moved to
 the beginning of a word, it is wrong. If you paste it to a text
 editor, the OpenType rendering engine should display a missing
 consonant as a dotted ring (if it is present in the font). In
 character order the dotted ring will precede I-matra but in visual
 (glyph) order it will be just opposite. Thus the asterisk shows the
 place where you will see the dotted circle. This is just one simple
 case. I-matra may follow a consonant group, such as in word PRIY
 (dear) which is PA+VIRAMA+RA+I-matra+YA or STRIYOCIT (good for women)
 which is SA+VIRAMA+TA+VIRAMA+RA+I-matra+YA+O-matra+CA+I-matra+TA. Both
 words will start with the I-matra glyph. The latter will contain two
 ordering bugs after copypaste. Consider also word MURTI (statue)
 which is a sequence of characters

This sounds like each word needs its own /ActualText .
So some intricate programming is certainly necessary.
But \XeTeXinterchartoks  (is that the right spelling?)
should make this possible.

 MA+U-matra(long)+RA+VIRAMA+TA+I-matra. Visually the long U-matra will
 appear as an accent below the MA glyph. The next glyph will be I-matra
 followed by TA followed by RA shown as an upper accent at the right
 edge of the syllable. Generally in RA+VIRAMA+consonant+matra the RA
 glyph appears at the end of the syllable although locically (in
 character order) it belongs to the beginning. These cases cannot be
 solved by toUnicode map because many-to-many mappings are not allowed.

Agreed.  /ToUnicode  is not the right PDF construction for this.

 Moreover, a huge amount of mappings will be needed. It would be better
 to do the reverse processing independent of toUnicode mappings, to use
 ICU or Pango or Uniscribe or whatever to analyze the glyphs and
 convert them to characters. The rules are unambiguous but AR does not
 do it.

Having an external pre-procesor is what I do for tagging mathematics.
It seems like a similarly intricate problem here.

 
 We discuss nonbreakable spaces while we are not yet able to convert
 properly printable glyphs to characters when doing copypaste from
 PDF...

  :-)

 
 
 -- 
 Zdeněk Wagner
 http://hroch486.icpf.cas.cz/wagner/
 http://icebearsoft.euweb.cz

Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information

Re: [XeTeX] Whitespace in input

2011-11-18 Thread Ross Moore
,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-17 Thread Ross Moore
Hi Phil,

On 17/11/2011, at 23:53, Philip TAYLOR p.tay...@rhul.ac.uk wrote:

 Keith J. Schultz wrote:
 
 You mention in a later post that you do consider a space as a printable 
 character.
This line should read as:
  You mention in a later post that you consider a space as a 
 non-printable character.
 
 No, I don't think of it as a character at all, when we are talking
 about typeset output (as opposed to ASCII (or Unicode) input).  

This is fine, when all that you require of your output is that it be visible on
a printed page. But modern communication media goes much beyond that.
A machine needs to be able to tell where words and lines end, reflowing 
paragraphs when appropriate and able to produce a flat extraction of all the 
text, perhaps also with some indication of the purpose of that text (e.g. by 
structural tagging).

In short, what is output for one format should also be able to serve as input 
for another.

Thus the space certainly does play the role of an output character – though the 
presence of a gap in the positioning of visible letters may serve this role in 
many, but not all, circumstances.

 Clearly
 it is a character on input, but unless it generates a glyph in the
 output stream (which TeX does not, for normal spaces) then it is not
 a character (/qua/ character) on output but rather a formatting
 instruction not dissimilar to (say) end-of-line.

But a formatting instruction for one program cannot serve as reliable input for 
another.
A heuristic is then needed, to attempt to infer that a programming instruction 
must have been used, and guess what kind of instruction it might have been. 
This is not 100% reliable, so is deprecated in modern methods of data storage 
and document formats.
XML based formats use tagging, rather that programming instructions. This is 
the modern way, which is used extensively for communicating data between 
different software systems.

 
 ** Phil.

TeX's strength is in its superior ability to position characters on the page 
for maximum visual effect. This is done by producing detailed programming 
instructions within the content stream of the PDF output. However, this is not 
enough to meet the needs of formats such as EPUB, non-visual reading software, 
archival formats, searchability, and other needs.
Tagged PDF can be viewed as Adobe's response to address these requirements as 
an extension of the visual aspects of the PDF format. It is a direction in 
which TeX can (and surely must) move, to stay relevant within the publishing 
industry of the future.


Hope this helps,

 Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-17 Thread Ross Moore
Hello Zdenek,

On 18/11/2011, at 7:49 AM, Zdenek Wagner wrote:

 But a formatting instruction for one program cannot serve as reliable input
 for another.
 A heuristic is then needed, to attempt to infer that a programming
 instruction must have been used, and guess what kind of instruction it might
 have been. This is not 100% reliable, so is deprecated in modern methods of
 data storage and document formats.
 XML based formats use tagging, rather that programming instructions. This is
 the modern way, which is used extensively for communicating data between
 different software systems.
 
 Yes, that's the point. The goal of TeX is nice typographical
 appearance. The goal of XML is easy data exchange. If I want to send
 structured data, I send XML, not PDF.

These days people want both.

 
 ** Phil.
 
 TeX's strength is in its superior ability to position characters on the page
 for maximum visual effect. This is done by producing detailed programming
 instructions within the content stream of the PDF output. However, this is
 not enough to meet the needs of formats such as EPUB, non-visual reading
 software, archival formats, searchability, and other needs.
 Tagged PDF can be viewed as Adobe's response to address these requirements
 as an extension of the visual aspects of the PDF format. It is a direction
 in which TeX can (and surely must) move, to stay relevant within the
 publishing industry of the future.
 
 Hope this helps,
 Ross
 
 No, it does not help. Remember that tha last (almost) portable version
 of PDF is 1.2. If you are to open tagged PDF or even PDF with a
 toUnicode map or a colorspace other than RGB or CMYK in Acrobat Reader
 3, it displays a fatal error and dies. I reported it to Adobe in March
 2001 and they did nothing.

What else would you expect?
AR is at version 10 now.
On Linux it is at version 9 now, indeed 9.4.6 is current.

You don't expect TeX formats prior to TeX3 to handle non-ascii 
characters, so why would you expect other people's older software 
versions to handle documents written for later formats?

 I even reported another fatal bug in
 January 2001. I sent sample files but nothing happened, Adobe just
 stopped development of Acrobat Reader at buggy version 3 for some
 operating systems.

Why should they support OSs that have a limited life-time?
Industry moves on. A new computer is very cheap these days,
with software that can do things your older one never could do.

By all means keep the old one while it still does useful work, 
but you get another to do things that the older cannot handle.

 Why do you so much rely on Adobe? When exchanging
 structured documents I will always do it in XML and never create
 tagged PDF because ...

PDF, as a published standard, is not maintained by Adobe itself 
these days, yet Adobe continues to provide a free reader, at least 
for the visual aspects. That makes documents in PDF viewable by 
everyone (who is only interested in the visual aspect).

It is an ISO standard, which publishers will want to use.
Most of the people who use (La)TeX are academics or others
who need to do a fair amount of publishing, of one kind
or another.

TeX can be modified to become capable of producing Tagged PDF.
 (See the videos of my talks.)
Free software (Poppler) is being developed to handle most aspects
of PDF content, though it hasn't yet progressed enough to support
structure tagging. It's surely on the list of things to do.

  ... I know that some users will be unable to read them
 by Adobe Acrobat Reader.

Why not?
It is not Adobe Reader that is holding them back.

 I do not wish to make them dependent on
 ghostscript and similar tools.

You'll have to give some more details of who you are
referring to her, and why their economic circumstances 
require them to have access to XML-transmitted data,
but preclude them from access to other kinds of standard 
computing software and devices.


 -- 
 Zdeněk Wagner
 http://hroch486.icpf.cas.cz/wagner/
 http://icebearsoft.euweb.cz


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-17 Thread Ross Moore
.

This is actually my preference in these situations, as there is 
a definite advantage in keeping the (La)TeX input source clean.
At some time you might want to use it with a different processor,
which might not have an easy in-built way to handle the problematic
characters. 

 
 ** Phil.


Hope this helps clarify any misconceptions,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114






--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-15 Thread Ross Moore
Hi Zdenek,

On 16/11/2011, at 8:58 AM, Zdenek Wagner wrote:

 2011/11/15 Ross Moore ross.mo...@mq.edu.au:
 
 On 16/11/2011, at 5:56 AM, Herbert Schulz wrote:
 
 Given that TeX (and XeTeX too) deal wit a non-breakble space already (where 
 we usually use the ~ to represent that space) it seems to me that XeTeX 
 should treat that the same way.
 
 No, I disagree completely.
 
 What if you really want the Ux00A0 character to be in the PDF?
 That is, when you copy/paste from the PDF, you want that character
 to come along for the ride.
 
 From the typographical point of view it is the worst of all possible
 methods. If you really wish it,

The *really wish it* is the choice of the author, not the
software.

 then do not use TeX but M$ Word or
 OpenOffice. M$ Word automatically inserts nonbreakable spaces at some
 points in the text written in Czech. As far as grammer is concerned,
 it is correct. However, U+00a0 is fixed width. If you look at the
 output, the nonbreakable spaces are too wide on some lines and too
 thin on other lines. I cannot imagine anything uglier.

I do not disagree with you that this could be ugly.
But that is not the point.

If you want superior aesthetic typesetting, with nice choices
for hyphenation, then don't use Ux00A0. Of course!


Whatever the reason for wanting to use this character, there
should be a straight-forward way to do it.
Using the character itself is:
 a.  the most understandable
 b.  currently works
 c.  requires no special explanation.


 
 
 -- 
 Zdeněk Wagner
 http://hroch486.icpf.cas.cz/wagner/
 http://icebearsoft.euweb.cz

Cheers,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-15 Thread Ross Moore
Hi Phil,

On 16/11/2011, at 8:45 AM, Philip TAYLOR wrote:

 Ross Moore wrote:
 
 On 16/11/2011, at 5:56 AM, Herbert Schulz wrote:
 
 Given that TeX (and XeTeX too) deal wit a non-breakble space already (where 
 we usually use the ~ to represent that space) it seems to me that XeTeX 
 should treat that the same way.
 
 No, I disagree completely.
 
 What if you really want the Ux00A0 character to be in the PDF?
 That is, when you copy/paste from the PDF, you want that character
 to come along for the ride.
 
 I'm not sure I entirely go along with this argument, Ross.
 What if you really want the \ character to be in the PDF,
 or the ^ character, or the $ character, or any character
 that TeX currently treats specially ?  

TeX already provides \$ \_ \# etc. for (most of) the other special
characters it uses, but does not for ^^A0 --- but it does not
need to if you can generate it yourself on the keyboard.


 Whilst I can agree
 that there is considerable merit in extending XeTeX such
 that it treats all of these new, special characters
 specially (by creating new catcodes, new node types and so
 on), in the short term I can see no fundamental problem with
 treating U+00A0 in such a way that it behaves indistinguishably
 from the normal expansion of ~.

How do you explain to somebody the need to do something really,
really special to get a character that they can type, or copy/paste?

There is no special role for this character in other vital aspects 
of how TeX works, such as there is for $ _ # etc.


 
 In TeX ~ *simulates* a non-breaking space visually, but there is
 no actual character inserted.
 
 And I don't agree that a space is a character, non-breaking or not !

In this view you are against most of the rest of the world.

If the output is intended to be PDF, as it really has to be with 
XeTeX, then the specifications for the modern variants of PDF 
need to be consulted.

With PDF/A and PDF/UA and anything based on ISO-32000 (PDF 1.7)
there is a requirement that the included content should explicitly
provide word boundaries. Having a space character inserted is by
far the most natural way to meet this specification.
(This does not mean that having such a character in the output
need affect TeX's view of typesetting.)

Before replying to anything in the above paragraph, please
watch the video of my recent talk at TUG-2011.

  http://river-valley.tv/further-advances-toward-tagged-pdf-for-mathematics/

or similar from earlier years where I also talk a bit about such things.

 
 ** Phil.


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-15 Thread Ross Moore
Hi Phil,

On 16/11/2011, at 10:08 AM, Zdenek Wagner wrote:

 How do you explain to somebody the need to do something really,
 really special to get a character that they can type, or copy/paste?
 
 There is no special role for this character in other vital aspects
 of how TeX works, such as there is for $ _ # etc.
 
 
 
 In TeX ~ *simulates* a non-breaking space visually, but there is
 no actual character inserted.
 
 And I don't agree that a space is a character, non-breaking or not !
 
 In this view you are against most of the rest of the world.
 
 TeX NEVER outputs a space as a glyph. Text extraction tools usually
 interpret horizontal spaces of sufficient size as U+0020.

I never said that it did, nor that it was necessary to do so.

Those text extraction tools do a pretty reasonable job, but don't
always get it right. Besides, there is reliance on a heuristic,
which can be fallible, especially if there is content typeset in 
a very small font size.
And what about at line-ends? They can get that wrong too.

Such a reliance is rather against the TeX way of doing things,
don't you think?

Better is for TeX itself to apply the heuristic, since it knows
the current font size and the separation between bits of words.

 (The exception to the above mentioned never is the verbatim mode.)

That isn't good enough for TeX to produce PDF/A.
Go and watch the videos that I pointed you to.


Lower down I give a run-down of how a variant of TeX handles
this problem, to very good effect.

 
 If the output is intended to be PDF, as it really has to be with
 XeTeX, then the specifications for the modern variants of PDF
 need to be consulted.
 
 With PDF/A and PDF/UA and anything based on ISO-32000 (PDF 1.7)
 there is a requirement that the included content should explicitly
 provide word boundaries. Having a space character inserted is by
 far the most natural way to meet this specification.
 
 A space character is a fixed-width glyph. If you insist in it, you
 will never be able to typeset justified paragraphs, you will move back
 to the era of mechanical typewriters.

Absolutely wrong!

I'm not insisting on it being included as the natural way to 
separate words within the PDF, though it certainly is a possible
way that is used by other software.

 (This does not mean that having such a character in the output
 need affect TeX's view of typesetting.)

Clearly you never even read this parenthetical statement ...

 
 Before replying to anything in the above paragraph, please
 watch the video of my recent talk at TUG-2011.

 ... and certainly you don't seem to have followed up on this
piece of advice, to get a better perspective of what I'm talking
about.

 
  http://river-valley.tv/further-advances-toward-tagged-pdf-for-mathematics/
 
 or similar from earlier years where I also talk a bit about such things.



Here is how you get *both* TeX-quality typesetting and explicit
spaces as word-boundaries inside the PDF, with no loss of quality.

What the experimental tagged-pdfTeX does is to use a font (called
dummy-space) that contains just a single character at code Ux0020,
at a size that is almost zero -- it cannot be exactly zero, else 
PDF browsers may not select it for copy/paste, or other text-extraction.

These extra spaces are inserted into the PDF content stream, *after*
TeX has determined the correct positioning for high-quality typesetting.
That is, it is *not* done by macros or widgets or suchlike, but is
done internally by the pdfTeX engine at shipout time.

The almost-zero size has no perceptible effect on the visual output.
But the existence of these extra space characters means that all
text-extraction methods work much more reliably.

There *are* extra primitives that can be used to turn this off and on
in places where such extra spaces are not wanted; e.g. in math.
And there is a primitive to insert such a space, in case it is required
manually, for whatever reason. All of these primitives are used
extensively when generating tagged PDF of mathematical expressions,
and are thus available for other usage too.


 
 
 ** Phil.

Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Whitespace in input

2011-11-15 Thread Ross Moore
Hi Zdenek,

On 16/11/2011, at 11:19 AM, Zdenek Wagner wrote:

 Just like any other Unicode character, if you want it then
 you should be able to put it in there.
 
 You ARE able to do it. Choose a font with that glyph, set \catcode to
 11 or 12 and that's it. What else do you wish to do?

The *default* behaviour should stay as this.
Any other behaviour needs to change the catcode
and make perhaps a definition.

 These are reasons why people might wish it in the source files, not in PDF.
 
 Yes. In the source, to have the occasional such character included
 within the PDF, for whatever reason appropriate to the material
 being typeset -- whether verbatim, or not.


 If you wish to take a [part of] PDF and include it in another PDF as
 is, you can take the PDF directly without the need of grabbing the
 text. If you are interested in the text that will be retypeset, you
 have to verify a lot of other things.
 
 How is any of this relevant to the current discussion?
 
 It was you who came with the argument that you wish to have
 nonbreakable spaces when copying the text from PDF.

No. I said that if you put one in, then you should be
expecting to get one out.
This should be the default behaviour, as it is now.

I certainly suggested nothing like getting out non-breaking
spaces as a replacement for anything else.


 Zdeněk Wagner
 http://hroch486.icpf.cas.cz/wagner/
 http://icebearsoft.euweb.cz



Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] nbsp; in XeTeX

2011-11-13 Thread Ross Moore
://icebearsoft.euweb.cz


Cheers,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Wacky behavior of XeLaTeX in TeXLive 2011

2011-11-07 Thread Ross Moore
Hi Alessandro,

On 07/11/2011, at 10:14 PM, Alessandro Ceschini wrote:

 Hi
 
 To Mojca:
 No, the version I'm running is NOT the one bundled with Ubuntu 11.10 (that's 
 TL 2009), I just downloaded a dvd of TL 2011 and installed it manually via 
 tl-install. Installation is complete (every package), and nothing of the TL 
 2009 package bundled with Ubuntu is installed. I checked this.
 
 To Ross:
 So, are you suggesting I should downgrade to TL 2010?

No.
I was responding to Mojca's comment:

 But Ubuntu probably just took the released version and not the branch
 with fixes. Also, unless I'm mistaken, I'm not aware of official TL
 distribution shipping patched binaries either and I don't exactly
 understand why since the mechanism to ship new binaries is in place.

which seems to be advocating that since patched binaries are available
then everyone (!) should be getting them, more or less automatically.

That is most certainly not the case. By all means get them if you want to,
realising that you could be on the cutting edge, and it is entirely your 
own responsibility --- but you'll find people on this list (and others)
who are willing to help when/if you run into trouble. 


Ubuntu and other package managers must decide when it is appropriate 
to update. This decision will be made in accordance with what they 
know about a large number of their users, or maybe a small number 
of very important users.

As for myself, I would never advise anyone to update onto the cutting
edge, unless I was confident that they knew how to handle it.
This does *not* mean that I am advocating against updating, nor to
downgrading. You must work out what is appropriate for yourself, 
in whatever are your circumstances, then act accordingly. 

 Previously I was using just that version, without any apparent problem, all 
 started when I upgraded to TL 2011 (but after a fresh install, so no remnants 
 of TL 2010 on my hard disk).
 
 Regards
 -- 
 /Alessandro Ceschini/


Hope this helps,

 (and apologies to Mojca if I've just mis-represented him)

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Wacky behavior of XeLaTeX in TeXLive 2011

2011-11-07 Thread Ross Moore
Hi Peter,

On 08/11/2011, at 9:50 AM, Peter Dyballa wrote:

 
 Am 07.11.2011 um 23:31 schrieb Ross Moore:
 
 (and apologies to Mojca if I've just mis-represented him)
 
 Ross,
 
 do you mean Mojca with him? (I'm confident she is still a woman. In which 
 case a her would be appropriate.

Ooops. My mistake, definitely. Sorry.

 But I also wonder why Mail chose this signature…)
 
 --
 Greetings
 
  Pete
 
 Typography exists to honor content.
   – Robert Bringhurst


very apt!

Cheers

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Anchor names

2011-11-06 Thread Ross Moore
Hi Heiko, and Akira,

On 06/11/2011, at 3:55 AM, Heiko Oberdiek wrote:

   \special{%
 pdf:ann width 4bp height 2bp depth 2bp%
   /Type/Annot%
   /foo/ab#abc
   /Subtype/Link%
   /Border[0 0 1]%
   /C[0 0 1]% blue border
   /A%
 /S/GoToR%%
 /F(t.tex)%
 /D66f6f8% 
 % Result: 66f6f8, but ** WARNING ** Failed to convert input 
 string toUTF16...
 % /Dc3a46e6368c3b872%
 % Result: feff00e4006e0063006800f80072
%
   %
   }%

I've verified that this is indeed what happens, with 

  This is XeTeX, Version 3.1415926-2.2-0.9997.4 (TeX Live 2010)


Now looking at the source coding, at:

   
http://ftp.tug.org/svn/texlive/trunk/Build/source/texk/xdvipdfmx/src/spc_pdfm.c?diff_format=uview=logpathrev=13771

it is hard to see how those results can occur.

The warning message is only produced when the function

   maybe_reencode_utf8(pdf_obj *instring)

returns a value less than 1 (e.g. -1)
viz. lines 571--578:   function:  modstrings

   }
   else {
 r = maybe_reencode_utf8(vp);
   }
   if (r  0) /* error occured... */
 WARN(Failed to convert input string to UTF16...);
 }
 break;

or  lines 1145--1150  (for  pdf:dest  but not actually used here)

 #ifdef  ENABLE_TOUNICODE
   error = maybe_reencode_utf8(name);
   if (error  0)
 WARN(Failed to convert input string to UTF16...);
 #endif
 array = parse_pdf_object(args-curptr, args-endptr, NULL);


Now that function should find only ASCII bytes in  '66f6f8'
and  'c3a46e6368c3b872' .
In both cases the string should have remained silently unmodified.

viz.lines 474--481function:  maybe_reencode_utf8

   /* check if the input string is strictly ASCII */
   for (cp = inbuf; cp  inbuf + inlen; ++cp) {
 if (*cp  127) {
   non_ascii = 1;
 }
   }
   if (non_ascii == 0)
 return 0; /* no need to reencode ASCII strings */


What am I reading wrong? If anything.

Has there been an earlier de-coding of    hex-strings
into byte values, done either by XeTeX or xdvipdfmx ?
If so, then surely it is this which is unneccessary.
(Not XeTeX, since the string is correct in the .xdv file.)

e.g.  function  pst_string_parse_hex   in  pst_obj.c  seems
to be doing this.  But that is only supposed to be used with  
coding from   cmap_read.c  and  t1-load.c .
And these are only meant for interpreting the font data that goes 
into content streams. So I'm at a loss in understanding this.

But  'modstrings'  is applied recursively, and part of it
seems to be checking for a CMap (when appropriate?).
So maybe there is an unintended un-encoding that precedes 
an encoding?


 
 It seems that *all* literal strings are affected by the
 unhappy reconversions. But the PDF specification lets no choice,
 there are various places for byte strings.
 In the example, if a file name has byte string XY and the destination Z,
 then the file name is XY and the file name Z and nothing else. Otherwise
 neither the file or the destination will be found.
 
 Thus either (XeTeX/)xdvipdfmx finds a way for specifying arbitrary
 byte strings (at least for PDF strings(/streams)) -- it is a
 requirement of the PDF specification. Or we have to conclude 
 that 8-bit is not supported and that means US-ASCII.
 
 Yours sincerely
  Heiko Oberdiek


Hope this helps --- or you can help me  :-)


Cheers,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] [tex-live] XeTeX/TeX Live : Setting the default language

2011-11-04 Thread Ross Moore
Hi Phil,

On 04/11/2011, at 9:52 PM, Philip TAYLOR (Webmaster, Ret'd) wrote:

 Mojca Miklavec wrote:
 
 Now imagine that you send your document to a friend to make some final
 corrections  submit PDF for printing ... and that friend has set
 French or Russian as his default/preferred language, so the printing
 house will print the document typeset with Russian hyphenation
 patterns. Wouldn't that be nice?
 
 A document for export can contain Khaled's recommended :
 
   \uselanguage {whatever}
 
 A document solely for internal use does not require one,
 nor should one need to be added : the installer should
 ask the user whether it should respect his or her regional
 settings (locale, in Unix-speak, I believe).

Sorry Phil, but I agree with Mojca on this one.

The average (La)TeX user will not understand the issue,
and will not make the right choice when given one at 
installation time, and certainly will not know to put
this extra  \uselanguage{...}  line when needed, unless that 
was required on their own system when preparing the document.

So what Mojca envisions would most certainly happen.

Much and all as it may seem English-centric, or US-biased,
I'd agree that every document that does not just follow 
the world-wide bog-standard LaTeX installation *should*
specify the language explicitly in the document, either 
through use of Babel or Polyglossia or other equivalent method.


Those power users who wish to set up their defaults 
differently can certainly do so. It is then up to them 
to make sure that they do things correctly when sending
documents to others to process.

Has it not always been this way in the TeX word?
Is not this consistency in TeX one of its major strengths?
I've not seen any compelling reason to change this.


 
 ** Phil.



All the best,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114






--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Hyperref \hyperlink and \hypertarget not working with accented characters

2011-11-02 Thread Ross Moore
Hi Phil,

On 02/11/2011, at 7:54 PM, Philip TAYLOR (Webmaster, Ret'd) wrote:

 Ross Moore wrote:
 
 On 02/11/2011, at 10:40 AM, Andy Black wrote:
 
   \hyperlink{rAsociación}{APLT (1988)}
 
 Don't use non-ASCII characters in the link.
 
 Oh dear, does PDF still live in the TeX 2 era ?  Surely /someone/ in
 Adobe is aware that there are character sets other than US English,
 and that those who write in such languages are perfectly entitled
 to wish to use them in links, whether or not such text ever appears
 on-screen ?

No. I disagree with what you say.

Adobe respects Unicode. It does not have to agree with 
the UTF8 encoding of it.

PDF is an ISO standard now, so is properly published,
and anyone can conform with what has been published.

This is miles (!) better than many other attempts to impose 
acceptance of other preferences for encoding of data.
The PDF specs say what is acceptable in all the different
circumstances where character data is used for different
purposes, and it provides mechanisms for arbitrary content
to be translated to Unicode code-points, irrespective of 
how individual fonts may be encoded.

Specifying an internal representation of a symbolic link 
is a programmatic thing, not a textual content thing.
So of course you need to follow the published syntax.

Thus the question here reduces to whether XeTeX, or the 
hyperref package, should ensure that whatever restrictions 
imposed by the published PDF spec are met, or whether 
the author needs to do it him/herself.

My advice is simply that if you restrict yourself to ASCII 
letters, then you will not face any difficulties.
This is pure pragmatism; nothing less.

 
 Philip Taylor



All the best,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Hyperref \hyperlink and \hypertarget not working with accented characters

2011-11-02 Thread Ross Moore
Hi Phil,

On 03/11/2011, at 10:26 AM, Philip TAYLOR (Webmaster, Ret'd) wrote:

  And when
 I added a throw-away line to the effect that Heiko,
 as well as Adobe, need to put US ASCII behind them
 and move into the 21st Century

Unless you have actually read the PDF specs, you really
are not qualified to make such a statement.

Adobe *does* support Unicode, in many different ways,
both in Content and in identifiers, as my previous post shows. 

It just doesn't support UTF8 bytes directly in Name strings,
which seems to be what you are complaining about.
But such direct support is simply not possible, within
files that contain both data and programming constructs,
without employing some kind of escaping technique.
There have to be some characters that are treated specially,
to avoid problems with delimiters, and allow for alternative 
ways of referring to extra characters.
Every programming language that I've ever seen has this kind
of feature, somewhere within its rule set.

Adobe have chosen their rules, and they work very when
*when you follow those rules*.
PDF is a published ISO Spec. so anyone can read it,
and implement it, or try to determine what is wrong
when things do not go as hoped for.


 (as DEK did in 1990,
 after FMi and others made representations to him
 concerning the need for TeX to support a character
 set that include at least the basic Western diacritics),
 again there is no disrespect, either explicit or
 implied.  I am sure that Heiko was not offended, and
 if he was, he has full access to my personal mailbox
 and is welcome at any time to complain if he feels that
 I have failed to show the respect he clearly deserves.
 
 ** Phil.


Hope this helps,

Ross

(Sorry Arthur, for adding to this thread.
Hopefully this will be the last of it, until someone looks
at the apparent failure to follow the spec that I identified
in my previous message.)


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114






--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Hyperref \hyperlink and \hypertarget not working with accented characters

2011-11-01 Thread Ross Moore
Hello Andy,

On 02/11/2011, at 10:40 AM, Andy Black wrote:

 I have not heard back from anyone on this issue.
 
 Has anyone else had success with hyperlinks that use accent characters in the 
 link?
 
 Thanks,
 
 --Andy
 
 On 9/2/2011 12:02 PM, Andy Black wrote:
 Hello,
 
 I'm using XeTeX version 3.1415926-2.2-0.9997.4 (Web2C 2010) (format=xelatex 
 2010.11.15) with hyperref  2010/10/30 v6.81t.

\hyperlink{rAsociación}{APLT (1988)}

Don't use non-ASCII characters in the link.

The link anchor is just a string that is used internally.
It is never displayed in the PDF, so why risk running
into encoding problems by using non-ASCII characters?

PDF does not use UTF8 at all. 
You'll have to transform any UTF8 characters into a UTF16 
ASCII-HeX representation of the Unicode code-point,
both in the destination-label and in any corresponding
hyperlink target-labels that point at it.

 
 with
 
\hypertarget{rAsociación}{Asociación para la Promoción de Lecto-Escritura 
 Tlapaneca.  1988.  }
 
 then the hyperlink in the resulting PDF does not go to the target.  If I 
 replace the accented o with an unaccented o, then the hyperlink works fine.
 
 Do I need to do something special to get the hyperref package to produce 
 hyperlinks that work when there are non-A-Z characters?

Hyperref gives the means to do this, using  \pdfstringdef .

But since this label is only used internally, you might as well
save your self some trouble, and (La)TeX some processing time,
by just using ASCII letters for such things.

 
 Thanks,
 
 --Andy


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] TECkit map for Latin alphabet to Unicode IPA

2011-10-30 Thread Ross Moore

On 30/10/2011, at 8:11 PM, Daniel Greenhoe wrote:

 On Sun, Oct 30, 2011 at 4:33 PM, Peter Dyballa peter_dyba...@web.de wrote:
 With COMBINING RING BELOW, U+0325?
 
 Yes --- How do I in general put, for example, U+0325 below U+0062,
 while still maintaining proper alignment (e.g. the bottom of the b
 (U+0062) with  a ring (U+0325) below it is still aligned with the
 bottom of an adjacent c (U+0063) with nothing below it)?

With Xunicode loaded, does this not do what you want?

   c\textsubring{b}c

or   cb0325c   (with no extra package).

It is up to the font to implement the placement.
XeTeX just receives the codes for the characters/glyphs. 

You can write a macro to simplify the input, once you are
sure that you know what you want, and how to get it.

 
 Dan


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


  1   2   >