from:"\"Ross Moore\""

Re: [XeTeX] strange case of overlapping graphics

2021-08-09 Thread Ross Moore

Hello Sej.

This message came through empty to me.

Was it intentional?
How can we help you resolve the difficulty?


All the best.

Ross

On 8 Aug 2021, at 6:26 am, Sej Lyn Jimenez 
mailto:sejlynjime...@gmail.com>> wrote:



Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>

Re: [XeTeX] strange case of overlapping graphics

2021-06-10 Thread Ross Moore

Hi Zdenek,

On 11 Jun 2021, at 6:49 am, Zdenek Wagner 
mailto:zdenek.wag...@gmail.com>> wrote:

Yes, it is what I saw in pdflatex as well. Now I see the reason, I have not 
read my exiftool output carefully. It says:

X Resolution: 25
Y Resolution: 7

Yes, that does look a bit weird, doesn’t it.
One would expect it to be the same in both directions, unless specifically
set to be different for a special effect.

Notice that it also saysResolution Unit: inches .
Surely not!  unless it needs ~100 units for the inch.
Further down there are  Pixel units of meters.
Very strange.

Thanks in advance for any lessons on how to properly read/interpret this info.

So the culprit need not be XeTeX after all.

All the best.

Ross

So, pdftex as well as viewers used by Phil honour the resolution in both 
directions and the output is tall. Gwenview, gimp and ocular just honour the 
size in pixels. Xelatex probably calculates the dimensions correctly taking 
into account X/Y resolutions, generated the commands for PNG inclusion but 
xdvipdfmx honours the pixels only, not the resolutions.

Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml<http://ttsm.icpf.cas.cz/team/wagner.shtml>

čt 10. 6. 2021 v 22:37 odesílatel Ross Moore 
mailto:ross.mo...@mq.edu.au>> napsal:
Hi Janusz, and others

For me, using  pdftex, the only issue is that the aspect ratio isn’t correct.

XeTeX gets it wrong:

Notice the apparent lack of centering with respect to the caption.
It’s as if the width has been read incorrectly by XeTeX, for positioning
the 2 instances of the image.

Explicitly specifying a more realistic width fixes this, with both engines.
Examining the information in GraphicConverter  (on MacOS)
I cannot see anything amiss.  (see image)

So to me, this looks like a XeTeX issue after all.

Hope this helps.

Ross

On 11 Jun 2021, at 2:12 am, Janusz S. Bień 
mailto:jsb...@mimuw.edu.pl>> wrote:

On Thu, Jun 10 2021 at 11:04 -05, Herbert Schulz wrote:
>> On Jun 10, 2021, at 10:58 AM, Zdenek Wagner 
>> mailto:zdenek.wag...@gmail.com>> wrote:
>>
>> čt 10. 6. 2021 v 17:09 odesílatel Philip Taylor
>> mailto:p.tay...@hellenic-institute.uk>> 
>> napsal:
>>>
>>> In both Windows Preview and in Adobe Illustrator CC, the PNG file
>>> is roughly twice as tall (relative to its width) as it appears in
>>> the PDF.
>>> --
>> So this means that the PNG contains something strange which is
>> interpreted by some programs and ignored by other programs. Maybe
>> different vertical and horizontal resolutions? I do not have a tool to
>> analyze it further.

Tomorrow I will make more tests and submit the problem to ddjvu
author(s).

> Howdy,
>
> And
>
> \includegraphics[width=1.3cm]{Zaborowski_MBC_page19y42_a}
> \includegraphics[width=1.3cm]{Zaborowski_MBC_page19y42_a}
>
> works fine (not exactly the same size but that can be adjusted).

Yes. It's great you have found a way to circumvent the problem!

On the other hand this make the problem more misterious...

Best regards

Janusz

--
,
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien<https://sites.google.com/view/jsbien>

Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au<http://www.maths.mq.edu.au/>

CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>

Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>

Re: [XeTeX] Coloured fonts

2021-03-18 Thread Ross Moore

Hi David, Philip.

On 19 Mar 2021, at 7:17 am, David Carlisle 
mailto:d.p.carli...@gmail.com>> wrote:

Not sure if xetex can do colour fonts currently,

According to here:

 https://www.colorfonts.wtf

there’s not many applications that do support this new technology.

The colour doesn’t show in Phil’s example PDF, neither in Adobe’s  Illustrator, 
nor Acrobat Pro,
despite Adobe being one of the instigators of this font format.
So presumably the font isn’t installed correctly into the PDF.

Presumably the  /Style  dictionary here:

9 0 obj
<<
/Descent -173
/StemV 87
/Ascent 631
/FontName /THVNSG+BabelStoneXiangqiColour
/ItalicAngle 0
/Style
<<
/Panose <080002020604010101010101>
>>
/AvgWidth 734
/FontBBox [-14 -232 1014 795]
/Type /FontDescriptor
/CIDSet 16 0 R
/CapHeight 631
/Flags 4
/FontFile2 17 0 R
>>
endobj

is where the colour is specified, by that /Panose  entry.
But there must be something else that is missing.

Unfortunately the link to get the font doesn’t work for me.

[cid:4B73A63B-0EAF-4379-91AC-6674DBBF5B27@modem]

So David, could you possibly send the PDF of the example you posted below, 
please?

You can always experiment with luatex which gets this if using harfbuzz

\documentclass{article}

\usepackage{fontspec}

\newfontfamily\chess[Renderer=HarfBuzz]{BabelStoneXiangqiColour.ttf}
\begin{document}

testing {\chess ^^01fa64}

\end{document}

On Thu, 18 Mar 2021 at 18:39, Philip Taylor 
mailto:p.tay...@rhul.ac.uk>> wrote:
Seeking to re-typeset a long out-of-print classic on Xiang-Qi ("Chinese 
Chess"), but with the pieces shewn as they really are rather than as upper-case 
Latin letters requiring a gloss (the presentation chosen by the original 
author), I downloaded and installed Andrew West's BabelStone Xiangqi Colour 
font<https://www.babelstone.co.uk/Fonts/Xiangqi.html>.  I then wrote a short 
piece of XeTeX code to check that the glyphs/pieces appear in the PDF as they 
should, and very sadly they do not, coming out as monochrome rather than in 
colour (see attached PDF).

The red pieces are described by Andrew as red Chinese characters on a sandy 
yellow background, and the black pieces as black Chinese characters on a sandy 
yellow background.  In the resulting PDF, however, they appear as white Hanzi 
on a black ground and black Hanzi on a white ground.  Does XeTeX support 
coloured fonts, and if so, how do I persuade it to render these glyphs as 
intended rather than in monochrome ?

I can, of course, load \font \redpieces = "BabelStone Xiangqi 
Colour":color=FF scaled \magstep 5 (see code below), but that still does 
not give me the sandy yellow ground that each glyph was designed to have.

'opentype-info.tex', when run against BabelStone Xiangqi Colour, tells me that 
the font does not provide any Opentype layout features, so it does not look as 
if XeTeX's "/ICU:+abcd" convention would allow me to indicate that I require 
colour support.

% !TeX Program=XeTeX

\font \pieces = "BabelStone Xiangqi Colour" scaled \magstep 5
\font \redpieces = "BabelStone Xiangqi Colour":color=FF scaled \magstep 5
\font \blackpieces = "BabelStone Xiangqi Colour" scaled \magstep 5
\pieces
\centerline {\char "1FA60\relax \ \char "1FA61\relax \ \char "1FA62\relax \ 
\char "1FA63\relax \ \char "1FA64\relax \ \char "1FA65\relax \ \char 
"1FA66\relax}
\centerline {\strut}
\centerline {\char "1FA67\relax \ \char "1FA68\relax \ \char "1FA69\relax \ 
\char "1FA6A\relax \ \char "1FA6B\relax \ \char "1FA6C\relax \ \char 
"1FA6D\relax}
\centerline {\strut}
\centerline {\strut}
\centerline {\redpieces \char "1FA60\relax \ \char "1FA61\relax \ \char 
"1FA62\relax \ \char "1FA63\relax \ \char "1FA64\relax \ \char "1FA65\relax \ 
\char "1FA66\relax}
\centerline {\strut}
\centerline {\blackpieces \char "1FA67\relax \ \char "1FA68\relax \ \char 
"1FA69\relax \ \char "1FA6A\relax \ \char "1FA6B\relax \ \char "1FA6C\relax \ 
\char "1FA6D\relax}
\end
--
Philip Taylor

Cheers.

Ross

Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>

Re: [XeTeX] latin-1 encoded characters in commented out parts trigger log warnings

2021-02-21 Thread Ross Moore

Hi Jonathan, and others.

On 22 Feb 2021, at 10:39 am, Jonathan Kew 
mailto:jfkth...@gmail.com>> wrote:

On 21/02/2021 22:55, Ross Moore wrote:
The file reading has failed  before any tex accessible processing has happened 
(see the ebcdic example in the TeXBook)
OK.

Also pdfTeX has no trouble with an xstring example.
It just seems pretty crazy that the comments need to be altered
for that package to be used with XeTeX.

Well as long as the Latin-1 accented characters are only in comments, it 
arguably doesn't "really" matter; xetex logs a warning that it can't interpret 
them, but if you know that part of the line is going to be ignored anyway, you 
can ignore the warning.

There’s actually a pretty easy fix, at least for XeLaTeX.
The package contains 2 files only:   xstring.sty  and  xstring.tex .
The .sty is just a 1-liner to load the .tex .

It could be beefed up with:

 \RequirePackage{ifxetex} %   is this still the best package for  \ifxetex  
 ?
\ifxetex
  \XeTeXdefaultencoding "iso-8859-1"
\input{xstring.tex}
  \XeTeXdefaultencoding "utf8"
\else
 \input{xstring.tex}
\fi

(ignore if straight quotes have become curly ones in my email editor!)



Even nicer would be to beef it up further by:
 1. record the current default encoding – is this possible ?
 then restore from this.
 2. use a grouping while deciding what to do,
 expanding the required commands before ending the group.

\showthe\XeTeXdefaultencoding  doesn’t work,
so is there another container that tells what is the default encoding?
Or should we always assume it is UTF-8 and revert to that afterwards?

e.g. something like:

\RequirePackage{ifxetex}
\begingroup
 \def\next{\endgroup \input{xstring.tex}}%
 \ifxetex
  \XeTeXdefaultencoding "iso-8859-1"
  \def\next{\endgroup
   \input{xstring.tex}%
   \XeTeXdefaultencoding "utf8"}%
 \fi
\next




(pdfTeX doesn't care because it simply reads the bytes from the file; any 
interpretation of bytes as one encoding or another is handled at the TeX macro 
level.)

Right.
Which is why I do my PDF development work in pdfTeX before
testing whether it can be adapted also to XeTeX and/or LuaTeX.


JK



Cheers.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>

Re: [XeTeX] latin-1 encoded characters in commented out parts trigger log warnings

2021-02-21 Thread Ross Moore

Hi David,

On 22 Feb 2021, at 8:43 am, David Carlisle
mailto:d.p.carli...@gmail.com>> wrote:

Surely the line-end characters are already known, and the bits&bytes
have been read up to that point *before* tokenisation.

This is not a pdflatex inputenc style utf-8 error failing to map a stream of
tokens.

It is at the file reading stage and if you have the file encoding wrong you do
not know reliably what are the ends of lines and you haven't interpreted it as
tex at all, so the comment character really can't have an effect here.

Ummm. Is that really how XeTeX does it?
How then does Jonathan’s
\XeTeXdefaultencoding "iso-8859-1”
ever work ?
Just a rhetorical question; don’t bother answering. :-)

This mapping is invisible to the tex macro layer just as you can change the
internal character code mapping in classic tex to take an ebcdic stream, if you
do that then read an ascii file you get rubbish with no hope to recover.

So I don't think such a switch should be automatic to avoid reporting encoding
errors.

I reported the issue at xstring here
https://framagit.org/unbonpetit/xstring/-/issues/4<https://framagit.org/unbonpetit/xstring/-/issues/4>

I looked at what you said here, and some of it doesn’t seem to be in accord with
my TeXLive installations.

viz.

/usr/local/texlive/2016/.../xstring.tex:\expandafter\ifx\csname
@latexerr\endcsname\relax% on n'utilise pas LaTeX ?
/usr/local/texlive/2016/.../xstring.tex:\fi% fin des d\'efinitions LaTeX
/usr/local/texlive/2016/.../xstring.tex:% - Le package ne n\'ecessite plus
LaTeX et est d\'esormais utilisable sous
/usr/local/texlive/2016/.../xstring.tex:% Plain eTeX.
/usr/local/texlive/2017/.../xstring.tex:% conditions of the LaTeX Project
Public License, either version 1.3
/usr/local/texlive/2017/.../xstring.tex:% and version 1.3 or later is part of
all distributions of LaTeX
/usr/local/texlive/2017/.../xstring.tex:\expandafter\ifx\csname
@latexerr\endcsname\relax% on n'utilise pas LaTeX ?
/usr/local/texlive/2017/.../xstring.tex:\fi% fin des d\'efinitions LaTeX
/usr/local/texlive/2017/.../xstring.tex:% - Le package ne n\'ecessite plus
LaTeX et est d\'esormais utilisable sous
/usr/local/texlive/2017/.../xstring.tex:% Plain eTeX.
/usr/local/texlive/2018/.../xstring.tex:% !TeX encoding = ISO-8859-1
/usr/local/texlive/2018/.../xstring.tex:% Licence: Released under the LaTeX
Project Public License v1.3c %
/usr/local/texlive/2018/.../xstring.tex:% Plain eTeX.
/usr/local/texlive/2019/.../xstring.tex:% !TeX encoding = ISO-8859-1
/usr/local/texlive/2019/.../xstring.tex:% Licence: Released under the LaTeX
Project Public License v1.3c %
/usr/local/texlive/2019/.../xstring.tex: Plain eTeX.

prior to 2018, the accents in comments used ASCII, so UTF-8, but not
intentionally so.

In 2017, the accents in comments became latin-1 chars.
A 1st line was added: % !TeX encoding = ISO-8859-1
to indicate this.

Such directive comments are useless, except at the beginning of the main
document source.
They are for Front-End software, not TeX processing, right?

Jonathan, David,
so far as I can tell, it was *never* in UTF-8 with preformed accents.

David

that says what follows next is to be interpreted in a different way to what
came previously?
Until the next switch that returns to UTF-8 or whatever?

If XeTeX is based on eTeX, then this should be possible in that setting.

Even replacing by U+FFFD
is being lenient.

Why has the mouth not realised that this information is to be discarded?
Then no replacement is required at all.

The file reading has failed before any tex accessible processing has happened
(see the ebcdic example in the TeXBook)

OK.
But that’s changing the meaning of bit-order, yes?
Surely we can be past that.

\danger \TeX\ always uses the internal character code of Appendix~C
for the standard ASCII characters,
regardless of what external coding scheme actually appears in the files
being read. Thus, |b| is 98 inside of \TeX\ even when your computer
normally deals with ^{EBCDIC} or some other non-ASCII scheme; the \TeX\
software has been set up to convert text files to internal code, and to
convert back to the external code when writing text files.

the file encoding is failing at the "convert text files to internal code"
stage which is before the line buffer of characters is consulted to produce the
stream of tokens based on catcodes.

Yes, OK; so my model isn’t up to it, as Bruno said.
… And Jonathan has commented.

Also pdfTeX has no trouble with an xstring example.
It just seems pretty crazy that the comments need to be altered
for that package to be used with XeTeX.

David

Cheers, and thanks for this discussion.

Ross

Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955 | F: +61 2 985

Re: [XeTeX] latin-1 encoded characters in commented out parts trigger log warnings

2021-02-21 Thread Ross Moore

Hi Ulrike,

On 22 Feb 2021, at 7:52 am, Ulrike Fischer 
mailto:ne...@nililand.de>> wrote:

Am Sun, 21 Feb 2021 20:26:04 + schrieb Ross Moore:

> Once you have encountered the (correct) comment character,
> what follows on the rest of the line is going to be discarded,
> so its encoding is surely irrelevant.
>
> Why should the whole line need to be fully tokenised,
> before the decision is taken as to what part of it is retained?

Well you need to find the end of the line to know where to stop with
the discarding don't you? So you need to inspect the part after the
comment char until you find something that says "newline”.

My understanding is that this *is* done first.
Similarly to TeX's  \read  towhich grabs a line of input from a 
file,
before doing the tokenisation and storing the result in the .
   page 217 of The TeXbook

If I’m wrong with this, for high-speed input, then yes you need to know where 
to stop.
But that’s just as easy, since you stop when a byte is to be tokenised
as an end-of-line character, and these are known.
You need this anyway, even when you have tokenised every byte.

So all we are saying is that when handling the bytes between
a comment and its end-of-line, just be a bit more careful.

It’s not necessary for each byte to be tokenised as valid for UTF-8.
Maybe change the (Warning) message when you know that you are within
such a comment, to say so.  That would be more meaningful to a package-writer,
and to an author who uses the package, looks in the .log file, and sees the 
message.

None of this is changing how the file is ultimately processed;
it’s just about being friendlier in the human interface.

--
Ulrike Fischer
https://www.troubleshooting-tex.de/<https://www.troubleshooting-tex.de>

All the best.

Ross

Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>

Re: [XeTeX] latin-1 encoded characters in commented out parts trigger log warnings

2021-02-21 Thread Ross Moore

Hi David,

On 21 Feb 2021, at 11:02 pm, David Carlisle
mailto:d.p.carli...@gmail.com>> wrote:

I don't think there is any reasonable way to say you can comment out parts of a
file in a different encoding.

I’m not convinced that this ought to be correct for TeX-based software.

TeX (not necessarily XeTeX) has always operated as a finite-state machine.
It *should* be possible to say that this part is encoded as such-and-such,
and a later part encoded differently.

I fully understand that editor software external to TeX might well have
difficulties
with files that mix encodings this way, but TeX itself has always been
byte-based
and should remain that way.

A comment character is meant to be viewed as saying that:
*everything else on this line is to be ignored*
– that’s the impression given by TeX documentation.

But you only know it is a comment character if you can interpret the incoming
byte stream
If there are encoding errors in that byte stream then everything ls is guess
work.

Who said anything about errors in the byte stream?
Once you have encountered the (correct) comment character,
what follows on the rest of the line is going to be discarded,
so its encoding is surely irrelevant.

Why should the whole line need to be fully tokenised,
before the decision is taken as to what part of it is retained?

In the case of a package file, rather than author input for typesetting,
the intention of the coding is completely unknown,
is probably all ASCII anyway, except (as in this case) for comments intended
for human eyes only, following a properly declared comment-character.

In this particular case with mostly ascii text and a few latin-1 characters it
may be that you can guess that
the invalid utf-8 is in fact valid latin1 and interpret it that way,

You don’t need to interpret it as anything; that part is to be discarded.

and the guess would be right for this file
but what if the non-utf8 file were utf-16 or latin-2 or

Surely the line-end characters are already known, and the bits&bytes
have been read up to that point *before* tokenisation.
So provided the tokenisation of the comment character has occurred before
tackling what comes after it, why would there be a problem?

... just guessing the encoding (which means guessing where the line and so the
comment ends)
is just guesswork.

No guesswork intended.

The file encoding specifies the byte stream interpretation before any tex
tokenization
If the file can not be interpreted as utf-8 then it can't be interpreted at all.

Why not?
Why can you not have a macro — presumably best on a single line by itself –

there is an xetex primitive that switches the encoding as Jonathan showed,
but guessing a different encoding
if a file fails to decode properly against a specified encoding is a dangerous
game to play.

I don’t think anyone is asking for that.

I can imagine situations where coding for packages that used to work well
without UTF-8 may well be commented involving non-UTF-8 characters.
(Indeed, there could even be binary bit-mapped images within comment sections;
having bytes not intended to represent any characters at all, in any encoding.)

If such files are now subjected to constraints that formerly did not exist,
then this is surely not a good thing.

Besides, not all the information required to build PDFs need be related to
putting characters onscreen, through the typesetting engine.

For example, when building fully-tagged PDFs, there can easily be more
information
overall within the tagging (both structure and content) than in the visual
content itself.
Thank goodness for Heiko’s packages that allow for re-encoding strings between
different formats that are valid for inclusion within parts of a PDF.

I’m thinking here about how a section-title appears in:
bookmarks, ToC entries, tag-titles, /Alt strings, annotation text for
hyperlinking, etc.
as well as visually typeset for on-screen.
These different representations need to be either derivable from a common
source,
or passed in as extra information, encoded appropriately (and not necessarily
UTF-8).

So I don't think such a switch should be automatic to avoid reporting encoding
errors.

I reported the issue at xstring here
https://framagit.org/unbonpetit/xstring/-/issues/4<https://framagit.org/unbonpetit/xstring/-/issues/4>

David

that says what follows next is to be interpreted in a different way to what
came previously?
Until the next switch that returns to UTF-8 or whatever?

If XeTeX is based on eTeX, then this should be possible in that setting.

Even replacing by U+FFFD
is being lenient.

Why has the mouth not realised that this information is to be discarded?
Then no replacement is required at all.

David

Re: [XeTeX] latin-1 encoded characters in commented out parts trigger log warnings

2021-02-21 Thread Ross Moore

Hi David.

On 21 Feb 2021, at 10:12 pm, David Carlisle 
mailto:d.p.carli...@gmail.com>> wrote:

I think that should be taken up with the xstring maintainers.

Is  xstring  intended for use with XeTeX ?
I suspect not.
But anyway, there are still issues with this.

(BTW, I wrote this before Jonathan Kew’s response.)


I don't think there is any reasonable way to say you can comment out parts of a 
file in a different encoding.

I’m not convinced that this ought to be correct for TeX-based software.

TeX (not necessarily XeTeX) has always operated as a finite-state machine.
It *should* be possible to say that this part is encoded as such-and-such,
and a later part encoded differently.

I fully understand that editor software external to TeX might well have 
difficulties
with files that mix encodings this way, but TeX itself has always been 
byte-based
and should remain that way.

A comment character is meant to be viewed as saying that:
 *everything else on this line is to be ignored*
– that’s the impression given by TeX documentation.

If it is the documentation that is incorrect, then it should certainly be 
clarified.

For XeTeX and this particular example, it’s probably just a matter of checking
that the non-UTF8 characters occur *after* a UTF-8  ‘%' , and not issuing
an error message under these conditions.
A warning, maybe, but not an error.


The file encoding specifies the byte stream interpretation before any tex 
tokenization
If the file can not be interpreted as utf-8 then it can't be interpreted at all.

Why not?
Why can you not have a macro — presumably best on a single line by itself –
that says what follows next is to be interpreted in a different way to what 
came previously?
Until the next switch that returns to UTF-8 or whatever?


If XeTeX is based on eTeX, then this should be possible in that setting.


Even replacing by U+FFFD
is being lenient.

David




On Sun, 21 Feb 2021 at 11:04, jfbu mailto:j...@free.fr>> wrote:
Hi,

consider this

\documentclass{article}
\usepackage{xstring}
\begin{document}
\end{document}

and call it xexstring.tex

Then xelatex xexstring triggers 136 warnings of the type

Invalid UTF-8 byte or sequence at line 35 replaced by U+FFFD.

Looking at file

/usr/local/texlive/2020/texmf-dist/tex/generic/xstring/xstring.tex

I see that this matches with use of latin-1 encoded characters in comments.

Notice that it is a not a user decision here to use a latin-1
encoded file.

In fact I encountered this in a file I was given where
xstring package was loaded by another package.

Regards,

Jean-François


Cheers.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>

Re: [XeTeX] [tex-implementors] Proposal : that TeX engines generating PDF directly should be able to close the output file without terminating.

2020-07-03 Thread Ross Moore

Hi Jonathan,

On 4 Jul 2020, at 8:55 am, Jonathan Kew 
mailto:jfkth...@gmail.com>> wrote:

On 03/07/2020 20:13, Bruno Le Floch wrote:
On 7/3/20 6:50 PM, Jonathan Kew wrote:
On 03/07/2020 16:26, Philip Taylor wrote:
Jonathan Kew wrote:

Many potential use-cases, I think, can be equally well addressed by multiple TeX
invocations under the control of a higher-level script or tool of some kind.
Perhaps there are compelling examples where this would not be the case, but I'm
not aware of them at the moment.

JK

A major use case could be for AucTeX preview of equations, or other wysywyg-like
interfaces where one wants to compile chunks of TeX code always with the same
preamble, and with no relevant changes in macros: one could have an ongoing TeX
run producing pdfs when provided with further input.

This raises the question of what state the TeX engine should return to when the 
hypothetical \nextpdf primitive is executed. Does it return to a pristine 
"initex" state, or a "freshly-loaded .fmt file" state, or is the current state 
completely unchanged (does \jobname reflect the new output name?), or what? 
Should the \count variables that are by convention used to record page numbers 
get reset?

Does a new .log file also get started? What about \write output files -- are 
they flushed and new files started?

There’s currently a thread called “startup time” about \dump’ing a new format 
file.
Similar kinds of consideration exist there, but at a different level (of 
course),
in choosing the best place to make that  \dump  call.

At least there it is clear what is the purpose for making the new format.
It is much less clear here what would be the new use-cases made possible
if such a  \nextpdf  primitive were made available.

  A currently-working
variant of this is the following (in bash), which ships out a first page, then
waits 10 seconds, then ships out another one.

$ (echo '\relax Hello world!\vfill\break' && sleep 10 && echo '\relax Another
pdf.\bye') | xetex

One could imagine a primitive \nextpdf that would make xetex produce 2 separate
pdfs (in the present case texput.pdf and secondfile.pdf)

$ (echo '\relax Hello world!\nextpdf{secondfile}' && sleep 10 && echo '\relax
Another pdf.\bye') | xetex

This looks equivalent to (xetex '\relax Hello world!\bye' && sleep 10 && xetex 
--jobname secondfile '\relax Another pdf.\bye'), right?

It's true there would be a difference if there are macros etc. defined while 
processing the first file, and then used while generating the second. But I'm 
not sure this is really a commonly-required use case.

Consider me not yet persuaded……

I’m on the fence too.

Of course another possibility for Phil is to put all the pages of *both* his 
desired PDFs into the same “master PDF”,
then use a command-line tool like  pdftk  to extract the relevant pages for one 
or other into a new PDF.

This would likely not work with Tagged PDF documents.
There the extraction into separate files would need to be done by Acrobat Pro,
so as to preserve the tagging structures and the page-based relationships of 
marked content.
But that’s an issue for the future.


JK



All the best.
Stay safe.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>

Re: [XeTeX] startup time

2020-07-03 Thread Ross Moore

Hi Michael, and others.

On 04/07/2020, at 5:44, "Michael Maxwell"  wrote:

> 
> 
> On 7/3/2020 2:28 PM, Zdenek Wagner wrote:
>> There are several options:
>> 1. Dump your own format with your styles. You will have to regenerate the 
>> format after update of any of these style files and you will have to take 
>> care of dependencies

>> I already have 2 and 3, although afaict 3 has little if any effect, because 
>> the processing time appears to be taken up with macro expansion (or whatever 
>> it is that tex does while processing the preamble).
> 
> 1 is what I was looking for, but how do you do it?  I tried
>   xelatex -ini 

You will need to also have
   \input latex.ltx
first, as you are building an extended LaTeX from scratch.

Place the  \dump  command after all your packages and definitions,
but before  \begin{document} .

But there are tricks and traps that will have to explore for your own documents 
and choice of packages.
For example, if any package defines its own auxiliary file for passing 
information to later LaTeX runs, then find exactly when this file is read, and 
opened for writing.
A file-pointer cannot be preserved in dumped format file. 
So you'll probably have to move that package to after the  \dump  rather than 
before it.

> but it chokes on the \documentclass; or when I try it on my style sheet 
> alone, it chokes on the \ProvidesPackage.  Apparently it works with plain 
> TeX, but not with LaTeX?  And I'm not sure this would do what I want anyway; 
> what it means to "be xeinitix"; a web search for that term was unproductive 
> (unless you're looking for some kind of gas warning light).
> 
> Can you give me a lead on how to do #1?

I used to do this kind of thing a lot, almost 20 years ago when computers were 
much slower than they are now. I even wrote a package called  ldump  to capture 
some of the definitions that packages delay using \AtBeginDocument .
Moore's Law (no relation) has made it rather unnecessary now, though we are 
putting more and more into our documents, so it may still have a use.

Hope this helps.

   Ross

Re: [XeTeX] Proposal : that TeX engines generating PDF directly should be able to close the output file without terminating.

2020-07-03 Thread Ross Moore

Hi Philip.

On 3 Jul 2020, at 9:46 pm, Philip Taylor 
mailto:p.tay...@hellenic-institute.uk>> wrote:

Zdenek Wagner wrote:

I have never tried but could not you use \write18 to run XeTeX from XeTeX? I am 
not sure whether it is supported.

It would seem that adding the command-line qualifier "--shell-escape" is 
required in order that \write 18 be permitted, but then we run into the problem 
that (with the benefit of hindsight) we should have foreseen at the outset :  
the PDF file is incomplete at this point —


Then the main file should be the TeX job that *requires* something else to be 
available,
not the one that is required.
If the other PDF is not yet up-to-date then you use a  \write 18  call to 
process it
at the *beginning* of the job that requires it, not at the end.

Alternatively, use a  makefile  and run  ‘make'  from the command-line.
This is like a ‘batch’ command that runs all dependent jobs before doing
the final one, with everything up-to-date.


XeTeX --shell-escape "Hoi-An TA menu (separate pages).tex"
This is XeTeX, Version 3.14159265-2.6-0.92 (TeX Live 2020/W32TeX) (preloaded
 format=xetex)
 \write18 enabled.
entering extended mode
(./Hoi-An TA menu (separate pages).tex (./Hoi-An TA menu.xlsx.tex omnivore
omnivore omnivore omnivore omnivore omnivore vegan vegan vegan vegan vegan
omnivore omnivore omnivore vegan vegan vegan omnivore omnivore omnivore
omnivore omnivore vegan vegan vegan vegan vegan omnivore omnivore vegan
omnivore vegan omnivore omnivore omnivore omnivore vegan vegan vegan vegan
vegan [1] omnivore omnivore omnivore omnivore vegan vegan omnivore omnivore
vegan vegan vegan omnivore [2]) [3] [4] [5] [6]This is XeTeX, Version 3.14159265
-2.6-0.92 (TeX Live 2020/W32TeX) (preloaded format=xetex)
 restricted \write18 enabled.
entering extended mode
(./Hoi-An TA menu (combine pages).texSyntax Error: Couldn't find trailer diction
ary
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table

! Unable to load picture or PDF file '"Hoi-An TA menu (separate pages).pdf"'.

   }
l.28 ... - 0,666 \rulewidth \relax height \vsize }
  \relax
?

Philip Taylor


Hope this helps.
Stay safe.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>

Re: [XeTeX] \font "":color=FFFFFF produces black, not white glyphs \font "":color=FFFFFF produces black, not white glyphs, re-visited

2020-05-26 Thread Ross Moore

Hi Phil,

On 26 May 2020, at 5:12 pm, Philip Taylor 
mailto:p.tay...@hellenic-institute.uk>> wrote:

Ross Moore wrote:

I’m sorry, but this just doesn’t make any sense to me — but see further below.

No, the fourth couplet is TT, where T is "Transparency".  Unfortunately , it is 
a misnomer, since 00 = completely transparent and FE is almost opaque, which is 
why I spoke of "opacity" rather than transparency.  Unfortunately FF is not 
opaque when preceded by FF, because the driver treats FF [FF] specially.

As I said, it didn’t make sense to me.  :-)
Thanks for the clarification, and sorry for my added noise.


First it is important to realise that both flattening and conversion to CMKY 
will take place (the document is for digital printing).  When flattening takes 
place, RGB FF text will completely obscure the ground, and after conversion 
to CMYK there will then be no ink where the text occurs.  Unfortunately as 
things are at the moment, there will be 1/256 bleed-through of the ground 
because the RGB white was not perfectly opaque.

"knockout", tho' interesting, should not be needed.  The example earlier sent 
shews that one can get very close to 100% white (and of course there are no 
white inks involved) but not to 100% and this is what I would like to achieve 
(and which should, IMHO, be achievable).  Were it not for the fact that the 
driver treats FF and  specially, there would be no problem at all 
in achieving my aim.

It looks like Akira has done what you wanted, so the exercise was a success. :-)


** Phil.

Cheers.
Stay safe.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>

Re: [XeTeX] \font "":color=FFFFFF produces black, not white glyphs \font "":color=FFFFFF produces black, not white glyphs, re-visited

2020-05-25 Thread Ross Moore

Hi Zdenek.

We have very aggressive Mail protection software.
Sorry, it has blocked your  .tar.gz  file.

Is there a website that I can download it from?

Sorry for the hassle.

Ross


On 26 May 2020, at 10:35 am, Zdenek Wagner 
mailto:zdenek.wag...@gmail.com>> wrote:

Hi Ross,

I have packed the whole directory with the PDF/X tests, it includes
the generated files as well.

Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml<https://protect-au.mimecast.com/s/5cTGC0YKgRsjZq9Yi2VgRi?domain=ttsm.icpf.cas.cz>
http://icebearsoft.euweb.cz<https://protect-au.mimecast.com/s/BEK2CgZ05Jf8pOj4h3JSq-?domain=icebearsoft.euweb.cz>

From: IT Mail Adminmailto:postmas...@mq.edu.au>>
Subject: pdfx-tests_tar_gz
Date: 26 May 2020 at 10:35:43 am AEST
To: Ross Mooremailto:ross.mo...@mq.edu.au>>


Attachment Notice

The following attachment was removed from the associated email message.

File Name
pdfx-tests.tar.gz
File Size
1033141 Bytes

Attachment management policies limit the types of attachments that are allowed 
to pass through the email infrastructure.

Attachments are monitored and audited for security reasons.


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>

Re: [XeTeX] \font "":color=FFFFFF produces black, not white glyphs \font "":color=FFFFFF produces black, not white glyphs, re-visited

2020-05-25 Thread Ross Moore

Hi Zdenek,

On 26 May 2020, at 9:31 am, Zdenek Wagner
mailto:zdenek.wag...@gmail.com>> wrote:

út 26. 5. 2020 v 0:59 odesílatel Ross Moore
mailto:ross.mo...@mq.edu.au>> napsal:
and at the time, that appeared to solve my problem. However, it would appear
that since then "xdvipdfmx" has been enhanced to support transparency, as a
result of which Khaled's suggested FF00 no longer works (the text is
invisible, see attached). Could anyone tell me how, short of using \specials,
I can achieve 100% white with 100% opacity (= 0% ink) in XeTeX ?

I’m sorry, but this just doesn’t make any sense to me — but see further below.
Surely 100% opacity means that the blend between background and foreground is
100% background, 0% foreground.
Thus your text will be invisible, whatever colour has been specified; that this
is white becomes irrelevant.

The only way to get 100% white, over a coloured background, would be with 100%
ink, so 0% opacity.
Any other opacity level will allow some of the background colour to be mixed in.
At least that is how I understand what colour mixing is all about.

Sorry, correct me if my English is wrong but I would expect 100% ink = 100%
opacity = 0% transparency

You’re absolutely correct, my mistake.
Certainly I meant 0% *transparency*, the opposite of what Phil was trying to do.
It was he who said 100% opacity (= 0% ink) which is where the error lies.
Surely 100% ink = 100% opacity = 0% transparency, as you say.

However, there is another PDF parameter called “knockout”.
See this link for an brief description of the issue:

https://www.enfocus.com/en/products/pitstop-pro/pdf-editing/knockout-white-text-in-a-pdf<https://protect-au.mimecast.com/s/tuWYCBNqgBCvBNrAuzR3NV?domain=enfocus.com>

This is another topic. This addresses "black overprint" in printing.

Sure, but it is about knocking out the colour in the background.
Using black text is probably its most common usage.
But if you want the natural paper to come through, then surely this is the only
way to do it conveniently.

Otherwise you would have to manually define regions outlining the letters, and
make these
boundary curves for your background. Totally impractical.
PDF does this for you, if you have used the correct way to ask for it.

The idea is that process colours are printed in the following order:
cyan-magenta-yellow-black. If you want to print a yellow text on a cyan
background, RIP must erase the cyan plate to white where the characters will
later appear on the yellow plate, otherwise the text would not be yellow. If
the offset films are not precisely adjusted, you will see colour artifacts at
the boarders of the characters. If you want to type a dark text (usually black)
to a light background, it can just be overprinted. In order to make it work,
both colours must be defined in CMYK (not RGB, not grayscale). Professional
Acrobat since version 9 contains a prepress checking function which can verify
whether overprint was really used. Black overprint is implemented in my
zwpagelayout package. It was tested in xelatex, pdflatex, and latex + dvips.
The package does not my test files. If you like, I can send them.

Sure; I’d love to see these.
I’m sure that this would most closely approach what Phil seems to want to do.

How to achieve knockout using TeX+gs or pdfTeX or XeTeX?
I’m not at all sure. It must have a graphics state parameter.
The next image shows what I think is the relevant portion of the PDF specs.

There’s a whole section on “Transparency Groups”, but mostly it is about how
different transparent objects
must combine to produce the final colour where objects overlap.

Transparency should not be used for prepress. It works fine on office printers
but often come out as black on phototypesetters and CTP.

Phil hasn’t said what is his application.

After a cursory look, I think you need to use a Form X Object, which can be
done in pdftex using the \pdfxform primitive,
with appropriate attributes specified.
For XeTeX you would need to be using \special commands.
Someone here must have some experience with this.

Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml<https://protect-au.mimecast.com/s/TZFICD1vRkCX0ZzMs5cgKz?domain=ttsm.icpf.cas.cz>
http://icebearsoft.euweb.cz<https://protect-au.mimecast.com/s/4H07CE8wlRCBMj2nhptby5?domain=icebearsoft.euweb.cz>

Philip Taylor

Sorry for my error adding to confusion.

Cheers.
Stay safe.

Ross

Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955 | F: +61 2 9850 8114
M:+61 407 288 255 | E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee n

Re: [XeTeX] Translation commands for package dramatist to put in the marathi.ldf file

2020-04-12 Thread Ross Moore

Hi Zdenek,

On 13/04/2020, at 0:00, "Zdenek Wagner" 
mailto:zdenek.wag...@gmail.com>> wrote:

Hi all,

This can be done with \let and \def, not with \newcommand:

\let\savedLabel\label
\def\label#1{Do something with #1 \SavedLabel{#1}}

The second line could be replaced with
\renewcommand*\label[1]{Do something with #1 \SavedLabel{#1}}

This idea can be extended somewhat.

\makeatletter
\def\MY@label#1{Do something with #1 \LTX@label{#1}}
\AtBeginDocument{
 \let\LTX@label\label
 \let\label\MY@label
}
\makeatother

This way you retain pointers to both your own version and the original,
so can change between them if necessary, within different parts of your
document, or within specially tailored environments.

Delaying the \let rebindings using \AtBeginDocument means that it will still 
work
if some other package (such as  hyperref ) makes its own changes,
which you also want to incorporate.


\providecommand is useful if you assume that a definition exist but you want to 
provide a default definition.

Sure.
It is particularly useful when devising templates that will be filled with 
information
provided by a database, and command-line shell software that automates
calls to TeX or LaTeX or other engine.

latex '\def\name{client name} ... \input{mytemplate.tex}'

where  mytemplate.tex  is the main LaTeX source, having
  \providecommand{\name}{some default}
and similarly for all the variable data fields.

I use this kind of setup to automate personalised assignment cover-sheets,
generated online in response to student requests from a web page.
Sometimes the full question sheet is done this way,
with versions personalized, or randomized, based upon student ID numbers.


The newcommand family is useful because it offers a default first argument but 
if you use arguments with the newcommand family, use always the star version so 
that the macro is not \long. If you forget a right brace after an argument, you 
will get an error message at the end of a paragraph but without  the star you 
get an error message at the end of a file hence it is difficult the source of 
the error.

Construct \csname scenename\endcsname expands to the contents of \scenename if 
already defined or is defined to be identical with \relax if not yet defined. 
When checking existence of definition, LaTeX does the following:

\expandafter\ifx\csname scenename\endcsname\relax
  code for \scenename not yet defined
\else
  code for \scenename already defined
\fi

With \csname  you can test for all kinds of things,
and even adjust macros like  \begin  and  \end  to patch in extra coding
for specific environments, whether a package is loaded or not.

The possibilities are endless.


Cheers.
Stay safe.

  Ross


Of course, the whole \else part can be omitted if you have nothing to put there.

Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml<https://protect-au.mimecast.com/s/QcrSCnx1Z5UgvO8wS9Msu9?domain=ttsm.icpf.cas.cz>
http://icebearsoft.euweb.cz<https://protect-au.mimecast.com/s/fkdBCoV1Y2S5EqO9HzWPDx?domain=icebearsoft.euweb.cz>


ne 12. 4. 2020 v 12:49 odesílatel Ross Moore 
mailto:ross.mo...@mq.edu.au>> napsal:
Hi Phil, Zdeněk and others.

On 12 Apr 2020, at 7:46 pm, Philip Taylor 
mailto:p.tay...@hellenic-institute.uk>> wrote:

Zdeněk Wagner wrote:


I would not do it. Of course, you cannot use \renewcommand because
\scenename is not used in any *.ldf. You could use \def\scenename{दरश्य} […]


LaTeX has  \providecommand  with the same syntax as \newcommand  and  
\renewcommand .

It makes the definition *only* if the c-s is not already known.
This means that you can always use:

  \providecommand\mycs{}
  \renewcommand\mycs{ what I really want }

to get around such issues.


A thought — if \scenename is not known at the point that the last line of 
[gloss-]marathi.ldf is read, would there be any point in using \def \scenename 
{दरश्य}, since such definition would either get over-ridden by whatever 
subsequent definition of \scename is causing the problem (\def, \renewcommand), 
or would prevent a subsequent \newcommand from working as \scenename would 
already be defined.  Is this not the case (he asked, as someone who barely 
understands anything that LaTeX does ...) ?

There is always a way to get what you want,
whether using Plain TeX or LaTeX or whatever other high-level macro structures.

Thus the important thing is how to make it resistant to updates, as Zdeněk said.


Philip Taylor

Hope this helps.
Stay safe.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au

CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
conta

Re: [XeTeX] Translation commands for package dramatist to put in the marathi.ldf file

2020-04-12 Thread Ross Moore

Hi Phil, Zdeněk and others.

On 12 Apr 2020, at 7:46 pm, Philip Taylor 
mailto:p.tay...@hellenic-institute.uk>> wrote:

Zdeněk Wagner wrote:


I would not do it. Of course, you cannot use \renewcommand because
\scenename is not used in any *.ldf. You could use \def\scenename{दरश्य} […]


LaTeX has  \providecommand  with the same syntax as \newcommand  and  
\renewcommand .

It makes the definition *only* if the c-s is not already known.
This means that you can always use:

  \providecommand\mycs{}
  \renewcommand\mycs{ what I really want }

to get around such issues.


A thought — if \scenename is not known at the point that the last line of 
[gloss-]marathi.ldf is read, would there be any point in using \def \scenename 
{दरश्य}, since such definition would either get over-ridden by whatever 
subsequent definition of \scename is causing the problem (\def, \renewcommand), 
or would prevent a subsequent \newcommand from working as \scenename would 
already be defined.  Is this not the case (he asked, as someone who barely 
understands anything that LaTeX does ...) ?

There is always a way to get what you want,
whether using Plain TeX or LaTeX or whatever other high-level macro structures.

Thus the important thing is how to make it resistant to updates, as Zdeněk said.


Philip Taylor

Hope this helps.
Stay safe.

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>

Re: [XeTeX] Math class initialization in Unicde-aware engine

2019-11-28 Thread Ross Moore

Hi Joseph.

On 28 Nov 2019, at 6:29 pm, Joseph Wright 
mailto:joseph.wri...@morningstar2.co.uk>> 
wrote:

On 28/11/2019 00:16, Ross Moore wrote:
If by ignoring you mean removing the character entirely, then that is surely 
not best at all.
Most  N Class (Normal) characters would be simply of the default  \mathord  
class.

That is already the case: it's where IniTeX starts off, chars are mathord. So 
'nothing to do here'. Also note that some of this information is already set 
from the main Unicode file: it tells us which chars are letters.

OK. That’s what I’d expect.

I’d expect others to be mapped instead into a macro that corresponds to 
something that TeX does support.
e.g.
 space characters for  thinspace, 2-em space, etc.  in  U+2000 – U+200A
can expand into things like:   \, \; \> \quad \qquad  etc.  ( even to 
constructions like  \mskip1mu )

That's not a generic IniTeX thing, I'm afraid.

Yeah, well there are so many of these extra space characters.
I really don’t know where they are all used in practice by other (non-TeX) apps.

The Unicode data loaders are explicitly about setting up the basic data in 
Unicode TeX engines that's held in (primitive) tables.

Creating macros is the job of the 'rest' of the format. Here, presumably you 
are thinking of making chars math-active: that's well out-of-scope for the 
loader.

Fair enough; especially if this is all happening before processing any textual 
input intended for the typeset page.


After all, this is essentially what happens when pdfTeX reads raw Unicode input.

pdfTeX reads bytes, there's not really much comparison. In IniTeX mode, there 
is not much happening with UTF-8 and pdfTeX: perhaps you are thinking of with 
LaTeX?

Yes, sure I’m thinking of LaTeX; at least now that UTF-8 input has become the 
default.
Previously there would be (inputenc) package and  .def  file loading.
But, as you say above, this comes later.

One has to wonder then, how much of the Unicode range needs to be (or can be) 
handled earlier;
e.g, when there is only one sensible interpretation for the use of specific 
characters?
Conversely, how much can, or should, be left to later when there may be a 
better idea of which
(classes of) characters are present within the input source?

I suppose that is the kind of question you are dealing with; so I’ll now butt 
out of this conversation,
but still watch it if there’s further continuation.


Joseph



Cheers,

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>

Re: [XeTeX] Math class initialization in Unicde-aware engine

2019-11-27 Thread Ross Moore

Hi Joe, Doug

On 28 Nov 2019, at 10:27 am, Joseph Wright 
mailto:joseph.wri...@morningstar2.co.uk>> 
wrote:

> # N - Normal - includes all digits and symbols requiring only one form

> # D - Diacritic

> # F - Fence - unpaired delimiter (often used as opening or closing)

> # G - Glyph_Part - piece of large operator

> # S - Space
> # U - Unary - operators that are only unary

> # X - Special - characters not covered by other classes

> Unfortunately, the documentation/comments don't say what happens to entries 
> having these other Unicode math codes (N, D, F, G, S, U, and X). Are they 
> completely ignored, or are they mapped to one of the other eight codes that 
> matches what TeX is interested in or only capable of handing?
>
> I can imagine that the space character, given Unicode math class 'S' in 
> MathClass.txt, is ignored during this parse. But what happens to the '¬' 
> character (U+00AC) ("NOT SIGN"), which is assigned 'U' (Unary Operator). 
> Surely the logical not sign is not being ignored during initialization of a 
> Unicode-aware engine, yet the comments in load-unicode-math-classes.tex don't 
> say one way or the other, and it appears to me that the parsing code is 
> ignoring it.

The other Unicode math classes don't really map directly to TeX ones, so
they are currently ignored. Suggestions for improvements here are of
course welcome.

If by ignoring you mean removing the character entirely, then that is surely 
not best at all.

Most  N Class (Normal) characters would be simply of the default  \mathord  
class.

I’d expect others to be mapped instead into a macro that corresponds to 
something that TeX does support.
e.g.
 space characters for  thinspace, 2-em space, etc.  in  U+2000 – U+200A
can expand into things like:   \, \; \> \quad \qquad  etc.  ( even to 
constructions like  \mskip1mu )

After all, this is essentially what happens when pdfTeX reads raw Unicode input.

The G class (Glyph_Part) is a lot harder, as those glyph parts don’t correspond 
to any single
TeX macro. Think about a very large opening brace spanning 3+ ordinary line 
widths, say,
as may be generated by  \left\{ ... \right\}  surrounding some (inner-) 
displayed math alignment.
On input, the whole grouping would need to be identified and mapped to 
appropriate TeX coding.

Basically there is a lot here that needs to be looked more or less individually.

I’ve been through this kind of exercise, in reverse, to decide what to specify 
as /Alt  and /ActualText
replacements (for accessibility) for what TeX produces with various math 
constructions.
I don’t have definitive answers for everything, but have tried some 
possibilities for many things.

Joseph

Hope this helps.

Ross

Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>

Re: [XeTeX] Erroneous \lastkern

2019-06-28 Thread Ross Moore

Hi Ulrike, Jonathan,

On 28 Jun 2019, at 6:59 pm, Jonathan Kew 
mailto:jfkth...@gmail.com>> wrote:

On 28/06/2019 08:32, Ulrike Fischer wrote:
> Am Thu, 27 Jun 2019 23:03:09 + schrieb Ross Moore:
>
>>> there is probably something wrong with XeLaTeX, but I cannot find what.
>>
>> The difference between xetex and xelatex is the font:
>>
>> I’m sorry but I don’t understand this as an answer.
>
> It wasn't meant as an answer. I only explained why you get different
> results with plain and latex: because they use different default
> fonts.

OK. But that just emphasises that there is a bug lurking here.

> xetex has a different typesetting engine: it doesn't handle chars
> but words as units.
>
> See page 31 here 
> http://xml.web.cern.ch/XML/lgc2/xetexmain.pdf<https://protect-au.mimecast.com/s/jOdnCWLVn6iDMlXKfxOK2u?domain=xml.web.cern.ch>.
>
> So I'm not really surprised that you get the y, I was more suprised
> that it doesn't happen with legacy fonts - there it seem to switch
> back to the "handle characters" mode.

Yes, the bug arises because of how xetex collects a series of characters
to be "shaped" by an opentype font, rather than the core tex engine
handling each character individually. So at the point when \lastkern is
encountered, the letter A has not yet been appended to the current node
list being built; it is "pending" in the buffer of characters that will
become a whole-word node.

OK, I understand that now.
In fact I was think of surmising that such a mechanism could be in play.

But surely this means that XeTeX's “current list” should be that buffer, not 
TeX’s usual horizontal list.
So  \lastkern , \lastpenalty  and  \lastskip  should be getting their values 
from there.

Alternatively, and perhaps equivalently, just set all these to zero whenever we 
are adding characters
to this “pending” buffer. Only when the words are ready to be put back into the 
horizontal list,
should these be set back to what is there, if those values are still relevant 
to any typesetting tasks
at the beginning of that resulting word.

Still, I would regard this as a bug that we ought to fix. I imagine
similar primitives like \lastpenalty or \lastskip probably share the
same buggy behavior.

Yes, I would think so.

BTW, this is relevant to my Tagged PDF work, as it must insert extra literal 
material via \special  commands.
To ensure the same typesetting as without those tags, it is frequently 
necessary to transfer that previous
\skip , \kern  or \penalty  to come *after* the \special  and nullify the one 
before it.

So far this is only developed for pdfTeX, but in future we’ll want it for XeTeX 
too.
Discovering this difference now, and fixing it, will surely avoid a headache 
when that time comes.

JK

Cheers,

Ross

Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>

Re: [XeTeX] Erroneous \lastkern

2019-06-27 Thread Ross Moore

Hi Ulrike,

On 28 Jun 2019, at 6:27 am, Ulrike Fischer 
mailto:ne...@nililand.de>> wrote:

Am Thu, 27 Jun 2019 21:13:05 +0200 schrieb Christian Tellechea:

> there is probably something wrong with XeLaTeX, but I cannot find what.

The difference between xetex and xelatex is the font:

I’m sorry but I don’t understand this as an answer.
I would expect the test of \lastkern  to always give  n
no matter what the driver, font, with LaTeX or not.

The result from  \lastkern  (according to the TeXBook)
is that it is only non-zero when the last item on the current list is a kern.
(It is *not* a place-holder for the last kern that has been used.)


Thus the appearance of the letter A should have reset  \lastkern  to 0,
*except* if there is a kern provided by A itself (font dependent, perhaps yes);
but in that case it is unlikely to have the value of  2019sp .

Of course, maybe this value has been chosen, *because* that font does end with 
such a kern.
This would be the only valid reason that you could ever get a  y  from this 
code.


Put another way, the result of the test should not depend upon the initial use 
of  \kern2019sp  at all.
However, if it is commented-out, then  XeTeX does give  n , so the  y  is *not* 
coming from the letter A .

To me this *does* indicate a bug;
or at least a difference in how XeTeX behaves, compared with the original TeX.



\documentclass{article}

\begin{document}

Should be An: \kern2019sp A\ifnum\lastkern=2019 y\else n\fi\par

\fontencoding{OT1}\selectfont
Should be An: \kern2019sp A\ifnum\lastkern=2019 y\else n\fi\par

\end{document}

or in plain xetex:

Should be An: \kern2019sp A\ifnum\lastkern=2019 y\else n\fi\par

\font\test="[lmroman10-regular.otf]"\test
Should be An: \kern2019sp A\ifnum\lastkern=2019 y\else n\fi\par

\bye


--
Ulrike Fischer
https://www.troubleshooting-tex.de/<https://protect-au.mimecast.com/s/EjjEC4QO8xSlXW8wCxnhXi?domain=troubleshooting-tex.de>


Cheers,

Ross


Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>

Re: Google Disk (was: XeLaTeX to Word/OpenOffice - the state of the art?)

2019-03-17 Thread Ross Moore

Hi Zdenek,

On 18 Mar 2019, at 7:58 am, Zdenek Wagner 
mailto:zdenek.wag...@gmail.com>> wrote:

ne 17. 3. 2019 v 19:57 odesílatel Ross Moore 
mailto:ross.mo...@mq.edu.au>> napsal:
>

> Yes, but it is better for the CMaps to at least be appropriate, rather than 
> inaccurate or missing altogether, as can be the case. Different software 
> tools get information from different places, so ideally one needs to provide 
> the best values for all those possible places.
>
No, CMaps help for simple scripts only.

Fine. But a CMap must be present for validation.
As I said earlier (repeated below), they will not always be sufficient for 
proper Accessibility.

Let's imagine a person name
written বৌমিক in the Bengali script and transliterated as Bowmik. OW
is a two part matra (dependent vowel) which looks as e-matra preceding
the consonant and o-matra following the consonant. I-matra always
precedes the consonant thus using a CMap only the word would become
eboimak with two spelling errors. An editor will complain on an
e-matra at the beginning of a word and i-matra following o-matra, the
editor will indicate missing consonants. Similarly Hindi word स्थापित
(sthaapit) would be extraxted as sthaaipat which is wrong because
i-matra must not follow aa-matra. If I had time, I could give you
several thousands examples where CMaps fail. In past I did many tests
with Devanagari and without ActualText the problem cannot be solved.

I’m really happy that you have done such tests, and determined this.
It’s certainly not an area that I could have researched.
It demonstrates that supporting Accessibility properly can be a lot more 
complicated
than any single simple-minded approach would support.

This is the very reason why \XeTeXgenerateactualtext was implemented.
It is not just a problem of save as text/rtf/doc, in addition search
does not work.

Great addition.
However, it’s useful insofar as those AT getting information from the 
/ActualText.
Some screen-readers go for other places.

Indeed the PDF/A and PDF/UA specifications expect the Accessible text to come
from the  /Alt(…) tagging of the structure element parent of the tagged text.
(Obviously not all AT follow these specifications.)

This is why I suggest to populate more than one place with information that is 
helpful...

> And Zdenek's comment emphasises how what might work well in one language 
> setting can be quite insufficient for others. We need to be able to 
> accommodate all things that are helpful.
> That is surely what the U (for Universal) means in PDF/UA.

… requiring an appreciation for the intricacies of the language and intended 
audiences.

>
> Cheers,
>
> Ross
>

Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml<https://protect-au.mimecast.com/s/VW2aCANpnDC1mY4QF9GZJv?domain=ttsm.icpf.cas.cz>
http://icebearsoft.euweb.cz<https://protect-au.mimecast.com/s/OefSCBNqgBClZojGHjGNEq?domain=icebearsoft.euweb.cz>

I don’t see us as arguing against each other; rather we are sharing
experiences which indicate the depth of what is needed.

Cheers again,

Ross

Dr Ross Moore
Department of Mathematics and Statistics
12 Wally’s Walk, Level 7, Room 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>
http://www.maths.mq.edu.au
[cid:image001.png@01D030BE.D37A46F0]
CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University. <http://mq.edu.au/>
<http://mq.edu.au/>

Re: Google Disk (was: XeLaTeX to Word/OpenOffice - the state of the art?)

2019-03-17 Thread Ross Moore

Hi Andrew,

On 18/03/2019, at 0:18, "Andrew Cunningham" 
mailto:lang.supp...@gmail.com>> wrote:

Ross,

It is also dependent in the fonts themselves and the scripts the language is 
written in.

Absolutely.

Depending on the language and script the only way to ensure accessibility is to 
include the ActualText attributes for each relevant tag.

Indeed, provided you have supplied tagging at all, as of course should be done.

Considering how complex opentype fonts  can become for some scripts the 
simplistic To Unicode mappings in a PDF can be insufficient.

Yes, but it is better for the CMaps to at least be appropriate, rather than 
inaccurate or missing altogether, as can be the case. Different software tools 
get information from different places, so ideally one needs to provide the best 
values for all those possible places.

And text in a PDF may by WCAG definition be non-textual content.

Presumably you mean, adding descriptive text to graphics that convey meaningful 
information; e.g. a company logo, and most illustrations.
Of course this should be done too. But this can only be useful if the alternate 
descriptive text can be found via the structure tagging; hence the need for 
fully tagged PDF, navigable via that tagging.

And Zdenek's comment emphasises how what might work well in one language 
setting can be quite insufficient for others. We need to be able to accommodate 
all things that are helpful.
That is surely what the U (for Universal) means in PDF/UA.

Cheers,

  Ross

On Sunday, 17 March 2019, Ross Moore 
mailto:ross.mo...@mq.edu.au>> wrote:
Hi Karljūrgen,

On 17/03/2019, at 1:42, "Karljürgen Feuerherm" 
mailto:kfeuerh...@kfeuerherm.ca>> wrote:

> Ross,
>
> Your reply caught my eye, and I am now looking at the pdfx package 
> documentation.
>
> May I ask, if accessibility is a concern, why a-2b/-2u rather than -ua-1, 
> which seems directly targeted at this?

PDF/UA and PDF/A-1a,2a,3a  require a fully tagged PDF.
This is a highly non-trivial task, which requires adding much extra to the 
document, done almost entirely through \special commands. The pdfx package does 
not provide this, but is useful for meeting the Metadata and other requirements 
of these formats.

Abstractly, accessibility is about having sufficient information stored in the 
PDF for software tools to be able to build and present a description of the 
content and structure, other than the visual one. The same can be said of 
software for converting into a different format.

A significant part of this is being able to correctly identify each character 
in the fonts used within the TeX/produced PDF. Even this is a non-trivial 
problem, due to TeX's non-standard font encodings, and virtual font technique.

>
> Many thanks,
>
> K
>
>> You should use the  pdfx  package and prepare for  PDF/A-2b or -2u.
>> This fixes many of these things that affect conversions, as well as 
>> Accessibility and Archivability.
>>
>> It's not fully tagged PDF, but handles many other technical issues.
>>

Hope this helps.

Ross

--
Andrew Cunningham
lang.supp...@gmail.com<mailto:lang.supp...@gmail.com>

Re: Google Disk (was: XeLaTeX to Word/OpenOffice - the state of the art?)

2019-03-16 Thread Ross Moore

Hi Karljūrgen,

On 17/03/2019, at 1:42, "Karljürgen Feuerherm"  wrote:

> Ross,
> 
> Your reply caught my eye, and I am now looking at the pdfx package 
> documentation.
> 
> May I ask, if accessibility is a concern, why a-2b/-2u rather than -ua-1, 
> which seems directly targeted at this?

PDF/UA and PDF/A-1a,2a,3a  require a fully tagged PDF.
This is a highly non-trivial task, which requires adding much extra to the 
document, done almost entirely through \special commands. The pdfx package does 
not provide this, but is useful for meeting the Metadata and other requirements 
of these formats.

Abstractly, accessibility is about having sufficient information stored in the 
PDF for software tools to be able to build and present a description of the 
content and structure, other than the visual one. The same can be said of 
software for converting into a different format.

A significant part of this is being able to correctly identify each character 
in the fonts used within the TeX/produced PDF. Even this is a non-trivial 
problem, due to TeX's non-standard font encodings, and virtual font technique.

> 
> Many thanks,
> 
> K
> 
>> You should use the  pdfx  package and prepare for  PDF/A-2b or -2u.
>> This fixes many of these things that affect conversions, as well as 
>> Accessibility and Archivability.
>> 
>> It's not fully tagged PDF, but handles many other technical issues.
>> 

Hope this helps.

Ross

Re: Google Disk (was: XeLaTeX to Word/OpenOffice - the state of the art?)

2019-03-15 Thread Ross Moore

Hi Janusz,

On 16/03/2019, at 17:51, "Janusz S. Bień" 
mailto:jsb...@mimuw.edu.pl>> wrote:

Sorry for the previous male sent by mistake (the shortcuts in Gnus are
sometimes confusing...)

On Fri, Mar 15 2019 at 13:34 +01, BPJ wrote:
> Den 2019-03-15 kl. 08:31, skrev Janusz S. Bień:
>> On Fri, Mar 15 2019 at 7:19 +01, BPJ wrote:
>>> I use, despite myself, Google Docs to convert PDF to DOCX,

For me the quality is similar to Acrobat 9, i.e. completely not
acceptable: spaces between words are often missing.

This is inherent in the way TeX was written.
But there are ways to tackle the issue.

You should use the  pdfx  package and prepare for  PDF/A-2b or -2u.
This fixes many of these things that affect conversions, as well as 
Accessibility and Archivability.

It's not fully tagged PDF, but handles many other technical issues.

Best regards

Janusz

--
,
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien

Hope this helps.

   Ross

Re: make4ht problem

2019-03-12 Thread Ross Moore

Hi Janusz,

This is the kind of incompatibility problem that I like to try nailing down.
But like Phil, my recommendation is to try Acrobat saving to Word.
Adobe has put a lot of effort into making this better.

It will become easier when TeX supports creating Tagged PDF.
But that's still a fair way off.

Cheers.

   Ross

On 12/03/2019, at 21:10, "Janusz S. Bień" 
mailto:jsb...@mimuw.edu.pl>> wrote:

On Tue, Mar 12 2019 at 10:38 +01, Michal Hoftich wrote:
> Hi Janusz,
>
>> --8<---cut here---start->8---
>> (/usr/share/texlive/texmf-dist/tex/generic/tex4ht/biblatex.4ht
>> ! Undefined control sequence.
>>  ...docsvlist \expandafter {\bbl@loaded
>> }\fi
>> l.228 \fi}{}
>> --8<---cut here---end--->8---
>
> It is hard to say what is going on without a TeX example. This seems
> like an issue with BibLaTeX support, but without trying an actual TeX
> example it is hard to guess what the problem is.

I don't mind sending the source files to anybody interested. I can also
try to prepare a minimal example.

Best regards

Janusz

--
,
Janusz S. Bien
emeryt (emeritus)
https://sites.google.com/view/jsbien

Re: [XeTeX] Fake italics for some characters only

2018-12-05 Thread Ross Moore

BPJ and John Was,

Please join this XeTeX list.
Otherwise I have to authorize each of your postings.
This delays them being sent out to everyone.

Cheers.

 Ross

On 05/12/2018, at 22:10, "BPJ" mailto:b...@melroch.se>> wrote:

@Zdenek, the point is that other characters inside `\textit` should be real 
italics. I at least have tried it using a macro around the "culprit" characters 
and I think it looks better than fake italics throughout, which looks really 
bad (shades of low-budget publications from the early eighties! :-). Anyway I'm 
working on a solution in my head which I'll try when I get back to my desktop. 
I think I'll try to use a boolean which I set/unset at the start/end of my 
"`\mytextit` and a single macro for the active characters which checks this 
boolean. I have no idea yet if it will work, but it seems the semantically 
cleanest way to do it to my mind.

/bpj

ons 5 dec. 2018 kl. 10:53 skrev Zdenek Wagner 
mailto:zdenek.wag...@gmail.com>>:
Hi,

I am afraid that I do not understand why to make only 4 FakeSlant
characters instead of a FakeSlant font. Does it mean that other
characters will remain upright inside \textit?

Anyway, making a few characters active for \textit is quite simple.
Let's suppose that A and B should be active. You then define:

\def\mytextit{\begingroup \catcode`\A=13 \catcode`\B=13 \dotextit}
\def\dotextit#1{\textit{#1}\endgroup}

You will then call \mytextit{Test of A and B}

Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz

st 5. 12. 2018 v 5:51 odesílatel Alan Munn 
mailto:am...@gmx.com>> napsal:
>
> Can you provide a bit more detail? Maybe a small example document?
>
> Alan
>
>
> Benct Philip Jonsson wrote:
> > I have a somewhat unusual problem. In a document produced using
> > XeLaTeX I need to use four Unicode letters with scarce font support in
> > italicized words and passages but the font which I have to use
> > supports these characters only in roman. The obvious solution is to
> > use the FakeSlant feature of fontspec but I don’t want to enclose
> > these characters in a command argument, in the hope that a future
> > version of the document can use an italic font which supports these
> > characters, but neither do I (perhaps needless to say) want to use
> > fake italics except for these four characters. In other words I would
> > like to perform some kind of “keyhole surgery” in the preamble and use
> > these characters normally in the body of the document, which I guess
> > means having to make them active and somehow detect when they are
> > inside the argument of `\textit`. (Note: it is appropriate to use
> > `\textit` rather than `\emph` here because the purpose of the
> > italicization is to mark text as being in an object language in a
> > linguistic text.) Is that at all possible? I guess I could wrap
> > `\textit` in a macro which locally redefines the active characters,
> > but I’m not sure how to do that, nor how to access the glyphs
> > corresponding to the characters once the characters are active. I am a
> > user who isn’t afraid of using and making the most of various packages
> > or of writing an occasional custom command to wrap up some repeatedly
> > needed operation, but I am no expert. I am aware of all the arguments
> > against fake italics — that is why I want to limit the damage as much
> > as possible! — but I have no choice here. Waiting for the/an
> > appropriate font to include italic versions of these characters is not
> > an option at the moment.
> >
> > /Benct
> >
> >
> >
> > --
> > Subscriptions, Archive, and List information, etc.:
> >  
> > http://tug.org/mailman/listinfo/xetex
>
>
>
> --
> Subscriptions, Archive, and List information, etc.:
>   
> http://tug.org/mailman/listinfo/xetex

--
Subscriptions, Archive, and List information, etc.:

http://tug.org/mailman/listinfo/xetex

--
Subscriptions, Archive, and List information, etc.:
 https://protect-au.mimecast.com/s/kD-7CoV1Y2SDN42oTVDIZs?domain=tug.org

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Controlling font embedding in PDF output

2018-11-16 Thread Ross Moore

Hi Werner,

On 17/11/2018, at 1:36, "Werner LEMBERG"  wrote:

> 
> > > Is there a simple option to make XeTeX (or rather xdvipdfmx) not
> > > embed fonts in PDFs? I'm going to post-process the output, which
> > > will do the embedding.
> >
> > Perhaps it is easier to generate the PDF, then remove the embedded
> > fonts?
> 
> Not for my use case, which is to include many PDFs (generated by
> LilyPond) into a master PDF (generated by XeLaTeX). The
> post-processor (Ghostscript's ps2pdf script) should then compute
> subsetted fonts for the whole document, which can make the final PDF
> *a lot* smaller in comparison to the standard way because subsetted
> fonts usually can't be merged.

Are you sure that this is even feasible, in that the same characters are 
referred to in the same way, in each of the Lilypond PDFs?

If the fonts are all Type1, with the same encodings in each PDF, this would be 
OK.
But I've seen PDFs where the subsetting of Type0 or TTF fonts is as an array, 
which simply assigns a number to the used glyphs, perhaps in the order of first 
occurrence within the PDF. These certainly cannot be merged, without adjusting 
essentially every string in every embedded PDF.

> 
> In LilyPond I can control whether its output PDF gets generated
> (1) the usual way (using subsetted fonts), (2) with embedded but not
> subsetted fonts, or (3) without embedded fonts. Ideally, I want
> option (3) for XeTeX (and for pdfTeX and luatex also, BTW). If this
> isn't possible, I would like to enforce option (2) so that ps2pdf can
> still do a decent job (at the cost of larger intermediate PDFs).

If you can get this to work, I'd be very interested in the technique.
Otherwise, a possible alternative approach is to combine the PDFs into a single 
Portfolio, using Adobe's Acrobat Pro. However I'd doubt that this gives any 
saving in file size over inclusion as attachments.

> 
> 
> Werner
> 

Hope this helps.

Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Could Adobe Photoshop's "blending options" for text be supported in a future {Pdf|Xe}TeX variant

2018-04-04 Thread Ross Moore

Hi again Phil,

On Apr 5, 2018, at 8:39 AM, Ross Moore
mailto:ross.mo...@mq.edu.au>> wrote:

* https://www.dropbox.com/s/b7a1383rb1dx2vp/Ao%20dais.pdf?dl=0
*
https://www.dropbox.com/s/7s6s7n9w8popiyg/MENU%20001%20new%20ellipse.pdf?dl=0
*
https://www.dropbox.com/s/smmcjy9zuuxa1nu/MENU%20001%20%28metallic%20gold%20text%20demo%29.pdf?dl=0

I would be interested in others' reactions to this.

These are using PDF’s concept of “Text Rendering” modes.
In particular 7 Tr meaning mode 7,
which uses the outlines of characters to be the clipping path for an underlying
graphic.

I was able to extract the “underlying graphics” from the 1st of your PDFs.
Here is the one for the 2nd line of text.

[cid:F4531BBB-B66D-4E0C-B6BC-F8AE8E276DB3@mq.edu.au]

You can see the need to have a clipping path, created from the text using the
same font that was “blended” to make this image.
The need for exact positioning can also be appreciated from the image.

Below is the image for the gold text in the 3rd PDF.
Again the need for clipping is apparent.

[cid:3F4B8FAD-7EA9-4891-AA17-7205E30FB42C@mq.edu.au]

Thus the letter shapes restrict what parts of the graphic come shining through.

This is essentially already available with pdfTeX; viz.

https://tex.stackexchange.com/questions/250156/problem-with-pdfliteral/250162#250162

There is one part missing: how to make the underlying graphic correctly?
e.g., to have letters looking like they are embossed, or standing out in 3D,
etc.

You need to construct the desired view in an image, and then place the actual
characters,
with appropriate rendering mode, exactly over that image so that only the
desired parts are shown.
This requires external image-processing software, which is what you paid Adobe
to do with Photoshop.

Philip Taylor

Hope this helps.

Ross

Dr Ross Moore

Mathematics Dept | 12 Wally’s Walk, 734
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955 | F: +61 2 9850 8114
M:+61 407 288 255 | E:
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>

[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>

CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>

--
Subscriptions, Archive, and List information, etc.:
http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Could Adobe Photoshop's "blending options" for text be supported in a future {Pdf|Xe}TeX variant

2018-04-04 Thread Ross Moore

Hi Phil.

On Apr 5, 2018, at 4:38 AM, Philip Taylor (RHUoL)
mailto:p.tay...@rhul.ac.uk>> wrote:

I have been playing with Adobe Photoshop's "blending options" for text
recently, adding a gold or metallic texture to otherwise plain text. The
results are visually very striking, and I therefore began to wonder whether
similar functionality might one day be added to Pdf/XeTeX, in the former case
natively and in the latter case via \specials and an extended (x)dvipdfm(x)
driver.

Three examples of the sorts of effect I have in mind can be seen at :

I would be interested in others' reactions to this.

These are using PDF’s concept of “Text Rendering” modes.
In particular 7 Tr meaning mode 7,
which uses the outlines of characters to be the clipping path for an underlying
graphic.
Thus the letter shapes restrict what parts of the graphic come shining through.

This is essentially already available with pdfTeX; viz.

https://tex.stackexchange.com/questions/250156/problem-with-pdfliteral/250162#250162

There is one part missing: how to make the underlying graphic correctly?
e.g., to have letters looking like they are embossed, or standing out in 3D,
etc.

Philip Taylor

Hope this helps.

Ross

Dr Ross Moore

Mathematics Dept | 12 Wally’s Walk, 734
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955 | F: +61 2 9850 8114
M:+61 407 288 255 | E:
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>

[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>

CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

--
Subscriptions, Archive, and List information, etc.:
http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] xelatex to doc?

2018-01-30 Thread Ross Moore

Hi Robert and Michal,

On Jan 31, 2018, at 8:29 AM, Michal Hoftich 
mailto:michal@gmail.com>> wrote:

Hi Bob,

On Tue, Jan 30, 2018 at 6:57 PM, Hueckstedt, Robert A. (rah2k)
mailto:ra...@virginia.edu>> wrote:
With a publisher’s permission I used xelatex to provide them copy, not
camera-ready copy, for a long book that has Sanskrit in Devanagari  and an
English translation. Of course, the files I provided the publisher are pdfs.
Now, the publisher wants them in doc. When they try to cut and paste from
the pdf to doc, none of the conjunct consonants are recognized in the doc
file. I used the velthuis-sanskrit mapping, and I am wondering if using the
RomDev mapping would make a difference. I somehow doubt it. Suggestions?

You can try to compile your TeX file to HTML using tex4ht. The HTML
code can be then pasted to Word. Basic command to compile XeTeX file
to HTML is

  make4ht -ux filename.tex

This might work, but first I’d try using Acrobat Pro to save the PDF
directly into a Word document.

This *can* work really well, especially when the PDF is enriched with
some tagging and the correct ToUnicode CMap resources for the fonts.
Try it and see if the result is reasonable.

Alternatively, you can Export to HTML from Acrobat Pro; though I’d
expect that if the .doc export is no good, then the HTML export would
suffer from similar issues.

It may even be that Adobe Reader can do these exports now,
as it is the same code-base.


Development version of make4ht can compile also to the ODT format,
which can be opened directly in Word:

  make4ht -ux -f odt filename.tex

It is possible that you will need some additional configurations for
the correct compilation. It depends on used packages or custom macros
in the document.

Best regards,
Michal


Hope this helps.

Ross


PS. if you don’t have access to Acrobat Pro to try this,
can you send me a few pages. I’ll then try it for you.
If the result is good, that may be sufficient reason for you
to consider investing in a license.


Dr Ross Moore
Mathematics Dept | 12 Wally’s Walk, 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>

http://www.maths.mq.edu.au


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] metalogo and bidi packages

2017-06-19 Thread Ross Moore

Hi Adam,

On Jun 20, 2017, at 8:10 AM, maxwell 
mailto:maxw...@umiacs.umd.edu>> wrote:

I've installed the new TeXLive 2017.  There is a conflict between the metalogo 
and bidi packages.  I don't suppose this would be a biggie, except that the 
xltxtra package loads metalogo.  (And something else I'm using loads xltxtra...)

The conflict is shown in this minimal example:
--
\documentclass{report}
\usepackage{metalogo}
\usepackage{bidi}

It happens while processing the file
   latex-xetex-bidi.def

since metalogo has already defined macros:  \XeTeX  and  \XeLaTeX  .


\begin{document}
hi
\end{document}
--

The error msg is:

(/home/groups/tools/texlive/2017/texmf-dist/tex/xelatex/bidi/latex-xetex-bidi.d
ef

! LaTeX Error: Command \XeTeX already defined.
  Or name \end... illegal, see p.192 of the manual.

See the LaTeX manual or LaTeX Companion for explanation.
Type  H   for immediate help.
...

l.122 ...di@reflect@box{E}}\kern-.1667em \TeX}}$}}


The reverse loading order (bidi, then metalogo) triggers an error msg from bidi 
about loading order, and probably wouldn't help anyway.

This following works smoothly, and allows access to both versions of the logo.
Notice that bidi’s version of \XeLaTeX is slightly narrower than the one from  
metalogo.

\documentclass{report}
\usepackage{graphicx}
\usepackage{fontspec}
\usepackage{bidi}
\usepackage{metalogo}
%\usepackage{bidi}

% comment or delete these lines, in practice
\makeatletter
\show\XeTeX
\show\original@XeTeX
\makeatother

\begin{document}
Hi, from
\XeTeX\ and \XeLaTeX!

\makeatletter
Hi, from
\original@XeTeX\ and \original@XeLaTeX!
\makeatother

\end{document}




For the time being, doing the following before bidi is loaded seems to solve 
the problem:
-
\let\XeTeX\relax
\let\XeLaTeX\relax
-

  Mike Maxwell
  University of Maryland


--
Subscriptions, Archive, and List information, etc.:
http://tug.org/mailman/listinfo/xetex


Dr Ross Moore
Mathematics Dept | 12 Wally’s Walk, 734
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: ross.mo...@mq.edu.au<mailto:ross.mo...@mq.edu.au>

http://www.maths.mq.edu.au


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] pst-fill boxfill failure when compiling with XeLaTeX

2017-06-13 Thread Ross Moore

Hello Daniel,

On 14/06/2017, at 7:45, "Daniel Greenhoe"  wrote:

> Probably the most important reason I would like the XeTeX environment
> is because of the unicode font handling and ease of font switching
> (when the graphic includes text). However, even in that case, I could
> render the graphic with dvips+ps2pdf (as you said) and then apply the
> text on top of that using XeTeX.

There are several environments that help with this kind of thing;
e.g.,   LaTeX's  {picture}  environment
  Tikz  
   Xy-pic's  \xyimport  function.

The latter is extremely versatile, as it sets up a coordinate system based on 
the size of the imported image, without needing to know explicit dimensions.
Then you can use it to go anywhere within the image and use any of Xy-pic's 
graphic elements to place text, draw lines and arrows in different styles, put 
frames around parts of the picture, and much more. All this in a coordinate 
independent way, in case you decide to rescale the imported image, but retain 
the same font sizes.

> 
> Thank you again,
> Daniel

Hope this helps.

   Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] :letterspace oddities

2016-08-26 Thread Ross Moore

Hi Jonathan, Zdenek, Phil,

On 26/08/2016, at 19:42, "Jonathan Kew"  wrote:

> On 25/8/16 18:02, Philip Taylor wrote:
>> For some time now I have been partially aware of some oddities in the
>> XeTeX implementation of :letterspace, but it was only today that my
>> thoughts crystallised sufficiently for me to attempt to record them on-
>> list :
>> 
>> 1) Search functionality.
>> 
>> For :
>> 
>> \font \errorfont = "Copperplate Gothic Bold:letterspace=8,color=BF"
>> scaled 2260
>> 
>> \newbox \errorbox
>> 
>> \setbox \errorbox = \leftline {\errorfont +++ NOT AT TOP OF PAGE +++}

>> Adobe Acrobat 7.1 has no problem locating the string "+++" if the
>> contents of \errorbox end up in the PDF file; however, for
>> 
>> \font \errorfont = "Copperplate Gothic Bold:letterspace=16,color=BF"
>> scaled 2260
>> 
>> the same string cannot be found.

> 
> Remember that TeX doesn't treat spaces as "characters" but as glue, which 
> means they don't end up as part of the *text* in the resulting DVI or PDF 
> file; they are merely implied by the positioning of the visible glyphs.
> 
> As a result, consider what Acrobat must be doing: it can "see" the visible 
> glyphs and their positions, but it "sees" no  characters separating 
> words. It must be inferring which characters are adjacent in the text stream, 
> and which are separated by spaces, purely from their positions. So when you 
> add a substantial amount of letter-spacing, it seems likely that Acrobat will 
> view the text as being "+ + +" rather than "+++".

Yes, this is a very good way of explaining it.

TeX's failure to include actual spaces in the output text-strings within the 
PDF is a double-edged sword. 
  On one hand, by treating spaces as glue, it is what allows TeX to produce the 
high-quality visual appearance that it does;
  but on the other hand this is the reason why normal TeX-produced PDFs do not 
work with Acrobat's 'Reflow' feature, when this setting is requested in the 
viewer.
(Think about how text in a web browser readjusts to fit a window when the 
viewing font size is increased, or when the window size is reduced.) 
For many people, especially those with eyesight difficulties, Reflow is 
extremely important. 

With small screens, as on smartphones and tablets, the lack of reflow within 
most PDF readers, is one of the biggest objections to use of PDF as a file 
format, as compared with HTML and XML-based formats, which do allow reflow.
As for the proliferation of PDF 2.0, PDF/UA and Tagged PDF formats generally, 
(e.g., as international standards) TeX will never be properly in the game 
unless the output is adjusted to include spaces within the output strings, in 
the font being use for the text.

Note that pdfTeX now has a mode that allows 'fake' spaces to be inserted, based 
upon the distance between letters, when sufficient for it to be reasonably 
inferred that a space must have been in the original input. But these are in a 
different font to the surrounding text, and as such are not regarded by Adobe 
Acrobat/Reader to be part of normal text strings, for the purpose of reflow.
Besides, the continual switching of fonts between text and fake spaces, adds 
quite a bit to the total size of the PDF file.

This is one direction that could be explored by the XeTeX, and dvipdfmx 
developers.
Develop a method to reinsert spaces into the PDF output, without altering the 
spacing in the non-reflowed view.

> It's possible that \XeTeXgenerateactualtext=1 would help,

How does this work?
Does it use a heuristic to infer that a space was originally present?
Or does it only work with syllables and special characters?
Can a user provide customized input to the actual-text strings, that will not 
affect typesetting?

> as I think it would annotate the letter-spaced "+++" as a unit with its 
> actual text, allowing Acrobat to find it correctly despite the intervening 
> spaces that *appear* to be present from just looking at the glyphs.

I'd certainly like to see the results of this kind of testing.

> 
> JK

Hope this helps.

Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [tex-live] Seemingly inexplicable shift in page origin between TL 2014 and TL 2016

2016-07-15 Thread Ross Moore

Hi Phil.

On Jul 15, 2016, at 5:28 PM, Philip Taylor 
mailto:p.tay...@rhul.ac.uk>> wrote:

 I am intrigued to know why a package intended to support colour would want to 
set page size, though, and wonder from where it gets its information regarding 
the intended page size,

One of my messages answered this.
It is so that  \setpagecolor  can work correctly.
A coloured rectangle is drawn, at the size of the full page.
 \shipout  is patched to do it on every page.

since by the time that package {color} is loaded I have set all possible page 
dimensions to my intended size (B5, in this case).

Try  \pagecolor{yellow}  or somesuch.
Enjoy.


** Phil.
--

Philip Taylor

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [tex-live] Seemingly inexplicable shift in page origin between TL 2014 and TL 2016

2016-07-15 Thread Ross Moore

Hi David,

On Jul 15, 2016, at 4:47 PM, David Carlisle 
mailto:d.p.carli...@gmail.com>> wrote:

Is there a mechanism similar to  \hypersetup  that allows the options to be 
changed
*after* the package has been loaded?

Not really, although the actual setting is in \AtBeginDocument{\AtBeginDVI  so 
as long
as you set the relevant registers to the right value it'll get set up (but let 
me know if you need more hooks)

No, I don’t need any more hooks.
A previous message explained why.

There is a very minor issue, as follows.

PDF/X generally requires CMYK color space, whereas PDF/A and PDF/E usually use 
RGB colors.
If one specifies  \pagecolor{yellow}  say, then it will use whatever color 
space was stated
when ‘yellow’ was defined as a color.
This would lead to a validation problem if it wasn’t the same space as the PDF 
type requires.

The “fix” for this is to use the  xcolor  package, and force conversions into 
the right color space.
Thank you for writing this, so many years ago!

Currently  pdfx.sty  force loads  xcolor   for PDF/X.
For some reason the command with PDF/A was commented-out, so that  xcolor would 
be loaded
only if requested by the document author, as per usual.
(I must have been testing something and didn’t uncomment it before releasing 
the package;
or maybe I thought more tests were needed, and just forgot about it.)

I do, however, put in a check that prevents the author from using the wrong 
color space.

The next version of  pdfx.sty  will force loading of  xcolor  in all 
situations, since there’s
no easy way of knowing what or when the author might request colours.


What is the issue?
  xcolor  can be very noisy:
viz.

Package xcolor Warning: Incompatible color definition on input line 3993.

[48]][48

Package xcolor Warning: Incompatible color definition on input line 4027.


Package xcolor Warning: Incompatible color definition on input line 4101.


Package xcolor Warning: Incompatible color definition on input line 4115.

[49]][49

That's 3 warnings per page.
Fortunately the final result is fine, passing validation.

xcolor  has an option  hideerrors  but this doesn’t suppress these warnings.



Cheers

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [tex-live] Seemingly inexplicable shift in page origin between TL 2014 and TL 2016

2016-07-14 Thread Ross Moore

Hi again David,

On Jul 15, 2016, at 9:19 AM, Ross Moore 
mailto:ross.mo...@mq.edu.au>> wrote:

Hi David,

There is a potential conflict with  pdfx.sty  setting the  /MediaBox .

OK, I see what is going on now.
You are allowing a colored rectangle to be drawn the size of the page,
to support coloured pages, yes?

Nothing to do with the PDF Boxes, except of course you want the sizes
to match; especially when a  \mag  is used.



what does \usepackage [nosetpagesize]{color} do?

Is there a mechanism similar to  \hypersetup  that allows the options to be 
changed
*after* the package has been loaded?

OK; it just sets a boolean flag.


Alternatively, can I detect whether the  pagesize  special has been done 
already?
Then not repeat specifying  /MediaBox  when setting the other boxes:  
Bleed/Crop/Trim
which are required for PDF/X validation.

If not loaded yet, I can do  \PassOptionsToPackage{nosetpagesize}{color} .
But I’ll want to catch the case also if it is loaded.

Looks like this won’t be necessary.
The question now will be how having such colored pages affects validation.
Hopefully not at all, for PDF/X and PDF/A.

Maybe PDF/UA, according to the actual colors, but that would be a visual check 
not automated.


Thanks for a new code-branch to try out.


Cheers

 Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [tex-live] Seemingly inexplicable shift in page origin between TL 2014 and TL 2016

2016-07-14 Thread Ross Moore

Hi David,

On Jul 15, 2016, at 8:49 AM, David Carlisle 
mailto:d.p.carli...@gmail.com>> wrote:

well that'll be the page size special change as mentioned earlier I assume.

Hmm. In which version of  color.sty  was this introduced?
Presumably later than:

Package: color 2016/05/09 v1.1c Standard LaTeX Color (DPC)

There is a potential conflict with  pdfx.sty  setting the  /MediaBox .


what does \usepackage [nosetpagesize]{color} do?

Is there a mechanism similar to  \hypersetup  that allows the options to be 
changed
*after* the package has been loaded?

Alternatively, can I detect whether the  pagesize  special has been done 
already?
Then not repeat specifying  /MediaBox  when setting the other boxes:  
Bleed/Crop/Trim
which are required for PDF/X validation.

If not loaded yet, I can do  \PassOptionsToPackage{nosetpagesize}{color} .
But I’ll want to catch the case also if it is loaded.


Cheers

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [tex-live] Seemingly inexplicable shift in page origin between TL 2014 and TL 2016

2016-07-14 Thread Ross Moore

Hi Phil,

On Jul 15, 2016, at 6:02 AM, Philip Taylor 
mailto:p.tay...@rhul.ac.uk>> wrote:

Very happy to believe that, Ross (and will test later) but what I do not 
understand is why things have changed so dramatically in going from TeX Live 
2014 to 2016.  In 2014, adding the PDF/X-1A specials made no difference 
whatsoever to the position of size of the page; in 2016, they cause a vertical 
displacement.

There have been significant changes, at least in  xdvipdfmx  since 2014.

And the interaction with the color package is also inexplicable.  Anyhow, I 
first have to come up with a truly MWE and then I will be in a better position 
to investigate and report back.

Please send me such MWE when you have it.

I’ve been trying a real-world example (Serbian version of TeXLive documentation)
that compiles fine (with one small hiccup that doesn’t seem to affect the 
result)
using 2016’s  xdvipdfmx , but which crashes out almost immediately with 2014 
saying:

SCI:TL-SR16 ross$ /usr/local/texlive/2014/bin/universal-darwin/xdvipdfmx -z 0 
-vv --kpathsea-debug 4095 texlive-sr.xdv
texlive-sr.xdv
 -> texlive-sr.pdf
kdebug:fopen(texlive-sr.xdv, rb) => 0xa0e103ec
DVI ID = 7

xdvipdfmx:fatal: Something is wrong. Are you sure this is a DVI file?

Output file removed.
SCI:TL-SR16 ross$

Note that I have maximum verbosity turned on, as well as a lot of  kpathsea  
tracing.
Yet still it isn’t clear where it is going wrong.

So I’d appreciate your cut-down MWE to test with both versions.
Then we can play with /MediaBox and /CropBox values, to see whether that
is the cause of what you are getting. Or whether it is something else.



** Phil.

Cheers

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [tex-live] Seemingly inexplicable shift in page origin between TL 2014 and TL 2016

2016-07-14 Thread Ross Moore

Hi Phil,

On Jul 14, 2016, at 10:02 PM, Philip Taylor 
mailto:p.tay...@rhul.ac.uk>> wrote:


Hallo David, and thank you for your suggestions.  I have now found the 
following :

1) The primary cause of the displacement is the PDF/X-1A:2003 specials -- 
remove these, and the page returns to its normal vertical position, but shifted 
outwards (leftwards) somewhat;
2) The use of \usepackage {color} within e-plain's \beginpackages ... 
\endpackages; in TeX Live 2016, this makes the page size considerably wider, 
IFF the PDF/X-1A specials are not emitted.

The displacement is visible both in TeXworks viewer and in Adobe Acrobat.


I think what you are seeing is due to the  /CropBox settings.
A PDF viewer shows the contents of the cropped area,
scaling to fill either the width or height or both.

There are 2 kinds of test that you can do.

1.  Print a few pages of your document, once with the \specials, once without.
Is there any difference in the position of your content on the printed page?

2. Vary the numbers in  /CropBox [ a b c d ] ;
such that (a,b) is bottom-left  and  (c,d)  is top-right corners of a 
rectangle.
Values such as  [ 200 200 300 300 ] should crop to a small-ish portion of a 
page,
which is then scaled up to fit your window-size.
   Observe the value of your browser’s scaling factor.

   With different values of [ a b c d ] you can simulate vertical and 
horizontal shifts,
to a small extent, according to how much smaller the /CropBox is, compared to
the /MediaBox.

Does this interpretation agree with what you observe?



A4 is indeed the default in both, and the difference between A4 and letter is 
only 3/4", whereas it appears to require a 1" correction ...

** Phil.


Cheers,

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Σχετ: Plain XeTeX, pdftitle, pdfinfo

2016-07-13 Thread Ross Moore


On Jul 11, 2016, at 7:59 PM, Zdenek Wagner 
mailto:zdenek.wag...@gmail.com>> wrote:

Especially this one, it depends on a lot of files. I wanted to extract ideas 
how to build the XMP, how to include the ICC but I gave up.

XMP is done via a template file; e.g.  pdfx.xmp  or  pdfa.xmp .
There are many places where information can be supplied, via macros
such as  \xmp@Subject  and  \xmp@Author .
Much of the  pdfx  package is about supplying values for these, in UTF8 
encoding.


I know, writing XMP is easy, I do not know how to include it. I do not like to 
use hyperref for a document that will only be pronted and never will be online.

Using pdfTeX it is done like this in  pdfx.sty :

   \def\pdfx@numcoords{/N 4}% for CMYK colors
   \immediate\pdfobj stream attr{\pdfx@numcoords} file %
 {\pdfx@CMYKcolorprofiledir\pdfx@cmyk@profile}%
   \edef\OBJ@CMYK{\the\pdflastobj\space 0 R}%

Then that object reference in \OBJ@CMYK  is required for the  OutputIntent .
viz.

  \def\pdfx@outintent@dict{%
/Type/OutputIntent
/S/GTS_PDFX^^J
/OutputCondition (\pdfx@cmyk@intent)^^J
/OutputConditionIdentifier (\pdfx@cmyk@identifier)^^J
/Info(\pdfx@cmyk@intent)^^J
/RegistryName(\pdfx@cmyk@registry)
/DestOutputProfile \OBJ@CMYK
   }%

which is linked to the PDF Catalog via:

 \immediate\pdfobj{<<\pdfx@outintent@dict>>}%
  \edef\pdfx@outintents{[\the\pdflastobj\space 0 R]}%
 \def\pdfx@outcatalog@dict{%
  /ViewerPreferences <>
  /OutputIntents \pdfx@outintents % needs appropriate expansion
 }%
 \pdfcatalog{\pdfx@outcatalog@dict}%


Of course you need to supply all the information for the macros:
  \pdfx@cmyk@….
and  \pdfx@CMYKcolorprofiledir  (possibly empty).


Using XeTeX there is similar coding using  \special s,
including symbolic names for object references.
e.g.

\def\OBJ@CMYK{@colorprofile}%
\special{pdf:fstream @colorprofile %
  (\pdfx@CMYKcolorprofiledir\pdfx@cmyk@profile) <<\pdfx@numcoords >>}
   \def\pdfx@outintents{ @outintentsarray }%
   \def\pdfx@outintentref{ @outintent@dict }%
   \immediate\special{pdf:obj \pdfx@outintentref << \pdfx@outintent@dict >>}
   \immediate\special{pdf:obj \pdfx@outintents [ ]}%
   \immediate\special{pdf:put \pdfx@outintents \pdfx@outintent@dict}%

with \pdfcatalog defined appropriately:

 \def\pdfx@catalog@xetex#1{\special{pdf:put @catalog <<#1>>}}


You should be able to put all the pieces together now.

Cheers,

Ross


Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz<http://icebearsoft.euweb.cz/>


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Σχετ: Plain XeTeX, pdftitle, pdfinfo

2016-07-11 Thread Ross Moore

e are many places where information can be supplied, via macros
such as  \xmp@Subject  and  \xmp@Author .
Much of the  pdfx  package is about supplying values for these, in UTF8 
encoding.



Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz<http://icebearsoft.euweb.cz/>


Cheers

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Σχετ: Plain XeTeX, pdftitle, pdfinfo

2016-07-10 Thread Ross Moore

nserted into \shipout to insert the bounding 
boxes on each page;

  pdfx.sty  uses\RequirePackage{everyshi}  and  \EveryShipout   for this.
It’s perhaps a bit of overkill, but a standard way to patch  
\shipout .

  1.  the colours will need to be converted to the desired output profile using 
Adobe Acrobat;

pdfx.sty  uses the  xcolor  package to handle this.
   Once a Color Profile is declared (either CMYK or RGB) the 
appropriate options
   are prepared for  xcolor  then the package is loaded with these 
options.
   Internal macros are rigged to stop changes being made, if the 
author tries to
   load the package separately. Similarly if  color  was loaded 
before  pdfx ,
   then appropriate coding imposes the correct color space.

   The upshot of this is that whenever a color is requested by name 
(‘blue’, ‘red’,
‘green’, ‘magenta’, etc.) then the correct color space 
coordinates are used.
Also, if a new color is declared (say as RGB) but the color 
model is CMYK,
then a conversion is done on the declaration, giving CMYK 
coords when that
new color is used.


  1.  the file will need to be reduced in size with Acrobat 4+ compatibility 
but with no image compression in order to convert it to PDF 1.3;

 Not sure of the specifics of this.
 Can anyone provide example documents?
 If this is really an issue, does  xdvipdfmx  have command-line options
 which allow specifying what can be compressed and what not?
 I don’t think so.

Such control is needed also to have uncompressed XMP Metadata
but compressed content streams, in all flavors/levels of PDF/A.
This is something that is highly desirable.

  pdftex and luatex already do this right, as also will Ghostscript when 
v9.20 emerges
from pre-release status.
The next version (1.5.9) of  pdfx.sty  will fully support  latex+dvips+GS  
using this.


  1.  the dimensions of the bounding boxes are for B5 in so-called "big points" 
(Postscript points) and will need to be amended for other page sizes;

 Setting these as a constant for all pages figures to be OK for most 
documents.
 Even better might be to reset to the size of each box being shipped-out.

 Since this can actually be done bypassing the \output  routine, then it
 requires patching  \shipout  rather than \makeheader  or similar.
 This is certainly an issue for further discussion.


  1.  \setboundingboxes will have to be called explicitly for the first page 
only.

\shipout can be hooked as follows :

\def \setboundingboxes
{%
\special {pdf: put @thispage << /ArtBox [0 0 498.89641  708.65968] 
>>}%
\special {pdf: put @thispage << /BleedBox [0 0 498.89641  
708.65968] >>}%
\special {pdf: put @thispage << /CropBox [0 0 498.89641  708.65968] 
>>}%
\special {pdf: put @thispage << /MediaBox [0 0 498.89641  
708.65968] >>}%
\special {pdf: put @thispage << /TrimBox [0 0 498.89641  708.65968] 
>>}%
}

Yes, (w/o /ArtBox ); but if you are hooking into  \shipout ,
why not measure the size of the box being shipped?
Do the conversion into actual points.
Will the bottom-left corner always be at  [0 0] ?
Probably need to look also at  \hoffset  and  \voffset .



    \newcount \maxpage
\maxpage = 
\let \Shipout = \shipout
\def \shipout {\ifnum \pageno < \maxpage \setboundingboxes \fi \Shipout}
--

Philip Taylor


Hope this helps,

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] xelatex, hyperref, and new TeXLive

2016-06-15 Thread Ross Moore


On Jun 16, 2016, at 9:45 AM, David Carlisle 
mailto:d.p.carli...@gmail.com>> wrote:

The result is that when you subsequently request   [dvipdfmx]  or  any other 
driver,
hyperref thinks that we are in non-dvi mode, so  *incorrectly* throws the error.

So it’s surely an omission in  hyperref.sty .

But you don’t actually need to specify a driver option,
and everything works OK anyway.

It only works with no option if you are not using a hyperref.cfg that specifies 
incompatible options:-)

OK. So [xetex] is the correct option to use, if any is needed.
Besides, the actual driver binary is   xdvipdfmx   not  dvipdfmx .




  Mike Maxwell

David


Cheers,

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] xelatex, hyperref, and new TeXLive

2016-06-15 Thread Ross Moore

Hi Mike, David, Herb,

On Jun 16, 2016, at 8:46 AM, maxwell 
mailto:maxw...@umiacs.umd.edu>> wrote:

With the help of David Carlisle and Herbert Schulz, I've found part of the 
problem.  For some reason, in the (our?) 2016 version, kpsewhich points to this 
hyperref.cfg file:
  ...texlive/2016/texmf-dist/doc/latex/listings-ext/hyperref.cfg

I’m seeing the same behaviour, but for me the packages are as follows:

(/usr/local/texlive/2016/texmf-dist/tex/latex/latexconfig/hyperref.cfg)

/usr/local/texlive/2016/texmf-dist/tex/latex/hyperref/hyperref.sty:4322: Packag
e hyperref Error: Wrong DVI mode driver option `dvipdfmx',
(hyperref)because XeTeX is running.




This .cfg file contains a \hypersetup{...} command that specifies 'ps2pdf'.  
Changing that to 'xetex' fixes the problem, at least for xelatex (I'm not sure 
what would happen with other flavors of latex).  (Update: removing the line 
entirely, so it specifies neither xetex nor ps2pdf, works too, and presumably 
won't cause trouble for other latices.)

But:
1) Why does kpsewhich find that file, instead of this one:
  ...texlive/2016/texmf-dist/tex/latex/latexconfig/hyperref.cfg
  which does not have any \hypersetup{} command, and which would
  presumably not cause the same problem?
2) Why did this change from 2015 to 2016?  We did a pretty vanilla
  install, I think the only non-default choice we made was to use
  'letter' instead of 'a4'.
3) Is this a bug? (meaning should I report it?)

Here is the relevant coding from  hyperref.sty  with annotations added by me.

\newif\ifHy@DviMode
This defines  \ifHy@DviMode and switches, leaves it as  \iffalse
\let\Hy@DviErrMsg\ltx@empty
\ifpdf
  \def\Hy@DviErrMsg{pdfTeX or LuaTeX is running in PDF mode}%
\else
  \ifxetex
This is already  \iftrue
\def\Hy@DviErrMsg{XeTeX is running}%
… but surely we should be setting  \Hy@DviModetrue  here !!!
  \else
\ifvtex
  \ifvtexdvi
\Hy@DviModetrue
  \else
\def\Hy@DviErrMsg{VTeX is running, but not in DVI mode}%
  \fi
\else
  \Hy@DviModetrue
\fi
  \fi
\fi

The result is that when you subsequently request   [dvipdfmx]  or  any other 
driver,
hyperref thinks that we are in non-dvi mode, so  *incorrectly* throws the error.

So it’s surely an omission in  hyperref.sty .

But you don’t actually need to specify a driver option,
and everything works OK anyway.


  Mike Maxwell


Hope this helps,

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX - PDF-x1a

2016-04-17 Thread Ross Moore

Hi Zdenek, Bruno, Arthur,

Recall this conversation from several months ago?

I've now got 2 important contributions to make.

On 06/09/2015, at 5:47 PM, Zdenek Wagner wrote:

2015-09-06 8:10 GMT+02:00 Bruno Le Floch 
mailto:blfla...@gmail.com>>:
On 9/4/15, Arthur Reutenauer 
mailto:arthur.reutena...@normalesup.org>> 
wrote:
> On Thu, Sep 03, 2015 at 10:44:19AM -0400, Adam Drissel wrote:
>> I need to be able to use XeTeX while still producing a PDF in x1a format
>> (PDF/A, PDF-x1a).  Do you have any idea how I can do this using XeTeX?
>
>   Unfortunately it's not really possible at the moment; the package pdfx
> aims at producing different standards of the PDF/A and PDF/X families
> but is aimed at pdfTeX.  To my knowledge there has been no serious
> effort to port it to XeTeX.

There has now!
I have successfully produced a validating PDF/A-2u document from
the Serbian version of the TeX-Live documentation for 2015.

As far as I remember from a talk at TUG 2015, the packages used to
produce PDF/A using pdftex were extended to work with LuaTeX (perhaps
using newer versions of the package) and one difficulty they faced
with XeTeX was the lack of a way to compute MD5 sums.  Now IIRC a
primitive was very recently added to XeTeX for MD5 sums.  So perhaps
it wouldn't be too much work to port the package to XeTeX.

#1
What is the name of this primitive please?
Was it really added?

One place where it was being used by  pdfTeX + pdfx.sty
is in generating a UUID; i.e., an identifier that can virtually
be guaranteed to be unique to the document being processed.

It isn't actually needed, for this purpose, but it would save
a significant amount of processing of macro evaluations.

Scott Pakin's coding in  hyperxmp  emulates a seeded RNG
(random number generator) to generate such a unique ID.
Currently I'm using Scott's coding with XeTeX + pdfx.sty
but the md5 sum primitive would shorten this considerably,
and most likely works much faster.

Luatex and pdftex have the \pdfminorversion primitive to set the required PDF 
version. AFAIK there is no way how XeTeX could communicate such a requirement 
to xdvipdfmx. The only way is to call xelatex -no-pdf ... and xdvipdfmx -V4 ... 
(default is PDF 1.5 but PDX/x-1a:2003 requires PDF 1.4).

#2
There's more to it than just this.

The command-line needs to be (something like):

xelatex -output-driver="xdvipdfmx -z 0"  .tex

This "-z 0" is needed because the XMP Metadata packet must *not*
be compressed, but must remain readable as plain text, in UTF-8 encoding.
With "-z 1" or higher, the Metadata is compressed.

With  texlive-sr.pdf  the filesize difference is enormous:
 ~798 kb  with  "-z 1"
~10.4 Mb  with  "-z 0"

I've been unable to find a way to specify that parts of the generated PDF
be uncompressed while other parts can be.

I really don't know tbh,
Bruno

Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz

If any of the XeTeX Development team can help with either of these
issues, I'd be most appreciative.
And we'd be getting a much better product for generating PDF/A files.

Cheers,

Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] babel

2016-03-24 Thread Ross Moore

Hi Javier,

On Mar 24, 2016, at 5:59 PM, Javier Bezos 
mailto:jbez...@gmail.com>> wrote:

Apostolos,

preface = \textPi \textrho\acctonos \textomicron\textlambda 
\textomicron\textgamma

XeLaTeX is Unicode aware and can handle Unicode strings. Therefore, I fail to 
see
why you are doing things this way. The LGR font encoding is an ancient hack that
has no usage anymore.

Of course, in Unicode engines the default captions section
apply, not the captions.licr subsection.

I think that it is absolutely correct that you build in continuing support
for old encodings that may no longer be used with new documents.

The existence of old documents using such encodings certainly
warrants this — especially in the case of archives that process
old (La)TeX sources to create PDFs on the fly.

It is quite possible that in future these will be required to conform
to modern standards, rather than just reproduce exactly what those
sources did in past decades. Then there is the issue of old documents
being aggregated with newer ones, for “Collected Works”-like publications.

It is quite wrong to say that because we now have newer, better methods
that those older methods should be discarded entirely.


I’m facing exactly this problem, adapting  pdfx.sty  to be able to translate
Metadata provided in old encodings: KOI8-R, LGR, OT6 etc.
automatically into UTF-8, because the latter is required by XMP for
requirements to satisfy PDF/A, PDF/X and PDF/E standards.



Javier

Keep up the good work.

Cheers,

Ross


Dr Ross Moore

Mathematics Dept | Level 2, S2.638 AHH
Macquarie University, NSW 2109, Australia

T: +61 2 9850 8955  |  F: +61 2 9850 8114
M:+61 407 288 255  |  E: 
ross.mo...@mq.edu.au<mailto:rick.min...@mq.edu.au>

http://www.maths.mq.edu.au<http://mq.edu.au/>


[cid:image001.png@01D030BE.D37A46F0]<http://mq.edu.au/>


CRICOS Provider Number 2J. Think before you print.
Please consider the environment before printing this email.<http://mq.edu.au/>

This message is intended for the addressee named and may
contain confidential information. If you are not the intended
recipient, please delete it and notify the sender. Views expressed
in this message are those of the individual sender, and are not
necessarily the views of Macquarie University.<http://mq.edu.au/>



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] potential new feature: \XeTeXgenerateactualtext

2016-02-24 Thread Ross Moore

Hi Will,

> On Feb 25, 2016, at 5:19 PM, Will Robertson  wrote:
> 
> Hi Ross,
> 
> Great to hear from you.
> I thought of you straight away when writing my email :)
> 
> 
>> On 25 Feb 2016, at 11:35 AM, Ross Moore  wrote:
>> 
>> You have to be *very* careful with /ActualText, since it must be done using 
>> PDFdoc encoding, 
>> as it becomes part of the page contents stream.
>> Any errors will corrupt the PDF file completely — but that’s true of other 
>> things as well.
>> Heiko’s  \pdfstringdef  in the hyperref package is very good for handling 
>> this…
> 
> That’s good to know, thanks.
> I think there has been *some* work by one or two of the LaTeX3 members on 
> general methods for this sort of thing, but it’s been a while.

Send me their names.
I may have a bit more time this year.

>> Look at some of my papers associated with TUG conferences, to see various
>> options that can be used to make mathematics more accessible in PDFs; i.e.,
>> papers numbered as 5, 6, 7 on this page: 
>> 
>>  http://www.tug.org/twg/accessibility/
>> 
>> Although these were done using pdfTeX, some of these things should be able
>> to be implemented for XeTeX + xdvipdfmx  also.
> 
> This is exactly where I was going with all this (so we’re getting quite far 
> away from the new primitive).
> My understanding is that the extended pdfTeX you were using was included in 
> TeX Live 2015, is that right? Or will be in TL2016?

The later papers, which are not directly on “Tagged PDF”, don’t require
the special tagging features.

> How much work would it be to translate that work into something that will 
> also function in XeTeX?

That depends on how easy it is to create PDF objects and object references
between them.
Since I don’t know how  xdvipdfmx does it — using pdfmark ?  as does dvips ?
then it’s nowhere near as convenient as with pdfTeX.

Hopefully someone with the necessary experience can pick up on those ideas.
That’s why I’ve followed up your comment on this list.
Indeed, we need someone to get  pdfx.sty  working with XeLaTeX;
it’s for similar reasons that it doesn’t do so already.

Switch it to another thread, if you think that is appropriate.

> Cheers,
> Will

Cheers,

Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] potential new feature: \XeTeXgenerateactualtext

2016-02-24 Thread Ross Moore

Hi Will, Jonathan, and others

> On Feb 25, 2016, at 10:31 AM, Will Robertson  wrote:
> 
> On 24 Feb 2016, at 2:20 AM, Jonathan Kew  wrote:
>> 
>> For a document that wants some other kind of "ActualText", there's going to 
>> need to be pretty detailed markup in the source, I think. (E.g. each word, 
>> or similar unit, will need to be tagged to provide the desired ActualText 
>> that goes with it.) At that point, I wonder if turning off 
>> \XeTeXgenerateactualtext and just doing it "manually" with macros that 
>> generate \special{}s would be the most reasonable way forward.
> 

You have to be *very* careful with /ActualText, since it must be done using 
PDFdoc encoding, 
as it becomes part of the page contents stream.
Any errors will corrupt the PDF file completely — but that’s true of other 
things as well.
Heiko’s  \pdfstringdef  in the hyperref package is very good for handling 
this...

> This sounds interesting for maths, where there is a chance we could 
> automatically insert \special{}s at the glyph and/or the equation level — has 
> this always been possible in XeTeX or does this require the newest patch for 
> xdvipdfmx you just released?

 … but doing the math-characters correctly, without interfering with spacings, 
is highly non-trivial.

Look at some of my papers associated with TUG conferences, to see various
options that can be used to make mathematics more accessible in PDFs; i.e.,
papers numbered as 5, 6, 7 on this page: 

   http://www.tug.org/twg/accessibility/

Although these were done using pdfTeX, some of these things should be able
to be implemented for XeTeX + xdvipdfmx  also.

> 
> Cheers,
> Will

Cheers,

Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] \(pdf)mdfivesum

2015-07-01 Thread Ross Moore

Hi Joseph,

On 01/07/2015, at 23:03, Joseph Wright  wrote:

> Hello all,
> 
> I have a request for a new primitive in XeTeX, not directly related to
> typesetting by I think useful. To understand why I'm asking, a bit of
> background would be useful.
> 
> The LaTeX team have recently taken over looking after catcode/charcode
> info for the Unicode engines from the previous rather diffuse situation.
> As part of that, we were asked to ensure that the derived data was
> traceable and so have included the MD5 sum of the source files in the
> new unicode-letters.def file.

MD5 sums are also required pieces of data with some of the modern PDF 
standards, such as PDF/A, PDF/UA, and especially whenever attachments are 
included.
They are part of the bookkeeping data that can be used to ensure that embedded 
files are indeed  what was intended, and have not been intercepted and changed 
by Malware.

> We can happily generate that file using pdfTeX (\pdfmdfivesum primitive)
> or LuaTeX (using Lua code), but not using XeTeX. That's not a big issue
> but the need for an MD5 sum gives me an idea which would need support in
> XeTeX.
> 
> LaTeX offers \listfiles to help us track down package version issues but
> this fails if files have been locally modified or don't have
> date/version info. It would therefore be useful to have a system that
> can ensure that files match, which is where MD5 sums come in. Once can
> imagine arranging that every file \input (or \read) has the MD5 sum
> calculated as part of document typesetting: this is not LaTeX-specific.
> This data could then be available as an additional file listing to help
> track problems. However, to be truly useful this would need to work with
> all three major engines, and currently XeTeX is out. I'd therefore like
> to ask that \pdfmdfivesum (or perhaps just \mdfivesum) is added to XeTeX.

I fully support this request.
Issues of guaranteeing fidelity and conformance to standards are actually quite 
important in areas other than academia.
It is time TeX caught up with regard to such issues.

> There are a small number of other 'utility' primitives in pdfTeX/LuaTeX
> (some in the latter as Lua code emulation) that might also be looked at
> at the same time (see
> http://chat.stackexchange.com/transcript/message/22496265#22496265):
> 
> - \pdfcreationdate
> - \pdfescapestring
> - \pdfescapename
> - \pdfescapehex
> - \pdfunescapehex
> - \pdfuniformdeviate
> - \pdfnormaldeviate
> - \pdffilemoddate
> - \pdffilesize
> - \pdffiledump
> - \pdfrandomseed
> - \pdfsetrandomseed

Several of these are definitely needed when generating PDFs that conform to 
existing standards, particularly with regard to attached or embedded files.

- \pdffilemoddate
- \pdfcreationdate
- \pdffilesize

Of course it is not hard to get such information from command-line utilities, 
when the files to be included are pre-existing, prior to commencement of a 
typesetting job.
But in cases where TeX is used to itself write out the files before re-reading 
for inclusion, then it is much easier to code when such primitives are 
available within the engine. Otherwise one needs to encode a call-out to 
command-line utilities, then read back the output. This introduces OS system 
dependencies, which is something that we definitely want to avoid with TeX 
systems.

> 
> most of which are not related to PDF output and which may have good use
> cases. I am specifically *not* asking for any of these to be added here
> but note this list as it *may* be that the work may be closely related.
> --
> Joseph Wright

Hope this helps,

Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Bug fixes and new features related to Unicode character codes, surrogates, etc

2015-05-06 Thread Ross Moore

Hi David,

On 07/05/2015, at 9:26 AM, David Carlisle wrote:

>> The character itself, as bytes that is, is not wrong and users should be 
>> able to create these.
>> But preferably through macros that ensure that they come correctly paired.
> 
> placing two character tokens representing a surrogate pair should not
> though magically turn itself
> into a single character.

Agreed.
You don't know whether you want a single character until 
you know what kind of output is being generated.
That need not be known on input.

> The UTF-8 or  encoding should refer to
> the unicode code point not
> to the UTF-16 encoding,

No disagreement to this.

> 
> In the current versions d835dc00 is two characters in luatex
> and one character in xetex
> as the implementation detail that xetex's underlying storage is mostly
> UTF-16 is exposed.

This seems to be premature of XeTeX then.
It seems to be making an assumption on how those bytes 
will ultimately be used.

> If it is
> not possible to prevent ^^^ or utf8 encoded surrogate pairs combining
> then it is better to
> prevent them being formed.

Hmm. 
What if you have an entirely different purpose in mind for those bytes?
You still need to be able to create them and do further processing with them.

Maybe there should be a primitive that sets a flag controlling what
happens to surrogates' bytes on input?
It may well be that XeTeX's current behaviour is best for putting
content into PDF pages; but not best in other situations. So a macro
programmer should have a means to change this, when needed.

> 
> this is no different to XML where & #xd835;& #xdc00; always refers to
> two (invalid) characters not
> to & #x1d400;

Seems fine to me.
If application software wants/needs to combine them, it can do so.

> 
> David

Cheers,

Ross

Ross Moore

Senior Lecturer
Mathematics Department  |   Level 2, E7A 
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955   |  F: +61 2 9850 8114
M: +61 407 288 255  |  http://www.maths.mq.edu.au/

CRICOS Provider Number 2J. Think before you print. Please consider the 
environment before printing this email.

This message is intended for the addressee named and may contain confidential 
information. If you are not the intended recipient, please delete it and notify 
the sender. Views expressed in this message are those of the individual sender, 
and are not necessarily the views of Macquarie University.

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Bug fixes and new features related to Unicode character codes, surrogates, etc

2015-05-06 Thread Ross Moore

Hi Arthur,

On 07/05/2015, at 8:04, Arthur Reutenauer  
wrote:

>  While working on these bugs, we also discussed how surrogate
> characters were handled in XeTeX.  Surrogate characters are the 2048
> code points that are used in UTF-16 to encode characters with code
> points above 65536: a pair of them makes up one Unicode character;
> however they're not meant to be used in isolation, even though they have
> code points like other characters (they're not just byte sequences).
> 
>  Right now, XeTeX allows isolated surrogate characters, and also
> combines sequences such as d835dc00 into one Unicode character.
> We want to flag the former case but are not sure how: should we make the
> characters invalid (with catcode 15)?  

That would definitely be wrong.
The character itself, as bytes that is, is not wrong and users should be able 
to create these.
But preferably through macros that ensure that they come correctly paired.

IMHO, this is a macro issue, not an engine issue.

The same kind of thing applies with combining accents and diacritics.
I've written macros that take an argument and follow it with a combining 
character.
This is useful for generating correct UTF8 bytes to put into XML packets, as 
needed for the XMP Metadata that is required in PDF files that must validate 
for ISO specifications.

Similar macros could be used to construct upper-plane characters from 
surrogates, given only the math style and Latin letter. For these, single 
surrogate characters will be needed in the macro definitions, with the ultimate 
matching pair to be determined algorithmically, probably using an \ifcase  
instance. Single characters thus need to be able to be input, so as to create 
the macro definition.

OK, a clever macro programmer can change the catcodes to become valid local to 
the macro definition. But that is really complicating things.

> Or we could map them to the
> standard "unknown" character (U+FFFD).  The latter case is more nasty
> and should definitely be forbidden -- the ^^ notation should only be
> used for "proper" characters (so instead of the above, the Unicode code
> point of the resulting Unicode character should be used, in this case
> ^1d400).

I disagree. 
The ^^ notation can be used in macros to create the required bytes, for writing 
out into a file other than the  .dvi  or .pdf  output.
pdfTeX (or other engine) then can cause that file to become embedded as a file 
object stream in the final PDF.

> 
>  Any thoughts?
> 
>Best,
> 
>Arthur

Hope this helps,

Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX maintenance

2015-04-27 Thread Ross Moore

Hi all,

On 28/04/2015, at 0:40, Apostolos Syropoulos  wrote:

>> 
>> As to whether "XML is a particularly good format not only here or for
>> anything", all I can say is that in my experience we (humanity, that is)
>> have not yet come up with anything better; LaTeX 2e, by explicitly
>> permitting the conflation of form and content, fails abysmally in this
>> respect (IMHO, of course).
>> 
> 
> 
> Well I think that JSON is currently the next hot thing in computing together 
> with big data.
> Many consider that JSON will eventually replace XML. Also, about the PDF 
> format I think that
> archivable PDF
> 
> 
> http://www.digitalpreservation.gov/formats/fdd/fdd000252.shtml
> 
> is very important.

Agreed.
Now PDF/A-1a, PDF/A-2a, PDF/A-3a are all accessible tagged PDF
(whereas the 'b' and 'u' sub levels need not be tagged).
It isn't much more to get PDF/UA from PDF/A-1a, etc, and so have validation for 
both.
This should be a major aim of our community.

> 
> A.S.
> --
> Apostolos Syropoulos
> Xanthi, Greece


Ross


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX maintenance

2015-04-27 Thread Ross Moore

Hi Joseph,

On 27/04/2015, at 4:19 PM, Joseph Wright wrote:

> On 27/04/2015 00:22, Ross Moore wrote:
>>> But of course that doesn't address the problem for LaTeXt users until
>>> someone writes a suitable/comparable package (maybe someone did
>>> already, I didn't try to follow).
>> 
>> I have coding for much of what is needed, using the modified pdfTeX.
>> But there is a lot that still needs to be added; e.g. PDF’s table model,
>> References, footnotes, etc. 
> 
> Somewhat away from the original topic, but it strikes me that building a
> tagged PDF is going to be much more problematic at the macro layer than
> at the engine level: is that fair?

Certainly one needs help at the engine level, to build the tree
structures: what is a parent/child of what else.

But macros are needed to determine where new structure starts
and finishes.
Think  \section  and friends, list environments, \item  etc.

Indicators must go in at a high level, before these are decomposed
into the content:  letters, font-switches, etc.

In short, determining where structure is to be found is *much* harder 
at the engine level; but doing the book-keeping to preserve that 
structure, once known, is definitely easier when done at that level.

Philip Taylor is correct in thinking that such things can be 
better controlled in XML. But there the author has to put in 
the extra verbose markup for themselves --- hopefully with help
from some kind of interface.
However, that can involve a pretty steep learning curve anyway.

Word has had styles for decades, but how many authors actually
make proper use of them?  e.g. linking one style to another,
setting space before & after, rather than just using newlines,
and inserting space runs instead of setting tabs.
How many even know of the difference between   and 
Shift-  (or is it Option- ) ?

The point of (La)TeX is surely to allow the human author
to not worry too much about detailed structure, but still allow
sufficient hints (via the choice of environments and macros used) 
that most things should be able to be worked out.

In particular, you need to hack into  \everypar  to determine
where the TeX mode switches from vertical to horizontal.
(LaTeX already does this, so it is delicate programming to mix
in what (La)TeX wants with what is needed for tagging.)

Doing it this way keeps things well hidden from the author,
who most likely just doesn't want to know anyway.

> Deciding what elements of a document
> are 'structure' is hard, and in 'real' documents it's not unusual to see
> a lot of input that's more about appearance than structure. That of
> course isn't limited to TeX: I suspect anyone trying to generate tagged
> output has the same concern (users do odd things).

Absolutely, as in my Word examples above.

LaTeX wants you to use a \section-like command, rather than
switching to bold-face, perhaps after inserting vertical space.
But if a human can recognise this, it should also be possible
to program TeX to recognise it. A really friendly system would
pause and question the author, perhaps with several options
available on how to proceed --- TeX can do this.
And TeX has a  \nonstopmode  to override such stoppages.

> --
> Joseph Wright

Enough on this for now.  This is surely a topic for TUG-2015.
By then we should know when the revised ADA Section 508 
will come into effect
--- or if it has been delayed or watered down. :-)

Cheers,

Ross

Ross Moore

Senior Lecturer
Mathematics Department  |   Level 2, E7A 
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955   |  F: +61 2 9850 8114
M: +61 407 288 255  |  http://www.maths.mq.edu.au/

CRICOS Provider Number 2J. Think before you print. Please consider the 
environment before printing this email.

This message is intended for the addressee named and may contain confidential 
information. If you are not the intended recipient, please delete it and notify 
the sender. Views expressed in this message are those of the individual sender, 
and are not necessarily the views of Macquarie University.

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX maintenance

2015-04-26 Thread Ross Moore

Hi Doug,

On 27/04/2015, at 10:05 AM, Douglas McKenna wrote:

> Given that the number of TeX input files using ^^u is likely miniscule, and 
> the number of those that follow the ^^u or ^^U with four or six hex digits is 
> even smaller, it seemed like a worthwhile benefit vs. cost, 
> compatibility-wise.  Maybe there's something I've not thought out well.

For user-input files, then yes it is probably very small.
But such constructions figure to be used a lot within package sources
--- precisely to create macros that shield users from the syntax.

For example, try in a terminal:

  grep "\^\^\^\^" `kpsewhich unicode-math.dtx` | wc -l

There are 160 lines of input and/or macro definitions.
(four of these use ^ )
Doubtless packages supporting other languages are similar.

Of course, since these are in packages the coding can be changed,
if engines need to be changed.
(Except that old versions will still have to be retained for those 
people who do not update to newer versions of the engine.)

> 
> This discussion I just found is both pertinent and frightening, I suppose:
> 
> http://stackroulette.com/tex/62725/the-notation-in-various-engines

Yeah. Thanks for this link. It is from July 2012
--- so maybe some of that incompatibility is fixed now?

If not, then TUG-2015 in Germany this July may be a good 
place to discuss the status of all this?

> 
> 
> Doug McKenna
> 

Cheers,

Ross

Ross Moore

Senior Lecturer
Mathematics Department  |   Level 2, E7A 
Macquarie University, NSW 2109, Australia
T: +61 2 9850 8955   |  F: +61 2 9850 8114
M: +61 407 288 255  |  http://www.maths.mq.edu.au/

CRICOS Provider Number 2J. Think before you print. Please consider the 
environment before printing this email.

This message is intended for the addressee named and may contain confidential 
information. If you are not the intended recipient, please delete it and notify 
the sender. Views expressed in this message are those of the individual sender, 
and are not necessarily the views of Macquarie University.

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX maintenance

2015-04-26 Thread Ross Moore

Hi Mojca,

> On 27 Apr 2015, at 6:53 am, Mojca Miklavec  <mailto:mojca.miklavec.li...@gmail.com>> wrote:
> 
> On Sun, Apr 26, 2015 at 10:26 PM, Ross Moore wrote:
>> 
>> No standard TeX implementation currently comes close to producing Tagged PDF.
> 
> ConTeXt MkIV does:
>https://www.tug.org/TUGboat/tb31-3/tb99hagen.pdf 
> <https://www.tug.org/TUGboat/tb31-3/tb99hagen.pdf>

Yes; I’m aware of what Hans can achieve, and hold him in awe. :-)
Besides, this uses LuaTeX.  viz. this quote from the end of Hans’ article.
“Also, it is yet another nice test case and torture test for LuaTEX and it 
helps us to find buglets and oversights.” 

That is precisely why I used the word “standard” qualifying “TeX installation” 
in my statement above.

> 
> But of course that doesn't address the problem for LaTeXt users until
> someone writes a suitable/comparable package (maybe someone did
> already, I didn't try to follow).

I have coding for much of what is needed, using the modified pdfTeX.
But there is a lot that still needs to be added; e.g. PDF’s table model,
References, footnotes, etc. 

> 
> Mojca
> 
> PS: Our government is still mainly depending on documents with a "doc"
> extension.

Right. Conversion to PDF requires Adobe’s converters.
There are known bugs — but this is doubtless being worked on.

The point is that, for people wishing to use TeX-based software to
produce PDFs, then extra converters or manual conversion techniques
(e.g., using Acrobat Pro) will be required to produce a valid PDF/UA document.
Unless, that is, our community takes this seriously and creates a major project.

Another quote from Han’s article:
 “This is a typical case where more energy has to be spent on driving the voice 
of Acrobat but I will do that when we find a good reason.” 

That reason is getting much, much closer.

All the best,

Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX maintenance

2015-04-26 Thread Ross Moore

Hi all,

On 26/04/2015, at 20:51, Joseph Wright  wrote:

> On 26/04/2015 11:47, Philip Taylor wrote:
>> To my mind, XeTeX /is/ the future of TeX.  The days of entering
>> "français" as "fran\c cais" are surely numbered, and it has never been
>> possible to enter "العربية", "ελληνικά" or "עברית" (etc) in an analogous
>> way.  Therefore, is it not time to petition the TUG Board to adopt XeTeX
>> as a formal TUG project, and to allocate adequate funding to ensure not
>> only its continued existence but its continued development, at least
>> until such time as a clearly superior alternative not only emerges but
>> becomes adopted as the /de facto/ replacement for TeX ?
>> 
>> Philip Taylor
> 
> The problem as always is not so much money as people. [Also, you do know
> about LuaTeX, yes? ;-) More seriously, XeTeX isn't a drop-in replacement
> for TeX90/pdfTeX.]

There is an even bigger issue which is going to affect the future of TeX.

http://www.access-board.gov/guidelines-and-standards/communications-and-it/about-the-ict-refresh/proposed-rule

The laws (in the US, but this will propagate) are going to becoming much 
tougher about requiring Accessibility of electronic documents, both for 
websites and PDFs.
Basically all PDFs (produced by government agencies for public consumption) 
*must* satisfy the PDF/UA published standard. That is, they must be Tagged PDF, 
and satisfy all the extra recommendations for enhancing Accessibility of the 
document's content and structure.
Being a legal requirement for Government Agencies has a knock-on effect for 
everyone, so TeX software will need to be enhanced to meet such requirements, 
else extinction becomes a real possibility.

No standard TeX implementation currently comes close to producing Tagged PDF.
LuaTeX, with it's extra scripting, has the potential to do so.
Extra primitives for pdfTeX go a long way, but require 1000s of extra lines of 
TeX/LaTeX coding to implement proper structure tagging without placing a burden 
on authors.
(Those primitives are not yet standard in pdfTeX, but are in a separate 
development branch.)

It may be possible to continue with a  .tex  —> .dvi —> .pdf  workflow, but I 
doubt it very much.
Structure tagging requires a completely separate tree-like view of a document’s 
structure and which must be interleaved with the content within the page-tree 
structure. Storing everything that will be required into the .dvi  file, on a 
page-page basis for later processing by a separate program, is unlikely to give 
a viable solution; at least not without substantial extension of  dvips , 
dvipdfmx, etc. and Ghostscript itself perhaps.

Direct production of the PDF by a single engine is surely the best approach.

> --
> Joseph Wright

Hope this helps,

Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] additional beginL endL nodes in math

2015-04-17 Thread Ross Moore

Hi David, Zdenek, and others

On 18/04/2015, at 1:05, Zdenek Wagner  wrote:

> 2015-04-17 16:31 GMT+02:00 David Carlisle :
> 
> package, anyway)
> 
> Colour is a font attribut in XeTeX but AFAIK it allowx RGB and RGBA only, 
> CMYK is not supported. If I want to print the document by offset, I have to 
> use colour specials, otherwise I risk unwanted result. 

It is not just colour information that one may want to insert.

Here are some more instances in support of  boojums  (using  pdfTeX , not 
XeTeX).

A.
For tagged PDF you may need to tag individual characters with attributes for 
accessibility and Copy/Paste, such as to get the PDF page-contents stream 
looking like:

/Span < >> BDC
  ... normal TeX-placed material ...
EMC

\pdfliteral   page { }can provide the extra PDF coding lines, 
but this is 2 extra  boojums  for each actual character in the math list. 

B.
I'm currently writing a paper describing a method to attach tooltips to 
mathematical symbols and expressions. After setting the chunks in boxes for 
measuring, this ultimately puts
   \pdfannot 
into the math-stream.  
To not affect spacing, this would need to be a  boojum  surely.

I can supply instances where spacing has changed, by an extra thin space.
Sometimes placing extra {...} avoids this extra space, other cases require a \! 
to fix.

> 
> Zdeněk Wagner
> http://hroch486.icpf.cas.cz/wagner/
> http://icebearsoft.euweb.cz
> 
> 
> 
> David

Is there a good tool or TeXnique that lets one see the contents of a 
math-stream, after all macros are processed, and during the internal 
page-construction stage?
I'd like something a bit better than just examining box contents after using 
\tracingall .

Cheers,

 Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [arXiv #128410] Re: XeLaTeX generated pdf metadata

2014-09-22 Thread Ross Moore

Hi Axel, Mike and others.

On 23/09/2014, at 12:04 PM, Axel E. Retif wrote:

> On 09/22/2014 08:42 PM, Mike Maxwell wrote:
> 
> 
>> I guess these jokers haven't heard of Unicode.  Are they stuck back in
>> the 1990s?
> 
> Are you and Philip Taylor even aware that you're replying directly to an 
> arXiv administrator?
> 
> I think arXiv and Cornell University are doing a great service to the 
> scientific community and public in general and deserve more respect.

Yes, they are doing a great service.

But, having said that, there should still be an obligation 
to keep up with the times, and not *prevent* the archiving
of publications that have special typesetting requirements.

How else can we advance aspects of mathematical/scientific 
publishing, when the main repository refuses to accept works 
that ably demonstrate useful new ideas?

Not that XeTeX is really all that special any more.
It started in 2004, on the Mac only.
Support for Unicode math is a bit more recent, 
starting around 2006, similarly to when XeTeX went
multi-platform (I think).

We talked about this at TUG 2014, in one of the discussion
sessions, where someone had reported dissatisfaction with 
what could be submitted to arXiv. 
Other issues were raised as well, including the fact that
the current TeXLive version being used there is ~3 years
out of date.

Earlier this year I submitted a paper that was meant 
to demonstrate use of PDF/A-3u features for publishing 
*accessible* mathematical content. 
But because the version of pdfTeX was outdated at 2011, 

http://arxiv.org/help/faq/texlive

the PDFs produced on-the-fly by arXiv do not validate to 
the standard declared within them.

They would not accept the PDF that I myself can compile,
in which validation is 100%.

(The particular differences in the PDF output are due 
to a mistake in 2011 and later versions of pdfTeX itself.
This has now been fixed, but perhaps is available only
by download from the  pdftex  source repository.)

The upshot of this is that it is not possible to *lead by 
example* with PDFs that are meant to demonstrate the value 
of new and emerging standards.
This includes standards that are accepted elsewhere within 
the publishing industry, and are to some extent mandated 
by existing US accessibility laws, applicable to many 
government and academic institutions.

> 
> It seems to me that if they start accepting Xe(La)TeX submissions they will 
> be receiving documents with strange fonts,
> the license of which they will have to investigate first to see if they can 
> post the articles.

Most fonts are allowed to be subsetted and included within PDFs.
The subsetting prevents sensible extraction of the font as 
a whole, so foundries do not object.
After all, how can the beauty and craftsmanship within a font 
be displayed, and its popularity increased to the benefit of
the designer and foundry, unless documents using it are allowed 
to be distributed?

So no, that is *not* the crux of the issue.

It is the insistence on being able to reproduce the PDF
*automatically from source* that is where the problem lies.

There should be more circumstances under which users' PDFs 
would be accepted *as-is*, and distributed from arXiv.

Sources should certainly be included in the arXiv, primarily 
for verification purposes, even when not able to be presently 
compiled to the desired satisfaction.

If font licensing is still deemed to be an issue, then surely
there is a difference between recreating the PDF from source 
using a purchased, fully-licensed copy of the font, and simply 
serving a copy of a document for which the author has used 
their own (presumably purchased or licensed) copy of that font.

By all means tell the author that full acceptance of the paper
may be delayed if some investigation needs to be carried out.
Tell them the real reason; but *do not* insult the author 
by saying that (s)he must submit in a completely different 
format to what is best for the content of the work that 
(s)he has already prepared. 

Hope this helps,

Ross Moore
Director, TeX Users Group

----
Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] problem xelatex+tikz

2014-09-11 Thread Ross Moore

Hi Francois,

On 12/09/2014, at 7:39 AM, François Patte wrote:

> I wanted to get rid of these "one inch" to provide the publisher a
> "camera ready" file and the way I did was an easy way to give a
> centered text with the text width and height given by the publisher...

Then the way to do this in TeX is *not* with a single TeX run.
Typeset your book using LaTeX, using the publisher supplied dimensions
for the page content.

The run a 2nd (La)TeX job that simply includes the pages of your
typeset book, as if images.
The  pdfpages.sty  package is very good for this. 
You lose annotations this way; but it is for printing, isn't it?

In that 2nd job you can reorder and reposition your included pages 
in whatever way you like; face-to-face for 2-up, another facing pair
upside down for 4-up folding to create a booklet,  whatever ...

Centering this is fine, so adjusting \hoffset and \voffset is OK
as it cannot affect the details contained in the pages of your book.

Changing those global settings is very risky when you are using
packages that manipulate raw PDF structures and coordinates, 
as does  tikz  via  pgfgraphics.
It makes good sense to me that different typesetting engines might 
well give you different results, as each has had to find its own 
way to implement how raw PDF graphics streams need to be handled.

So, I'd have to disagree with Ulrike that it is necessarily a  tikz
bug. I'd say that if you want to employ such effects, then beware
of how you are interacting with the graphics environments that
your ultimate PDF engine needs to work within.

There may not be any documentation to help you, so encapsulate
your tasks better to eliminate any unwanted effects.

> 
> When I have done this, it was the first edition of the latex companion
> and in the second edition, the offset commands are still there!
> 
> 
> 
> - -- 
> François Patte

Hope this helps,

Ross

----
Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] rangecheck in --run--

2014-08-27 Thread Ross Moore

Hi Mike,

On 28/08/2014, at 7:27 AM, maxwell wrote:

> One of our people is getting a crash in xetex, which I can't reproduce.  It's 
> very odd, since afaik we're both using the same input files, the same 
> instance of xetex, the same TeXLive 2014 files, and so forth, and running on 
> the same machine.  Clearly s.t. is different, but I'm not sure what, and this 
> email is a query about what I should be looking for.
> 
> The error msg is:
> --
> Error: /rangecheck in --run--
> Operand stack:
>   --dict:11/20(L)--   TT0   1   FontObject   --dict:8/8(L)--   
> --dict:8/8(L)--   TimesNewRomanPSMT   --dict:13/13(L)--   Times-Roman   
> Times-Roman
> Execution stack:
>   %interp_exit   .runexec2   --nostringval--   --nostringval--   
> --nostringval--   2   %stopped_push   --nostringval--   --nostringval--   
> --nostringval--   false   1   %stopped_push   1862   1   3   %oparray_pop   
> 1861   1   3   %oparray_pop   1845   1   3   %oparray_pop   --nostringval--   
> --nostringval--   2   1   1   --nostringval--   %for_pos_int_continue   
> --nostringval--   --nostringval--   false   1   %stopped_push   
> --nostringval--   --nostringval--
> Dictionary stack:
>   --dict:1155/1684(ro)(G)--   --dict:1/20(G)--   --dict:76/200(L)--   
> --dict:76/200(L)--   --dict:106/127(ro)(G)--   --dict:286/300(ro)(G)--   
> --dict:22/25(L)--   --dict:4/6(L)--   --dict:26/40(L)--
> Current allocation mode is local
> Command /groups/tools/texlive/2014/bin/x86_64-linux/xelatex   -halt-on-error 
> -output-directory=./LinguistInABox/output/latex 
> ./LinguistInABox/output/latex/linguistInABoxGrammar.xetex -no-pdf died with 
> signal 13, without coredump

The problem looks to be with Ghostscript.
You may be using different versions, so check that first.


> 
> 
> Signal 13 is "Write on a pipe with no reader, Broken pipe."
> 
> I believe the crash is happening at the point xelatex is trying to embed an 
> existing PDF.  

Yes. That PDF presumably has some text in it, using Times font as  
TimesNewRomanPSMT .
Others used to using XeTeX under Linux may be able to offer a more detailed 
understanding
of the specific kind of error.


> If I'm right (we're going to verify it tomorrow), the command that crashes is
> --
> \imgexists{list_intonation.pdf}{{\imgevalsize{list_intonation.pdf}{\includegraphics[width=\imgwidth,height=\imgheight,keepaspectratio=true]{list_intonation.pdf{}
> -
> 
> Googling this:
>xetex OR xelatex "rangecheck in --run--"
> brings up about six msgs from 2011, which seem to be the same thread, and 
> afaict are irrelevant.
> 
> We're running the version of xetex that came with TeXLive 2014 
> (3.14159265-2.6-0.1) on Linux.
> 
> Any suggestions as to what I should be looking for?
> 
>   Mike Maxwell
>   University of Maryland


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114





--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] FakeBold vs TikZ

2014-06-18 Thread Ross Moore

Hi Khaled,

On 18/06/2014, at 14:04, Khaled Hosny  wrote:

> On Wed, Jun 18, 2014 at 10:53:40AM +1000, Ross Moore wrote:
>> It seems that one cannot break up the processing any more,
>> or at least not in the simple-minded way.
>> Would one of the developers please explain how to do this 
>> kind of testing now.
> 
> So called “native” fonts (AKA non-TFM fonts) are now stored in the XDV
> files using their full path not font name, so you can’t process XDV
> files using such fonts uless you have fonts in the exact same location
> (but whether this relates to the error you are seeing or not, I don’t
> no).

Yes, I discovered this.
First, to get XeTeX to work, I had to load various TeX fonts to become system 
accessible.
I did this by 
 1. making symbolic links from a subdirectory of /Library/Fonts  (on a Mac) to 
appropriate directories in the  texmf-dist/fonts/otf/  hierarchy;
 2. opening  FontBook.app   and choosing to load more fonts.

This was done on both systems, so the paths to the font names should indeed be 
the same...

  ...except that my account names are different on the 2 machines.
Symbolic links are again your friend here. I made symlinks in  /Users  so that 
the name used on one system becomes valid also on the other.

With these symlinks in place, XeTeX worked just fine to create the .xdv  files,
and  xdvipdfmx  no longer complained about not finding fonts by the full path 
in an .xdv file created on the other system. 
However, it does barf with the TFM error that I stated in the previous email.

That error occurs also when I split the job on the same system;
that is,
 xelatex -no-pdf  testfile.tex
 xdvipdfmx testfile.xdv

but all is fine with   xelatex testfile.tex  
so there must be some extra information that is being used when  xdvipdfmx  is 
called automatically. Any idea on what this could be? or how to run some 
tracing of the xdv processing?

My next step will be to install both TeXLive versions on one of the machines, 
and create TeXShop engine scripts to be able to easily choose which one to use.

> 
> Regards,
> Khaled

Cheers,

Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] FakeBold vs TikZ

2014-06-17 Thread Ross Moore

Hi Stefan, Marcin, and others,

On 17/06/2014, at 7:29 AM, Stefan Solbrig wrote:

> Hi,
> 
>>> Just compiled the document under TeX Live 2013 (all updates till the 
>>> freeze) and still no dot.

Interesting.

I just tried with TeXLive 2013 and 2014 on different Macs.

2013 has the dot!
2014 does *not* have the dot.

For the 2013 installation we have xdvipdfmx-0.7.9
For the 2014 installation,  xdvipdfmx  is version  20140317

I tried using  xelatex -no-pdf  to keep the  .xdv  file.
Then renamed these and copied to the other machine, for processing
with the other version of  xdvipdfmx .
If this would work, then it could identify whether the problem
was in  xdvipdfmx  or due to what is put into the .xdv  file.

No joy came from this test.
Instead, all I get is a fatal error:   Invalid TFM ID: 0 .

It seems that one cannot break up the processing any more,
or at least not in the simple-minded way.
Would one of the developers please explain how to do this 
kind of testing now.

Earlier testing with TeX Live 2012, with  xdvipdfmx-0.7.8 
was just giving the similar error

>>>> Output written on Tikz-test.xdv (1 page, 4856 bytes).
>>>> Transcript written on Tikz-test.log.
>>>> /usr/local/texlive/2012/bin/x86_64-darwin/xdvipdfmx
>>>> Tikz-test.xdv -> Tikz-test.pdf
>>>> [1
>>>> ** ERROR ** TFM: Invalid TFM ID: 0
>>>> 
>>>> Output file removed.

Cheers,

Ross

Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX (/not/ XeLaTeX) : Marginal kerning, font protrusion, hyperlinks

2014-04-26 Thread Ross Moore

I'm finding this thread to be quite distressing.

On 26/04/2014, at 6:45, Philip Taylor  wrote:

>  Wagner wrote:
> 
>> XeTeX can do via xdvipdfmx specials almost everything explained in the
>> PDF reference and pdfmark reference. Will you insist to include these
>> 1000+ pages in the XeTeX manual?
> 
> No, Zdeněk; I am asking for /important/ facts to be documented,
> not the entire known universe.

Phil, who is supposed to write the documentation that you desire?

You know that all TeX development is done voluntarily.
Most documentation is written by whomever wrote the code.
So you cannot expect any XeTeX developer to write beyond  \special  and it's 
initial keywords.

Anything about the arguments to \special  will necessarily be written by the 
authors of the driver applications, or someone else who has kindly donated 
their time to contribute their less than complete knowledge, based upon their 
own specific experience.

Asking for anything else is unreasonable, and insisting upon it is arrogance.

Yes, you do need to understand that there is a driver application, even if only 
one is currently supported with XeTeX, and that it currently — it wasn't always 
this way — is called automatically. The TeX world has always been this way, in 
that tasks are devolved to different application programs, and their 
documentation is typically written independently. 

> 
> And how about the code below:
> 
>> \bf\special{pdf: code q 2 Tr 0.4 w 0 .5 .5 .05 k 1 1 0 .1 K}Hello
>> world!\special{pdf:code}
>> \bye
> 
> The number of such productions is infinite; no documentation system,
> no no matter how complex and complete, can fully document an infinite
> universe of discourse, and therefore unless you can show that whatever
> your write-only code accomplishes is something that B L User is likely
> to want to accomplish, then I can see little point in documenting it.

Of course.
So please take the obvious hints, face reality, and desist on pursuing this 
thread.

Off list I have given you the advice of:
  1.  employing the   miniltx.tex   input file, to enable you to load important 
LaTeX internals, without having to submit to LaTeX's model of what is a 
document and how it might be structured; and
  2.  use  \tracingall  with LaTeX examples to see what is really happening.
And being prepared to open the package files themselves, to see what other 
branches are possible with the package's internal coding.

Method 2. has always worked for me, as it reveals far more accurate information 
than any documentation can ever do. Yes, it can be difficult and daunting, but 
it is accurate.
I mean, if a computer can understand it, then surely so can I.

> 
> ** Phil.
> -- 
> All duplicate recycled material deleted on principle.

A fine principle.

Please apply such empathetic principles also to developers, who supply their 
efforts entirely voluntarily, and respect the fact that they may have a 
different perception to you of what is important, and what is not.

Best regards,

Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX (/not/ XeLaTeX) : Marginal kerning, font protrusion, hyperlinks

2014-04-15 Thread Ross Moore

Hi Phil,

Have you ever tried

\input miniltx.tex

This then allows a subset of LaTeX structural commands and internals to be used 
without the documentclass stuff — which is what I think you detest most.
Now many LaTeX packages can be loaded and used, without problems, in what are 
otherwise plain TeX documents. Just use, e.g.

  \usepackage{color}

as in a LaTeX document.

I'm pretty sure that  graphics.sty  and  graphicx.sty  are usable this way, and 
also  hyperref.sty ,
and its driver files, including  hdvipdfmx.def  — I think this is the correct 
name.

Now all the documentation you want about hyperlinks is in the book The LaTeX 
Web Companion,  or in the PDFs built from the .dtx  documentation files for the 
LaTeX packages. 
Use the  texdoc  command to access these.

Beware that not all LaTeX packages work this way. That will depend upon what 
the authors of the packages have used internally. There can be inter-package 
dependencies, ultimately leading back to the parts of LaTeX that you have not 
loaded. You have to just try things out, and use what works, perhaps keeping 
records of what does or does not.

Hope this helps,

 Ross

On 15/04/2014, at 19:07, Philip Taylor  wrote:

> 
> 
> Khaled Hosny wrote:
> 
>> On Thu, Apr 10, 2014 at 12:58:23PM +0100, Philip Taylor wrote:
>>> 
>>> Why are these key XeTeX primitives (\XeTeXprotrudechars, \rpcode, etc)
>>> not documented in /The XƎTEX reference guide/ ?   Will, Khaled,
>>> Jonathan :  can you comment on this, and will these (and any other
>>> currently undocumented primitives) be documented in the version of
>>> /The XƎTEX reference guide/ which accompanies TeX Live 2014 ?
>> 
>> From me: simply because I know near nothing about them.
> 
> Fully understood.  In that case, may I ask Jonathan where these primitives 
> are, in fact, documented, so that Khaled and Will can
> potentially make use of this information if they choose to
> prepare a TeX Live 2014 edition of /The XƎTEX reference guide/ ?
> 
> Also, are there any other currently undocumented XeTeX primitives,
> and where can be found any information on embedding hyperlinks
> using XeTeX ?
> 
> Philip Taylor
> 
> 
> 
> 
> --
> Subscriptions, Archive, and List information, etc.:
> http://tug.org/mailman/listinfo/xetex

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] turn off special characters in PDF

2014-01-06 Thread Ross Moore

Hi Joe,

On 04/01/2014, at 8:43 AM, Joe Corneli wrote:

> Hi All:
> 
> I'm glad my message sparked some discussion.  My M[N]WE for my
> specific use case on tex.stackexchange.com has not gotten much
> attention - I recently attached a +200 bounty.
> 
> http://tex.stackexchange.com/questions/151835/actualtext-in-small-cap-hyperlinks
> 
> I figured I should put in a plug for that here.  I already got a reply
> from one of the main authors of hyperref, but patching \href at the
> necessary level is beyond me.  Finally, I realize a detailed
> discussion of this issue is probably not germane to this list, so if
> you feel that way, please direct further comments there, or to me off
> list.

No, it is quite germane for this list, and relates to
a very recent thread.

The attached PDF is a variant of your example.
Copy/Paste the text using Adobe Reader or Acrobat Pro.
You should get:

Old: Sexy tex: .
New: Sexy tex: sxe .

Apples's Preview (at least within TeXshop) doesn't seem to recognise
the /ActualText  tagging.

accsupp-href.pdf
Description: Adobe PDF document

To achieve this I had to do several things.
Here are the relevant definitions:

\newcommand*{\hrefnew}[2]{%
\hrefold{#1}{\BeginAccSupp{method=pdfstringdef,unicode,ActualText={#2}}#2\EndAccSupp{}}}
\AtBeginDocument{%
 \let\hrefold\href 
 \let\href\hrefnew
}

Notes:
  1. Use \BeginAccSupp and \EndAccSupp  as tightly
 as possible around the text needing to be tagged.

  2. You want the  method=pdfstringdef   option.
 (It is  pdfstringdef  not  pdfstring .)
 This results in appropriate strings for the /ActualText value;
 either ASCII if possible (as here) or UTF16 strings with BOM.

 3.  Delay the rebinding of \href  to \AtBeginDocument .
 This way you do not interfere with any other package making
 its own redefinition of what \href does.

What follows is highly technical and of no real concern to anyone
just wanting to use /ActualText tagging.
Rather it is about implementing this (and more general kinds of)
tagging in the most efficient way.

The result of the above coding is to adjust the PDF page stream 
to include:

  q 
  1 0 0 1 129.04 -82.56 cm 
  /Span<>BDC
  Q BT /F1 11.955 Tf 129.04 -82.56 Td[<095e09630950>]TJ ET q 
  1 0 0 1 145.89 -82.56 cm EMC 
  Q

where you can see the /Span tagging of the content between BDC and EMC.
This works, but is excessive, to my mind, by duplicating some operations.

Now the xdvipdfmx processor allows an alternative form for
the \special  used to place the tagging.
It can be invoked with the following redefinition of internals
from the  accsupp.sty  package:

\makeatletter
 \def\ACCSUPP@bdc{\special {pdf:literal direct \ACCSUPP@span BDC}}
 \def\ACCSUPP@emc{\special {pdf:literal direct EMC}}
\makeatother

This gives a much more efficient PDF stream:

   ...>6<0059001b>]TJ ET
   /Span<>BDC 
   BT /F1 11.955 Tf 129.04 -82.56 Td[<095e09630950>]TJ ET 
   EMC
   BT /F1 11.955 Tf ...

in which the irrelevant coordinate/matrix changes (using 'cm')
no longer occur.

But even this could possibly be improved further to avoid the
extra BT ... ET :

   ...>6<0059001b>]TJ 
   /Span<>BDC 
   /F1 11.955 Tf 129.04 -82.56 Td[<095e09630950>]TJ 
   EMC
   /F1 11.955 Tf ...

In the experimental version of  pdfTeX  there is a
keyword 'noendtext' that can be used with the new 
 \pdfstartmarkedcontent  primitive:

  \pdfstartmarkedcontent attr{} noendtext ...

which is designed with this aim in mind.
Use of this keyword sets a flag so that the matching  
 \pdfendmarkcontent  can keep the BT/ET nesting consistent.

> 
> Thank you!
> 
> Joe

Hope this helps,

Ross

Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

<>

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] turn off special characters in PDF

2014-01-01 Thread Ross Moore

Hi Zdeněk,

On 02/01/2014, at 2:14 AM, Zdenek Wagner wrote:

> 2014/1/1 Ross Moore :

>> In the example PDF that I attached to my previous message, each mathematical
>> character is mapped to a big-endian UTF-16 hexadecimal string, with Plane-1
>> alphanumerics expressed using surrogate pairs.
>> 
> Thank you, now I see it. The book where I read about /ActualText did
> not mention that I can use UTF16 if I start the string with BOM.

Fair enough; this I had to discover for myself.
The PDF Reference Manual (e.g. for ISO 32000) has no such examples,
so I had to experiment with different ways to specify strings requiring
non-ascii characters. UTF16 is the most elegant, and avoids the messiness
of using escape characters and octal codes, even for some non-letter
ASCII characters.

> Can I
> see the source of the PDF? It could help me much to see how you do all
> these things.

Each piece of mathematics is captured, saved to a file, converted to MathML,
then run through my Perl script to create alternative (La)TeX source.
This is done to be able to create a fully-tagged PDF description of the 
mathematical content, using a special version of  pdftex  that Han The Thanh
created for me (and others) --- still in experimental stage.

You should not need all of this machinery, but I'm happy to answer
any questions you may have.

I've attached a couple of examples of the output from my Perl script, 
in which you can see how the /ActualText  replacement strings
are specified, using a macro \SMC — which ultimately expands to use
the  \pdfstartmarkedcontent  primitive.

2013-Assign2-soln-inline-2-tags.tex
Description: Binary data

2013-Assign2-soln-inline-1-tags.tex
Description: Binary data

Without the special primitives, you should be able to use  \pdfliteral 
to insert the tagging needed for just using  /ActualText .

>> 
>> I see no reason why Indic character strings could not be done similarly.
>> You probably need some on-the-fly preprocessing to work out the required
>> strings to use.

I'm not sure whether there is a LaTeX package that allows you to get the
literal bits into the correct place without upsetting other fine
details of the typesetting with Indic characters.
This certainly should be possible, at least when using  pdfLaTeX .
Not sure of the details using XeTeX — but you work with the source code,
so can devise anything that is needed, right?

> 
> -- 
> Zdeněk Wagner
> http://hroch486.icpf.cas.cz/wagner/
> http://icebearsoft.euweb.cz

Hope this helps,

    Ross

Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

<>

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] turn off special characters in PDF

2014-01-01 Thread Ross Moore

Hi Zdenek, and others,

On 01/01/2014, at 11:53, Zdenek Wagner  wrote:

>> The attached file (produced using pdfTeX, not XeTeX) is an example
>> that I've used in TUG talks, and elsewhere.
>> Try copy/paste of portions of the mathematics. Be aware that you can
>> get different results depending upon the PDF viewer used when
>> extracting the text.  (The file has uncompressed streams, so you
>> can view it in a decent text editor to see the tagging structures
>> used within the PDF content.)
>> 
> If I remember it well, ActualString supports only bytes, not
> cotepoints. Thus accfented characters cannot be encoded, neither Indic
> characters.

I don't know what you mean by this.
In my testing I can tag pretty-much any piece of content, and map it to any 
string using /ActualText .
Mostly I use Adobe's Acrobat Pro as the PDF reader, and this works fine with it,
modulo some bugs that have been reported when using very long replacement 
strings.

In the example PDF that I attached to my previous message, each mathematical 
character is mapped to a big-endian UTF-16 hexadecimal string, with Plane-1 
alphanumerics expressed using surrogate pairs. 

I see no reason why Indic character strings could not be done similarly.
You probably need some on-the-fly preprocessing to work out the required 
strings to use.
This is certainly possible, and is what I do with mathematical expressions.
It should be possible to do it entirely within TeX, but the programming can get 
very tricky, so I use Perl instead.

> ToUnicode supports one byte to many bytes, not many bytes
> to many bytes.

Exactly. This is why /ActualText  is the structure to use.

> Indic scripts use reordering where a matra precedes the
> consonants or some scripts contain two-piece matras. Unless the
> specification was corrected the ToUnicode map is unable to handle the
> Indic scritps properly.

Agreed;  /ToUnicode  is not what is needed here.
This sounds like precisely the kind of situation where you want to tag an 
extended block of content and use /ActualText  to map it to a pre-constructed 
Unicode string.
I'm no expert in Indic languages, so cannot provide specific details or 
examples.

>>> 
>>> --
>>> Regards,
>>> Alexey Kryukov 
>>> 
>>> Moscow State University
>>> Faculty of History
>> 
>> 
>> 
>> Hope this helps,
>> 
>>Ross

>> -- 
> Zdeněk Wagner
> http://hroch486.icpf.cas.cz/wagner/
> http://icebearsoft.euweb.cz

Happy New Year,

Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] turn off special characters in PDF

2013-12-29 Thread Ross Moore

Hi Joe,

On 30/12/2013, at 8:12 AM, Joe Corneli wrote:

> This answer talks about how to turn off litgatures:
> http://tex.stackexchange.com/a/5419/4357
> 
> Is there a way to turn off *all* special characters (e.g. small caps)
> and just get ASCII characters in the copy-and-paste level of the PDF?

In short, no!
 — because this is against the idea of making more use of Unicode,
across all computing platforms.

Certainly a ligature can have an /ActualText replacement consisting
of the separate characters, but this requires the PDF producer
to have supplied this within the PDF, as it is being generated.

I've played a lot with this kind of thing, and think that this
is the wrong approach. One should use /ActualText to provide
the correct Unicode replacement, when one exists. Thus one
can extract textual information reliably, even when the PDF
uses legacy fonts that may not contain a /ToUnicode resource,
or if that resource is inadequate in special situations.

Besides, do you really mean *all* special characters?
What about simple symbols like: ß∑∂√∫Ω  and all the other 
myriad foreign/accented letters and mathematical symbols?

If you want these to Copy/Paste as TeX coding (\beta  \Sum \delta  
\sqrt etc.) within documents that you write yourself, then I wrote 
a package called  mmap  where this is an option for the original 
Computer Modern fonts.

Alternatively, a PDF reader might supply a filtering mode that
converts the ligatures back to separate characters. Then the
user ought to be able to choose whether or not to use this filter.
I don't know of any that actually do this.
(In any case, you would want such a tool to allow you to specify
which characters to replace, and which to preserve.)

Your best option is surely to (get someone else to) write such 
a filter that meets your needs, and use it to post-process the text 
extracted via Copy/Paste or with other text-extraction tools.

Of course this is no use if your aim is to create documents for
which others get the desired result via Copy/Paste.
For this, the /ActualText approach is what you need.

Hope this helps,

Ross

----
Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

<>

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] wrong glyphs with FreeSerif Italic

2013-12-26 Thread Ross Moore

On 27/12/2013, at 9:39 AM, Zdenek Wagner wrote:

> Sorry, I cannot reproduce it, there must be something wrong in your
> installation. I tried both with TeX Live 2011 and TeX Live 2013 and I
> get the expected result.

Me too, with:

 This is XeTeX, Version 3.1415926-2.2-0.9997.4 (TeX Live 2010)
and
 This is XeTeX, Version 3.1415926-2.4-0.9998 (TeX Live 2012)

With 2010 the font versions, as encoded in the font itself, are
  FontForge 2.0 : Free Serif : 4-1-2009Version $Revision: 1.358 $
  FontForge 2.0 : Free Serif Italic : 4-1-2009   Version $Revision: 1.175 $

With 2012 the font versions, as encoded in the font itself, are 
  GNU: FreeSerif Normal: 2012   Version 0412.2263
  GNU: FreeSerif Italic: 2012   Version 0412.2268

With the 2012 font, I get a lot of warnings about unsupported features;
viz.

*
* fontspec warning: "icu-feature-not-exist-in-font"
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif',
* with script 'Latin', and language 'Default'.
*
*
* fontspec warning: "icu-feature-not-exist-in-font"
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif',
* with script 'Latin', and language 'Default'.
*
*
* fontspec warning: "icu-feature-not-exist-in-font"
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif/B',
* with script 'Latin', and language 'Default'.
*
*
* fontspec warning: "icu-feature-not-exist-in-font"
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif/I',
* with script 'Latin', and language 'Default'.
*
*
* fontspec warning: "icu-feature-not-exist-in-font"
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif/BI',
* with script 'Latin', and language 'Default'.
*
*
* fontspec warning: "icu-feature-not-exist-in-font"
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif',
* with script 'Latin', and language 'Default'.
*
*
* fontspec warning: "icu-feature-not-exist-in-font"
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif',
* with script 'Latin', and language 'Default'.
*
*
* fontspec warning: "icu-feature-not-exist-in-font"
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif',
* with script 'Latin', and language 'Default'.
*
*
* fontspec warning: "icu-feature-not-exist-in-font"
* 
* OpenType feature 'Fractions=Off' (-frac) not available
* for font 'FreeSerif',
* with script 'Latin', and language 'Default'.
*

> 
> 2013/12/26 Julian Bradfield :
>> This is probably FA, but I haven't found it by searching...
>> 
>> I'm a first-time user of xelatex (but 30-year user of TeX in general),
>> and have used it to typeset a linguistic article with Charis SIL. I
>> then wanted to switch to GNU Freefont, and encountered the weird
>> symptom that all the glyphs are displaced by two codepoints in the
>> Italic version.
>> Here's a minimal example:
>> 
>> \documentclass{article}
>> \usepackage{mathspec}
>> \setallmainfonts(Digits,Latin,Greek,Special)[Mapping=tex-text,Fractions=Off]{FreeSerif}
>> \begin{document}
>> ABCabc \it ABCabc
>> \end{document}
>> 
>> 
>> On processing, the PDF shows ABCabd CDEcde; the right character
>> metrics appear to have been used, but the glyphs are wrong.
>> 
>> My xelatex version is
>> This is XeTeX, Version 3.1415926-2.3-0.9997.5 (TeX Live 2011) 
>> (format=xelatex 20
>> 12.11.27)
>> and the Freefont is the release of

Re: [XeTeX] XeTeX : images as links

2013-09-20 Thread Ross Moore

Hi Phil,

On 21/09/2013, at 3:23 AM, Philip Taylor wrote:

> In a forthcoming PDF catalogue of Greek MSS, a number of "thumbnail"
> images of folia, bindings, etc., will appear, many if not all of
> which will be expected to function as hyperlinks to full-sized
> (or perhaps pannable/zoomable) versions of the same.  However,
> endeavouring to achieve this functionality using either Eplain's
> \hyperref or Eplain's \hlstart/\hlend fails to produce the desired
> effect -- whilst text can act as a clickable region for a hyperlink,
> an image included using \XeTeXpicfile seemingly cannot.   
> 
> The following, a verbatim copy from the test file, demonstrates
> the problem --

Can you post a PDF, preferably uncompressed,
so we can look at how the hyperlink is specified.

> 
> 
>   \catcode `\< = \catcode `\@
>   \input eplain
>   \catcode `\< = \active
>   
>   \enablehyperlinks
>   
>   \uselanguage {UKenglish}
>   
>   \hlstart {url}{}{http://example.org/fullsize}
>   \hbox
>   \bgroup
>   \XeTeXpicfile Images/LPL-MS-1214/JPG/f1r.jpg height 0,25\vsize
>   \egroup
>   \hlend
>   
>   \vskip \baselineskip
>   
>   \hbox
>   \bgroup
>   \hlstart {url}{}{http://example.org/fullsize}
>   \XeTeXpicfile Images/LPL-MS-1214/JPG/f1r.jpg height 0,25\vsize
>   \hlend
>   \egroup
>   
>   \vskip \baselineskip
>   
>   \href{http://example.org/fullsize}{\hbox {\XeTeXpicfile
> Images/LPL-MS-1214/JPG/f1r.jpg height 0,25\vsize}}
>   
>   \end

What is this  0,25\vsize  ?
Should it not be  0.25\vsize  for TeX to get the correct
vertical dimension?

Try also:  
  \setbox0=\hbox{\XeTeXpicfile Images/LPL-MS-1214/JPG/f1r.jpg height 0,25\vsize 
}
then
  \message{height=\ht0 + \dp0, width=\wd0}
to see what height you are really getting.

Then set the link with:
  \href{http://example.org/fullsize}{\box0}

> 
> Needless to say, if any of the \XeTeXpicfiles are replaced by text,
> all works as expected.
> 
> Can anyone please explain why this does not work, and how the
> problem can best be transcended ?  (NB.  UNIV = Plain XeTeX, not XeLaTeX)
> 
> Philip Taylor
> 
> P.S. The catcode stuff at the top is because the real project goes on
> to load an XML file.


I'd doubt that there's any problem caused by this, with < and >
**unless**
 these are used internally by packages which you load *after* the
catcode changes.

e.g. conditionals may fail
   \ifnum ... <    
if the catcodes had not been setup robustly within the package.

But surely you would have noticed problems of this kind already,
if they were indeed going to occur with your larger document.


Cheers

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114


<>


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeLaTeX PDF Glitch with Serbian Glyphs

2013-09-16 Thread Ross Moore

Hello Alessandro,

On 17/09/2013, at 6:16, Alessandro Ceschini  wrote:

> Serbian Cyrillic requires a peculiar localisation because some glyphs are 
> different from the standard. The PDF produced by XeLaTeX however must have 
> some glitch because if I try copy/paste from it to another document the 
> characters affected by Serbian localisation simply disappear :-\ ! This 
> doesn't happen with PDF produced by LibreOffice 4.1, which now supports 
> OpenType and therefore localised glyphs: characters are correctly copied, and 
> even if the recipient program doesn't support OpenType, then standard glyphs 
> are displayed.

Please attach PDFs which show the non-standard glyphs, produced by different 
means,
and one or more screenshots which show the poor results you are getting with 
different applications. Otherwise it is very hard to understand just what is 
the problem,
and impossible to advise of a fix or workaround.

It sounds like you have a font that provides alternative glyphs for some 
code-points.
When you do Copy/Paste from the PDF, you are extracting only the code-point, 
not the glyph itself, nor the font that it came from. Hence you would get only 
the standard glyph showing in the result, since whatever system font is being 
used does not have the special variant available.
Some software may allow a Rich-Text Copy/Paste, which might carry the extra 
font information.

With some examples, people on this list can test other software applications or 
check exactly what is contained within the PDFs.
It may well be that an  /ActualText  entry is what you need.

> -- 
> Alessandro Ceschini

Hope this helps,

  Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Letterly fonts

2013-08-07 Thread Ross Moore

Hi Wilfried,

On 07/08/2013, at 9:27 PM, Wilfred van Rooijen wrote:

> Hello all,
> 
> Polyglossia knows many languages, but also lacks support for many languages. 
> An example is Japanese. I have looked at it two times and concluded that 
> polyglossia cannot support Japanese. The reason in this case is simple: in 
> Japanese, "chapter 1" is written as 第1章, litterally "number 1 chapter", and 
> similar for all other "document divisions". The "standard" latex classes 
> encode the chapter numbering as "chapter_name_string chapter_number", and it 
> is not possible to change this to "prefix chapter_number chapter_name_string" 
> from polyglossia. It has to be done at the level of the documentclass (or 
> maybe a stylefile). But, bottom line, polyglossia cannot do it. I suspect 
> there are (potentially many) languages with similar problems. 

Surely this is not an insurmountable problem.
In (La)TeX it is certainly possible to achieve this split and reorder.
One just has to look rather carefully at what the document-class actually
does and devise a suitable patch.

For example, in  report.cls  and  book.cls  the \chaptername is placed
using a sequence
\@chapapp \ \thechapter. 
(or \@chapapp\space\thechapter. )
where  \@chappapp  is a variable which expands to either 
\chaptername  or  \appendixname .

So redefine \@chapapp  to take 2 parameters:

  \def\@chapapp#1#2.{\@numapp#1#2#1\chaptername.} 

where  \@numapp --> 第  and  \chaptername --> 章

Similarly patch  \appendix as follows:

\renewcommand\appendix{\par
  \setcounter{chapter}{0}%
  \setcounter{section}{0}%
  \gdef\@chapapp##1##2.{\@numapp##1##2##1\appendixname.}%
  \gdef\thechapter{\@Alph\c@chapter}}

These redefinitions should be made using \AtBeginDocument{...}
and only done subject to a conditional according to what is 
chosen as the main document language. 

Such patching could be added to Polyglossia or, as you note,
done within a separate style file.

Patching other levels of section-heading seem not to be so straight-forward,
as there is no analog of the  \@chapapp  variable macro.
But at least in principle, there should be a way to do it, according
to what you actually need with Japanese.

On the other hand, if the  hyperref  package is loaded, such that  nameref.sty
is used, then there is a macro  \Sectionformat  that is a convenient place where
patches can be applied to the way the actual section heading is put onto the 
page.
Used as:  \Sectionformat{#8}{#2}
  where  #8 = the mandatory argument to  \section{...}, \paragraph{...}, etc.
  and  #2 = sectioning-counter name  e.g. 'section', 'paragraph', etc.
and with a default definition of  \providecommand\Sectionformat[2]{#1} ,
this is a macro just begging to be redefined according to need.

> 
> As far as code2000 is concerned, indeed, it supports a tremendous amount of 
> scripts, but it is __not__ intended to typeset entire documents. It lacks 
> boldface, slanted, italic, etc. In short, it is not a complete font at the 
> level of "professional" typesetting. 
> 
> Code2000 was developed (IIRC Netscape and Bitstream) for internet browsers to 
> be able to support a wide range of scripts at the level of being able to put 
> something on the screen even if the system does not have the specific fonts 
> for (uncommon) scripts. As such, it was never intended for "real" typesetting.
> 
> Cheers,
> Wilfred

Hope this helps,

Ross

Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

<>

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] On an ugly hack to mathbf only in the local style.

2012-10-30 Thread Ross Moore

Hi Michaël,

On 31/10/2012, at 1:39 AM, Michaël Cadilhac wrote:

>> Howdy,
>> 
>> \vec{v}_1 ?
> 
> Herb,
> 
> Thanks, but of course, I'd like to avoid going through hundreds of pages (ok,
> a script would be easy to write, but still...).  Also, I'd like to keep the
> semantics "\vec{T} is for a vector T", whether T=v or T=v_1.

It's a pity that you chose to write you manuscripts this way.
You can see how difficult it gets when you write a macro
that represents just an abstract concept, without detailed thought
for all the different ways it may be used.

What I do for this kind of thing is:

 \newcommand{\TT}{\boldsymbol{T}}  
 \newcommand{\vv}{\boldsymbol{v}}  

% \boldsymbol gives a bold-italic, rather than bold-upright

then use it in the body material as:

  \TT  or  \TT_1  or  \vv^{(1)}_2  etc.

When reading your own source coding, you see `\TT' and
think `vector T' or just `T' --- which are what you would 
say out loud if you were writing on a black/white-board
while giving a lecture.

The other advantage of doing it this way is that you do not
need to change the body of your document when you choose, in
future, to use a different kind of processor, creating a view 
of your document for a different format: HTML, XML, tagged-PDF, 
ePub, MathML, etc.

You'll only need to adjust the macro definitions to add whatever
is necessary for the required kind of enrichment.

> 
> Thanks!
> 
> M.

Cheers

Ross

Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] how to do (better) searchable PDFs in xelatex?

2012-10-15 Thread Ross Moore

Hi Peter, Jonathan,

On 16/10/2012, at 2:02, Peter Baker  wrote:

> On 10/15/12 10:59 AM, Jonathan Kew wrote:
>> 
>> That's exactly the problem - these glyphs are encoded at PUA codepoints, so 
>> that's what (most) tools will give you as the corresponding character data. 
>> If they were unencoded, (some) tools would use the glyph names to infer the 
>> relevant characters, which would work better.
>> 
>>> Small caps are named like "a.sc" and they are unencoded.
>> And as they're unencoded, (some) tools will look at the glyph name and map 
>> it to the appropriate character.
> 
> I've been trying to explain this:  but Jonathan does it much better than I 
> did, and with more authority.

Yes, but why would he tools be designed this way?
Surely unencoded means that the code-point has not been assigned yet, and may 
be assigned in future. So using these is asking for trouble.
Was not the intention of PUA to be the place to put characters that you need 
now, but have no corresponding Unicode point? This is precisely where using the 
font name should work. Or am I missing something?

So why would the tool be designed to infer the right composition of characters 
when a ligature is properly named at an unencoded point, but that same 
algorithm is not used when it is at a PUA point?

> 
> P.

Perplexed.

Ross

PS. would not this be particulr issue with ligatures be resolved with a 
/ToUnicode  CMap for the font, which can do one–many assignments. 
Yes, this does not handle the many–one and many–many requirements of complex 
scripts, but that isn't what was being reported here, and is a much harder 
recognition problem.
Besides, it isn't clear there what copy-paste should best produce. Nor how to 
specify the desired search.

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] how to do (better) searchable PDFs in xelatex?

2012-10-14 Thread Ross Moore

Hi Peter,

On 15/10/2012, at 1:45 PM, Peter Baker wrote:

> On 10/14/12 7:47 PM, msk...@ansuz.sooke.bc.ca wrote:
>> If font designers did that, and if PDF readers looked at the glyph names 
>> according to Adobe's directions, then searches would work regardless of PUA 
>> use. However, not all fonts and not all readers do this.
> My experience is that "not all" = "none." I've tested my own font (Junicode) 
> in Adobe Reader, Preview, Evince and Goodreader (with PDFs generated by 
> XeTeX), and the result is the same in all. Standard ligatures (those encoded 
> at FB00 and following) work fine, but others do not. For example, Junicode 
> has an f_t ligature in the PUA, properly named, and when that is used you 
> cannot search for "after" or "often" in any of those PDF readers. But when I 
> move it out of the PUA into an unencoded slot, it works fine.

Any chance of providing example PDFs of this?
(preferably using uncompressed streams, to more easily
examine the raw PDF content)

Do the documents also have CMap resources for the fonts,
or is the sole means of identifying the meaning of the
ligature characters coming from their names only?

Have these difficulties been reported to Adobe recently?
If not, would you mind me doing so?

> 
> Same with Libertine.
> 
> Peter


Cheers

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] The future of XeTeX

2012-07-30 Thread Ross Moore

Hi Phil,

On 31/07/2012, at 7:35 AM, Philip TAYLOR wrote:

> 
> OK, but how about things such as graphics ?  I already know
> that I cannot use (e.g.,) \pdfximage, \pdfrefximage and
> \pdflastximage in XeTeX, and have to use instead \XeTeXpicfile;
> if I were to migrate to LuaTeX, would I be correct in expecting
> to find a whole new set of graphics primitives ?

I know you don't like using LaTeX, but really people have
put a huge amount of effort into that, and its packages.

When I occasionally still use Plain TeX, almost always
I make use of:

   \input miniltx.tex 

This loads a *very limited* subset of LaTeX, but enough
to allow  \usepackage  to work.
Some LaTeX packages have been written in TeX, so as to not 
depend on other parts of LaTeX for loading and doing their stuff.
This includes graphics and graphicx, and perhaps color too.
(Xy-pic was written this way too!)

So you can make use of the  \includegraphics  command,
along with its options and driver support, from within
your Plain TeX documents.
This then allows you to just use them with all their
associated power and effects, without having to worry 
about how it all works ...

OR
   ... turn on some  \tracingall  to find out just what
primitives are being constructed, to fully satiate your 
ever-inquiring mind.

>  And what of
> \XeTeX's extensions to the \font primitive ?

Isn't there some documentation about that?

> 
> ** Phil.

Cheers,

Ross

----
Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Footnote rule line without any footnote text material

2012-07-12 Thread Ross Moore

Hi Andy, and Peter,

On 13/07/2012, at 1:38 AM, Andy Black wrote:

> On 7/12/2012 2:29 AM, Peter Dyballa wrote:
>> Am 12.07.2012 um 02:02 schrieb Andy Black:
>> 

>> Is it for an almost minimal test case necessary to set up fancyhdr and 
>> hyperref?
> 
> As you may have guessed, the TeX code is generated automatically from an XML 
> mark-up language

That explains all the unnecessary extra grouping,
and in-body commands where an environment would be more appropriate
 --- well, it would help to make the coding easier to read,
and debug, by making the structure clearer.
But none of this is actually wrong.

> for linguistic documents (see http://www.xlingpaper.org/). This generation 
> process is attempting to use many of the features that LaTeX and friends 
> provide (to avoid having to re-invent the wheel) while still allowing for 
> larger variations in layout parameters than basic LaTeX has.  What is here is 
> what I found to work.  I'm not recalling the exact reason why I used MainFont 
> but I do know that the
> 
>\font\MainFont="/font family name/" at /pointsize/pt
> 
> was a way to allow for varying font families and point sizes, including 
> larger or smaller than the three I understand LaTeX provides (10, 11, and 12).
> 
> I would not be shocked to learn that there is a better way to do this, but I 
> found this to work.

First off, dump the XML elements as environments.
Put the constant coding into the \newenvironment  definitions,
read from a separate package or document-class file.

If there are attribute values that need to be passed as parameters,
then that's OK. You can specify the number of parameters, via.

  \newenvironment[]{...
... coding at the start of the environment ...
  }{
 ... coding at the end of the environment ...
  }

> 
>> 
>> Why is
>> 
>>  \protect\footnote
>> 
>> necessary?

Irrelevant. Tracing shows that  \protect = \relax  
at the time this is called.

> 
> There are situations when a footnote is embedded within other constructions 
> (perhaps a table within a table -I'm not recalling the exact context) where 
> the \protect was necessary.  Rather than coding the TeX generator to have to 
> determine the set of contexts where the \protect was required, I opted to 
> just always use it.
> 
>> Could this provoke setting the footnotes line on every page?
> 
> No, it doesn't.  

Correct.

> I'm using the \protect\footnote for every footnote and it is only in very, 
> very rare circumstances that we get the extra footnotes line.
> 
> In addition, I also just removed the \protect in the .tex file I sent and 
> re-ran it using tl 2012.  The extra footnotes line is still there.
> 
> Thanks again so much for exploring this with me.

The problem is definitely related to  {longtable} 
since it calls \output  and fiddles with the \pagegoal .

Your example has a  \begin{longtable} ... \end{longtable}
at the bottom of the page, prior to where the unwanted rule
occurs.

I can make the extra footnote rule go away, by including
some extra space at the bottom of the table; viz.

  ... table cell data ...
  \\noalign{\vspace{}}
  \end{longtable}

varying the "some amount" one can either get the extra rule,
or suppress it.  More space, beyond some limit, suppresses
the rule, and has no other effect.
What that limit is, may vary according to the actual vertical
size of the tabular material, so this does *not* give an easy
way to solve the problem programatically.

What I suspect is happening is that when the \LT@output  is called
{longtable} is aware that a footnote is around, and splits its
contents, perhaps leaving only glue to go onto the next page.
On that next page, the {longtable} environment finishes, and still
thinks that there is a footnote to be placed, even though there
is no content remaining.
The glue at the top of the page gets discarded, but the apparent
presence of footnote material is retained.

When extra space is added at the end of the {longtable} the 
desirability of where to split the table changes, and perhaps
the whole table now gets placed on the first page. 
There is no carry-over of any knowledge of a footnote, so no
extra line is drawn on the next page.

This is all pretty-much speculation.
Someone more familiar with the inner workings of {longtable}
may be able to make more sense of what is happening.

> 
> --Andy
> 
>> 
>> --
>> Greetings
>> 
>>   Pete
>> 
>> The human brain operates at only 10% of its capacity. The rest is overhead 
>> for the operating system.

Needed more than 10% to study this weirdness!

Cheers,

Ross

Ross Moore

Re: [XeTeX] Extraneous comma shows conflict between memoir and xecjk package?

2012-06-07 Thread Ross Moore

Hi Jon,

On 07/06/2012, at 2:53 PM, jon wrote:

> Thanks, Ross.
> 
> That was plunge into the deep end of the pool! I gave it try, nevertheless, 
> and was still searching the log for the extraneous comma when Qing Lee's 
> message came in, a virtual life-saver for a poor swimmer. 

Yean.
that package uses the LaTeX-3 syntax, which is *a lot* harder to search
that classical LaTeX.
Sorry for subjecting you to that --- but it does make you marvel
at the kind of program that is TeX !

Cheers

Ross

> 
> Jon

--------
Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Extraneous comma shows conflict between memoir and xecjk package?

2012-06-06 Thread Ross Moore

Hello Jon,

On 07/06/2012, at 5:15 AM, jon wrote:

> THE BAD
> The following does not produce expected results. There is an *extraneous 
> comma* (or superscripted comma) inserted  into the body text directly before 
> every footnote number.
> 
> \documentclass[12pt]{memoir} > 
> \usepackage{fontspec}
> \setmainfont[Mapping=tex-text,Numbers=OldStyle]{Linux Libertine O} > 
>  > \usepackage{xeCJK} > 
> \setCJKmainfont[]{AR PL UKai TW} > 
> \begin{document}

{\tracingall
> But if I don't ``\emph{See through it and let it 
> go},"\footnote{看得破，放得下。\emph{Kàn de pò, fàng de xià}.} the karma will carry 
> forward >  >  > 
}

> \end{document}

> The book is 99.5% typeset so it is not practical to switch away from memoir 
> at this point. If there is a way to use Chinese fonts without using xeCJK, I 
> could try that. Any ideas how to get rid of that extraneous comma?

First you need to find out where it is coming from.
Inserting \tracingall (within a group {...} to limit the scope)
will produce lots of output in the .log  or Console window.

You should be able to search this to find where the extra comma
is inserted, then follow backwards the various macro-expansions
that caused this.

Once the source is located, you should be able to make a single
macro re-definition in your document preamble, to prevent 
the behaviour that you do not want.

This may seem a rather tedious way to tackle the problem, 
but it is reliable and very instructive for solving such
delicate problems, if you are interested in programming. 
— Not everyone's cup of tea, though.

> 
> Thanks!
> 
> Jon
> --
> Jon Babcock

Hope this helps,

Ross

Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeLaTeX and SIunitx

2012-05-14 Thread Ross Moore

Hi Bruno,

On 14/05/2012, at 4:04 PM, Bruno Le Floch wrote:

>> If by styles, you mean in a macro definition made within
>> a separate style file, then I agree with you 100%.
>> 
>> But ...
>> 
>>> But it doesn't
>>> solve the problem here as pdftex chokes if it sees more than two ^^:

OK. Mea Culpa.

But this just reinforces the need for having style files,
and engine-dependent subsidiary files.

>>  ... this is not a good example to support this view.
>> 
>>> \documentclass{article}
>>> \begin{document}
>>> 00b5
>>> \end{document}
>> 
>> The body of your document source should be engine independent,
>> so this should look more like:
>> 
>> 
>> \documentclass{article}
>> \usepackage{ifxetex}
>> 
>> \ifxetex
\input{mydefs_xetex.def} ...
> 
>> \else
>> \if ...
>>  % \inputs  to handle other engines
>>  ...
>> \fi
>> \fi
>> 
>> \begin{document}
>> \micronChar
>> \end{document}

>> You want to avoid having to find and replace multiple instances
>> of the special characters, when you share your work with colleagues
>> or need to reuse your own work in other contexts.
>> Instead you should simply need to adjust the macro expansions,
>> and all that previous work will adapt automatically.


> You cannot do  " \ifxetex 00b5 \else ^^c2^^b5 \fi "  because the
> character ^^^ is invalid in pdfTeX (catcode 15), hence pdfTeX chokes
> whenever it sees that character in a line, with the exception of \^^^,
> the command symbol (otherwise it would be difficult to change the
> catcode of ^^^).  On the other hand, you can do
> 
> \ifxetex
>\expandafter \@gobble \string \00b5
> \else
>   ...
> \fi

This kind of coding may have a place somewhere, but surely it is 
better to use different input files for different engines.

This is a similar concept to "Content Negotiation" for websites, 
whereby a web server delivers different versions of the content 
according to which web browser is making the request.

> 
> Regards,
> Bruno

Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114






--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeLaTeX and SIunitx

2012-05-13 Thread Ross Moore

Hi Ulrike, and Bruno,

On 13/05/2012, at 11:05 PM, Ulrike Fischer wrote:

> Am Fri, 11 May 2012 19:44:00 +0200 schrieb Bruno Le Floch:
> 
>> I'm really no expert, but the siunitx package could include, e.g., µ
>> as 00b5.  This  would not make pdftex choke when appearing in the
>> false branch of an engine-dependent conditional.
> 
> Using ^^..-notation is certainly a good idea in styles - regardless
> of the engine - as it avoids encoding confusing.

If by styles, you mean in a macro definition made within 
a separate style file, then I agree with you 100%.

But ...

> But it doesn't
> solve the problem here as pdftex chokes if it sees more than two ^^: 
> 

  ... this is not a good example to support this view.

> \documentclass{article}
> \begin{document}
> 00b5
> \end{document}

The body of your document source should be engine independent,
so this should look more like:

\documentclass{article}
\usepackage{ifxetex}

\ifxetex
 \newcommand{\micronChar}{00b5}
  % handle other characters
  ...
\else
 \if ... 
  % handle other possibilities
  %  e.g.  ^^c2^^b5
  ...
 \fi
\fi

\begin{document}
\micronChar
\end{document}

Better still, of course is to have the conditional
definitions made in a separate file, so that similar things
can all be handled together and used in multiple documents.

You want to avoid having to find and replace multiple instances
of the special characters, when you share you work with colleagues
or need to reuse your own work in other contexts.
Instead you should simply need to adjust the macro expansions,
and all that previous work will adapt automatically.

> 
> 
> ! Text line contains an invalid character.
> l.9  ^^^
>^00b5
> ? x
> 
> 
> For pdftex you would have to code it as two 8bit-octect:  ^^c2^^b5
> But this naturally will assume that pdftex is expecting utf8-input.
> 
> -- 
> Ulrike Fischer 

Hope this helps,

Ross

Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Producer entry in info dict

2012-02-28 Thread Ross Moore

Hi Heiko,

On 29/02/2012, at 8:44 AM, Heiko Oberdiek wrote:

> Hello,
> 
> the entries in the information dictionary can be controlled
> at TeX macro level except for /Producer:
> 
> % xetex --ini
> \catcode`\{=1
> \catcode`\}=2
> \shipout\hbox{%
>  \special{pdf:docinfo<<%
>/Producer(MyProducer)%
>/Creator(MyCreator)%
>/Author(MyAuthor)%
>/Title(MyTitle)%
>/Subject(MySubject)%
>/Keywords(MyKeywords)%
>/CreationDate(D:2012010100Z)%
>/ModDate(D:2012010100Z)%
>/MyKey(MyValue)%
>>> }%
> }
> \csname @@end\endcsname\end

Surely  /Creator  is (La)TeX, Xe(La)TeX, ConTeXt, etc.
while   /Producer  is the PDF engine:  
   Ghostscript, xdvipdfmx, pstopdf, Acrobat Distiller, etc.
and  /Author  is the person who wrote the bulk of
the document source.

Why should it be reasonable that an author can set the
 /Producer and /Creator  arbitrarily within the document 
source?

The author chooses his workflow, and should pass this
information on to the appropriate package ...

> 
> The entry for /Producer gets overwritten by xdvipdfmx,
> e.g. "xdvipdfmx (0.7.8)". Result:
> 
> * Bug-reports/hyperref: pdfproducer={XeTeX ...} does not work.
> * hyperxmp is at a loss, it *MUST* know the value of the
>  /Producer, because the setting in the XMP part has to be
>  the same.

  ... via options to  \usepackage[...]{hyperxmp}

and the package should be kept up-to-date with the exact strings
that will be produced by the different processing engines, in all 
their existing versions.

I know that one processor cannot know in advance how its output
will be further processed, but that is not the point of XMP.

The person who is the author, or production editor, *does* know 
this information (at least in principle) and should ensure that 
this gets encoded properly within the final PDF --- if complete 
validation against an existing standard is of any importance.

> 
> Please fix this issue in xdvipdfmx.

I'm not sure that it is  xdvipdfmx's duty to handle this
issue; though see my final words below.

My initial thoughts are as follows:

The nature and purpose of XMP  is such that an author
cannot just  \usepackage{hyperxmp}   with no extra options,
and expect the XMP information to be created automagically,
correctly in every detail.

The alternative is to have an auxiliary file that contains
macro definitions, to be used both in the  docinfo  and XMP.
This auxiliary file needs to be created either manually,
or automatically extracting the information from a PDF,
first time it is created.

With PDF/A and PDF/UA the PDF file is not supposed to be 
compressed, so automating this is not so hard --- though 
it may well be platform-dependent.
(Not sure about other flavours of PDF/??? .)

> 
> Yours sincerely
>  Heiko Oberdiek

BTW, what about the  /CreationDate  and  /ModificationDate ?
Surely these should be set automatically too ?
Doesn't  pdfTeX  have the means to do this?

Of course when it is a 2-engine process, such as
  XeTeX + xdvipdfmx 
then which time should be encoded here?
XeTeX cannot know the time at which  xdvipdfmx  will do 
its work.  Maybe it can extrapolate ahead, from information
saved from the previous run ?

So maybe what is really desirable is for  xdvipdfmx  to write
out an auxiliary file containing all relevant metadata, including
timings, that can then be used by the next run of  XeLaTeX .
A  \special{ ... }  command could be used to trigger the need
for such an action to be performed.

Is that what you had in mind?

Hope this helps,

Ross

Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] How to Convert LaTeX to Html

2012-02-13 Thread Ross Moore

Hi Peter,

On 13/02/2012, at 9:32 AM, Peter Dyballa wrote:

> 
> Am 12.2.2012 um 01:10 schrieb A u:
> 
>> (--- xdv font = Sanskrit2003 (not implemented) ---)
>> --- warning --- Couldn't find font `Sanskrit2003.htf' (char codes: 0--255)
>> (--- xdv font = Sanskrit2003 (not implemented) ---)
> 
> In TeX4ht you cannot use every font, only those that are already converted to 
> HTF format – I don't know how this conversion works.
> 

There is an explanation in The LaTeX Web Companion, §4.6.7 .

> --
> Greetings
> 
>  Pete



Cheers,

Ross

--------
Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Typesetting Greek mathematical text using Unicode

2012-01-18 Thread Ross Moore


On 19/01/2012, at 7:05 AM, Ross Moore wrote:

> Hello D. and Andrew,
> 
> On 19/01/2012, at 3:06 AM, Andrew Moschou wrote:
> 
>> I opened Cambria in a font editor, and it looks fine. I also opened Cambria 
>> in the character map, and it displayed it wrong just as you describe. No 
>> idea where the problem lies...
> 
> When I look into it on a Mac, then Cambria does not support that
> code-point, at least not as a single pre-composed glyph.  
> (see image, where Cambria does not appear in the list of fonts)

Furthermore, when I typeset your example, modified slightly to

>> \documentclass[a4paper]{article}
>> \usepackage{xltxtra}
>> \setmainfont{Cambria}
>> \begin{document}{\tracingall
>> Ἀριθμὸς  \char"1F08
>> }\end{document}

then I get messages:

>>> {into \tracingonline=1}
>>> {the letter Ἀ}
>>> {horizontal mode: the letter Ἀ}
>>> Missing character: There is no Ἀ in font 
>>> Cambria/ICU:script=latn;language=DFL
>>> T;!
>>> Missing character: There is no ὸ in font 
>>> Cambria/ICU:script=latn;language=DFL
>>> T;!
>>> {blank space  }
>>> {\char}
>>> Missing character: There is no Ἀ in font 
>>> Cambria/ICU:script=latn;language=DFL
>>> T;!
>>> {end-group character }}


See attached image as output:

<>

Perhaps my 2006 version of Cambria is older than yours?

Version Version 1.00 
Location/Users/rossmoor/Library/Fonts/CAMBRIA.TTC
Unique name Microsoft: Cambria: 2006
ManufacturerMicrosoft Corporation
DesignerAgfa Monotype Corporation
Copyright   © 2006 Microsoft Corporation. All Rights Reserved.



It works fine, using   \setmainfont{Linux Libertine O}
(see other image)

<>

Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114






--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] fontspec loading the wrong font?

2011-12-15 Thread Ross Moore

Hi Daniel,

On 16/12/2011, at 9:44 AM, Daniel Greenhoe wrote:

> On Fri, Dec 16, 2011 at 5:56 AM, Ross Moore  wrote:
>> try doing some detailed tracing, using  \tracingall
> 
> Thank you for your help. I did try it with the \tracingall directive.
> However, the compilation crashes with message
> ! Undefined control sequence.
> l.85   GNU FreeSerif:   & \fntFreeSerif

Yes, because of the {...} delimiting.
This makes part of the \newfontfamily  stuff local to that group.

You can scan through that tracing to see whether all is as
it is supposed to be, with respect to filenames of the fonts
being loaded, or being setup for later loading.

The fact that the document fails is irrelevant to obtaining
that information.

If you remove those braces, then you'll trace an awful lot more
of the document, getting masses more output into the .log  file.

The document may now process to completion, but it may be a lot
harder to find the relevant parts to font-loading.

> 
> Maybe the log file would still be helpful to someone; but it is huge
> (about 2.4MByte), so I won't attach it to this email. If anyone wants
> the file, I can email it or put it on a publicly accessible server.
> 
> Dan

Hope this helps,

Ross

----
Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] fontspec loading the wrong font?

2011-12-15 Thread Ross Moore

Hello Daniel,

On 16/12/2011, at 8:43 AM, Daniel Greenhoe wrote:

> I have run into a very strange problem when using fontspec and trying
> to test a new experimental version of GNU FreeSerif. In particular,
> suppose I try labeling the old FreeSerif as \fntFreeSerif and the new
> experimental FreeSerif as \fntFreeSerifx like this:

try doing some detailed tracing, using  \tracingall

{\tracingall % detailed trace of just the next 2 top-level commands
> \newfontfamily{\fntFreeSerif}[
>   ExternalLocation,
>   Path   = {/xfonts/gnuFreeFont/},
>   Extension  = {.otf},
>   UprightFont= {*},
>   BoldFont   = {*Bold},
>   ItalicFont = {*Italic},
>   BoldItalicFont = {*BoldItalic},
>   ]{FreeSerif}
> 
> \newfontfamily{\fntFreeSerifx}[
>  ExternalLocation,
>  Path   = {/xfonts/gnuFreeFont/2011dec12/},
>  Extension  = {.ttf},
>  UprightFont= {*},
>  BoldFont   = {*Bold},
>  ItalicFont = {*Italic},
>  BoldItalicFont = {*BoldItalic},
>  ]{FreeSerif}
} % closing delimiter to restrict the scope of \tracingall


Then study the .log file output.
There will be *masses* of extra output lines, most of which
is quite irrelevant to your needs.
nevertheless, you may be able to spot where something is obviously
Not how you would like it to be.

> 
> Then XeLaTeX seems to get confused and does not seem to find the new
> \fntFreeSerifx font, but is maybe using \fntFreeSerif or another
> version of FreeSerif, perhaps one in my Texlive setup.
> 
> In the log file, both fonts are assigned the same label FreeSerif(0):
> 
> . Font family 'FreeSerif(0)' created for font 'FreeSerif' with options [
> . ExternalLocation, Path = {/xfonts/gnuFreeFont/}, Extension = {.otf},
> . UprightFont = {*}, BoldFont = {*Bold}, ItalicFont = {*Italic},
> . BoldItalicFont = {*BoldItalic}, ].
> 
> . Font family 'FreeSerif(0)' created for font 'FreeSerif' with options [
> . ExternalLocation, Path = {/xfonts/gnuFreeFont/2011dec12/}, Extension =
> . {.ttf}, UprightFont = {*}, BoldFont = {*Bold}, ItalicFont = {*Italic},
> . BoldItalicFont = {*BoldItalic}, ].
> 
> 
> But if I comment out any *one* (or all four) of the shape directive
> lines like this
> 
> \newfontfamily{\fntFreeSerifx}[
>  ExternalLocation,
>  Path   = {/xfonts/gnuFreeFont/2011dec12/},
>  Extension  = {.ttf},
>   UprightFont= {*},
>   BoldFont   = {*Bold},
>   ItalicFont = {*Italic},
> %   BoldItalicFont = {*BoldItalic},
>  ]{FreeSerif}
> 
> then the problem goes away, and the two fonts are given different labels:
> 
> . Font family 'FreeSerif(0)' created for font 'FreeSerif' with options [
> . ExternalLocation, Path = {/xfonts/gnuFreeFont/}, Extension = {.otf},
> . UprightFont = {*}, BoldFont = {*Bold}, ItalicFont = {*Italic},
> . BoldItalicFont = {*BoldItalic}, ].
> 
> . Font family 'FreeSerif(1)' created for font 'FreeSerif' with options [
> . ExternalLocation, Path = {/xfonts/gnuFreeFont/2011dec12/}, Extension =
> . {.ttf}, UprightFont = {*}, BoldFont = {*Bold}, ItalicFont = {*Italic}, ].
> 
> Is this something I am doing wrong, a fontspec bug, or a problem with
> FreeSerif and variants?

The .log output using  \tracingall  may offer some clues to help
someone to answer this question.

> 
> Many thanks in advance,
> Dan


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114






--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] tabular in footnote

2011-12-06 Thread Ross Moore

Hello Daniel,

On 07/12/2011, at 8:16 AM, Daniel Greenhoe wrote:

> Thank you everyone for your help with this problem. I will regard it
> as a bug. I hope that someday it can be fully resolved.

Heiko explains why the table doesn't align as you want.

Try this variant of your example.


\documentclass[12pt]{book}
\usepackage[a4paper,noheadfoot,nomarginpar,margin=20mm,showframe]
 {geometry}


% adjust this value to suit
\def\foottableraise{2ex}

% define a new environment
\newenvironment{foottable}{%
 \raise\foottableraise\hbox\bgroup\space
 \begin{tabular}[t]%
 }{%
 \end{tabular}\egroup\vskip\foottableraise
 }

\begin{document}%
  xyz\footnote{%
\raisebox{\foottableraise}{ % inserts a space
\begin{tabular}[t]{|l|}
   \hline
abc\\
def\\
ghj\\
klm\\
\hline
  \end{tabular}%\\
  }%
  \vskip \foottableraise
}
  xyz\footnote{%
\begin{foottable}{|l|}
   \hline
abc\\
def\\
ghj\\
klm\\
\hline
  \end{foottable}%\\
   }
\end{document}%

Note that you need to use TeX's  \raise  and  \bgroup ... \egroup
in the environment definition.
This is because \raisebox reads its argument too soon, so the
start and end of the box cannot then be split between the
\begin and \end of the \newenvironment .

> 
> Dan


Hope this helps,

Ross

--------
Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114






--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Diacritics in color

2011-12-02 Thread Ross Moore

Hi Tobias,

On 03/12/2011, at 6:06, Tobias Schoel  wrote:

> As a teacher I can think of some more Applications. Of course, these are 
> pedagogical:
> 
> Teaching scripts to beginners (learning to write a primary school, learning 
> to write in a different script when learning another language (or even in the 
> same language: Mongol?):
> 
> You might want to color single parts of a glyph in order to highlight them. 
> So, for example in a handwritten (see 
> http://de.wikipedia.org/wiki/Schulausgangsschrift or english equivalents I 
> haven't found in the time) "a" the beginning or end-strokes might be colored.

Yes, but are these examples really requiring parts of the same whole character 
coloured differently?

Presuming that the font did allow access to individual glyphs, as if separate 
characters, then would not all meaningful aspects be equally well (if not 
better) encoded by an overlay?
That is, position a coloured version of the required glyph over the full 
character in monochrome.

In the pedagogical setting, you are presumably talking about the single stroke 
as a sub-part of the whole character, so it deserves to be placed as an entity 
in itself.
This is quite different to a colored diacritical mark modifying the meaning of 
a character.

> 
> Of course the font creator has to create sub-glyphs or other fancy stufff, 
> but XeTeX should allow (re)composition of the glyph with different colors.
> 
> bye
> 
> Toscho

Hope this helps,

   Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Diacritics in color

2011-11-30 Thread Ross Moore

Hi Heiko, and others

On 30/11/2011, at 8:56 PM, Heiko Oberdiek wrote:

> The PDF stuff:
> 
>  % without color:
>  0 -99.63 Td[<0024>]TJ
>  54.19 15.57 Td[<0301>]TJ
> 
>  % with color:
>  -54.19 -115.2 Td[<0024>]TJ
>  ET 1 0 0 RG 1 0 0 rg BT /F1 99.626 Tf
>  59.3 -278.84 Td[<0301>]TJ
> 
>  % with color via \special{pdf:code ...}:
>  0 -378.46 Td[<0024>]TJ
>  ET 1 0 0 rg 1 0 0 RG BT /F1 99.626 Tf
>  59.3 -378.46 Td[<0301>]TJ
> 
> It seems that in XeTeX the color cannot be inserted without
> breaking the PDF text sections (BT...ET). \special{pdf:literal direct ...}
> (or short \special{pdf:code ...}) is not equivalent to
> \pdfliteral direct{...} but rather to \pdfliteral page{...}
> that ends a text section in the PDF output.
> 
> If someone want's this issue fixed, make a feature request
> for XeTeX/xdvipdfmx to provide a real "\pdfliteral direct"/
> "direct mode for colors" without breaking text sections.

In the experimental version of pdfTeX with support for Tagged PDF,
we need a similar kind of variant to \pdfliteral .
It is called 'enclose'.

Here is how, in the coding of the corresponding pdftex.web 
the various options are read, following an occurrence 
of \pdfliteral :

@ @=
begin
check_pdfoutput("\pdfliteral", true);
new_whatsit(pdf_literal_node, write_node_size);
if scan_keyword("direct") then
pdf_literal_mode(tail) := direct_always
else if scan_keyword("page") then
pdf_literal_mode(tail) := direct_page
else if scan_keyword("enclose") then
pdf_literal_mode(tail) := enclose
else
pdf_literal_mode(tail) := set_origin;
scan_pdf_ext_toks;
pdf_literal_data(tail) := def_ref;
end

and here is the documentation on those options:

@# {data structure for \.{\\pdfliteral}; node size = 2}
@d pdf_literal_data(#) == link(#+1) {data}
@d pdf_literal_mode(#) == info(#+1) {mode of resetting the text matrix
  while writing data to the page stream}
@# {modes of setting the current transformation matrix (CTM)}
@d set_origin  == 0 {end text (ET) if needed, set CTM to current 
point}
@d direct_page == 1 {end text (ET) if needed, but don't change the 
CTM}
@d direct_always   == 2 {don't end text, don't change the CTM}
@d scan_special== 3 {look into special text}
@d enclose == 4 {like |direct_always|, but end current string 
and sync pos}


The 'enclose' option is used to position the 'BDC' and 'EMC'
as closely around the text snippets as can be achieved,
without breaking the BT ... ET.
This seems to be the same kind of requirement here.

It would be nice if any extension to literal specials in  xdvipdfmx
used the same keywords for similar functionality.

> 
> Yours sincerely
>  Heiko Oberdiek


Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Diacritics in color (was Re: XETEX cannot access OpenType features in PUA?)

2011-11-28 Thread Ross Moore

Hi Aleks,

On 29/11/2011, at 6:18 AM, Aleksandr Andreev wrote:

> Jonathan Kew writes:
> 
>>> Making this work in xetex would require a different approach to 
>>> implementing color.
> 
> I have been able to get it to work (the base glyph in black and the
> "diacritic" in red) in LuaTeX using the luacolor package.
> 
> Here's a minimal example:
> 
> \documentclass{minimal}
> \usepackage{fontspec}
> \usepackage{xcolor}
> \usepackage{luacolor}
> 
> \newfontface\moo{MezenetsUnicode}
> 
> \begin{document}
> \moo
> \textcolor{red}{}
> \end{document}

Would you be so kind as to post the PDF from this?
And where does one obtain the font MezenetsUnicode ?
 --- Google gives nothing with this name.

Furthermore my LuaTeX gives a Segmentation Fault, so I cannot
just try with a different font!

> 
> I'm not much of an expert in the inner workings of TeX and I know
> absolutely nothing about Lua (is that a derivative of LISP?) so I
> can't comment on whether the luacolor package could be ported to
> XeTeX.

I'd doubt that this could work currently.

My guess is that you would need to do some post-processing
of the PDF code snippet returned from the OS positioning
the glyphs. Once positioned, you would need to wrap the colour
commands around the part which places the diacritic.

> Any insights?

But XeTeX currently does not give you access to that PDF string,
and it is well past the place of macro-expansion in LaTeX, so 
there wouldn't be a mechanism for such late adjustments.

It can be done with LuaTeX, since it does have the appropriate
mechanism for such post-processing.

Others more familiar with how LuaTeX works can confirm this
explanation -- or shoot it down, as appropriate.

> Aleks

Hope this helps,

Ross

Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Problems with lmroman10 fonts after Life TeX Update

2011-11-27 Thread Ross Moore

Hi Eckart,
On 28/11/2011, at 4:21 AM, Eckart Hasselbrink wrote:

> Hi,
> 
> I need help: I let the Life TeX Utility do its thing last night (I have 
> LifeTeX-2011-BasicTeX installed). Since then I have problems with a document 
> which compiled fine before.
> 
> I got the error that t3enc.def could not be found. I learned from the web, 
> that I need to install the tipa package as this now required by XeTeX.

Xunicode  most certainly does *not* require the tipa package, 
just the file  t3enc.def  for the font encoding that  tipa.sty  uses.
This is so that constructions from  tipa.sty  can be replaced by
their Unicode equivalents, enabling older documents using TIPA
constructions to be processed correctly.

Not sure why TIPA isn't part of your TeX installation anyway.
If there is a good reason for this, I can adjust that part
of  xunicode.sty  to bypass this dependency and the
accompanying TIPA functionality.

Have you tried a version of TeX from TeX Live that is 
a bit more advanced than  BasicTeX ?

> Now, I am stuck with the following error messages:
> 
>> (/usr/local/texlive/2011basic/texmf-dist/tex/latex/euenc/eu1lmr.fd)kpathsea: 
>> Invalid fontname `[lmroman10-regular]', contains '['

Parsing to find the font name has failed somehow here.
Someone else can comment.

>> 
>> ! Font EU1/lmr/m/n/10=[lmroman10-regular]:mapping=tex-text at 10.0pt not 
>> loadab
>> le: Metric (TFM) file or installed font not found.
>>  
>>   relax 
>> l.100 \fontencoding\encodingdefault\selectfont
>> 
>> ? 
>> ) (/usr/local/texlive/2011basic/texmf-dist/tex/xelatex/xunicode/xunicode.sty
>> (/usr/local/texlive/2011basic/texmf-dist/tex/latex/tipa/t3enc.defkpathsea: 
>> Invalid fontname `[lmromanslant10-regular]', contains '['
>> 
>> ! Font EU1/lmr/m/sl/10=[lmromanslant10-regular]:mapping=tex-text at 10.0pt 
>> not 
>> loadable: Metric (TFM) file or installed font not found.
>>  
>>   relax 

> 
> These do not make any sense to me. How do I remedy this situation?
> 
> TIA,
> 
> Eckart

Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] VIQR pre-processor wrotten in (Xe)TeX ?

2011-11-24 Thread Ross Moore

Hi Andrew, and Phil,

On 25/11/2011, at 9:38 AM, Andrew Cunningham wrote:

> Word final vowels with punctuation following, e.g. full stop, question mark.
> 
> the following sentence:
> 
> Tôi yêu tiếng nước tôi từ khi mới ra đời.
> 
> is represented in strict VIQR as:
> 
> To^i ye^u tie^'ng nu+o+'c to^i tu+` khi mo+'i ra ddo+`i\.
> 
> Notice the escaping of the full stop at the end of the sentence.
> Without the escaping the full stop would be converted to diacritic
> below the letter "i".

What a pain for TeX, which already uses  '\.' to put 
a dot-below accent on a character.

At least here we have \.  (since '.' has catcode 12, not 11).
So it should be possible to modify the definition of \.{ } within
the NFSS processing of \.  to assign a special meaning.
This is coding that could be easily included within Xunicode.sty , 
but applied only optionally, according to encoding or within a font-switch.

You would need that user code of  \.  and equivalently \.{ } 
should expand to a sequence of:
   '\' + '.'  each with catcode 12  followed by a space.
Then your TecKit .map patterns can be defined to respect this.

There will be problems when \. is meant to be used as a symbolic
separator for other purposes; e.g. as a decimal point,
or something like:  \.   (for whatever purpose).
Or at the end of a block of characters with no trailing space.

An alternative may be to define some other way to get the '\'
with catcode 12.

Or within environments that want to use VIQR data, the \catcode
of '\' is set to 12, with some other character ('|' say) becoming
TeX's category 0  escape-character.

This is probably best, as you'll need to define such environments
anyway. However, it makes it awkward to pass VIQR strings around 
within the arguments of macros. 

> 
> or another example:
> 
> Anh ddi dda^u\?
> 
> This escaping is part of VIQR and any input system that is based on VIQR.

Offhand I cannot think of any alternative meaning assigned to \? .
Is there one, in any special language, implemented in (La)TeX ?

> 
> 
> Andrew
> -- 
> Andrew Cunningham
> Senior Project Manager, Research and Development
> Vicnet
> State Library of Victoria
> Australia
> 
> andr...@vicnet.net.au
> lang.supp...@gmail.com

Hope this helps,

Ross

Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Whitespace in input

2011-11-18 Thread Ross Moore

this with pdfTeX, and it is
>> a vital part of my Tagged PDF work for mathematics.
>> Also, I have an example where the CJK.sty package is extended
>> to tag Chinese characters built from multiple glyphs so that
>> Copy/Paste works correctly (modulo PDF reader quirks).
>> 
>> Not sure about XeTeX.
>> 
>> I once tried to talk with Jonathan Kew about what would be needed
>> to implement this properly, but he got totally the wrong idea
>> concerning glyphs and characters, and what was needed to be done
>> internally and what by macros. The conversation went nowhere.

> -- 
> Zdeněk Wagner


Cheers,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Whitespace in input

2011-11-18 Thread Ross Moore

Hi Zdenek,

On 19/11/2011, at 9:51 AM, Zdenek Wagner wrote:

> This is a demonstration that glyphs are not the same as characters. I
> will startt with a simpler case and will not put Devanagari to the
> mail message. If you wish to write a syllable RU, you have to add a
> dependent vowel (matra) U to a consonant RA. There is a ligature RU,
> so in PDF you will not see RA consonant with U matra but a RU glyph.
> Similarly, TRA is a single glyph representing the following
> characters: TA+VIRAMA+RA. The toUnicode map supports 1:1 and 1:many
> mappings thus it is possible to handle these cases when copying text
> from a PDF or when searching. More difficult case is I matra (short
> dependent vowel I). As a character it must always follow a consonant
> (this is a general rule for all dependent vowels) but visually (as a
> glyph) it precedes the consonant group after which it is pronounced.
> The sample word was kitab (it means a book). In Unicode (as
> characters) the order is KA+I-matra+TA+A-matra(long)+BA. Visually
> I-matra precedes KA. XeTeX (knowing that it works with a Devanagari
> script) runs the character sequence through ICU and the result is the
> glyph sequence. The original sequence is lost so that when the text is
> copied from PDF, we get (not exactly) i*katab.

/ActualText is your friend here.
You tag the content and provide the string that you want to appear
with Copy/Paste as the value associated to a dictionary key.

There is a macro package that can do this with pdfTeX, and it is 
a vital part of my Tagged PDF work for mathematics.
Also, I have an example where the CJK.sty package is extended
to tag Chinese characters built from multiple glyphs so that
Copy/Paste works correctly (modulo PDF reader quirks).

Not sure about XeTeX.

I once tried to talk with Jonathan Kew about what would be needed 
to implement this properly, but he got totally the wrong idea 
concerning glyphs and characters, and what was needed to be done
internally and what by macros. The conversation went nowhere.

> Microsoft suggested
> what additional characters should appear in Indic OpenType fonts. One
> of them is a dotted ring which denotes a missing consonant. I-matra
> must always follow a consonant (in character order). If it is moved to
> the beginning of a word, it is wrong. If you paste it to a text
> editor, the OpenType rendering engine should display a missing
> consonant as a dotted ring (if it is present in the font). In
> character order the dotted ring will precede I-matra but in visual
> (glyph) order it will be just opposite. Thus the asterisk shows the
> place where you will see the dotted circle. This is just one simple
> case. I-matra may follow a consonant group, such as in word PRIY
> (dear) which is PA+VIRAMA+RA+I-matra+YA or STRIYOCIT (good for women)
> which is SA+VIRAMA+TA+VIRAMA+RA+I-matra+YA+O-matra+CA+I-matra+TA. Both
> words will start with the I-matra glyph. The latter will contain two
> ordering bugs after copy&paste. Consider also word MURTI (statue)
> which is a sequence of characters

This sounds like each word needs its own /ActualText .
So some intricate programming is certainly necessary.
But \XeTeXinterchartoks  (is that the right spelling?)
should make this possible.

> MA+U-matra(long)+RA+VIRAMA+TA+I-matra. Visually the long U-matra will
> appear as an accent below the MA glyph. The next glyph will be I-matra
> followed by TA followed by RA shown as an upper accent at the right
> edge of the syllable. Generally in RA+VIRAMA+consonant+matra the RA
> glyph appears at the end of the syllable although locically (in
> character order) it belongs to the beginning. These cases cannot be
> solved by toUnicode map because many-to-many mappings are not allowed.

Agreed.  /ToUnicode  is not the right PDF construction for this.

> Moreover, a huge amount of mappings will be needed. It would be better
> to do the reverse processing independent of toUnicode mappings, to use
> ICU or Pango or Uniscribe or whatever to analyze the glyphs and
> convert them to characters. The rules are unambiguous but AR does not
> do it.

Having an external pre-procesor is what I do for tagging mathematics.
It seems like a similarly intricate problem here.

> 
> We discuss nonbreakable spaces while we are not yet able to convert
> properly printable glyphs to characters when doing copy&paste from
> PDF...

  :-)

> 
> 
> -- 
> Zdeněk Wagner
> http://hroch486.icpf.cas.cz/wagner/
> http://icebearsoft.euweb.cz

Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department

Re: [XeTeX] Whitespace in input

2011-11-17 Thread Ross Moore

e code-points. 


Alternatively, use the editor to change the unwanted characters 
to ordinary spaces, or whatever else works well with TeX processing.

This is actually my preference in these situations, as there is 
a definite advantage in keeping the (La)TeX input source clean.
At some time you might want to use it with a different processor,
which might not have an easy in-built way to handle the problematic
characters. 

> 
> ** Phil.


Hope this helps clarify any misconceptions,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114






--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Whitespace in input

2011-11-17 Thread Ross Moore

Hello Zdenek,

On 18/11/2011, at 7:49 AM, Zdenek Wagner wrote:

>> But a formatting instruction for one program cannot serve as reliable input
>> for another.
>> A heuristic is then needed, to attempt to infer that a programming
>> instruction must have been used, and guess what kind of instruction it might
>> have been. This is not 100% reliable, so is deprecated in modern methods of
>> data storage and document formats.
>> XML based formats use tagging, rather that programming instructions. This is
>> the modern way, which is used extensively for communicating data between
>> different software systems.
>> 
> Yes, that's the point. The goal of TeX is nice typographical
> appearance. The goal of XML is easy data exchange. If I want to send
> structured data, I send XML, not PDF.

These days people want both.

> 
>> ** Phil.
>> 
>> TeX's strength is in its superior ability to position characters on the page
>> for maximum visual effect. This is done by producing detailed programming
>> instructions within the content stream of the PDF output. However, this is
>> not enough to meet the needs of formats such as EPUB, non-visual reading
>> software, archival formats, searchability, and other needs.
>> Tagged PDF can be viewed as Adobe's response to address these requirements
>> as an extension of the visual aspects of the PDF format. It is a direction
>> in which TeX can (and surely must) move, to stay relevant within the
>> publishing industry of the future.
>> 
>> Hope this helps,
>> Ross
>> 
> No, it does not help. Remember that tha last (almost) portable version
> of PDF is 1.2. If you are to open tagged PDF or even PDF with a
> toUnicode map or a colorspace other than RGB or CMYK in Acrobat Reader
> 3, it displays a fatal error and dies. I reported it to Adobe in March
> 2001 and they did nothing.

What else would you expect?
AR is at version 10 now.
On Linux it is at version 9 now, indeed 9.4.6 is current.

You don't expect TeX formats prior to TeX3 to handle non-ascii 
characters, so why would you expect other people's older software 
versions to handle documents written for later formats?

> I even reported another fatal bug in
> January 2001. I sent sample files but nothing happened, Adobe just
> stopped development of Acrobat Reader at buggy version 3 for some
> operating systems.

Why should they support OSs that have a limited life-time?
Industry moves on. A new computer is very cheap these days,
with software that can do things your older one never could do.

By all means keep the old one while it still does useful work, 
but you get another to do things that the older cannot handle.

> Why do you so much rely on Adobe? When exchanging
> structured documents I will always do it in XML and never create
> tagged PDF because ...

PDF, as a published standard, is not maintained by Adobe itself 
these days, yet Adobe continues to provide a free reader, at least 
for the visual aspects. That makes documents in PDF viewable by 
everyone (who is only interested in the visual aspect).

It is an ISO standard, which publishers will want to use.
Most of the people who use (La)TeX are academics or others
who need to do a fair amount of publishing, of one kind
or another.

TeX can be modified to become capable of producing Tagged PDF.
 (See the videos of my talks.)
Free software (Poppler) is being developed to handle most aspects
of PDF content, though it hasn't yet progressed enough to support
structure tagging. It's surely on the list of things to do.

>  ... I know that some users will be unable to read them
> by Adobe Acrobat Reader.

Why not?
It is not Adobe Reader that is holding them back.

> I do not wish to make them dependent on
> ghostscript and similar tools.

You'll have to give some more details of who you are
referring to her, and why their economic circumstances 
require them to have access to XML-transmitted data,
but preclude them from access to other kinds of standard 
computing software and devices.

> -- 
> Zdeněk Wagner
> http://hroch486.icpf.cas.cz/wagner/
> http://icebearsoft.euweb.cz

Hope this helps,

Ross

Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Whitespace in input

2011-11-17 Thread Ross Moore

Hi Phil,

On 17/11/2011, at 23:53, Philip TAYLOR  wrote:

> Keith J. Schultz wrote:
>>> 
>>> You mention in a later post that you do consider a space as a printable 
>>> character.
>>This line should read as:
>>  You mention in a later post that you consider a space as a 
>> non-printable character.
> 
> No, I don't think of it as a "character" at all, when we are talking
> about typeset output (as opposed to ASCII (or Unicode) input).  

This is fine, when all that you require of your output is that it be visible on
a printed page. But modern communication media goes much beyond that.
A machine needs to be able to tell where words and lines end, reflowing 
paragraphs when appropriate and able to produce a flat extraction of all the 
text, perhaps also with some indication of the purpose of that text (e.g. by 
structural tagging).

In short, what is output for one format should also be able to serve as input 
for another.

Thus the space certainly does play the role of an output character – though the 
presence of a gap in the positioning of visible letters may serve this role in 
many, but not all, circumstances.

> Clearly
> it is a character on input, but unless it generates a glyph in the
> output stream (which TeX does not, for normal spaces) then it is not
> a character (/qua/ character) on output but rather a formatting
> instruction not dissimilar to (say) end-of-line.

But a formatting instruction for one program cannot serve as reliable input for 
another.
A heuristic is then needed, to attempt to infer that a programming instruction 
must have been used, and guess what kind of instruction it might have been. 
This is not 100% reliable, so is deprecated in modern methods of data storage 
and document formats.
XML based formats use tagging, rather that programming instructions. This is 
the modern way, which is used extensively for communicating data between 
different software systems.

> 
> ** Phil.

TeX's strength is in its superior ability to position characters on the page 
for maximum visual effect. This is done by producing detailed programming 
instructions within the content stream of the PDF output. However, this is not 
enough to meet the needs of formats such as EPUB, non-visual reading software, 
archival formats, searchability, and other needs.
Tagged PDF can be viewed as Adobe's response to address these requirements as 
an extension of the visual aspects of the PDF format. It is a direction in 
which TeX can (and surely must) move, to stay relevant within the publishing 
industry of the future.

Hope this helps,

 Ross

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Whitespace in input

2011-11-15 Thread Ross Moore

Hi Zdenek,

On 16/11/2011, at 11:19 AM, Zdenek Wagner wrote:

>> Just like any other Unicode character, if you want it then
>> you should be able to put it in there.
> 
> You ARE able to do it. Choose a font with that glyph, set \catcode to
> 11 or 12 and that's it. What else do you wish to do?

The *default* behaviour should stay as this.
Any other behaviour needs to change the catcode
and make perhaps a definition.

>>> These are reasons why people might wish it in the source files, not in PDF.
>> 
>> Yes. In the source, to have the occasional such character included
>> within the PDF, for whatever reason appropriate to the material
>> being typeset -- whether verbatim, or not.


>>> If you wish to take a [part of] PDF and include it in another PDF as
>>> is, you can take the PDF directly without the need of grabbing the
>>> text. If you are interested in the text that will be retypeset, you
>>> have to verify a lot of other things.
>> 
>> How is any of this relevant to the current discussion?
>> 
> It was you who came with the argument that you wish to have
> nonbreakable spaces when copying the text from PDF.

No. I said that if you put one in, then you should be
expecting to get one out.
This should be the default behaviour, as it is now.

I certainly suggested nothing like getting out non-breaking
spaces as a replacement for anything else.


> Zdeněk Wagner
> http://hroch486.icpf.cas.cz/wagner/
> http://icebearsoft.euweb.cz



Hope this helps,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Whitespace in input

2011-11-15 Thread Ross Moore

Hi Phil,

On 16/11/2011, at 10:08 AM, Zdenek Wagner wrote:

>> How do you explain to somebody the need to do something really,
>> really special to get a character that they can type, or copy/paste?
>> 
>> There is no special role for this character in other vital aspects
>> of how TeX works, such as there is for $ _ # etc.
>> 
>> 
>>>> 
>>>> In TeX ~ *simulates* a non-breaking space visually, but there is
>>>> no actual character inserted.
>>> 
>>> And I don't agree that a space is a character, non-breaking or not !
>> 
>> In this view you are against most of the rest of the world.
>> 
> TeX NEVER outputs a space as a glyph. Text extraction tools usually
> interpret horizontal spaces of sufficient size as U+0020.

I never said that it did, nor that it was necessary to do so.

Those text extraction tools do a pretty reasonable job, but don't
always get it right. Besides, there is reliance on a heuristic,
which can be fallible, especially if there is content typeset in 
a very small font size.
And what about at line-ends? They can get that wrong too.

Such a reliance is rather against the TeX way of doing things,
don't you think?

Better is for TeX itself to apply the heuristic, since it knows
the current font size and the separation between bits of words.

> (The exception to the above mentioned "never" is the verbatim mode.)

That isn't good enough for TeX to produce PDF/A.
Go and watch the videos that I pointed you to.

Lower down I give a run-down of how a variant of TeX handles
this problem, to very good effect.

> 
>> If the output is intended to be PDF, as it really has to be with
>> XeTeX, then the specifications for the modern variants of PDF
>> need to be consulted.
>> 
>> With PDF/A and PDF/UA and anything based on ISO-32000 (PDF 1.7)
>> there is a requirement that the included content should explicitly
>> provide word boundaries. Having a space character inserted is by
>> far the most natural way to meet this specification.
> 
> A space character is a fixed-width glyph. If you insist in it, you
> will never be able to typeset justified paragraphs, you will move back
> to the era of mechanical typewriters.

Absolutely wrong!

I'm not insisting on it being included as the natural way to 
separate words within the PDF, though it certainly is a possible
way that is used by other software.

>> (This does not mean that having such a character in the output
>> need affect TeX's view of typesetting.)

Clearly you never even read this parenthetical statement ...

>> 
>> Before replying to anything in the above paragraph, please
>> watch the video of my recent talk at TUG-2011.

 ... and certainly you don't seem to have followed up on this
piece of advice, to get a better perspective of what I'm talking
about.

>> 
>>  http://river-valley.tv/further-advances-toward-tagged-pdf-for-mathematics/
>> 
>> or similar from earlier years where I also talk a bit about such things.

Here is how you get *both* TeX-quality typesetting and explicit
spaces as word-boundaries inside the PDF, with no loss of quality.

What the experimental tagged-pdfTeX does is to use a font (called
"dummy-space") that contains just a single character at code Ux0020,
at a size that is almost zero -- it cannot be exactly zero, else 
PDF browsers may not select it for copy/paste, or other text-extraction.

These extra spaces are inserted into the PDF content stream, *after*
TeX has determined the correct positioning for high-quality typesetting.
That is, it is *not* done by macros or widgets or suchlike, but is
done internally by the pdfTeX engine at shipout time.

The almost-zero size has no perceptible effect on the visual output.
But the existence of these extra space characters means that all
text-extraction methods work much more reliably.

There *are* extra primitives that can be used to turn this off and on
in places where such extra spaces are not wanted; e.g. in math.
And there is a primitive to insert such a space, in case it is required
manually, for whatever reason. All of these primitives are used
extensively when generating tagged PDF of mathematical expressions,
and are thus available for other usage too.

>> 
>>> 
>>> ** Phil.

Hope this helps,

Ross

Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Whitespace in input

2011-11-15 Thread Ross Moore


On 16/11/2011, at 9:45 AM, Zdenek Wagner wrote:

> 2011/11/15 Ross Moore :

>>>> What if you really want the Ux00A0 character to be in the PDF?
>>>> That is, when you copy/paste from the PDF, you want that character
>>>> to come along for the ride.
>>>> 
>>>> From the typographical point of view it is the worst of all possible
>>> methods. If you really wish it,

Maybe you misunderstood what I meant here.

I'm not saying that you might want Ux00A0 for *every* place
where there is a word-breaking space.
Just that there may be individual instance(s) where you have
a reason to want it.

Just like any other Unicode character, if you want it then
you should be able to put it in there.
That's what XeTeX currently does (with the TeX-wise familiar 
ASCII exceptions) for any code-point supported by the
chosen font.

>> 
>> The *really wish it* is the choice of the author, not the
>> software.
>> 
>>> then do not use TeX but M$ Word or
>>> OpenOffice. M$ Word automatically inserts nonbreakable spaces at some
>>> points in the text written in Czech. As far as grammer is concerned,
>>> it is correct. However, U+00a0 is fixed width. If you look at the
>>> output, the nonbreakable spaces are too wide on some lines and too
>>> thin on other lines. I cannot imagine anything uglier.
>> 
>> I do not disagree with you that this could be ugly.
>> But that is not the point.
>> 
>> If you want superior aesthetic typesetting, with nice choices
>> for hyphenation, then don't use Ux00A0. Of course!
>> 
>> 
>> Whatever the reason for wanting to use this character, there
>> should be a straight-forward way to do it.
>> Using the character itself is:
>>  a.  the most understandable
>>  b.  currently works
>>  c.  requires no special explanation.
>> 
> These are reasons why people might wish it in the source files, not in PDF.

Yes. In the source, to have the occasional such character included
within the PDF, for whatever reason appropriate to the material
being typeset -- whether verbatim, or not.

> 
> If you wish to take a [part of] PDF and include it in another PDF as
> is, you can take the PDF directly without the need of grabbing the
> text. If you are interested in the text that will be retypeset, you
> have to verify a lot of other things.

How is any of this relevant to the current discussion?

> If the text contained hyphenated
> words, you have to join the parts manually. You will have a lot of
> other work and the time saved by U+00a0 will be negligible. There are
> tools that may help you to insert nonbreakable spaces. I have even my
> own special tools written in perl to handle one class of input files
> that are really plain texts and the result is (almost) correctly
> marked LaTeX source.

All well and good. 
But how is that relevant to anything I said?

>> 
>>> 
>>> 
>>> --
>>> Zdeněk Wagner
>>> http://hroch486.icpf.cas.cz/wagner/
>>> http://icebearsoft.euweb.cz


Cheers,

Ross


Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-419  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114







--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

1 2 3 >

1 - 100 of 222 matches

Mail list logo