Re: [XeTeX] [arXiv #128410] Re: XeLaTeX generated pdf metadata

2014-09-24 Thread Zdenek Wagner
2014-09-24 23:34 GMT+02:00 maxwell :
> On 2014-09-22 22:04, Axel E. Retif wrote:
>>
>> On 09/22/2014 08:42 PM, Mike Maxwell wrote:
>>
>>> I guess these jokers haven't heard of Unicode.  Are they stuck back in
>>> the 1990s?
>>
>>
>> Are you and Philip Taylor even aware that you're replying directly to
>> an arXiv administrator?
>>
>> I think arXiv and Cornell University are doing a great service to the
>> scientific community and public in general and deserve more respect.
>
>
> For the record, I was on the other side of this issue in the early 2000s,
> and was told I should move into the 21st century.  The person who told me
> that was right, and I was wrong.  Having been converted, I feel the need to
> proselytize; apologies, though, for coming across as brash.
>
> I'm a linguist, so I constantly deal with other scripts.  Unicode is
> essential for our work, and its use has been routine in linguistics and
> computational linguistics publications and data archiving for over a decade.
> All the language archiving sites I know about will accept *only* Unicode (or
> at the very least discourage non-Unicode submissions).
>
A few years ago I was asked by editors of a linguistics journal to
make a LaTeX template for them. They told me that many authors use
Arabic and Chinese and therefore they decided to use XeLaTeX only.
However, they wanted also an old-style LaTeX template for those
authors who do not need such scripts and do not wish to install XeTeX
and the document should be easily convertible from old-style LaTeX to
XeLaTeX. And the authors are allowed to use free fonts only. It was
easy to develop such a template.

> So no, I don't understand why an archiving service would not allow
> Unicode-encoded papers, even if it does require xelatex.  (For the record, I
> think the font is a red herring, since afaik the font license issue comes up
> regardless of whether you're using latex or xelatex.)
>
Even worse, if you need a non-latin script, you only have a Unicode
OpenType font and are forced to use old-style LaTeX, you have to
convert the font but most often such a conversion is explicitely
prohibited by the font license.

>Mike Maxwell
>
>
>
> --
> Subscriptions, Archive, and List information, etc.:
>  http://tug.org/mailman/listinfo/xetex



-- 
Zdeněk Wagner
http://hroch486.icpf.cas.cz/wagner/
http://icebearsoft.euweb.cz



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] [arXiv #128410] Re: XeLaTeX generated pdf metadata

2014-09-24 Thread maxwell

On 2014-09-22 22:04, Axel E. Retif wrote:

On 09/22/2014 08:42 PM, Mike Maxwell wrote:


I guess these jokers haven't heard of Unicode.  Are they stuck back in
the 1990s?


Are you and Philip Taylor even aware that you're replying directly to
an arXiv administrator?

I think arXiv and Cornell University are doing a great service to the
scientific community and public in general and deserve more respect.


For the record, I was on the other side of this issue in the early 
2000s, and was told I should move into the 21st century.  The person who 
told me that was right, and I was wrong.  Having been converted, I feel 
the need to proselytize; apologies, though, for coming across as brash.


I'm a linguist, so I constantly deal with other scripts.  Unicode is 
essential for our work, and its use has been routine in linguistics and 
computational linguistics publications and data archiving for over a 
decade.  All the language archiving sites I know about will accept 
*only* Unicode (or at the very least discourage non-Unicode 
submissions).


So no, I don't understand why an archiving service would not allow 
Unicode-encoded papers, even if it does require xelatex.  (For the 
record, I think the font is a red herring, since afaik the font license 
issue comes up regardless of whether you're using latex or xelatex.)


   Mike Maxwell


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] [arXiv #128410] Re: XeLaTeX generated pdf metadata

2014-09-23 Thread Philip Taylor


Axel E. Retif wrote:

> On 09/22/2014 08:42 PM, Mike Maxwell wrote:
> 
> 
>> I guess these jokers haven't heard of Unicode.  Are they stuck back in
>> the 1990s?
> 
> Are you and Philip Taylor even aware that you're replying directly to an
> arXiv administrator?

Since Philip Taylor wrote /only/ to the XeTeX list and did not cc
, presumably he was not only aware of the risk but also
took care to obviate it.

Philip Taylor




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] [arXiv #128410] Re: XeLaTeX generated pdf metadata

2014-09-23 Thread Dominik Wujastyk
I have had similar problems with PubMedCentral.  While I was a Wellcome
Trust Senior Research Fellow, I was contractually obliged to submit all my
publications to PMC.  But in every case it took over a year for my work to
appear, and involved a huge wrangle about XML, XeTeX, and conversion.

PMC uses tools to convert the author's PDF into XML.  Then they generate a
new PDF from their XML.  They publish their own XML and their own PDF.

I get it that they want XML.  But their conversion pipeline is not good for
complex work, especially if it includes Unicode characters.  Their
re-generated PDFs were a complete mess and my articles were quite literally
unreadable.  (And the page numbers were all changed, making reference
ambiguous.)  Admittedly, my articles use Sanskrit in Unicode and complex
layout formatting.  That's why I use XeTeX, of course.

For an example, see especially pp.211 onwards of my article here:

   - http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2772122/

In the end, PMC agreed that their tech could not handle my writings, so
they published my PDF and no XML.

It sounds as if arXiv is facing similar difficulties.  The best way forward
for arXiv and PMC is to identify authors who are knowledgeable about
advanced document processing (i.e., the members of this list!), and talk to
them in a cooperative spirit about complex documents, metadata, and
conversion issues.  This would be better than treating such authors as
"difficulties."

Best,

Dominik



On 23 September 2014 08:16,  wrote:

> On Tue, 23 Sep 2014, Ross Moore wrote:
> > It is the insistence on being able to reproduce the PDF
> > *automatically from source* that is where the problem lies.
>
> >From reading Norbert's Web blog, it appears that that's also an issue for
> Debian packaging of TeX-related software.  Debian has a formal requirement
> for everything that can possibly be built from source, to be built from
> source, and it's not practical to do that automatically with many
> TeX-related documentation files.  My own horoscop LaTeX package, whose
> documentation requires many megabytes of astrological software (free, but
> not typically packaged by Linux distributions) to compile properly, is
> only one example.  I think there are other packages that exist
> specifically to support expensive commercial products and require those
> products in order to compile, notwithstanding that the results of
> compilation are free to distribute.  This kind of thing is definitely a
> problem; I'm not sure it is TeX's problem.
>
> As for arXiv, what bothers me is that in the case of XeLaTeX, they will
> accept neither the source code *nor* the compiled PDF.  All an author can
> do is circumvent the rules by lying in the document metadata, or else go
> through contortions to compile a special arXiv-only version with some
> other software.  I found this page helpful in my efforts to do that:
>http://member.ipmu.jp/yuji.tachikawa/cjk-on-arxiv/
>
> --
> Matthew Skala
> msk...@ansuz.sooke.bc.ca People before principles.
> http://ansuz.sooke.bc.ca/
>
>
> --
> Subscriptions, Archive, and List information, etc.:
>   http://tug.org/mailman/listinfo/xetex
>


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] [arXiv #128410] Re: XeLaTeX generated pdf metadata

2014-09-22 Thread mskala
On Tue, 23 Sep 2014, Ross Moore wrote:
> It is the insistence on being able to reproduce the PDF
> *automatically from source* that is where the problem lies.

>From reading Norbert's Web blog, it appears that that's also an issue for
Debian packaging of TeX-related software.  Debian has a formal requirement
for everything that can possibly be built from source, to be built from
source, and it's not practical to do that automatically with many
TeX-related documentation files.  My own horoscop LaTeX package, whose
documentation requires many megabytes of astrological software (free, but
not typically packaged by Linux distributions) to compile properly, is
only one example.  I think there are other packages that exist
specifically to support expensive commercial products and require those
products in order to compile, notwithstanding that the results of
compilation are free to distribute.  This kind of thing is definitely a
problem; I'm not sure it is TeX's problem.

As for arXiv, what bothers me is that in the case of XeLaTeX, they will
accept neither the source code *nor* the compiled PDF.  All an author can
do is circumvent the rules by lying in the document metadata, or else go
through contortions to compile a special arXiv-only version with some
other software.  I found this page helpful in my efforts to do that:
   http://member.ipmu.jp/yuji.tachikawa/cjk-on-arxiv/

-- 
Matthew Skala
msk...@ansuz.sooke.bc.ca People before principles.
http://ansuz.sooke.bc.ca/


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] [arXiv #128410] Re: XeLaTeX generated pdf metadata

2014-09-22 Thread Ross Moore
Hi Axel, Mike and others.

On 23/09/2014, at 12:04 PM, Axel E. Retif wrote:

> On 09/22/2014 08:42 PM, Mike Maxwell wrote:
> 
> 
>> I guess these jokers haven't heard of Unicode.  Are they stuck back in
>> the 1990s?
> 
> Are you and Philip Taylor even aware that you're replying directly to an 
> arXiv administrator?
> 
> I think arXiv and Cornell University are doing a great service to the 
> scientific community and public in general and deserve more respect.

Yes, they are doing a great service.

But, having said that, there should still be an obligation 
to keep up with the times, and not *prevent* the archiving
of publications that have special typesetting requirements.

How else can we advance aspects of mathematical/scientific 
publishing, when the main repository refuses to accept works 
that ably demonstrate useful new ideas?


Not that XeTeX is really all that special any more.
It started in 2004, on the Mac only.
Support for Unicode math is a bit more recent, 
starting around 2006, similarly to when XeTeX went
multi-platform (I think).



We talked about this at TUG 2014, in one of the discussion
sessions, where someone had reported dissatisfaction with 
what could be submitted to arXiv. 
Other issues were raised as well, including the fact that
the current TeXLive version being used there is ~3 years
out of date.


Earlier this year I submitted a paper that was meant 
to demonstrate use of PDF/A-3u features for publishing 
*accessible* mathematical content. 
But because the version of pdfTeX was outdated at 2011, 

http://arxiv.org/help/faq/texlive

the PDFs produced on-the-fly by arXiv do not validate to 
the standard declared within them.

They would not accept the PDF that I myself can compile,
in which validation is 100%.

(The particular differences in the PDF output are due 
to a mistake in 2011 and later versions of pdfTeX itself.
This has now been fixed, but perhaps is available only
by download from the  pdftex  source repository.)



The upshot of this is that it is not possible to *lead by 
example* with PDFs that are meant to demonstrate the value 
of new and emerging standards.
This includes standards that are accepted elsewhere within 
the publishing industry, and are to some extent mandated 
by existing US accessibility laws, applicable to many 
government and academic institutions.


> 
> It seems to me that if they start accepting Xe(La)TeX submissions they will 
> be receiving documents with strange fonts,
> the license of which they will have to investigate first to see if they can 
> post the articles.

Most fonts are allowed to be subsetted and included within PDFs.
The subsetting prevents sensible extraction of the font as 
a whole, so foundries do not object.
After all, how can the beauty and craftsmanship within a font 
be displayed, and its popularity increased to the benefit of
the designer and foundry, unless documents using it are allowed 
to be distributed?

So no, that is *not* the crux of the issue.

It is the insistence on being able to reproduce the PDF
*automatically from source* that is where the problem lies.


There should be more circumstances under which users' PDFs 
would be accepted *as-is*, and distributed from arXiv.

Sources should certainly be included in the arXiv, primarily 
for verification purposes, even when not able to be presently 
compiled to the desired satisfaction.
 

If font licensing is still deemed to be an issue, then surely
there is a difference between recreating the PDF from source 
using a purchased, fully-licensed copy of the font, and simply 
serving a copy of a document for which the author has used 
their own (presumably purchased or licensed) copy of that font.

By all means tell the author that full acceptance of the paper
may be delayed if some investigation needs to be carried out.
Tell them the real reason; but *do not* insult the author 
by saying that (s)he must submit in a completely different 
format to what is best for the content of the work that 
(s)he has already prepared. 



Hope this helps,

Ross Moore
Director, TeX Users Group



Ross Moore   ross.mo...@mq.edu.au 
Mathematics Department   office: E7A-206  
Macquarie University tel: +61 (0)2 9850 8955
Sydney, Australia  2109  fax: +61 (0)2 9850 8114





--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] [arXiv #128410] Re: XeLaTeX generated pdf metadata

2014-09-22 Thread Axel E. Retif

On 09/22/2014 08:42 PM, Mike Maxwell wrote:



I guess these jokers haven't heard of Unicode.  Are they stuck back in
the 1990s?


Are you and Philip Taylor even aware that you're replying directly to an 
arXiv administrator?


I think arXiv and Cornell University are doing a great service to the 
scientific community and public in general and deserve more respect.


It seems to me that if they start accepting Xe(La)TeX submissions they 
will be receiving documents with strange fonts, the license of which 
they will have to investigate first to see if they can post the articles.






--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] [arXiv #128410] Re: XeLaTeX generated pdf metadata

2014-09-22 Thread Mike Maxwell

On 9/22/2014 9:41 AM, arXiv Help wrote:

Dear Chandra,

arXiv does not support XeTeX/XeLaTeX. Please export the source to the 
appropriate *.tex file format and submit that file. arXiv does not currently 
have any plans to support XeTeX/XeLaTeX. Please make the necessary changes and 
attempt to reprocess your submission at your convenience.

--
arXiv admin


I guess these jokers haven't heard of Unicode.  Are they stuck back in the 
1990s?

   Mike Maxwell
   University of Maryland


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] [arXiv #128410] Re: XeLaTeX generated pdf metadata

2014-09-22 Thread Philip Taylor


arXiv Help wrote:
> Dear Chandra,
> 
> arXiv does not support XeTeX/XeLaTeX. Please export the source to the 
> appropriate *.tex file format and submit that file. arXiv does not currently 
> have any plans to support XeTeX/XeLaTeX. Please make the necessary changes 
> and attempt to reprocess your submission at your convenience.
> 
> --
> arXiv admin

Tell them there are accessibility features in XeTeX that are essential
to you, and that if they refuse to accept XeTeX input you will be forced
to bring a case claiming that you are being discriminated against by
virture of disability.  With any luck they will immediately concede
defeat without even asking what these accessibility features are :-)


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


[XeTeX] [arXiv #128410] Re: XeLaTeX generated pdf metadata

2014-09-22 Thread arXiv Help
Dear Chandra,

arXiv does not support XeTeX/XeLaTeX. Please export the source to the 
appropriate *.tex file format and submit that file. arXiv does not currently 
have any plans to support XeTeX/XeLaTeX. Please make the necessary changes and 
attempt to reprocess your submission at your convenience.

--
arXiv admin


On Sat Sep 20 03:36:19 2014, chyav...@gmail.com wrote:
> Could you try including this in your preamble:
> 
> \usepackage{hyperref}
>   \hypersetup{pdfcreator=XeLaTeX with hyperref}
> 
> and compile and see if the resulting PDF is sufficient for your
> purposes.
> 
> Section 3.7 at
> 
> https://www.tug.org/applications/hyperref/manual.html
> 
> explains these options in full.
> 
> HTH
> 
> Chandra
> 
> On 20/09/14 12:38, Daniel Greenhoe wrote:
> > Dear XeTex,
> >
> > I think my original email was not so clear. ArXiv.org of course
> > accepts papers generated using LaTeX, but they want to be given the
> > source files (.tex files, etc) rather than a pdf file. However, they
> > apparently sometimes make exceptions to this rule if the pdf file
> was
> > generated using XeTeX/XeLaTeX rather than LaTeX. That is, they *may*
> > in at least some cases accept a pdf generated by XeLaTeX, but will
> > *not* accept a pdf generated by LaTeX.
> >
> > Therefore, if it is not too much trouble, I would like the metadata
> in
> > my XeLaTeX/xdvipdfmx generated pdf file to clearly indicate that it
> > was generated by XeLaTeX (*not* by LaTeX). Would any one have time
> for
> > this?
> >
> > Many thanks in advance,
> > Dan
> >
> > - please ignore the following -
> > Dear arXiv-moderation
> > [arXiv #128343]
> > [arXiv #128400]
> >
> > On Sat, Sep 20, 2014 at 10:15 AM, Daniel Greenhoe
>  wrote:
> >> Dear XeTeX,
> >>
> >> I have tried uploading a paper in pdf format that I typeset using
> >> XeLaTeX to arXiv.org. However, it was later removed because it
> >> "appeared to be PostScript/PDF generated from TeX source". I wrote
> to
> >> arXiv-moderation, strongly arguing my case for using XeLaTeX rather
> >> than LaTeX. They responded saying "In order to approve such a
> request
> >> we'd have to have a PDF which includes it's XeTeX nature within the
> >> pdf properties...".
> >>
> >> When I typeset my paper using xelatex.exe, an xdv file is generated
> >> which contains this in the metadata:
> >>\Creator(LaTeX with hyperref pacakge)\Author()\Producer(XeTeX
> 0.1)
> >> Later I use xdvipdfmx.exe to generate a pdf file. It contains this
> >> information in the metadata:
> >>Creator:LaTeX with hyperref package
> >>Producer:   xdvipdfmx (20140317)
> >>
> >> So although the producer fields provides evidence that I am using
> >> XeLaTeX, the creator field erroneously implies that I have typeset
> >> using LaTeX. Hence, there will be a high probability that my paper
> >> will either be removed by an automated server at arXiv.org or a
> human
> >> administrator.
> >>
> >> I realize that I could possibly hand edit the xdv file or use a
> >> metadata editor to edit the pdf file. But I would rather not do
> this.
> >> I would rather work transparently, not surreptitiously changing the
> >> metadata of files.
> >>
> >> Would it be possible that some qualified person could correct the
> >> creator metadata output of XeLaTeX? I am currently using xelatex
> from
> >> TeXLive 2014 running on Windows. Here is the first line from the
> log
> >> file:
> >>This is XeTeX, Version 3.14159265-2.6-0.1 (TeX Live
> 2014/W32TeX)
> >> (preloaded format=xelatex 2014.9.20)  20 SEP 2014 07:16
> >>
> >> Many thanks in advance,
> >> Dan
> >>
> >> -please ignore the following-
> >> Dear arXiv-moderation,
> >
> >
> > --
> > Subscriptions, Archive, and List information, etc.:
> >http://tug.org/mailman/listinfo/xetex
> >
> 
> 
> ## Keyword: Dear arXiv ##




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex