Re: support figure space (U+2007)

2022-01-15 Thread Masamichi Hosoda
>> the /a "flag" can also be used to limit the character class to ASCII
>> space characters:
>>  
>> -  $content =~ s/^\s*//;
>> -  $content =~ s/\s*$//;
>> +  $content =~ s/^\s*//a;
>> +  $content =~ s/\s*$//a;
> 
> This looks good, thanks.  However, it is probably necessary to filter
> out the ideographic space (U+3000), too.
> 
> Masamichi-san, what do you think?  Here is a link to the complete
> conversation:
> 
>   https://lists.gnu.org/archive/html/bug-texinfo/2022-01/msg5.html
> 
> Other CJK users, please also comment!

I think it is not necessary to filter out U+3000.

In the widely used Japanese TeX system (pTeX, upTeX, and LuaTeX-ja),
U+3000 is output as-is and is not filtered out.
Processing the following `.tex` containing U+3000 with lualatex,
you get a PDF that looks right-aligned.

```
\documentclass{ltjsarticle}

\begin{document}

\begin{tabular}{ll}
  300030003042 & 300030003042 \\
  300030423042 & 300030423042 \\
  304230423042 & 304230423042 \\
\end{tabular}

\end{document}
```



Re: please migrate to git

2018-11-20 Thread Masamichi Hosoda
> Masamichi-san, how did you do the migration?  Maybe Gavin can simply
> clone your version, which would be the simplest solution, of course.

I used `git svn` something like the followings.

First:
$ git svn init -s --no-metadata --prefix=svn/ svn://svn.savannah.gnu.org/texinfo

Update:
$ git svn fetch
$ git checkout master
$ git merge svn/trunk



Re: please migrate to git

2018-11-19 Thread Masamichi Hosoda
>> Here is a git mirror I've converted.
>> https://github.com/trueroad/texinfo
> 
> Ahh, great. Is that automatically/regularly updated?

No, I update it manually.



Re: please migrate to git

2018-11-19 Thread Masamichi Hosoda
Hi Werner, Norbert, all

>> Given that Savannah provides git support and that it is rather easy to
>> import an SVN repository to git (using `git svn ...') I strongly
>> suggest to migrate to git.
> 
> If there is interest, I can set up a git svn mirror as I have done for
> luatex, texlive, texlive-source ..., that gets automatically updated.
> Since texinfo is not so big, it could be mirrored into github,
> eg. into the TeX-Live organization, but wherever you prefer.
> 
> The setup I am using is already prepared to mirror other svn repos, too,
> via texlive.info server.

Here is a git mirror I've converted.
https://github.com/trueroad/texinfo



Re: stale git repository on Savannah?

2018-01-15 Thread Masamichi Hosoda
>> Sorry if I am opening a can of worm, but what about moving the main
>> development repository from Subversion to Git?  :-)
> 
> +1
> 
> 
> Werner

I converted Texinfo SVN repository to Git.
But, it is not up-to-date.

https://github.com/trueroad/texinfo



Re: texinfo.tex assorted special characters issue

2016-09-17 Thread Masamichi Hosoda
> I've found that compiling the following texi file failed.
> 
> ```
> \input texinfo
> @documentencoding UTF-8
> 
> @node «
> @section a
> 
> @bye
> ```

I've created a patch which fixes the issue.
If there is no problem, I'd like to commit it.
Index: ChangeLog
===
--- ChangeLog	(revision 7367)
+++ ChangeLog	(working copy)
@@ -1,3 +1,10 @@
+2016-09-XX  Masamichi Hosoda  <truer...@trueroad.jp>
+
+	* doc/texinfo.tex
+	(\latonechardefs, \latninechardefs)
+	(\lattwochardefs, \unicodechardefs):
+	Add missing braces for symbol insertion commands with no arguments.
+
 2016-09-12  Gavin Smith  <gavinsmith0...@gmail.com>
 
 	* doc/texinfo.tex
Index: doc/texinfo.tex
===
--- doc/texinfo.tex	(revision 7367)
+++ doc/texinfo.tex	(working copy)
@@ -10037,18 +10037,18 @@
   \gdefchar^^a0{\tie}
   \gdefchar^^a1{\exclamdown}
   \gdefchar^^a2{{\tcfont \char162}} % cent
-  \gdefchar^^a3{\pounds}
+  \gdefchar^^a3{\pounds{}}
   \gdefchar^^a4{{\tcfont \char164}} % currency
   \gdefchar^^a5{{\tcfont \char165}} % yen
   \gdefchar^^a6{{\tcfont \char166}} % broken bar
   \gdefchar^^a7{\S}
   \gdefchar^^a8{\"{}}
-  \gdefchar^^a9{\copyright}
+  \gdefchar^^a9{\copyright{}}
   \gdefchar^^aa{\ordf}
-  \gdefchar^^ab{\guillemetleft}
+  \gdefchar^^ab{\guillemetleft{}}
   \gdefchar^^ac{\ensuremath\lnot}
   \gdefchar^^ad{\-}
-  \gdefchar^^ae{\registeredsymbol}
+  \gdefchar^^ae{\registeredsymbol{}}
   \gdefchar^^af{\={}}
   %
   \gdefchar^^b0{\textdegree}
@@ -10062,7 +10062,7 @@
   \gdefchar^^b8{\cedilla\ }
   \gdefchar^^b9{$^1$}
   \gdefchar^^ba{\ordm}
-  \gdefchar^^bb{\guillemetright}
+  \gdefchar^^bb{\guillemetright{}}
   \gdefchar^^bc{$1\over4$}
   \gdefchar^^bd{$1\over2$}
   \gdefchar^^be{$3\over4$}
@@ -10142,7 +10142,7 @@
   % Encoding is almost identical to Latin1.
   \latonechardefs
   %
-  \gdefchar^^a4{\euro}
+  \gdefchar^^a4{\euro{}}
   \gdefchar^^a6{\v S}
   \gdefchar^^a8{\v s}
   \gdefchar^^b4{\v Z}
@@ -10171,7 +10171,7 @@
   \gdefchar^^ae{\v Z}
   \gdefchar^^af{\dotaccent Z}
   %
-  \gdefchar^^b0{\textdegree}
+  \gdefchar^^b0{\textdegree{}}
   \gdefchar^^b1{\ogonek{a}}
   \gdefchar^^b2{\ogonek{ }}
   \gdefchar^^b3{\l}
@@ -10483,18 +10483,18 @@
   \DeclareUnicodeCharacter{00A0}{\tie}%
   \DeclareUnicodeCharacter{00A1}{\exclamdown}%
   \DeclareUnicodeCharacter{00A2}{{\tcfont \char162}}% 0242=cent
-  \DeclareUnicodeCharacter{00A3}{\pounds}%
+  \DeclareUnicodeCharacter{00A3}{\pounds{}}%
   \DeclareUnicodeCharacter{00A4}{{\tcfont \char164}}% 0244=currency
   \DeclareUnicodeCharacter{00A5}{{\tcfont \char165}}% 0245=yen
   \DeclareUnicodeCharacter{00A6}{{\tcfont \char166}}% 0246=brokenbar
   \DeclareUnicodeCharacter{00A7}{\S}%
   \DeclareUnicodeCharacter{00A8}{\"{ }}%
-  \DeclareUnicodeCharacter{00A9}{\copyright}%
+  \DeclareUnicodeCharacter{00A9}{\copyright{}}%
   \DeclareUnicodeCharacter{00AA}{\ordf}%
-  \DeclareUnicodeCharacter{00AB}{\guillemetleft}%
+  \DeclareUnicodeCharacter{00AB}{\guillemetleft{}}%
   \DeclareUnicodeCharacter{00AC}{\ensuremath\lnot}%
   \DeclareUnicodeCharacter{00AD}{\-}%
-  \DeclareUnicodeCharacter{00AE}{\registeredsymbol}%
+  \DeclareUnicodeCharacter{00AE}{\registeredsymbol{}}%
   \DeclareUnicodeCharacter{00AF}{\={ }}%
   %
   \DeclareUnicodeCharacter{00B0}{\ringaccent{ }}%
@@ -10508,7 +10508,7 @@
   \DeclareUnicodeCharacter{00B8}{\cedilla{ }}%
   \DeclareUnicodeCharacter{00B9}{$^1$}%
   \DeclareUnicodeCharacter{00BA}{\ordm}%
-  \DeclareUnicodeCharacter{00BB}{\guillemetright}%
+  \DeclareUnicodeCharacter{00BB}{\guillemetright{}}%
   \DeclareUnicodeCharacter{00BC}{$1\over4$}%
   \DeclareUnicodeCharacter{00BD}{$1\over2$}%
   \DeclareUnicodeCharacter{00BE}{$3\over4$}%
@@ -10980,36 +10980,36 @@
   % Punctuation
   \DeclareUnicodeCharacter{2013}{--}%
   \DeclareUnicodeCharacter{2014}{---}%
-  \DeclareUnicodeCharacter{2018}{\quoteleft}%
-  \DeclareUnicodeCharacter{2019}{\quoteright}%
-  \DeclareUnicodeCharacter{201A}{\quotesinglbase}%
-  \DeclareUnicodeCharacter{201C}{\quotedblleft}%
-  \DeclareUnicodeCharacter{201D}{\quotedblright}%
-  \DeclareUnicodeCharacter{201E}{\quotedblbase}%
+  \DeclareUnicodeCharacter{2018}{\quoteleft{}}%
+  \DeclareUnicodeCharacter{2019}{\quoteright{}}%
+  \DeclareUnicodeCharacter{201A}{\quotesinglbase{}}%
+  \DeclareUnicodeCharacter{201C}{\quotedblleft{}}%
+  \DeclareUnicodeCharacter{201D}{\quotedblright{}}%
+  \DeclareUnicodeCharacter{201E}{\quotedblbase{}}%
   \DeclareUnicodeCharacter{2020}{\ensuremath\dagger}%
   \DeclareUnicodeCharacter{2021}{\ensuremath\ddagger}%
-  \DeclareUnicodeCharacter{2022}{\bullet}%
+  \DeclareUnicodeCharacter{2022}{\bullet{}}%
   \DeclareUnicodeCharacter{202F}{\thinspace}%
-  \DeclareUnicodeCharacter{2026}{\dots}%
-  \DeclareUnicodeCharacter{2039}{\guilsinglleft}%
-  \DeclareUnicodeCharacter{203A}{\guilsinglright}%
+  \DeclareUnicodeCharacter{2026}{\do

texinfo.tex assorted special characters issue

2016-09-16 Thread Masamichi Hosoda
I've found that compiling the following texi file failed.

```
\input texinfo
@documentencoding UTF-8

@node «
@section a

@bye
```

Here is error messages in my environment.

```
$ texi2pdf aaa.texi
This is pdfTeX, Version 3.14159265-2.6-1.40.17 (TeX Live 2016/Cygwin) 
(preloaded format=pdfetex)
 restricted \write18 enabled.
entering extended mode
(./aaa.texi (/home/trueroad/tex/texinfo.tex
Loading texinfo [version 2016-09-12.20]: pdf, fonts, markup, glyphs,
page headings, tables, conditionals, indexing, sectioning, toc, environments,
defuns, macros, cross references, insertions,
(/usr/share/texmf-site/tex/generic/epsf/epsf.tex
This is `epsf.tex' v2.7.4 <14 February 2011>
) localization, formatting, and turning on texinfo input format.)
./aaa.texi:5: Argument of @guillemetleft has an extra }.

@par

   }
@txiescapepdf ...se @xdef #1{@pdfescapestring {#1}
  }@fi
@setpdfdestname ...{#1}@txiescapepdf @pdfdestname
  }
@pdfmkdest #1->@setpdfdestname {#1}
   @safewhatsit {@pdfdest name{@pdfdestname ...

@setref #1#2->@pdfmkdest {#1}
 @iflinks {@requireauxfile @atdummies @def @valu...
...
l.5 @section a

?
```

In my investigation:
  With texinfo.tex ver. 2016-08-09.22: failed
  With texinfo.tex ver. 2016-08-09.20: succeed

The difference is the following commit.
http://lists.gnu.org/archive/html/texinfo-commits/2016-08/msg00018.html


Re: PDF destination names for pdfTeX and LuaTeX

2016-08-08 Thread Masamichi Hosoda
>> I wonder if it's possible to put the code for getting the "destination
>> name" in a macro (a \def) so that it can be used when outputting a
>> target (e.g. @node) and when referring to a target (e.g. @xref or the
>> PDF sidebar). That could simplify the code and reduce duplication.
> 
> Sounds good.

I've created two patches.
One is for pdfTeX / Luatex.
The other is for XeTeX.

May I commit them?

ChangeLog:

2016-08-XX  Masamichi Hosoda  <truer...@trueroad.jp>

* doc/texinfo.tex (\setpdfdestname): New macro for XeTeX.
(\pdfdestname): Escaped PDF destination name
is set by \setpdfdestname.
(\setpdfoutlinetext): New macro for XeTeX.
(\pdfoutlinetext): Converted and escaped outline text
is set by \setpdfoutlinetext.
(\pdfmkdest): Use \setpdfdestname.
(\dopdfoutline): Use \setpdfdestname and \setpdfoutlinetext.
(\xrefX): Use \setpdfdestname.

2016-08-XX  Masamichi Hosoda  <truer...@trueroad.jp>

* doc/texinfo.tex (\setpdfdestname): New macro for pdfTeX and LuaTeX.
(\pdfdestname): Escaped PDF destination name
is set by \setpdfdestname.
(\setpdfoutlinetext): New macro for pdfTeX and LuaTeX.
(\pdfoutlinetext): Converted and escaped outline text
is set by \setpdfoutlinetext.
(\pdfmkdest): Use \setpdfdestname.
(\dopdfoutline): Use \setpdfdestname and \setpdfoutlinetext.
(\xrefX): Use \setpdfdestname.
--- texinfo.tex.org	2016-08-09 01:05:21.591393200 +0900
+++ texinfo.tex	2016-08-09 01:52:46.474094400 +0900
@@ -1338,7 +1338,7 @@
   \pdfrefximage \pdflastximage
 \fi}
   %
-  \def\pdfmkdest#1{{%
+  \def\setpdfdestname#1{{%
 % We have to set dummies so commands such as @code, and characters
 % such as \, aren't expanded when present in a section title.
 \indexnofonts
@@ -1362,9 +1362,53 @@
 \fi
 \def\pdfdestname{#1}%
 \txiescapepdf\pdfdestname
-\safewhatsit{\pdfdest name{\pdfdestname} xyz}%
   }}
   %
+  \def\setpdfoutlinetext#1{{%
+\indexnofonts
+\makevalueexpandable
+\turnoffactive
+\ifx \declaredencoding \latone
+  % The PDF format can use an extended form of Latin-1 in bookmark
+  % strings.  See Appendix D of the PDF Reference, Sixth Edition, for
+  % the "PDFDocEncoding".
+  \passthroughcharstrue
+  % Pass through Latin-1 characters.
+  %   LuaTeX: Convert to Unicode
+  %   pdfTeX: Use Latin-1 as PDFDocEncoding
+  \def\pdfoutlinetext{#1}%
+\else
+  \ifx \declaredencoding \utfeight
+\ifx\luatexversion\thisisundefined
+  % For pdfTeX  with UTF-8.
+  % TODO: the PDF format can use UTF-16 in bookmark strings,
+  % but the code for this isn't done yet.
+  % Use ASCII approximations.
+  \passthroughcharsfalse
+  \def\pdfoutlinetext{#1}%
+\else
+  % For LuaTeX with UTF-8.
+  % Pass through Unicode characters for title texts.
+  \passthroughcharstrue
+  \def\pdfoutlinetext{#1}%
+\fi
+  \else
+% For non-Latin-1 or non-UTF-8 encodings.
+% Use ASCII approximations.
+\passthroughcharsfalse
+\def\pdfoutlinetext{#1}%
+  \fi
+\fi
+% LuaTeX: Convert to UTF-16
+% pdfTeX: Use Latin-1 as PDFDocEncoding
+\txiescapepdfutfsixteen\pdfoutlinetext
+  }}
+  %
+  \def\pdfmkdest#1{%
+\setpdfdestname{#1}%
+\safewhatsit{\pdfdest name{\pdfdestname} xyz}%
+  }
+  %
   % used to mark target names; must be expandable.
   \def\pdfmkpgn#1{#1}
   %
@@ -1392,72 +1436,13 @@
 % page number.  We could generate a destination for the section
 % text in the case where a section has no node, but it doesn't
 % seem worth the trouble, since most documents are normally structured.
-{
-  \turnoffactive
-  \ifx \declaredencoding \latone
-% The PDF format can use an extended form of Latin-1 in bookmark
-% strings.  See Appendix D of the PDF Reference, Sixth Edition, for
-% the "PDFDocEncoding".
-\passthroughcharstrue
-% Pass through Latin-1 characters.
-%   LuaTeX: Convert to Unicode
-%   pdfTeX: Use Latin-1 as PDFDocEncoding
-\edef\pdfoutlinetext{#1}%
-\iftxiuseunicodedestname
-  % Pass through Latin-1 characters.
-  % LuaTeX with byte wise I/O converts Latin-1 characters to Unicode.
-  \edef\pdfoutlinedest{#3}%
-\else
-  % Use ASCII approximations in destination names.
-  \passthroughcharsfalse
-  \edef\pdfoutlinedest{#3}% 
-\fi
-  \else
-\ifx \declaredencoding \utfeight
-  \ifx\luatexversion\thisisundefined
-% For pdfTeX  with UTF-8.
-% TODO: the PDF format can use UTF-16 in bookmark strings,
-% but the code for this isn't done yet.
-% Use ASCI

Japanese Texinfo support files

2016-05-05 Thread Masamichi HOSODA
Hi folks,

I've made Japanese Texinfo support files, texinfo-ja.tex and txi-ja.tex.

  texinfo-ja.tex: Japanese texinfo.tex loader
  Some CJK packages are necessary to load before texinfo.tex.

  txi-ja.tex: Japanese translations and font definitions for texinfo.tex.

I'd like the Texinfo to contain these two files.

I've also made sample Japanese texi file, short-sample-ja.texi.
These files are required LuaTeX / XeTeX and some CJK packages, fonts,
like followings.

For LuaTeX
  Required:
LuaTeX 0.95 (TeX Live 2016 pretest)
  - LuaTeX 0.80 (TeX Live 2015) cannot compile it.
LuaTeX-ja
  http://www.ctan.org/pkg/luatexja
IPAex fonts
  http://www.ctan.org/tex-archive/fonts/ipaex
  Usage:
$ PDFTEX=luatex texi2pdf short-sample-ja.texi

For XeTeX
  Required:
XeTeX
  - Both XeTeX 0.2 (TeX Live 2015) and
0.6 (TeX Live 2016 pretest) can complie it.
zhspacing
  http://www.ctan.org/pkg/zhspacing
IPAex fonts
  http://www.ctan.org/tex-archive/fonts/ipaex
  Usage:
$ PDFTEX=xetex texi2pdf short-sample-ja.texi

Again, I'd like the Texinfo to contain these files.
Is something improvement or other things necessary?

Thanks
% texinfo-ja.tex -- Japanese texinfo.tex loader
% Some CJK packages are necessary to load before texinfo.tex.
%
% Copyright 2016 Free Software Foundation, Inc.
%
% This program is free software; you can redistribute it and/or modify
% it under the terms of the GNU General Public License as published by
% the Free Software Foundation; either version 3 of the license, or (at
% your option) any later version.
%
% This program is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
% GNU General Public License for more details.
%
% You should have received a copy of the GNU General Public License
% along with this program.  If not, see <http://www.gnu.org/licenses/>.
%
% Written by Masamichi Hosoda, 5 May 2016, <truer...@trueroad.jp>

%
% For LuaTeX
%
\ifx\luatexversion\thisisundefined
\else
  % LuaTeX-ja: Typeset Japanese with Lua(La)TeX
  % http://www.ctan.org/tex-archive/macros/luatex/generic/luatexja
  \openin 1 luatexja.sty \ifeof 1
\errmessage{LuaTeX-ja is not found.
It is required for Japanese Texinfo files with LuaTeX.
http://www.ctan.org/tex-archive/macros/luatex/generic/luatexja
It might be contained in texlive-lang-japanese package}
  \else
\input luatexja.sty
\def\txijapackage{LaTeX-ja}
  \fi
\fi

%
% For XeTeX
%
\ifx\XeTeXrevision\thisisundefined
\else
  % zhspacing: Spacing for mixed CJK-English documents in XeTeX
  % http://www.ctan.org/tex-archive/macros/xetex/generic/zhspacing
  %
  % This package is originally for Chinese,
  % but can also used in Japanese.
  %
  \openin 1 zhspacing.sty \ifeof 1
\errmessage{zhspacing is not found.
It is required for Japanese Texinfo files with LuaTeX.
http://www.ctan.org/tex-archive/macros/xetex/generic/zhspacing
It might be contained in texlive-lang-chinese.
(This package is for Chinese, but can also used in Japanese)}
  \else
\def\zhfont{dummy} % Cancel the request of SimSun font
\def\zhpunctfont{dummy} % Cancel the request of SimSun font
\input zhspacing.sty
\zhspacing
\def\txijapackage{zhspacing}
  \fi
\fi

%
% For others
%
\ifx\luatexversion\thisisundefined
  \ifx\XeTeXrevision\thisisundefined
\errmessage{The TeX engine is not LuaTeX / XeTeX.
LuaTeX / XeTeX is required for Japanese Texinfo files}
  \fi
\fi

% Original texinfo.tex
\input texinfo.tex
% $Id$
% txi-ja.tex -- Japanese translations and font definitions for texinfo.tex.
%
% Copyright 1999, 2007, 2008, 2016 Free Software Foundation, Inc.
% 
% This program is free software; you can redistribute it and/or modify
% it under the terms of the GNU General Public License as published by
% the Free Software Foundation; either version 3 of the license, or (at
% your option) any later version.
%
% This program is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
% GNU General Public License for more details.
%
% You should have received a copy of the GNU General Public License
% along with this program.  If not, see <http://www.gnu.org/licenses/>.
%
% Written by Masamichi Hosoda, 5 May 2016, <truer...@trueroad.jp>

\txisetlanguage{USenglish}{2}{3}

\plainnonfrenchspacing

\gdef\putwordAppendix{付録}
\gdef\putwordChapter{Chapter}
\gdef\putworderror{エラー}
\gdef\putwordfile{ファイル}
\gdef\putwordin{in}
\gdef\putwordIndexIsEmpty{(インデックスが空です)}
\gdef\putwordIndexNonexistent{(インデックスがありません)}
\gdef\putwordInfo{Info}
\gdef\putwordInstanceVariableof{Instance Variable of}
\gdef\putwordMethodon{Method on}
\gdef\putwordNoTitle{無題}
\gdef\putwordof{of}
\gdef\putwordon{on}
\gdef\putwordpage{p.}
\gdef\putwo

Re: XeTeX encoding problem

2016-03-27 Thread Masamichi HOSODA
Thank you for your advice.

> By the way, I don't understand ChangeLog entries like this:
> 
> 2016-03-23  Masamichi Hosoda  <truer...@trueroad.jp>
> * doc/texinfo.tex (\pdfgettoks, \pdfaddtokens, \adn, \poptoks, 
> \maketoks, \makelink, \pdflink, \done): New Macro.
> Add XeTeX PDF table of contents page number link support.
> 
> \pdfgettoks and the rest aren't "new macros" here.  Are they?  I don't
> see new definitions being added.

I'll change them like the following:

  Copy from the definition for pdfTeX and modify for XeTeX
  or
  Copy from the definition for pdfTeX

> I also happened to notice:
>   % In the case of XeTeX, xdvipdfmx converts strings to UTF-16.
>   % Therefore \txiescapepdf is not necessary.
> 
> A string like "\node" (where we don't want \n to be interpreted as a
> newline) or unbalanced parentheses won't cause problems?  -k

I did not notice the case of "\n" and unbalanced parentheses.
In my experiment, they cause problems.
I'll add \txiescapepdf to XeTeX supports.

However, XeTeX does not have \pdfescapestring.
So even if \txiescapepdf is add, the problems are not solved.



Re: XeTeX encoding problem

2016-03-23 Thread Masamichi HOSODA
>> I've made a patch that improves XeTeX PDF support.
>> This patch adds PDF table of contents page number link support.
>>
>> Would you commit it?
>> Or, may I commit it?
>>
>> Additionaly,
>> I'll make @email and @xref, \urefurlonlylinktrue link support for XeTeX.
> 
> Please do, if you haven't committed it already.

I've commited PDF table of contents page number link support and
@email, \urefurlonlylinktrue link support for XeTeX.

I'll make @xref link support for XeTeX.
I'm investigating it.

Thank you.



Re: XeTeX encoding problem

2016-03-22 Thread Masamichi HOSODA
I've made a patch that improves XeTeX PDF support.
This patch adds PDF table of contents page number link support.

Would you commit it?
Or, may I commit it?

Additionaly,
I'll make @email and @xref, \urefurlonlylinktrue link support for XeTeX.

ChangeLog:

XeTeX PDF TOC page number link support

2016-03-XX  Masamichi Hosoda  <truer...@trueroad.jp>

* doc/texinfo.tex (\pdfgettoks, \pdfaddtokens, \adn, \poptoks, 
\maketoks, \makelink, \pdflink, \done): New Macro.
Add XeTeX PDF table of contents page number link support.
--- texinfo.tex.org	2016-03-22 23:56:13.420162400 +0900
+++ texinfo.tex	2016-03-23 00:06:02.766729700 +0900
@@ -1645,6 +1645,32 @@
 /Subtype /Link /A << /S /URI /URI (#1) >> >>}%
 \endgroup}
   \def\endlink{\setcolor{\maincolor}\special{pdf:eann}}
+  \def\pdfgettoks#1.{\setbox\boxA=\hbox{\toksA={#1.}\toksB={}\maketoks}}
+  \def\addtokens#1#2{\edef\addtoks{\noexpand#1={\the#1#2}}\addtoks}
+  \def\adn#1{\addtokens{\toksC}{#1}\global\countA=1\let\next=\maketoks}
+  \def\poptoks#1#2|ENDTOKS|{\let\first=#1\toksD={#1}\toksA={#2}}
+  \def\maketoks{%
+\expandafter\poptoks\the\toksA|ENDTOKS|\relax
+\ifx\first0\adn0
+\else\ifx\first1\adn1 \else\ifx\first2\adn2 \else\ifx\first3\adn3
+\else\ifx\first4\adn4 \else\ifx\first5\adn5 \else\ifx\first6\adn6
+\else\ifx\first7\adn7 \else\ifx\first8\adn8 \else\ifx\first9\adn9
+\else
+  \ifnum0=\countA\else\makelink\fi
+  \ifx\first.\let\next=\done\else
+\let\next=\maketoks
+\addtokens{\toksB}{\the\toksD}
+\ifx\first,\addtokens{\toksB}{\space}\fi
+  \fi
+\fi\fi\fi\fi\fi\fi\fi\fi\fi\fi
+\next}
+  \def\makelink{\addtokens{\toksB}%
+{\noexpand\pdflink{\the\toksC}}\toksC={}\global\countA=0}
+  \def\pdflink#1{%
+\special{pdf:bann << /Border [0 0 0]
+  /Type /Annot /Subtype /Link /A << /S /GoTo /D (name#1) >> >>}%
+\setcolor{\linkcolor}#1\endlink}
+  \def\done{\edef\st{\global\noexpand\toksA={\the\toksB}}\st}
 %
   %
   % @image support
@@ -5479,7 +5505,14 @@
 % preserve coloured links across page boundaries.  Otherwise the marks
 % would get in the way of \lastbox in \insertindexentrybox.
   \else
-\hskip\skip\thinshrinkable #1%
+\ifx\XeTeXrevision\thisisundefined
+  \hskip\skip\thinshrinkable #1%
+\else
+  \pdfgettoks#1.%
+  \bgroup\let\domark\relax
+\hskip\skip\thinshrinkable\the\toksA
+  \egroup
+\fi
   \fi
 \fi
 \egroup % end \boxA
@@ -5614,7 +5647,11 @@
   \ifpdf
 \pdfgettoks#2.\ \the\toksA % The page number ends the paragraph.
   \else
-#2
+\ifx\XeTeXrevision\thisisundefined
+  #2
+\else
+  \pdfgettoks#2.\ \the\toksA % The page number ends the paragraph.
+\fi
   \fi
   \par
 }}


Re: XeTeX encoding problem

2016-03-22 Thread Masamichi HOSODA
>> but I want to understand what the point of the \edef was in the first
>> place, and what the point was of changing the catcode of backslash,
>> and test whether this is still necessary. Hopefully I'll get to this
>> soon.
> 
> I've committed a new change that should make special Unicode
> characters work again in @copying. Please let me know if there are any
> problems. Thanks.

texinfo.tex ver. 2016-03-22.07 works fine for all LilyPond texi documents.
Thank you.



Re: XeTeX encoding problem

2016-03-21 Thread Masamichi HOSODA
> Thanks for working on this. I'd like to avoid going back to the way it
> was done before if possible because this means that all the
> definitions of the Unicode characters are run through every time a
> macro is used. The following patch seems to give good results:
> 
> Index: doc/texinfo.tex
> ===
> --- doc/texinfo.tex (revision 7047)
> +++ doc/texinfo.tex (working copy)
> @@ -7823,7 +7823,9 @@
>% backslash to get it printed correctly.
>% FIXME: This may not be needed.
>%\catcode`\@=0 \catcode`\\=\active \escapechar=`\@
> +  \passthroughcharstrue
>\edef\temp{\noexpand\scanmacro{#1}}%
> +  \passthroughcharsfalse
>\temp
>\egroup
>  }
> 
> ===
> 
> but I want to understand what the point of the \edef was in the first
> place, and what the point was of changing the catcode of backslash,
> and test whether this is still necessary. Hopefully I'll get to this
> soon.

Your patch works fine on my environment.
Thank you.



Re: XeTeX encoding problem

2016-03-20 Thread Masamichi HOSODA
> I've noticed an issue of texinfo.tex ver. 2016-03-06.18.
> It can not compile the attached texi file.
>
[snip...]
> 
> All the following engines fail.
> LuaTeX 0.89.2
> XeTeX 0.2
> XeTeX 0.5
> pdfTeX 1.40.16
> 
> With texinfo.tex ver. 2016-03-05.11, they work fine.

Here is a patch that fixes the issue.

ChangeLog:

Fix Unicode character in @copying

2016-03-XX  Masamichi Hosoda  <truer...@trueroad.jp>

* doc/texinfo.tex: Fix Unicode character in @copying.
(\scanctxt): Add using \setcharscatcodeothernonglobal.
(\nativeunicodecharscatcodeothernonglobal):
Revert to 2015-03-15.
(\setcharscatcodeothernonglobal):
Revert to 2015-03-15.
--- texinfo.tex.org	2016-03-08 22:46:15.850782600 +0900
+++ texinfo.tex	2016-03-20 23:42:05.382585100 +0900
@@ -7896,6 +7896,7 @@
   \catcode`\|=\other
   \catcode`\~=\other
   \passthroughcharstrue
+  \ifx\declaredencoding\ascii \else \setcharscatcodeothernonglobal \fi
 }
 
 \def\scanargctxt{% used for copying and captions, not macros.
@@ -10912,6 +10913,22 @@
   \unicodechardefs
 }
 
+% Native Unicode (XeTeX and LuaTeX) catcode other non global definitions
+\def\nativeunicodecharscatcodeothernonglobal{%
+  \let\DeclareUnicodeCharacter\DeclareUnicodeCharacterNativeOther
+  \unicodechardefs
+}
+
+% Catcode (non-ASCII or native Unicode) are set to \other (non-global
+% assignments).
+\def\setcharscatcodeothernonglobal{%
+  \iftxiusebytewiseio
+\setnonasciicharscatcodenonglobal\other
+  \else
+\nativeunicodecharscatcodeothernonglobal
+  \fi
+}
+
 % US-ASCII character definitions.
 \def\asciichardefs{% nothing need be done
\relax


Re: XeTeX encoding problem

2016-03-14 Thread Masamichi HOSODA
>> I\\\'ve finally finished this. It does appear a bit faster as well as
>> being simpler. Where it could potentially break is chapter names with
>> non-ASCII characters, and index entries with non-ASCII characters, as
>> well as macros using non-ASCII characters (either in the macro
>> definition or in the arguments), as well as macros giving chapter
>> names and index entries. I\\\'ll do a bit more testing later.
>> 
>> Masamichi, does the native multibyte support still work well? Is there
>> any more that you wanted to do with it? If it looks good we should
>> look to releasing it as an official version for non-developers.
> 
> LGTM
> 
> Native Unicode support works fine with some texi source including Japanese.
> Thank you.

I've noticed that the definition of \ifpassthroughchars is duplicated.
Here is the patch for fix it.

ChangeLog:

Remove duplicated definition of \ifpassthroughchars

2016-03-XX  Masamichi Hosoda  <truer...@trueroad.jp>

* doc/texinfo.tex (\ifpassthroughchars):
Remove duplicated definition.
--- texinfo.tex.org	2016-03-08 22:46:15.850782600 +0900
+++ texinfo.tex	2016-03-14 22:34:30.708418900 +0900
@@ -10865,9 +10865,6 @@
   \unicodechardefs
 }
 
-\newif\ifpassthroughchars
-\passthroughcharsfalse
-
 % For native Unicode (XeTeX and LuaTeX)
 % Definition macro to replace / pass-through the Unicode character
 %


Re: LuaTeX >= 0.85 support

2016-02-22 Thread Masamichi HOSODA
>>> If LuaTeX breaks compatibility with earlier LuaTeX files, then it
>>> seems acceptable in some sense not to support older versions of
>>> LuaTeX.  If LuaTeX is stable at some point, then I'd have no
>>> problem with removing support for any earlier versions of LuaTeX.
>>> As Karl said, there's no harm if you want to keep on updating to
>>> track LuaTeX's changes.
>> 
>> Thank you for your information.  I'll watch LuaTeX changing.
> 
> I guess it makes sense only to support TeXLive LuaTeX (binary)
> versions.

Thank you for your suggestion.

TeX Live 2015 has LuaTeX 0.80.
TeX Live 2016 will have LuaTeX 0.90 if I understand correctly.

I'll watch TeX Live 2016 and LuaTeX 0.90.



Re: LuaTeX >= 0.85 support

2016-02-21 Thread Masamichi HOSODA
>> If it were up to me, I would simply declare LuaTeX unsupported at least
>> until 1.0.  It seems that tracking Hans's changes from now on will imply
>> a huge investment of time and effort, let alone doing it in a compatible
>> way so that people not running the bleeding edge will keep working as
>> well.  I don't see that the return is worth it.  But of course, if you
>> want to, more power to you!
> 
> If LuaTeX breaks compatibility with earlier LuaTeX files, then it
> seems acceptable in some sense not to support older versions of
> LuaTeX. If LuaTeX is stable at some point, then I'd have no problem
> with removing support for any earlier versions of LuaTeX. As Karl
> said, there's no harm if you want to keep on updating to track
> LuaTeX's changes.

Thank you for your information.
I'll watch LuaTeX changing.



Re: LuaTeX >= 0.85 support

2016-02-17 Thread Masamichi HOSODA
> Thanks very much, installed. Does this give LuaTeX support for the
> features that are supported with pdftex that were being removed or
> renamed from LuaTeX?

If I understand correctly,
PDF related primitives of LuaTeX 0.80 and pdfTeX are almost the same.
However, LuaTeX team changed them significantly after ver. 0.80.
(0.80, 0.81, 0.85 are all different.)
>From ver. 0.85 to 0.89, they are almost no changes.



LuaTeX >= 0.85 support

2016-02-15 Thread Masamichi HOSODA
Hello,

I've made LuaTeX >= 0.85 support patch.

ChangeLog

2016-02-XX  Masamichi Hosoda  <truer...@trueroad.jp>

* doc/texinfo.tex: Add LuaTeX >= 0.85 support.
(\txipagewidth): Rename from \pagewidth.
(\txipagehight): Rename from \pageheight.
--- texinfo.tex.org	2016-02-16 00:22:38.761365600 +0900
+++ texinfo.tex	2016-02-16 00:58:09.264846200 +0900
@@ -310,7 +310,7 @@
 % Margin to add to right of even pages, to left of odd pages.
 \newdimen\bindingoffset
 \newdimen\normaloffset
-\newdimen\pagewidth \newdimen\pageheight
+\newdimen\txipagewidth \newdimen\txipageheight
 
 % Main output routine.
 %
@@ -334,7 +334,7 @@
   % Common context changes for both heading and footing.
   % Do this outside of the \shipout so @code etc. will be expanded in
   % the headline as they should be, not taken literally (outputting ''code).
-  \def\commmonheadfootline{\let\hsize=\pagewidth \texinfochars}
+  \def\commmonheadfootline{\let\hsize=\txipagewidth \texinfochars}
   %
   % Retrieve the information for the headings from the marks in the page,
   % and call Plain TeX's \makeheadline and \makefootline, which use the
@@ -433,7 +433,7 @@
 \newinsert\margin \dimen\margin=\maxdimen
 
 % Main part of page, including any footnotes
-\def\pagebody#1{\vbox to\pageheight{\boxmaxdepth=\maxdepth #1}}
+\def\pagebody#1{\vbox to\txipageheight{\boxmaxdepth=\maxdepth #1}}
 {\catcode`\@ =11
 \gdef\pagecontents#1{\ifvoid\topins\else\unvbox\topins\fi
 % marginal hacks, j...@viisa.uucp (Juha Takala)
@@ -724,11 +724,11 @@
   % \dimen0 is the vertical size of the group's box.
   \dimen0 = \ht\groupbox  \advance\dimen0 by \dp\groupbox
   % \dimen2 is how much space is left on the page (more or less).
-  \dimen2 = \pageheight   \advance\dimen2 by -\pagetotal
+  \dimen2 = \txipageheight   \advance\dimen2 by -\pagetotal
   % if the group doesn't fit on the current page, and it's a big big
   % group, force a page break.
   \ifdim \dimen0 > \dimen2
-\ifdim \pagetotal < \vfilllimit\pageheight
+\ifdim \pagetotal < \vfilllimit\txipageheight
   \page
 \fi
   \fi
@@ -1100,6 +1100,35 @@
 \newif\ifpdf
 \newif\ifpdfmakepagedest
 
+%
+% For LuaTeX
+%
+
+\ifx\luatexversion\thisisundefined
+\else
+  \ifnum\luatexversion>84
+% For LuaTeX >= 0.85
+\def\pdfdest{\pdfextension dest}
+\let\pdfoutput\outputmode
+\def\pdfliteral{\pdfextension literal}
+\def\pdfcatalog{\pdfextension catalog}
+\def\pdftexversion{\numexpr\pdffeedback version\relax}
+\let\pdfximage\saveimageresource
+\let\pdfrefximage\useimageresource
+\let\pdflastximage\lastsavedimageresourceindex
+\def\pdfendlink{\pdfextension endlink\relax}
+\def\pdfoutline{\pdfextension outline}
+\def\pdfstartlink{\pdfextension startlink}
+\def\pdffontattr{\pdfextension fontattr}
+\def\pdfobj{\pdfextension obj}
+\def\pdflastobj{\numexpr\pdffeedback lastobj\relax}
+\let\pdfpagewidth\pagewidth
+\let\pdfpageheight\pageheight
+\edef\pdfhorigin{\pdfvariable horigin}
+\edef\pdfvorigin{\pdfvariable vorigin}
+  \fi
+\fi
+
 % when pdftex is run in dvi mode, \pdfoutput is defined (so \pdfoutput=1
 % can be set).  So we test for \relax and 0 as well as being undefined.
 \ifx\pdfoutput\thisisundefined
@@ -3579,7 +3608,7 @@
   %
   % Leave some space for the footline.  Hopefully ok to assume
   % @evenfooting will not be used by itself.
-  \global\advance\pageheight by -12pt
+  \global\advance\txipageheight by -12pt
   \global\advance\vsize by -12pt
 }
 
@@ -5651,7 +5680,7 @@
   \wd0=\hsize \wd2=\hsize
   \vbox{%
 \vskip\doublecolumntopgap
-\hbox to\pagewidth{\box0\hfil\box2}}%
+\hbox to\txipagewidth{\box0\hfil\box2}}%
 }
 
 
@@ -5678,7 +5707,7 @@
   % goal.  When TeX sees \eject from below which follows the final
   % section, it invokes the new output routine that we've set after
   % \balancecolumns below; \onepageout will try to fit the two columns
-  % and the final section into the vbox of \pageheight (see
+  % and the final section into the vbox of \txipageheight (see
   % \pagebody), causing an overfull box.
   %
   % Note that glue won't work here, because glue does not exercise the
@@ -9052,7 +9081,7 @@
   % We want to typeset this text as a normal paragraph, even if the
   % footnote reference occurs in (for example) a display environment.
   % So reset some parameters.
-  \hsize=\pagewidth
+  \hsize=\txipagewidth
   \interlinepenalty\interfootnotelinepenalty
   \splittopskip\ht\strutbox % top baseline for broken footnotes
   \splitmaxdepth\dp\strutbox
@@ -11016,12 +11045,12 @@
   \advance\vsize by \topskip
   \outervsize = \vsize
   \advance\outervsize by 2\topandbottommargin
-  \pageheight = \vsize
+  \txipageheight = \vsize
   %
   \hsize = #2\relax
   \outerhsize = \hsize
   \advance\outerhsize by 0.5in
-  \pagewidth = \hsize
+  \txipagewidth = \hsize
   %
   \normaloffset = #4\relax
   \bindingoffset = #5\relax


Re: XeTeX encoding problem

2016-02-08 Thread Masamichi HOSODA
> I hope it's clear what I'm trying to do here: instead of redefining,
> change the value of a conditional that is used within the macro. I
> thought that something similar might be possible with XeTeX's native
> Unicode.

Thank you for your advice.
I've made a patch.

ChangeLog:

Native Unicode replace switching instead of re-definition

2016-02-XX  Masamichi Hosoda  <truer...@trueroad.jp>

* doc/texinfo.tex:
Native Unicode replace switching instead of re-definition.

(\ifpassthroughchars): New switch.
(\DeclareUnicodeCharacterNative):
Integrate \DeclareUnicodeCharacterNativeThru.
Add capable to switch replace / pass-through characters.
(\DeclareUnicodeCharacterNativeThru): Remove.
(\nativeunicodechardefsthru): Remove.
(\passthroughcharacters):
Use switch instead of \nativeunicodechardefsthru.
--- texinfo.tex.org	2016-02-08 22:43:03.863678800 +0900
+++ texinfo.tex	2016-02-08 22:51:57.774630900 +0900
@@ -10816,15 +10816,32 @@
   \unicodechardefs
 }
 
+\newif\ifpassthroughchars
+\passthroughcharsfalse
+
 % For native Unicode (XeTeX and LuaTeX)
-% Definition macro to replace the Unicode character
+% Definition macro to replace / pass-through the Unicode character
 %
 \def\DeclareUnicodeCharacterNative#1#2{%
   \catcode"#1=\active
+  \def\dodeclareunicodecharacternative##1##2##3{%
+\begingroup
+  \uccode`\~="##2\relax
+  \uppercase{\gdef~}{%
+\ifpassthroughchars
+  ##1%
+\else
+  ##3%
+\fi
+  }
+\endgroup
+  }
   \begingroup
-\uccode`\~="#1\relax
-\uppercase{\gdef~}{#2}%
-  \endgroup}
+\uccode`\.="#1\relax
+\uppercase{\def\UTFNativeTmp{.}}%
+\expandafter\dodeclareunicodecharacternative\UTFNativeTmp{#1}{#2}%
+  \endgroup
+}
 
 % Native Unicode (XeTeX and LuaTeX) character replacing definitions
 % It makes the setting that replace the Unicode characters.
@@ -10833,27 +10850,6 @@
   \unicodechardefs
 }
 
-% For native Unicode (XeTeX and LuaTeX)
-% Definition macro not to make the Unicode character expand to a non-active 
-% token with the same character code.  Used when writing to auxiliary files.
-%
-\def\DeclareUnicodeCharacterNativeThru#1#2{%
-  \catcode"#1=\active
-  \begingroup
-\uccode`\.="#1\relax
-\uppercase{\endgroup \def\UTFNativeTmp{.}}%
-  \begingroup
-\uccode`\~="#1\relax
-\uppercase{\endgroup \edef~}{\UTFNativeTmp}%
-}
-
-% Native Unicode (XeTeX and LuaTeX) character ``through'' definitions.
-% It makes the setting that does not replace the Unicode characters.
-\def\nativeunicodechardefsthru{%
-  \let\DeclareUnicodeCharacter\DeclareUnicodeCharacterNativeThru
-  \unicodechardefs
-}
-
 % For native Unicode (XeTeX and LuaTeX).  Make the character token expand
 % to the sequences given in \unicodechardefs for printing.
 \def\DeclareUnicodeCharacterNativeAtU#1#2{%
@@ -10941,7 +10937,7 @@
   \iftxiusebytewiseio
 \nonasciistringdefs
   \else
-\nativeunicodechardefsthru
+\passthroughcharstrue
   \fi
 }
 


Re: XeTeX encoding problem

2016-02-07 Thread Masamichi HOSODA
>>  \def\DeclareUnicodeCharacterNative#1#2{%
>>\catcode"#1=\active
>> -  \begingroup
>> -\uccode`\~="#1\relax
>> -\uppercase{\gdef~}{#2}%
>> -  \endgroup}
>> +  \ifnativeunicodereplace
>> +\begingroup
>> +  \uccode`\~="#1\relax
>> +  \uppercase{\gdef~}{#2}%
>> +\endgroup
>> +  \else
>> +\begingroup
>> +  \uccode`\.="#1\relax
>> +  \uppercase{\endgroup \def\UTFNativeTmp{.}}%
>> +\begingroup
>> +  \uccode`\~="#1\relax
>> +  \uppercase{\endgroup \edef~}{\UTFNativeTmp}%
>> +  \fi
>> +}
> 
> I'm not sure if this is correct: shouldn't the conditional be inside a
> single definition, instead of two definitions (starting \gdef~ and
> \edef~) inside the conditional?

Sorry.
It's comletely incorrect.
It can not swith to ``pass-through''.

Even if to use \gdef for ``pass-through'',
it can not switch to ``replace'' again.

Would you revert it?



Re: XeTeX encoding problem

2016-02-07 Thread Masamichi HOSODA
>> I'm not sure if this is correct: shouldn't the conditional be inside a
>> single definition, instead of two definitions (starting \gdef~ and
>> \edef~) inside the conditional?
> 
> Sorry.
> It's comletely incorrect.
> It can not swith to ``pass-through''.
> 
> Even if to use \gdef for ``pass-through'',
> it can not switch to ``replace'' again.
> 
> Would you revert it?

Thank you reverting it.
I have no idea to switch to ``replace'' again.
Sorry.

However, attached 
test-ref-extra-space.texi
generates extra spaces due to re-difinition's extra spaces.

Attached
texinfo.tex.remove-extra-space2.diff
could remove extra spaces in re-difinition.
\input texinfo.tex @c -*- coding: utf-8 -*-

@documentencoding UTF-8

@node für
@unnumbered für

für
@xref{für}

@bye
--- texinfo.tex.org	2016-02-07 21:50:24.667799700 +0900
+++ texinfo.tex	2016-02-08 00:45:15.702194600 +0900
@@ -10117,675 +10117,675 @@
 % least make most of the characters not bomb out.
 %
 \def\unicodechardefs{%
-  \DeclareUnicodeCharacter{00A0}{\tie}
-  \DeclareUnicodeCharacter{00A1}{\exclamdown}
+  \DeclareUnicodeCharacter{00A0}{\tie}%
+  \DeclareUnicodeCharacter{00A1}{\exclamdown}%
   \DeclareUnicodeCharacter{00A2}{{\tcfont \char162}}% 0242=cent
-  \DeclareUnicodeCharacter{00A3}{\pounds}
+  \DeclareUnicodeCharacter{00A3}{\pounds}%
   \DeclareUnicodeCharacter{00A4}{{\tcfont \char164}}% 0244=currency
   \DeclareUnicodeCharacter{00A5}{{\tcfont \char165}}% 0245=yen
   \DeclareUnicodeCharacter{00A6}{{\tcfont \char166}}% 0246=brokenbar
-  \DeclareUnicodeCharacter{00A7}{\S}
-  \DeclareUnicodeCharacter{00A8}{\"{ }}
-  \DeclareUnicodeCharacter{00A9}{\copyright}
-  \DeclareUnicodeCharacter{00AA}{\ordf}
-  \DeclareUnicodeCharacter{00AB}{\guillemetleft}
-  \DeclareUnicodeCharacter{00AC}{\ensuremath\lnot}
-  \DeclareUnicodeCharacter{00AD}{\-}
-  \DeclareUnicodeCharacter{00AE}{\registeredsymbol}
-  \DeclareUnicodeCharacter{00AF}{\={ }}
-  %
-  \DeclareUnicodeCharacter{00B0}{\ringaccent{ }}
-  \DeclareUnicodeCharacter{00B1}{\ensuremath\pm}
-  \DeclareUnicodeCharacter{00B2}{$^2$}
-  \DeclareUnicodeCharacter{00B3}{$^3$}
-  \DeclareUnicodeCharacter{00B4}{\'{ }}
-  \DeclareUnicodeCharacter{00B5}{$\mu$}
-  \DeclareUnicodeCharacter{00B6}{\P}
-  \DeclareUnicodeCharacter{00B7}{\ensuremath\cdot}
-  \DeclareUnicodeCharacter{00B8}{\cedilla{ }}
-  \DeclareUnicodeCharacter{00B9}{$^1$}
-  \DeclareUnicodeCharacter{00BA}{\ordm}
-  \DeclareUnicodeCharacter{00BB}{\guillemetright}
-  \DeclareUnicodeCharacter{00BC}{$1\over4$}
-  \DeclareUnicodeCharacter{00BD}{$1\over2$}
-  \DeclareUnicodeCharacter{00BE}{$3\over4$}
-  \DeclareUnicodeCharacter{00BF}{\questiondown}
-  %
-  \DeclareUnicodeCharacter{00C0}{\`A}
-  \DeclareUnicodeCharacter{00C1}{\'A}
-  \DeclareUnicodeCharacter{00C2}{\^A}
-  \DeclareUnicodeCharacter{00C3}{\~A}
-  \DeclareUnicodeCharacter{00C4}{\"A}
-  \DeclareUnicodeCharacter{00C5}{\AA}
-  \DeclareUnicodeCharacter{00C6}{\AE}
-  \DeclareUnicodeCharacter{00C7}{\cedilla{C}}
-  \DeclareUnicodeCharacter{00C8}{\`E}
-  \DeclareUnicodeCharacter{00C9}{\'E}
-  \DeclareUnicodeCharacter{00CA}{\^E}
-  \DeclareUnicodeCharacter{00CB}{\"E}
-  \DeclareUnicodeCharacter{00CC}{\`I}
-  \DeclareUnicodeCharacter{00CD}{\'I}
-  \DeclareUnicodeCharacter{00CE}{\^I}
-  \DeclareUnicodeCharacter{00CF}{\"I}
-  %
-  \DeclareUnicodeCharacter{00D0}{\DH}
-  \DeclareUnicodeCharacter{00D1}{\~N}
-  \DeclareUnicodeCharacter{00D2}{\`O}
-  \DeclareUnicodeCharacter{00D3}{\'O}
-  \DeclareUnicodeCharacter{00D4}{\^O}
-  \DeclareUnicodeCharacter{00D5}{\~O}
-  \DeclareUnicodeCharacter{00D6}{\"O}
-  \DeclareUnicodeCharacter{00D7}{\ensuremath\times}
-  \DeclareUnicodeCharacter{00D8}{\O}
-  \DeclareUnicodeCharacter{00D9}{\`U}
-  \DeclareUnicodeCharacter{00DA}{\'U}
-  \DeclareUnicodeCharacter{00DB}{\^U}
-  \DeclareUnicodeCharacter{00DC}{\"U}
-  \DeclareUnicodeCharacter{00DD}{\'Y}
-  \DeclareUnicodeCharacter{00DE}{\TH}
-  \DeclareUnicodeCharacter{00DF}{\ss}
-  %
-  \DeclareUnicodeCharacter{00E0}{\`a}
-  \DeclareUnicodeCharacter{00E1}{\'a}
-  \DeclareUnicodeCharacter{00E2}{\^a}
-  \DeclareUnicodeCharacter{00E3}{\~a}
-  \DeclareUnicodeCharacter{00E4}{\"a}
-  \DeclareUnicodeCharacter{00E5}{\aa}
-  \DeclareUnicodeCharacter{00E6}{\ae}
-  \DeclareUnicodeCharacter{00E7}{\cedilla{c}}
-  \DeclareUnicodeCharacter{00E8}{\`e}
-  \DeclareUnicodeCharacter{00E9}{\'e}
-  \DeclareUnicodeCharacter{00EA}{\^e}
-  \DeclareUnicodeCharacter{00EB}{\"e}
-  \DeclareUnicodeCharacter{00EC}{\`{\dotless{i}}}
-  \DeclareUnicodeCharacter{00ED}{\'{\dotless{i}}}
-  \DeclareUnicodeCharacter{00EE}{\^{\dotless{i}}}
-  \DeclareUnicodeCharacter{00EF}{\"{\dotless{i}}}
-  %
-  \DeclareUnicodeCharacter{00F0}{\dh}
-  \DeclareUnicodeCharacter{00F1}{\~n}
-  \DeclareUnicodeCharacter{00F2}{\`o}
-  \DeclareUnicodeCharacter{00F3}{\'o}
-  \DeclareUnicodeCharacter{00F4}{\^o}
-  \DeclareUnicodeCharacter{00F5}{\~o}
-  \DeclareUnicodeCharacter{00F6}{\"o}
-  \DeclareUnicodeCharacter{00F7}{\ensuremath\div}
-  \DeclareUnicodeCharacter{00F8}{\o}
-  

Re: XeTeX encoding problem

2016-02-07 Thread Masamichi HOSODA
> I have a different suggestion for fixing this issue: execute
> \unicodechardefs only once in each run, and make the expansion of each
> character use a condition. The value of the condition can be changed
> to control what the characters do without redefining all of the
> characters.
> 
> The same could be done for \nonasciistringdefs. I was thinking of
> making this change before when I was looking at an log of macro
> expansion and was scrolling past many lines that resulted from the
> redefinitions of non-ASCII characters.

Thank you for your suggestion.
I've made the native Unicode replace switching patch.

ChangeLog:

Native Unicode replace switching instead of re-definition

2016-02-XX  Masamichi Hosoda  <truer...@trueroad.jp>

* doc/texinfo.tex:
Native Unicode replace switching instead of re-definition.

(\ifnativeunicodereplace): New switch.
(\DeclareUnicodeCharacterNative):
Integrate \DeclareUnicodeCharacterNativeThru.
Add capable to switch replace or pass-through characters.
(\DeclareUnicodeCharacterNativeThru): Remove.
(\nativeunicodechardefsthru): Remove.
(\passthroughcharacters):
Use switch instead of \nativeunicodechardefsthru
--- texinfo.tex.org	2016-02-07 21:50:24.667799700 +0900
+++ texinfo.tex	2016-02-07 22:56:21.374334100 +0900
@@ -10798,12 +10798,25 @@
 % For native Unicode (XeTeX and LuaTeX)
 % Definition macro to replace the Unicode character
 %
+\newif\ifnativeunicodereplace
+\nativeunicodereplacetrue
+
 \def\DeclareUnicodeCharacterNative#1#2{%
   \catcode"#1=\active
-  \begingroup
-\uccode`\~="#1\relax
-\uppercase{\gdef~}{#2}%
-  \endgroup}
+  \ifnativeunicodereplace
+\begingroup
+  \uccode`\~="#1\relax
+  \uppercase{\gdef~}{#2}%
+\endgroup
+  \else
+\begingroup
+  \uccode`\.="#1\relax
+  \uppercase{\endgroup \def\UTFNativeTmp{.}}%
+\begingroup
+  \uccode`\~="#1\relax
+  \uppercase{\endgroup \edef~}{\UTFNativeTmp}%
+  \fi
+}
 
 % Native Unicode (XeTeX and LuaTeX) character replacing definitions
 % It makes the setting that replace the Unicode characters.
@@ -10812,27 +10825,6 @@
   \unicodechardefs
 }
 
-% For native Unicode (XeTeX and LuaTeX)
-% Definition macro not to make the Unicode character expand to a non-active 
-% token with the same character code.  Used when writing to auxiliary files.
-%
-\def\DeclareUnicodeCharacterNativeThru#1#2{%
-  \catcode"#1=\active
-  \begingroup
-\uccode`\.="#1\relax
-\uppercase{\endgroup \def\UTFNativeTmp{.}}%
-  \begingroup
-\uccode`\~="#1\relax
-\uppercase{\endgroup \edef~}{\UTFNativeTmp}%
-}
-
-% Native Unicode (XeTeX and LuaTeX) character ``through'' definitions.
-% It makes the setting that does not replace the Unicode characters.
-\def\nativeunicodechardefsthru{%
-  \let\DeclareUnicodeCharacter\DeclareUnicodeCharacterNativeThru
-  \unicodechardefs
-}
-
 % For native Unicode (XeTeX and LuaTeX).  Make the character token expand
 % to the sequences given in \unicodechardefs for printing.
 \def\DeclareUnicodeCharacterNativeAtU#1#2{%
@@ -10920,7 +10912,7 @@
   \iftxiusebytewiseio
 \nonasciistringdefs
   \else
-\nativeunicodechardefsthru
+\nativeunicodereplacefalse
   \fi
 }
 


Improve XeTeX PDF outline support

2016-02-07 Thread Masamichi HOSODA
texinfo.tex ver. 2016-02-07.16 can not compile following attached files.

test-U201E.texi
test-set-value.texi

I've fixed it.
Here's the patch for texinfo.tex ver. 2016-02-07.16.

ChangeLog:

Improve XeTeX PDF outline support

2016-02-XX  Masamichi Hosoda  <truer...@trueroad.jp>

* doc/texinfo.tex:
Improve XeTeX PDF outline support.
(\pdfmkdest): Add \indexnofonts and \makevalueexpandable,
(\dopdfoutline): Add \turnoffactive,
(\pdfmakeoutlines): Add some comments. Use \let instead of \def.
\input texinfo.tex @c -*- coding: utf-8 -*-

@documentencoding UTF-8

@contents

@node „
@chapter test „: U+201E

„: U+201E DOUBLE LOW-9 QUOTATION MARK

@bye
\input texinfo.tex

@documentencoding UTF-8

@contents

@set foo-bar_ test test test

@node nøùü @code{@value{foo-bar_}}
@chapter øùü @code{@value{foo-bar_}} test

øùü

@bye
--- texinfo.tex.org	2016-02-08 01:22:13.800799100 +0900
+++ texinfo.tex	2016-02-08 01:23:28.667073400 +0900
@@ -1455,26 +1455,38 @@
 \ifx\XeTeXrevision\thisisundefined
 \else
   \pdfmakepagedesttrue \relax
+  % Emulate the primitive of pdfTeX
   \def\pdfdest name#1 xyz{%
 \special{pdf:dest (name#1) [@thispage /XYZ @xpos @ypos]}%
   }
-  \def\pdfmkdest#1{%
-\special{pdf:dest (name#1) [@thispage /XYZ @xpos @ypos]}%
-  }
+  \def\pdfmkdest#1{{%
+% We have to set dummies so commands such as @code, and characters
+% such as \, aren't expanded when present in a section title.
+\indexnofonts
+\makevalueexpandable
+% In the case of XeTeX, xdvipdfmx converts strings to UTF-16.
+% Therefore \txiescapepdf is not necessary.
+\safewhatsit{\pdfdest name{#1} xyz}%
+  }}
   %
   \def\dopdfoutline#1#2#3#4{%
 \edef\pdfoutlinedest{#3}%
 \ifx\pdfoutlinedest\empty
   \def\pdfoutlinedest{#4}%
 \fi
-%
-\edef\pdfoutlinetext{#1}%
-%
+\turnoffactive
+% In the case of XeTeX, xdvipdfmx converts strings to UTF-16.
+% Therefore \txiescapepdf is not necessary.
 \special{pdf:out [-] #2 << /Title (#1) /A << /S /GoTo /D (name\pdfoutlinedest) >> >> }%
   }
   %
   \def\pdfmakeoutlines{%
 \begingroup
+  %
+  % In the case of XeTeX, counts of subentries is not necesary.
+  % Therefore, read toc only once.
+  %
+  % We use the node names as the destinations.
   \def\partentry##1##2##3##4{}% ignore parts in the outlines
   \def\numchapentry##1##2##3##4{%
 \dopdfoutline{##1}{1}{##3}{##4}}%
@@ -1485,24 +1497,33 @@
   \def\numsubsubsecentry##1##2##3##4{%
 \dopdfoutline{##1}{4}{##3}{##4}}%
   %
-  \def\appentry{\numchapentry}%
-  \def\appsecentry{\numsecentry}%
-  \def\appsubsecentry{\numsubsecentry}%
-  \def\appsubsubsecentry{\numsubsubsecentry}%
-  \def\unnchapentry{\numchapentry}%
-  \def\unnsecentry{\numsecentry}%
-  \def\unnsubsecentry{\numsubsecentry}%
-  \def\unnsubsubsecentry{\numsubsubsecentry}%
+  \let\appentry\numchapentry%
+  \let\appsecentry\numsecentry%
+  \let\appsubsecentry\numsubsecentry%
+  \let\appsubsubsecentry\numsubsubsecentry%
+  \let\unnchapentry\numchapentry%
+  \let\unnsecentry\numsecentry%
+  \let\unnsubsecentry\numsubsecentry%
+  \let\unnsubsubsecentry\numsubsubsecentry%
+  %
+  % In the case of XeTeX, xdvipdfmx converts strings to UTF-16.
+  % Therefore, the encoding and the language may not be considered.
   %
   \indexnofonts
   \setupdatafile
-  %
+  % We can have normal brace characters in the PDF outlines, unlike
+  % Texinfo index files.  So set that up.
   \def\{{\lbracecharliteral}%
   \def\}{\rbracecharliteral}%
   \catcode`\\=\active \otherbackslash
   \input \tocreadfilename
 \endgroup
   }
+  {\catcode`[=1 \catcode`]=2
+   \catcode`{=\other \catcode`}=\other
+   \gdef\lbracecharliteral[{]%
+   \gdef\rbracecharliteral[}]%
+  ]
 
   \special{pdf:docview << /PageMode /UseOutlines >> }
   \special{pdf:tounicode UTF8-UTF16 }


Re: XeTeX PDF outline support

2016-02-04 Thread Masamichi HOSODA
>> I've made XeTeX PDF outline support patch.
> 
> Excellent, thanks!

My previous XeTeX PDF outline support patch could not compile
LilyPond German texi documents.
I've fixed it.

It can compile LilyPond all languages texi documents
by combining the following patches.

http://lists.gnu.org/archive/html/bug-texinfo/2016-02/msg9.html
http://lists.gnu.org/archive/html/bug-texinfo/2016-02/msg00010.html
http://lists.gnu.org/archive/html/bug-texinfo/2016-02/msg00011.html
--- texinfo.tex.org	2016-02-03 22:33:14.500957900 +0900
+++ texinfo.tex	2016-02-04 23:17:51.688395300 +0900
@@ -1450,6 +1450,86 @@
 \fi  % \ifx\pdfoutput
 
 %
+% PDF outline support for XeTeX
+%
+\ifx\XeTeXrevision\thisisundefined
+\else
+  \pdfmakepagedesttrue \relax
+  % Emulate the primitive of pdfTeX
+  \def\pdfdest name#1 xyz{%
+\special{pdf:dest (name#1) [@thispage /XYZ @xpos @ypos]}%
+  }
+  \def\pdfmkdest#1{{%
+% We have to set dummies so commands such as @code, and characters
+% such as \, aren't expanded when present in a section title.
+\indexnofonts
+\turnoffactive
+\makevalueexpandable
+% In the case of XeTeX, xdvipdfmx converts strings to UTF-16.
+% Therefore \txiescapepdf is not necessary.
+\safewhatsit{\pdfdest name{#1} xyz}%
+  }}
+  %
+  \def\dopdfoutlinexetex#1#2#3#4{%
+\edef\pdfoutlinedest{#3}%
+\ifx\pdfoutlinedest\empty
+  \def\pdfoutlinedest{#4}%
+\fi
+% In the case of XeTeX, xdvipdfmx converts strings to UTF-16.
+% Therefore \txiescapepdf is not necessary.
+\special{pdf:out [-] #2 << /Title (#1) /A << /S /GoTo /D (name\pdfoutlinedest) >> >> }%
+  }
+  %
+  \def\pdfmakeoutlines{%
+\begingroup
+  %
+  % In the case of XeTeX, counts of subentries is not necesary.
+  % Therefore, read toc only once.
+  %
+  % We use the node names as the destinations.
+  \def\partentry##1##2##3##4{}% ignore parts in the outlines
+  \def\numchapentry##1##2##3##4{%
+\dopdfoutlinexetex{##1}{1}{##3}{##4}}%
+  \def\numsecentry##1##2##3##4{%
+\dopdfoutlinexetex{##1}{2}{##3}{##4}}%
+  \def\numsubsecentry##1##2##3##4{%
+\dopdfoutlinexetex{##1}{3}{##3}{##4}}%
+  \def\numsubsubsecentry##1##2##3##4{%
+\dopdfoutlinexetex{##1}{4}{##3}{##4}}%
+  %
+  \let\appentry\numchapentry%
+  \let\appsecentry\numsecentry%
+  \let\appsubsecentry\numsubsecentry%
+  \let\appsubsubsecentry\numsubsubsecentry%
+  \let\unnchapentry\numchapentry%
+  \let\unnsecentry\numsecentry%
+  \let\unnsubsecentry\numsubsecentry%
+  \let\unnsubsubsecentry\numsubsubsecentry%
+  %
+  % In the case of XeTeX, xdvipdfmx converts strings to UTF-16.
+  % Therefore, the encoding and the language may not be considered.
+  %
+  \indexnofonts
+  \setupdatafile
+  % We can have normal brace characters in the PDF outlines, unlike
+  % Texinfo index files.  So set that up.
+  \def\{{\lbracecharliteral}%
+  \def\}{\rbracecharliteral}%
+  \catcode`\\=\active \otherbackslash
+  \input \tocreadfilename
+\endgroup
+  }
+  {\catcode`[=1 \catcode`]=2
+   \catcode`{=\other \catcode`}=\other
+   \gdef\lbracecharliteral[{]%
+   \gdef\rbracecharliteral[}]%
+  ]
+
+  \special{pdf:docview << /PageMode /UseOutlines >> }
+  \special{pdf:tounicode UTF8-UTF16 }
+\fi
+
+%
 % @image support for XeTeX
 %
 \newif\ifxeteximgpdf
\input texinfo.tex @c -*- coding: utf-8 -*-

@documentencoding UTF-8
@documentlanguage de

@node „
@subsection test U+201E

„: U+201E DOUBLE LOW-9 QUOTATION MARK

@bye


Re: XeTeX @image support

2016-02-03 Thread Masamichi HOSODA
I've improved XeTeX @image support patch.

ChangeLog:

Add @image support for XeTeX

2016-02-XX  Masamichi Hosoda  <truer...@trueroad.jp>

* doc/texinfo.tex (\doxeteximage):
@image support for XeTeX.
(\image): @image support for XeTeX.
--- texinfo.tex.org	2016-02-03 22:01:13.758884100 +0900
+++ texinfo.tex	2016-02-03 22:33:14.500957900 +0900
@@ -1449,6 +1449,56 @@
   \let\pdfmakeoutlines = \relax
 \fi  % \ifx\pdfoutput
 
+%
+% @image support for XeTeX
+%
+\newif\ifxeteximgpdf
+\ifx\XeTeXrevision\thisisundefined
+\else
+  %
+  % #1 is image name, #2 width (might be empty/whitespace), #3 height (ditto).
+  \def\doxeteximage#1#2#3{%
+\def\xeteximagewidth{#2}\setbox0 = \hbox{\ignorespaces #2}%
+\def\xeteximageheight{#3}\setbox2 = \hbox{\ignorespaces #3}%
+%
+% XeTeX (and the PDF format) support .pdf, .png, .jpg (among
+% others).  Let's try in that order, PDF first since if
+% someone has a scalable image, presumably better to use that than a
+% bitmap.
+\let\xeteximgext=\empty
+\xeteximgpdffalse
+\begingroup
+  \openin 1 #1.pdf \ifeof 1
+\openin 1 #1.PDF \ifeof 1
+  \openin 1 #1.png \ifeof 1
+\openin 1 #1.jpg \ifeof 1
+  \openin 1 #1.jpeg \ifeof 1
+\openin 1 #1.JPG \ifeof 1
+  \errmessage{Could not find image file #1 for XeTeX}%
+\else \gdef\xeteximgext{JPG}%
+\fi
+  \else \gdef\xeteximgext{jpeg}%
+  \fi
+\else \gdef\xeteximgext{jpg}%
+\fi
+  \else \gdef\xeteximgext{png}%
+  \fi
+\else \gdef\xeteximgext{PDF} \global\xeteximgpdftrue%
+\fi
+  \else \gdef\xeteximgext{pdf} \global\xeteximgpdftrue%
+  \fi
+  \closein 1
+\endgroup
+%
+\ifxeteximgpdf
+  \XeTeXpdffile "#1".\xeteximgext ""
+\else
+  \XeTeXpicfile "#1".\xeteximgext ""
+\fi
+\ifdim \wd0 >0pt width \xeteximagewidth \fi
+\ifdim \wd2 >0pt height \xeteximageheight \fi \relax
+  }
+\fi
 
 \message{fonts,}
 
@@ -9084,12 +9134,21 @@
   %
   % Output the image.
   \ifpdf
+% For pdfTeX and LuaTeX <= 0.80
 \dopdfimage{#1}{#2}{#3}%
   \else
-% \epsfbox itself resets \epsf?size at each figure.
-\setbox0 = \hbox{\ignorespaces #2}\ifdim\wd0 > 0pt \epsfxsize=#2\relax \fi
-\setbox0 = \hbox{\ignorespaces #3}\ifdim\wd0 > 0pt \epsfysize=#3\relax \fi
-\epsfbox{#1.eps}%
+\ifx\XeTeXrevision\thisisundefined
+  % For epsf.tex
+  % \epsfbox itself resets \epsf?size at each figure.
+  \setbox0 = \hbox{\ignorespaces #2}%
+\ifdim\wd0 > 0pt \epsfxsize=#2\relax \fi
+  \setbox0 = \hbox{\ignorespaces #3}%
+\ifdim\wd0 > 0pt \epsfysize=#3\relax \fi
+  \epsfbox{#1.eps}%
+\else
+  % For XeTeX
+  \doxeteximage{#1}{#2}{#3}%
+\fi
   \fi
   %
   \ifimagevmode


Re: XeTeX encoding problem

2016-02-03 Thread Masamichi HOSODA
This patch fixes ``reference has extra space in native Unicode'' issue.

ChangeLog:

Remove references extra space for native Unicode

2016-02-XX  Masamichi Hosoda  <truer...@trueroad.jp>

* doc/texinfo.tex (\unicodechardefs):
Remove references extra space for native Unicode.
--- texinfo.tex.org	2016-02-03 21:57:42.276462000 +0900
+++ texinfo.tex	2016-02-03 22:01:13.758884100 +0900
@@ -10024,675 +10024,675 @@
 % least make most of the characters not bomb out.
 %
 \def\unicodechardefs{%
-  \DeclareUnicodeCharacter{00A0}{\tie}
-  \DeclareUnicodeCharacter{00A1}{\exclamdown}
+  \DeclareUnicodeCharacter{00A0}{\tie}%
+  \DeclareUnicodeCharacter{00A1}{\exclamdown}%
   \DeclareUnicodeCharacter{00A2}{{\tcfont \char162}}% 0242=cent
-  \DeclareUnicodeCharacter{00A3}{\pounds}
+  \DeclareUnicodeCharacter{00A3}{\pounds}%
   \DeclareUnicodeCharacter{00A4}{{\tcfont \char164}}% 0244=currency
   \DeclareUnicodeCharacter{00A5}{{\tcfont \char165}}% 0245=yen
   \DeclareUnicodeCharacter{00A6}{{\tcfont \char166}}% 0246=brokenbar
-  \DeclareUnicodeCharacter{00A7}{\S}
-  \DeclareUnicodeCharacter{00A8}{\"{ }}
-  \DeclareUnicodeCharacter{00A9}{\copyright}
-  \DeclareUnicodeCharacter{00AA}{\ordf}
-  \DeclareUnicodeCharacter{00AB}{\guillemetleft}
-  \DeclareUnicodeCharacter{00AC}{\ensuremath\lnot}
-  \DeclareUnicodeCharacter{00AD}{\-}
-  \DeclareUnicodeCharacter{00AE}{\registeredsymbol}
-  \DeclareUnicodeCharacter{00AF}{\={ }}
-  %
-  \DeclareUnicodeCharacter{00B0}{\ringaccent{ }}
-  \DeclareUnicodeCharacter{00B1}{\ensuremath\pm}
-  \DeclareUnicodeCharacter{00B2}{$^2$}
-  \DeclareUnicodeCharacter{00B3}{$^3$}
-  \DeclareUnicodeCharacter{00B4}{\'{ }}
-  \DeclareUnicodeCharacter{00B5}{$\mu$}
-  \DeclareUnicodeCharacter{00B6}{\P}
-  \DeclareUnicodeCharacter{00B7}{\ensuremath\cdot}
-  \DeclareUnicodeCharacter{00B8}{\cedilla{ }}
-  \DeclareUnicodeCharacter{00B9}{$^1$}
-  \DeclareUnicodeCharacter{00BA}{\ordm}
-  \DeclareUnicodeCharacter{00BB}{\guillemetright}
-  \DeclareUnicodeCharacter{00BC}{$1\over4$}
-  \DeclareUnicodeCharacter{00BD}{$1\over2$}
-  \DeclareUnicodeCharacter{00BE}{$3\over4$}
-  \DeclareUnicodeCharacter{00BF}{\questiondown}
-  %
-  \DeclareUnicodeCharacter{00C0}{\`A}
-  \DeclareUnicodeCharacter{00C1}{\'A}
-  \DeclareUnicodeCharacter{00C2}{\^A}
-  \DeclareUnicodeCharacter{00C3}{\~A}
-  \DeclareUnicodeCharacter{00C4}{\"A}
-  \DeclareUnicodeCharacter{00C5}{\AA}
-  \DeclareUnicodeCharacter{00C6}{\AE}
-  \DeclareUnicodeCharacter{00C7}{\cedilla{C}}
-  \DeclareUnicodeCharacter{00C8}{\`E}
-  \DeclareUnicodeCharacter{00C9}{\'E}
-  \DeclareUnicodeCharacter{00CA}{\^E}
-  \DeclareUnicodeCharacter{00CB}{\"E}
-  \DeclareUnicodeCharacter{00CC}{\`I}
-  \DeclareUnicodeCharacter{00CD}{\'I}
-  \DeclareUnicodeCharacter{00CE}{\^I}
-  \DeclareUnicodeCharacter{00CF}{\"I}
-  %
-  \DeclareUnicodeCharacter{00D0}{\DH}
-  \DeclareUnicodeCharacter{00D1}{\~N}
-  \DeclareUnicodeCharacter{00D2}{\`O}
-  \DeclareUnicodeCharacter{00D3}{\'O}
-  \DeclareUnicodeCharacter{00D4}{\^O}
-  \DeclareUnicodeCharacter{00D5}{\~O}
-  \DeclareUnicodeCharacter{00D6}{\"O}
-  \DeclareUnicodeCharacter{00D7}{\ensuremath\times}
-  \DeclareUnicodeCharacter{00D8}{\O}
-  \DeclareUnicodeCharacter{00D9}{\`U}
-  \DeclareUnicodeCharacter{00DA}{\'U}
-  \DeclareUnicodeCharacter{00DB}{\^U}
-  \DeclareUnicodeCharacter{00DC}{\"U}
-  \DeclareUnicodeCharacter{00DD}{\'Y}
-  \DeclareUnicodeCharacter{00DE}{\TH}
-  \DeclareUnicodeCharacter{00DF}{\ss}
-  %
-  \DeclareUnicodeCharacter{00E0}{\`a}
-  \DeclareUnicodeCharacter{00E1}{\'a}
-  \DeclareUnicodeCharacter{00E2}{\^a}
-  \DeclareUnicodeCharacter{00E3}{\~a}
-  \DeclareUnicodeCharacter{00E4}{\"a}
-  \DeclareUnicodeCharacter{00E5}{\aa}
-  \DeclareUnicodeCharacter{00E6}{\ae}
-  \DeclareUnicodeCharacter{00E7}{\cedilla{c}}
-  \DeclareUnicodeCharacter{00E8}{\`e}
-  \DeclareUnicodeCharacter{00E9}{\'e}
-  \DeclareUnicodeCharacter{00EA}{\^e}
-  \DeclareUnicodeCharacter{00EB}{\"e}
-  \DeclareUnicodeCharacter{00EC}{\`{\dotless{i}}}
-  \DeclareUnicodeCharacter{00ED}{\'{\dotless{i}}}
-  \DeclareUnicodeCharacter{00EE}{\^{\dotless{i}}}
-  \DeclareUnicodeCharacter{00EF}{\"{\dotless{i}}}
-  %
-  \DeclareUnicodeCharacter{00F0}{\dh}
-  \DeclareUnicodeCharacter{00F1}{\~n}
-  \DeclareUnicodeCharacter{00F2}{\`o}
-  \DeclareUnicodeCharacter{00F3}{\'o}
-  \DeclareUnicodeCharacter{00F4}{\^o}
-  \DeclareUnicodeCharacter{00F5}{\~o}
-  \DeclareUnicodeCharacter{00F6}{\"o}
-  \DeclareUnicodeCharacter{00F7}{\ensuremath\div}
-  \DeclareUnicodeCharacter{00F8}{\o}
-  \DeclareUnicodeCharacter{00F9}{\`u}
-  \DeclareUnicodeCharacter{00FA}{\'u}
-  \DeclareUnicodeCharacter{00FB}{\^u}
-  \DeclareUnicodeCharacter{00FC}{\"u}
-  \DeclareUnicodeCharacter{00FD}{\'y}
-  \DeclareUnicodeCharacter{00FE}{\th}
-  \DeclareUnicodeCharacter{00FF}{\"y}
-  %
-  \DeclareUnicodeCharacter{0100}{\=A}
-  \DeclareUnicodeCharacter{0101}{\=a}
-  \DeclareUnicodeCharacter{0102}{\u{A}}
-  \DeclareUnico

XeTeX PDF outline support

2016-02-03 Thread Masamichi HOSODA
I've made XeTeX PDF outline support patch.

ChangeLog:

Add PDF outline support for XeTeX

2016-02-XX  Masamichi Hosoda  <truer...@trueroad.jp>

* doc/texinfo.tex:
Add PDF outline support for XeTeX.
(\pdfdest): set destination.
(\pdfmkdest): set destination.
(\dopdfoutline): make outline element.
(\pdfmakeoutlines): make PDF outline.
--- texinfo.tex.org	2016-02-03 22:33:14.500957900 +0900
+++ texinfo.tex	2016-02-03 22:45:51.999474500 +0900
@@ -1450,6 +1450,65 @@
 \fi  % \ifx\pdfoutput
 
 %
+% PDF outline support for XeTeX
+%
+\ifx\XeTeXrevision\thisisundefined
+\else
+  \pdfmakepagedesttrue \relax
+  \def\pdfdest name#1 xyz{%
+\special{pdf:dest (name#1) [@thispage /XYZ @xpos @ypos]}%
+  }
+  \def\pdfmkdest#1{%
+\special{pdf:dest (name#1) [@thispage /XYZ @xpos @ypos]}%
+  }
+  %
+  \def\dopdfoutline#1#2#3#4{%
+\edef\pdfoutlinedest{#3}%
+\ifx\pdfoutlinedest\empty
+  \def\pdfoutlinedest{#4}%
+\fi
+%
+\edef\pdfoutlinetext{#1}%
+%
+\special{pdf:out [-] #2 << /Title (#1) /A << /S /GoTo /D (name\pdfoutlinedest) >> >> }%
+  }
+  %
+  \def\pdfmakeoutlines{%
+\begingroup
+  \def\partentry##1##2##3##4{}% ignore parts in the outlines
+  \def\numchapentry##1##2##3##4{%
+\dopdfoutline{##1}{1}{##3}{##4}}%
+  \def\numsecentry##1##2##3##4{%
+\dopdfoutline{##1}{2}{##3}{##4}}%
+  \def\numsubsecentry##1##2##3##4{%
+\dopdfoutline{##1}{3}{##3}{##4}}%
+  \def\numsubsubsecentry##1##2##3##4{%
+\dopdfoutline{##1}{4}{##3}{##4}}%
+  %
+  \def\appentry{\numchapentry}%
+  \def\appsecentry{\numsecentry}%
+  \def\appsubsecentry{\numsubsecentry}%
+  \def\appsubsubsecentry{\numsubsubsecentry}%
+  \def\unnchapentry{\numchapentry}%
+  \def\unnsecentry{\numsecentry}%
+  \def\unnsubsecentry{\numsubsecentry}%
+  \def\unnsubsubsecentry{\numsubsubsecentry}%
+  %
+  \indexnofonts
+  \setupdatafile
+  %
+  \def\{{\lbracecharliteral}%
+  \def\}{\rbracecharliteral}%
+  \catcode`\\=\active \otherbackslash
+  \input \tocreadfilename
+\endgroup
+  }
+
+  \special{pdf:docview << /PageMode /UseOutlines >> }
+  \special{pdf:tounicode UTF8-UTF16 }
+\fi
+
+%
 % @image support for XeTeX
 %
 \newif\ifxeteximgpdf
\input texinfo.tex @c -*- coding: utf-8 -*-

@documentencoding UTF-8

@contents

@node node-fur
@chapter für

für

@node node-hello1
@section hello1

@node node-world1
@subsection world1

@node node-world2
@subsubsection world2

† ‡ § ¶

@node node-hello2
@section hello2

foobar

@node node-chapter2
@chapter chapter2

@node node-chapter2section
@section chapter2section

barbaz

@bye


Re: texinfo-6.0.93 pretest

2016-02-03 Thread Masamichi HOSODA
>>> I would like to have Masamichi-san's Unicode support stuff in 6.1...
>> 
>> Unfortunately, My native Unicode patch can not compile the attached
>> file.  I'm investigating, but it is still unexplained.
> 
> Well, I could imagine to tag your stuff as experimental so that more
> people try it, hopefully sending bug reports if something fails.

I've fixed it.

By combining the following four patches,
I've succeed to compile all English texi documents of LilyPond.

http://lists.gnu.org/archive/html/bug-texinfo/2016-02/msg9.html
http://lists.gnu.org/archive/html/bug-texinfo/2016-02/msg00010.html
http://lists.gnu.org/archive/html/bug-texinfo/2016-02/msg00011.html
http://lists.gnu.org/archive/html/bug-texinfo/2016-02/msg00012.html



Re: texinfo-6.0.93 pretest

2016-02-02 Thread Masamichi HOSODA
>> I've found and removed one error where extra space could occur in
>> the text of a cross-reference (and possibly elsewhere as well) when
>> processing with TeX.
> 
> I would like to have Masamichi-san's Unicode support stuff in 6.1...

Unfortunately, My native Unicode patch can not compile the attached file.
I'm investigating, but it is still unexplained.
\input texinfo.tex @c -*- coding: utf-8 -*-

@documentencoding UTF-8

@macro testmacro
ı
@end macro

@testmacro

@bye


Re: XeTeX encoding problem

2016-01-31 Thread Masamichi HOSODA
>>> I noticed page breaking issue in my patch.
>>> I've fixed it.
> 
> Please provide a sample to reproduce the issue.

I've attached it.

>> The empty lines in \utfeightchardefs? I'll commit that separately.
> 
> If the empty lines are really the cause, I agree that it deserves a
> separate commit since it doesn't seem to be related to the encoding
> problem.

The issue occurs in native Unicode only.

If native Unicode is enabled,
\nativeunicodechardefsthru may be used at the page break.
It has \unicodechardefs (renamed from \utfchardefs in my patch).
Extra empty lines cause infinite loop.

If @documentencoding is US-ASCII or ISO-8859-1, it does not occur.
In this case, \nativeunicodechardefsthru is not used.
\nonasciistringdefs is used instead.
It does not have extra empty lines.
\input texinfo.tex @c -*- coding: utf-8 -*-

@documentencoding UTF-8

@example
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
@end example

@bye


Re: XeTeX encoding problem

2016-01-31 Thread Masamichi HOSODA
>> Have you ever got the CJK characters to work in a Texinfo file with
>> XeTeX or LuaTeX? If so, maybe we should conditionally load the fonts
>> that you got to work. Can you satisfactorily typeset Japanese text
>> with XeTeX without the use of LaTeX packages? If not, it very likely
>> won't be practical to implement special rules for typesetting Japanese
>> in Texinfo itself.
> 
> Not yet. I want to try it.
> I'm going to use LuaTeX-ja.
> 
> https://osdn.jp/projects/luatex-ja/wiki/FrontPage%28en%29
> 
> It can set Japanese fonts separately from alphabetic font settings.
> It also has special rules for typesetting Japanese.
> It does not require LaTeX.
> 
> On the other hand, in XeTeX, it is difficult.
> However, Japanese characters and fonts can be used by plain XeTeX at least.

I'm trying Japanese Texinfo with XeTeX/LuaTeX native Unicode.
https://github.com/trueroad/texinfo-sample-jp

In my experiment, in both XeTeX and LuaTeX,
Japanese font and Alphabetic font can be separately set.
Even if in XeTeX, typesetting result of Japanese text is acceptable
in my humble opinion.



Re: XeTeX encoding problem

2016-01-28 Thread Masamichi HOSODA
I noticed page breaking issue in my patch.
I've fixed it.
--- texinfo.tex.org	2016-01-21 23:04:22.405562200 +0900
+++ texinfo.tex	2016-01-28 22:23:50.283561700 +0900
@@ -9433,43 +9433,68 @@
   \global\righthyphenmin = #3\relax
 }
 
-% Get input by bytes instead of by UTF-8 codepoints for XeTeX and LuaTeX, 
-% otherwise the encoding support is completely broken.
+% XeTeX and LuaTeX can handle native Unicode.
+% Their default I/O is UTF-8 sequence instead of byte-wise.
+% Other TeX engine (pdfTeX etc.) I/O is byte-wise.
+%
+\newif\iftxinativeunicodecapable
+\newif\iftxiusebytewiseio
+
 \ifx\XeTeXrevision\thisisundefined
+  \ifx\luatexversion\thisisundefined
+\txinativeunicodecapablefalse
+\txiusebytewiseiotrue
+  \else
+\txinativeunicodecapabletrue
+\txiusebytewiseiofalse
+  \fi
 \else
-\XeTeXdefaultencoding "bytes"  % For subsequent files to be read
-\XeTeXinputencoding "bytes"  % Effective in texinfo.tex only
-% Unfortunately, there seems to be no corresponding XeTeX command for
-% output encoding.  This is a problem for auxiliary index and TOC files.
-% The only solution would be perhaps to write out @U{...} sequences in
-% place of UTF-8 characters.
+  \txinativeunicodecapabletrue
+  \txiusebytewiseiofalse
 \fi
 
-\ifx\luatexversion\thisisundefined
-\else
-\directlua{
-local utf8_char, byte, gsub = unicode.utf8.char, string.byte, string.gsub
-local function convert_char (char)
-  return utf8_char(byte(char))
-end
-
-local function convert_line (line)
-  return gsub(line, ".", convert_char)
-end
-
-callback.register("process_input_buffer", convert_line)
-
-local function convert_line_out (line)
-  local line_out = ""
-  for c in string.utfvalues(line) do
- line_out = line_out .. string.char(c)
-  end
-  return line_out
-end
+% Set I/O by bytes instead of UTF-8 sequence for XeTeX and LuaTex
+% for non-UTF-8 (byte-wise) encodings.
+%
+\def\setbytewiseio{%
+  \ifx\XeTeXrevision\thisisundefined
+  \else
+\XeTeXdefaultencoding "bytes"  % For subsequent files to be read
+\XeTeXinputencoding "bytes"  % For document root file
+% Unfortunately, there seems to be no corresponding XeTeX command for
+% output encoding.  This is a problem for auxiliary index and TOC files.
+% The only solution would be perhaps to write out @U{...} sequences in
+% place of non-ASCII characters.
+  \fi
 
-callback.register("process_output_buffer", convert_line_out)
+  \ifx\luatexversion\thisisundefined
+  \else
+\directlua{
+local utf8_char, byte, gsub = unicode.utf8.char, string.byte, string.gsub
+local function convert_char (char)
+  return utf8_char(byte(char))
+end
+
+local function convert_line (line)
+  return gsub(line, ".", convert_char)
+end
+
+callback.register("process_input_buffer", convert_line)
+
+local function convert_line_out (line)
+  local line_out = ""
+  for c in string.utfvalues(line) do
+ line_out = line_out .. string.char(c)
+  end
+  return line_out
+end
+
+callback.register("process_output_buffer", convert_line_out)
+}
+  \fi
+
+  \txiusebytewiseiotrue
 }
-\fi
 
 
 % Helpers for encodings.
@@ -9496,13 +9521,6 @@
 %
 \def\documentencoding{\parseargusing\filenamecatcodes\documentencodingzzz}
 \def\documentencodingzzz#1{%
-  % Get input by bytes instead of by UTF-8 codepoints for XeTeX,
-  % otherwise the encoding support is completely broken.
-  % This settings is for the document root file.
-  \ifx\XeTeXrevision\thisisundefined
-  \else
-\XeTeXinputencoding "bytes"
-  \fi
   %
   % Encoding being declared for the document.
   \def\declaredencoding{\csname #1.enc\endcsname}%
@@ -9519,22 +9537,37 @@
  \asciichardefs
   %
   \else \ifx \declaredencoding \lattwo
+ \iftxinativeunicodecapable
+   \setbytewiseio
+ \fi
  \setnonasciicharscatcode\active
  \lattwochardefs
   %
   \else \ifx \declaredencoding \latone
+ \iftxinativeunicodecapable
+   \setbytewiseio
+ \fi
  \setnonasciicharscatcode\active
  \latonechardefs
   %
   \else \ifx \declaredencoding \latnine
+ \iftxinativeunicodecapable
+   \setbytewiseio
+ \fi
  \setnonasciicharscatcode\active
  \latninechardefs
   %
   \else \ifx \declaredencoding \utfeight
- \setnonasciicharscatcode\active
- % since we already invoked \utfeightchardefs at the top level
- % (below), do not re-invoke it, then our check for duplicated
- % definitions triggers.  Making non-ascii chars active is enough.
+ \iftxinativeunicodecapable
+   % For native Unicode (XeTeX and LuaTeX)
+   \nativeunicodechardefs
+ \else
+   % For UTF-8 byte sequence (pdfTeX)
+   \setnonasciicharscatcode\active
+   % since we already invoked \utfeightchardefs at the top level
+   % (below), do not re-invoke it, then our check for duplicated
+   % definitions triggers.  Making non-ascii chars active is enough.
+ \fi
   %
   \else
 \message{Ignoring unknown document 

Re: XeTeX encoding problem

2016-01-23 Thread Masamichi HOSODA
>> In XeTeX and LuaTeX, is "@documentencoding ISO-8859-1" support required?
>> If so, I'll improve the patch.
>> It will use byte-wise input when "@documentencoding ISO-8859-1" is used.
>>
>> However, if you want ISO-8859-1,
>> you can use pdfTeX instead of XeTeX/LuaTex or you can convert to UTF-8,
>> in my humble opinion.
> 
> It would be inconvenient to remember to use pdfTeX whenever you had to
> process a Texinfo document in ISO-8859-1. We should process
> byte-by-byte for an encoding like that, using the existing code in
> texinfo.tex to do so. It isn't perfect, as you say: for example, it
> looks like we couldn't include another Texinfo file the filename of
> which was in a single-byte encoding, but that's better than breaking
> it altogether.

Thank you for your comments.
I've improved the patch that can use ISO-8859-1 with XeTeX/LuaTeX.
ChangeLog is below.

>> I want Unicode which contains CJK characters. Not only ISO-8859-1.
>> In byte-wise input, CJK characters can not be used.
> 
> Have you ever got the CJK characters to work in a Texinfo file with
> XeTeX or LuaTeX? If so, maybe we should conditionally load the fonts
> that you got to work. Can you satisfactorily typeset Japanese text
> with XeTeX without the use of LaTeX packages? If not, it very likely
> won't be practical to implement special rules for typesetting Japanese
> in Texinfo itself.

Not yet. I want to try it.
I'm going to use LuaTeX-ja.

https://osdn.jp/projects/luatex-ja/wiki/FrontPage%28en%29

It can set Japanese fonts separately from alphabetic font settings.
It also has special rules for typesetting Japanese.
It does not require LaTeX.

On the other hand, in XeTeX, it is difficult.
However, Japanese characters and fonts can be used by plain XeTeX at least.

>>> I don't see the problem with Unicode filenames: files are named with a
>>> series of bytes; does this mean that XeTeX (or LuaTeX?) has problems
>>> accessing files with names which aren't in UTF-8?
>>>
> 
>>
>> In native Unicode, word sequence 0x0066 0x00FC 0x0072
>> is converted to UTF-8 byte sequence 0x66 0xC3 0xBC 0x72.
>> It means "Für", then filename "Für" can be handled.
>>
>> In byte-wise input, word sequence 0x0066 0x00C3 0x00BC 0x0072
>> is converted to byte sequence 0x66 0xC3 0x83 0xC2 0xBC 0x72.
>> It does not mean "Für", then filename "Für" can not be handled.
> 
> Thank you for the thorough explanation; it appears that the native
> support for reading files by UTF-8 sequence (instead of by byte) needs
> to be used for opening files with non-ASCII filenames.

Exactly.


ChangeLog:

Add native Unicode support for XeTeX and LuaTex

2016-01-XX  Masamichi Hosoda  <truer...@trueroad.jp>

* doc/texinfo.tex:
Add native Unicode support for XeTeX and LuaTex.

(\iftxinativeunicodecapable): New switch.
(\iftxiusebytewiseio): New switch.

(\setbytewiseio): Set I/O by bytes instead of UTF-8 sequence
for XeTeX and LuaTex non-UTF-8 (byte-wise) encodings.

(\documentencoding): Remove input by bytes settings for XeTeX.
Add I/O by bytes settings for single-byte encodings.
Add native Unicode settings for UTF-8 encoding.

(\U): Any Unicode characters can be used by native Unicode.

(\DeclareUnicodeCharacterUTFviii): Rename from
\DeclareUnicodeCharacter.
(\DeclareUnicodeCharacterNative): For native Unicode,
Definition macro to replace the Unicode character.
(\DeclareUnicodeCharacterNativeThru): For native Unicode,
Definition macro not to replace (through) the Unicode character.
(\DeclareUnicodeCharacterNativeAtU): For native Unicode,
Definition macro that is used by @U command.

(\unicodechardefs): Rename from \utfeightchardefs.
(\utfeightchardefs): UTF-8 byte sequence definitions (replacing and
@U command). It makes the setting that replace UTF-8 byte sequence.
(\nativeunicodechardefs): Native Unicode character replacing
definitions. It makes the setting that replace the Unicode characters.
(\nativeunicodechardefsthru): Native Unicode character ``through''
definitions. It makes the setting that does not replace
the Unicode characters.
(\nativeunicodechardefsatu): Native Unicode @U command definitions.

(\throughcharactersdefs): Character ``through'' definitions.
It makes the setting that does not replace the characters.


--- texinfo.tex.org	2016-01-21 23:04:22.405562200 +0900
+++ texinfo.tex	2016-01-24 02:20:37.523179700 +0900
@@ -9433,43 +9433,68 @@
   \global\righthyphenmin = #3\relax
 }
 
-% Get input by bytes instead of by UTF-8 codepoints for XeTeX and LuaTeX, 
-% otherwise th

Re: XeTeX encoding problem

2016-01-22 Thread Masamichi HOSODA
> I think it misses some percent signs, e.g.
> 
>   \def\utfeightchardefs{%  <- here
> \let\DeclareUnicodeCharacter\DeclareUnicodeCharacterUTFviii
> \unicodechardefs
>   }
> 
> Maybe they aren't necessary, but I would add them for consistency.

Thank you for your advice.
Here is fixed patch.
--- texinfo.tex.org	2016-01-21 23:04:22.405562200 +0900
+++ texinfo.tex	2016-01-22 22:07:54.739606200 +0900
@@ -9433,42 +9433,18 @@
   \global\righthyphenmin = #3\relax
 }
 
-% Get input by bytes instead of by UTF-8 codepoints for XeTeX and LuaTeX, 
-% otherwise the encoding support is completely broken.
-\ifx\XeTeXrevision\thisisundefined
-\else
-\XeTeXdefaultencoding "bytes"  % For subsequent files to be read
-\XeTeXinputencoding "bytes"  % Effective in texinfo.tex only
-% Unfortunately, there seems to be no corresponding XeTeX command for
-% output encoding.  This is a problem for auxiliary index and TOC files.
-% The only solution would be perhaps to write out @U{...} sequences in
-% place of UTF-8 characters.
-\fi
+% XeTeX and LuaTeX can handle native Unicode.
+%
+\newif\iftxinativeunicodecapable
 
-\ifx\luatexversion\thisisundefined
+\ifx\XeTeXrevision\thisisundefined
+  \ifx\luatexversion\thisisundefined
+\txinativeunicodecapablefalse
+  \else
+\txinativeunicodecapabletrue
+  \fi
 \else
-\directlua{
-local utf8_char, byte, gsub = unicode.utf8.char, string.byte, string.gsub
-local function convert_char (char)
-  return utf8_char(byte(char))
-end
-
-local function convert_line (line)
-  return gsub(line, ".", convert_char)
-end
-
-callback.register("process_input_buffer", convert_line)
-
-local function convert_line_out (line)
-  local line_out = ""
-  for c in string.utfvalues(line) do
- line_out = line_out .. string.char(c)
-  end
-  return line_out
-end
-
-callback.register("process_output_buffer", convert_line_out)
-}
+  \txinativeunicodecapabletrue
 \fi
 
 
@@ -9496,13 +9472,6 @@
 %
 \def\documentencoding{\parseargusing\filenamecatcodes\documentencodingzzz}
 \def\documentencodingzzz#1{%
-  % Get input by bytes instead of by UTF-8 codepoints for XeTeX,
-  % otherwise the encoding support is completely broken.
-  % This settings is for the document root file.
-  \ifx\XeTeXrevision\thisisundefined
-  \else
-\XeTeXinputencoding "bytes"
-  \fi
   %
   % Encoding being declared for the document.
   \def\declaredencoding{\csname #1.enc\endcsname}%
@@ -9531,10 +9500,16 @@
  \latninechardefs
   %
   \else \ifx \declaredencoding \utfeight
- \setnonasciicharscatcode\active
- % since we already invoked \utfeightchardefs at the top level
- % (below), do not re-invoke it, then our check for duplicated
- % definitions triggers.  Making non-ascii chars active is enough.
+ \iftxinativeunicodecapable
+   % For native Unicode (XeTeX and LuaTeX)
+   \nativeunicodechardefs
+ \else
+   % For UTF-8 byte sequence (pdfTeX)
+   \setnonasciicharscatcode\active
+   % since we already invoked \utfeightchardefs at the top level
+   % (below), do not re-invoke it, then our check for duplicated
+   % definitions triggers.  Making non-ascii chars active is enough.
+ \fi
   %
   \else
 \message{Ignoring unknown document encoding: #1.}%
@@ -9849,13 +9824,26 @@
 % @U{} to produce U+, if we support it.
 \def\U#1{%
   \expandafter\ifx\csname uni:#1\endcsname \relax
-\errhelp = \EMsimple	
-\errmessage{Unicode character U+#1 not supported, sorry}%
+\iftxinativeunicodecapable
+  % Any Unicode characters can be used by native Unicode.
+  % However, if the font does not have the glyph, the letter will miss.
+  \begingroup
+\uccode`\.="#1\relax
+\uppercase{.}
+  \endgroup
+\else
+  \errhelp = \EMsimple	
+  \errmessage{Unicode character U+#1 not supported, sorry}%
+\fi
   \else
 \csname uni:#1\endcsname
   \fi
 }
 
+% For UTF-8 byte sequence (pdfTeX)
+% Definition macro to replace the Unicode character
+% Definition macro that is used by @U command
+%
 \begingroup
   \catcode`\"=12
   \catcode`\<=12
@@ -9864,7 +9852,7 @@
   \catcode`\;=12
   \catcode`\!=12
   \catcode`\~=13
-  \gdef\DeclareUnicodeCharacter#1#2{%
+  \gdef\DeclareUnicodeCharacterUTFviii#1#2{%
 \countUTFz = "#1\relax
 %\wlog{\space\space defining Unicode char U+#1 (decimal \the\countUTFz)}%
 \begingroup
@@ -9922,6 +9910,37 @@
 \uppercase{\gdef\UTFviiiTmp{#2#3#4}}}
 \endgroup
 
+% For native Unicode (XeTeX and LuaTeX)
+% Definition macro to replace the Unicode character
+%
+\def\DeclareUnicodeCharacterNative#1#2{%
+  \catcode"#1=\active
+  \begingroup
+\uccode`\~="#1\relax
+\uppercase{\gdef~}{#2}%
+  \endgroup}
+
+% For native Unicode (XeTeX and LuaTeX)
+% Definition macro not to replace (through) the Unicode character
+%
+\def\DeclareUnicodeCharacterNativeThru#1#2{%
+  \catcode"#1=\active
+  \begingroup
+\uccode`\.="#1\relax
+\uppercase{\endgroup \def\UTFNativeTmp{.}}%
+  \begingroup
+

Re: XeTeX encoding problem

2016-01-22 Thread Masamichi HOSODA
>> Thank you for your comments.
>> I've updated the patch.
>>
>> I want the following.
>>   UTF-8 auxiliary file.
>>   Handling Unicode filename (image files and include files).
>>   Handling Unicode PDF bookmark strings.
> 
> Thanks for working on this. I've had a look at the most recent patch,
> which resolves the category code fixing problem. I see you are using
> native UTF-8 input throughout, but I can't see how this could support
> "@documentencoding ISO-8859-1" (or any other single-byte encoding). I
> think the things you mention above could be supported without using
> native UTF-8 support.

Thank you for reviewing.

In XeTeX and LuaTeX, is "@documentencoding ISO-8859-1" support required?
If so, I'll improve the patch.
It will use byte-wise input when "@documentencoding ISO-8859-1" is used.

However, if you want ISO-8859-1,
you can use pdfTeX instead of XeTeX/LuaTex or you can convert to UTF-8,
in my humble opinion.

I want Unicode which contains CJK characters. Not only ISO-8859-1.
In byte-wise input, CJK characters can not be used.

> I don't see the problem with Unicode filenames: files are named with a
> series of bytes; does this mean that XeTeX (or LuaTeX?) has problems
> accessing files with names which aren't in UTF-8?
> 
> Are PDF bookmarks written out incorrectly also?

If I understand correctly,
XeTeX/LuaTeX's inner encoding is UTF-16 instead of UTF-8.
XeTeX/LuaTeX converts UTF-8 input to UTF-16 in default.

For example,

"Für" in UTF-8  -> XeTeX/LuaTeX inner UTF-16
0x66 0xC3 0xBC 0x72 -> 0x0066 0x00FC 0x0072

If byte-wise input is used, 

"Für" in UTF-8  -> XeTeX/LuaTeX inner UTF-16???
0x66 0xC3 0xBC 0x72 -> 0x0066 0x00C3 0x00BC 0x0072

In Windows, native filesystem is UTF-16 instead of UTF-8.
That is XeTeX/LuaTeX inner UTF-16 word sequence is passed through to Windows.

In native Unicode, word sequence 0x0066 0x00FC 0x0072 means "Für",
then filename "Für" can be handled.
In byte-wise input, word sequence 0x0066 0x00C3 0x00BC 0x0072
does not mean "Für", then filename "Für" can not be handled.

Also PDF bookmarks requires UTF-16 for Unicode support.

On the other hand, in Linux, filesystem may be UTF-8.
In this case, XeTeX/LuaTeX inner UTF-16 word sequence
is converted to UTF-8 and is passed through to system call.

In native Unicode, word sequence 0x0066 0x00FC 0x0072
is converted to UTF-8 byte sequence 0x66 0xC3 0xBC 0x72.
It means "Für", then filename "Für" can be handled.

In byte-wise input, word sequence 0x0066 0x00C3 0x00BC 0x0072
is converted to byte sequence 0x66 0xC3 0x83 0xC2 0xBC 0x72.
It does not mean "Für", then filename "Für" can not be handled.

> It's useful to give a ChangeLog entry when posting patches to this
> list, because this gives a summary behind what was changed. One thing
> I wondered about was whether \DeclareUnicodeCharacterNativeAtU and
> \DeclareUnicodeCharacterNative needed to be separate macros.

I'll write ChangeLog.

\DeclareUnicodeCharacterNativeAtU is always required
even when encoding is not UTF-8.
In US-ASCII, @U{00FC} etc. can be used.

\DeclareUnicodeCharacterNative is only required when encoding is UTF-8.



Re: XeTeX encoding problem

2016-01-21 Thread Masamichi HOSODA
> Thank you for your comments.
> I've updated the patch.
> 
> I want the following.
>   UTF-8 auxiliary file.
>   Handling Unicode filename (image files and include files).
>   Handling Unicode PDF bookmark strings.
> 
> For this purpose, I used the method that changes catcode.
> The patch that is attached to this mail
> uses different method for this purpose.
> It uses re-defining replacing macros.

I've improved native Unicode replacing patch.

My previous patch could not handle @U command.
This patch can handle it.
In addition, I've added some comments and fixed some macro names.

How about this?
--- texinfo.tex.org	2016-01-21 23:04:22.405562200 +0900
+++ texinfo.tex	2016-01-21 23:10:48.289263700 +0900
@@ -9433,42 +9433,18 @@
   \global\righthyphenmin = #3\relax
 }
 
-% Get input by bytes instead of by UTF-8 codepoints for XeTeX and LuaTeX, 
-% otherwise the encoding support is completely broken.
-\ifx\XeTeXrevision\thisisundefined
-\else
-\XeTeXdefaultencoding "bytes"  % For subsequent files to be read
-\XeTeXinputencoding "bytes"  % Effective in texinfo.tex only
-% Unfortunately, there seems to be no corresponding XeTeX command for
-% output encoding.  This is a problem for auxiliary index and TOC files.
-% The only solution would be perhaps to write out @U{...} sequences in
-% place of UTF-8 characters.
-\fi
+% XeTeX and LuaTeX can handle native Unicode.
+%
+\newif\iftxinativeunicodecapable
 
-\ifx\luatexversion\thisisundefined
+\ifx\XeTeXrevision\thisisundefined
+  \ifx\luatexversion\thisisundefined
+\txinativeunicodecapablefalse
+  \else
+\txinativeunicodecapabletrue
+  \fi
 \else
-\directlua{
-local utf8_char, byte, gsub = unicode.utf8.char, string.byte, string.gsub
-local function convert_char (char)
-  return utf8_char(byte(char))
-end
-
-local function convert_line (line)
-  return gsub(line, ".", convert_char)
-end
-
-callback.register("process_input_buffer", convert_line)
-
-local function convert_line_out (line)
-  local line_out = ""
-  for c in string.utfvalues(line) do
- line_out = line_out .. string.char(c)
-  end
-  return line_out
-end
-
-callback.register("process_output_buffer", convert_line_out)
-}
+  \txinativeunicodecapabletrue
 \fi
 
 
@@ -9496,13 +9472,6 @@
 %
 \def\documentencoding{\parseargusing\filenamecatcodes\documentencodingzzz}
 \def\documentencodingzzz#1{%
-  % Get input by bytes instead of by UTF-8 codepoints for XeTeX,
-  % otherwise the encoding support is completely broken.
-  % This settings is for the document root file.
-  \ifx\XeTeXrevision\thisisundefined
-  \else
-\XeTeXinputencoding "bytes"
-  \fi
   %
   % Encoding being declared for the document.
   \def\declaredencoding{\csname #1.enc\endcsname}%
@@ -9531,10 +9500,16 @@
  \latninechardefs
   %
   \else \ifx \declaredencoding \utfeight
- \setnonasciicharscatcode\active
- % since we already invoked \utfeightchardefs at the top level
- % (below), do not re-invoke it, then our check for duplicated
- % definitions triggers.  Making non-ascii chars active is enough.
+ \iftxinativeunicodecapable
+   % For native Unicode (XeTeX and LuaTeX)
+   \nativeunicodechardefs
+ \else
+   % For UTF-8 byte sequence (pdfTeX)
+   \setnonasciicharscatcode\active
+   % since we already invoked \utfeightchardefs at the top level
+   % (below), do not re-invoke it, then our check for duplicated
+   % definitions triggers.  Making non-ascii chars active is enough.
+ \fi
   %
   \else
 \message{Ignoring unknown document encoding: #1.}%
@@ -9849,13 +9824,26 @@
 % @U{} to produce U+, if we support it.
 \def\U#1{%
   \expandafter\ifx\csname uni:#1\endcsname \relax
-\errhelp = \EMsimple	
-\errmessage{Unicode character U+#1 not supported, sorry}%
+\iftxinativeunicodecapable
+  % Any Unicode characters can be used by native Unicode.
+  % However, if the font does not have the glyph, the letter will miss.
+  \begingroup
+\uccode`\.="#1\relax
+	\uppercase{.}
+  \endgroup
+\else
+  \errhelp = \EMsimple	
+  \errmessage{Unicode character U+#1 not supported, sorry}%
+\fi
   \else
 \csname uni:#1\endcsname
   \fi
 }
 
+% For UTF-8 byte sequence (pdfTeX)
+% Definition macro to replace the Unicode character
+% Definition macro that is used by @U command
+%
 \begingroup
   \catcode`\"=12
   \catcode`\<=12
@@ -9864,7 +9852,7 @@
   \catcode`\;=12
   \catcode`\!=12
   \catcode`\~=13
-  \gdef\DeclareUnicodeCharacter#1#2{%
+  \gdef\DeclareUnicodeCharacterUTFviii#1#2{%
 \countUTFz = "#1\relax
 %\wlog{\space\space defining Unicode char U+#1 (decimal \the\countUTFz)}%
 \begingroup
@@ -9922,6 +9910,37 @@
 \uppercase{\gdef\UTFviiiTmp{#2#3#4}}}
 \endgroup
 
+% For native Unicode (XeTeX and LuaTeX)
+% Definition macro to replace the Unicode character
+%
+\def\DeclareUnicodeCharacterNative#1#2{%
+  \catcode"#1=\active
+  \begingroup
+\uccode`\~="#1\relax
+\uppercase{\gdef~}{#2}%
+  \endgroup}
+
+% 

Re: XeTeX encoding problem

2016-01-18 Thread Masamichi HOSODA
> If I understand correctly, you are changing the category codes of the
> Unicode characters when writing out to an auxiliary file, but only for
> those Unicode characters that are defined. This leads the Unicode
> character to be written out as a UTF-8 sequence. For the regular
> output, the definitions given with \DeclareUnicodeCharacter are used
> instead of trying to get a glyph for the Unicode character from a
> font. If there's no definition given, then the character must be in
> the font.
> 
> I don't know why you did it this way; maybe you could explain? Or if
> my explanation above is incorrect, could you correct it?
> 
> There is a potential problem with changing the category codes of the
> Unicode characters, in that any tokens that have already been read in
> won't be affected, depending on the implementation. For example, with
> 
> @chapter é,
> 
> whether this works depends on whether the argument "é" was read before
> or after the category codes changed. It would be less fragile to keep
> the characters as active but make them expand to a token with category
> code "other".
> 
> Using the character definitions built in to texinfo.tex with
> \DeclareUnicodeCharacter may give less good results than using the
> glyphs from a proper Unicode font.

Thank you for your comments.
I've updated the patch.

I want the following.
  UTF-8 auxiliary file.
  Handling Unicode filename (image files and include files).
  Handling Unicode PDF bookmark strings.

For this purpose, I used the method that changes catcode.
The patch that is attached to this mail
uses different method for this purpose.
It uses re-defining replacing macros.
--- texinfo.tex.org	2016-01-15 07:41:42.861186100 +0900
+++ texinfo.tex	2016-01-18 23:04:55.714317700 +0900
@@ -9428,45 +9428,18 @@
   \global\righthyphenmin = #3\relax
 }
 
-% Get input by bytes instead of by UTF-8 codepoints for XeTeX and LuaTeX, 
-% otherwise the encoding support is completely broken.
-\ifx\XeTeXrevision\thisisundefined
-\else
-\XeTeXdefaultencoding "bytes"  % For subsequent files to be read
-\XeTeXinputencoding "bytes"  % Effective in texinfo.tex only
-% Unfortunately, there seems to be no corresponding XeTeX command for
-% output encoding.  This is a problem for auxiliary index and TOC files.
-% The only solution would be perhaps to write out @U{...} sequences in
-% place of UTF-8 characters.
-\fi
+\newif\iftxinativeunicodecapable
 
-\ifx\luatexversion\thisisundefined
+\ifx\XeTeXrevision\thisisundefined
+  \ifx\luatexversion\thisisundefined
+\txinativeunicodecapablefalse
+  \else
+\txinativeunicodecapabletrue
+  \fi
 \else
-\directlua{
-local utf8_char, byte, gsub = unicode.utf8.char, string.byte, string.gsub
-local function convert_char (char)
-  return utf8_char(byte(char))
-end
-
-local function convert_line (line)
-  return gsub(line, ".", convert_char)
-end
-
-callback.register("process_input_buffer", convert_line)
-
-local function convert_line_out (line)
-  local line_out = ""
-  for c in string.utfvalues(line) do
- line_out = line_out .. string.char(c)
-  end
-  return line_out
-end
-
-callback.register("process_output_buffer", convert_line_out)
-}
+  \txinativeunicodecapabletrue
 \fi
 
-
 % Helpers for encodings.
 % Set the catcode of characters 128 through 255 to the specified number.
 %
@@ -9491,13 +9464,6 @@
 %
 \def\documentencoding{\parseargusing\filenamecatcodes\documentencodingzzz}
 \def\documentencodingzzz#1{%
-  % Get input by bytes instead of by UTF-8 codepoints for XeTeX,
-  % otherwise the encoding support is completely broken.
-  % This settings is for the document root file.
-  \ifx\XeTeXrevision\thisisundefined
-  \else
-\XeTeXinputencoding "bytes"
-  \fi
   %
   % Encoding being declared for the document.
   \def\declaredencoding{\csname #1.enc\endcsname}%
@@ -9526,10 +9492,12 @@
  \latninechardefs
   %
   \else \ifx \declaredencoding \utfeight
- \setnonasciicharscatcode\active
- % since we already invoked \utfeightchardefs at the top level
- % (below), do not re-invoke it, then our check for duplicated
- % definitions triggers.  Making non-ascii chars active is enough.
+ \iftxinativeunicodecapable
+   \nativeunicodechardefs
+ \else
+   \setnonasciicharscatcode\active
+   \utfeightchardefs
+ \fi
   %
   \else
 \message{Ignoring unknown document encoding: #1.}%
@@ -9859,7 +9827,7 @@
   \catcode`\;=12
   \catcode`\!=12
   \catcode`\~=13
-  \gdef\DeclareUnicodeCharacter#1#2{%
+  \gdef\DeclareUnicodeCharacterUTFviii#1#2{%
 \countUTFz = "#1\relax
 %\wlog{\space\space defining Unicode char U+#1 (decimal \the\countUTFz)}%
 \begingroup
@@ -9917,6 +9885,23 @@
 \uppercase{\gdef\UTFviiiTmp{#2#3#4}}}
 \endgroup
 
+\def\DeclareUnicodeCharacterNative#1#2{%
+  \catcode"#1=\active
+  \begingroup
+\uccode`\~="#1\relax
+\uppercase{\gdef~}{#2}%
+  \endgroup}
+
+\def\DeclareUnicodeCharacterNativeThru#1#2{%
+  \catcode"#1=\active
+  \begingroup
+

Re: XeTeX encoding problem

2016-01-17 Thread Masamichi HOSODA
> Instead, I would like to have the ucharclasses style file (for XeTeX)
> ported to texinfo (also part of TeXLive, BTW).
> 
>   https://github.com/Pomax/ucharclasses
> 
> It should also be ported to luatex so that Unicode blocks
> automatically access associated fonts.
> 
> But this is the future.  Right now, I favor a simple solution, namely
> native UTF8 support using the CM super fonts, even if there are
> missing characters (which ones, BTW?).  I guess this covers 99% of the
> current need.

I have another solution.
The sample patch is attached to this mail.

Unicode fonts are not required. (default Computer Modern is used.)
Byte wise input is *NOT* used.
Unicode glyphs (U+00FC etc.) can be used.

How about this?
--- texinfo.tex.org	2016-01-15 07:41:42.861186100 +0900
+++ texinfo.tex	2016-01-18 00:11:11.797800800 +0900
@@ -9428,45 +9428,18 @@
   \global\righthyphenmin = #3\relax
 }
 
-% Get input by bytes instead of by UTF-8 codepoints for XeTeX and LuaTeX, 
-% otherwise the encoding support is completely broken.
-\ifx\XeTeXrevision\thisisundefined
-\else
-\XeTeXdefaultencoding "bytes"  % For subsequent files to be read
-\XeTeXinputencoding "bytes"  % Effective in texinfo.tex only
-% Unfortunately, there seems to be no corresponding XeTeX command for
-% output encoding.  This is a problem for auxiliary index and TOC files.
-% The only solution would be perhaps to write out @U{...} sequences in
-% place of UTF-8 characters.
-\fi
+\newif\iftxinativeunicodecapable
 
-\ifx\luatexversion\thisisundefined
+\ifx\XeTeXrevision\thisisundefined
+  \ifx\luatexversion\thisisundefined
+\txinativeunicodecapablefalse
+  \else
+\txinativeunicodecapabletrue
+  \fi
 \else
-\directlua{
-local utf8_char, byte, gsub = unicode.utf8.char, string.byte, string.gsub
-local function convert_char (char)
-  return utf8_char(byte(char))
-end
-
-local function convert_line (line)
-  return gsub(line, ".", convert_char)
-end
-
-callback.register("process_input_buffer", convert_line)
-
-local function convert_line_out (line)
-  local line_out = ""
-  for c in string.utfvalues(line) do
- line_out = line_out .. string.char(c)
-  end
-  return line_out
-end
-
-callback.register("process_output_buffer", convert_line_out)
-}
+  \txinativeunicodecapabletrue
 \fi
 
-
 % Helpers for encodings.
 % Set the catcode of characters 128 through 255 to the specified number.
 %
@@ -9491,13 +9464,6 @@
 %
 \def\documentencoding{\parseargusing\filenamecatcodes\documentencodingzzz}
 \def\documentencodingzzz#1{%
-  % Get input by bytes instead of by UTF-8 codepoints for XeTeX,
-  % otherwise the encoding support is completely broken.
-  % This settings is for the document root file.
-  \ifx\XeTeXrevision\thisisundefined
-  \else
-\XeTeXinputencoding "bytes"
-  \fi
   %
   % Encoding being declared for the document.
   \def\declaredencoding{\csname #1.enc\endcsname}%
@@ -9526,10 +9492,12 @@
  \latninechardefs
   %
   \else \ifx \declaredencoding \utfeight
- \setnonasciicharscatcode\active
- % since we already invoked \utfeightchardefs at the top level
- % (below), do not re-invoke it, then our check for duplicated
- % definitions triggers.  Making non-ascii chars active is enough.
+ \iftxinativeunicodecapable
+   \nativeunicodechardefs
+ \else
+   \setnonasciicharscatcode\active
+   \utfeightchardefs
+ \fi
   %
   \else
 \message{Ignoring unknown document encoding: #1.}%
@@ -9859,7 +9827,7 @@
   \catcode`\;=12
   \catcode`\!=12
   \catcode`\~=13
-  \gdef\DeclareUnicodeCharacter#1#2{%
+  \gdef\DeclareUnicodeCharacterUTFviii#1#2{%
 \countUTFz = "#1\relax
 %\wlog{\space\space defining Unicode char U+#1 (decimal \the\countUTFz)}%
 \begingroup
@@ -9917,6 +9885,21 @@
 \uppercase{\gdef\UTFviiiTmp{#2#3#4}}}
 \endgroup
 
+\def\DeclareUnicodeCharacterNative#1#2{%
+  \catcode"#1=\active
+  \begingroup
+\uccode`\~="#1\relax
+\uppercase{\gdef~}{#2}%
+  \endgroup}
+
+\def\DeclareUnicodeCharacterNativeCatcodeActive#1#2{%
+  \catcode"#1=\active
+}
+
+\def\DeclareUnicodeCharacterNativeCatcodeOther#1#2{%
+  \catcode"#1=\other
+}
+
 % https://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_M
 % U+..U+007F = https://en.wikipedia.org/wiki/Basic_Latin_(Unicode_block)
 % U+0080..U+00FF = https://en.wikipedia.org/wiki/Latin-1_Supplement_(Unicode_block)
@@ -9931,7 +9914,7 @@
 % We won't be doing that here in this simple file.  But we can try to at
 % least make most of the characters not bomb out.
 %
-\def\utfeightchardefs{%
+\def\unicodechardefs{%
   \DeclareUnicodeCharacter{00A0}{\tie}
   \DeclareUnicodeCharacter{00A1}{\exclamdown}
   \DeclareUnicodeCharacter{00A2}{{\tcfont \char162}}% 0242=cent
@@ -10601,7 +10584,33 @@
 
   \global\mathchardef\checkmark="1370 % actually the square root sign
   \DeclareUnicodeCharacter{2713}{\ensuremath\checkmark}
-}% end of \utfeightchardefs
+}% end of \unicodechardefs
+
+\def\utfeightchardefs{
+  \let\DeclareUnicodeCharacter\DeclareUnicodeCharacterUTFviii
+  

Re: XeTeX encoding problem

2016-01-16 Thread Masamichi HOSODA
> > For example, if you want to use Japanese characters,
> > I think that it is possible to set the Japanese font in txi-ja.tex.
> 
> To reiterate: as far as I know, it is not possible to set the font for
> Japanese only in texinfo[.tex].  Thus the ja font, wherever it is
> specified, would be used for every character.  That sounds like just
> shunting the problem off in another direction, not actually solving it.
> 
> Really solving it is not a trivial matter and there is no simple
> dozen-line patch that will overcome it.

These packages can be set independently Japanese fonts and alphabetic fonts.

LuaTeX-ja
https://osdn.jp/projects/luatex-ja/wiki/FrontPage%28en%29

ZXjatype
http://www.ctan.org/pkg/zxjatype

I think it is *possible* to set the font for Japanese only.
For example, alphabetic fonts are set to Computer Modern or Latin Modern,
and Japanese fonts are set to IPA Mincho and IPA Gothic.



Re: XeTeX encoding problem

2016-01-15 Thread Masamichi HOSODA
>> By switching to native UTF-8, the support in texinfo.tex for characters
>> outside the base font is lost, as far as I can see.  Yes, you get some
>> characters "for free" (the ones in the lmodern*.otf fonts now being
>> loaded instead of the traditional cm*) but you also lose some characters
>> (the ones that aren't in lmodern).
> 
> That's quite a major problem, I think. I didn't realise that so many
> characters would be missing - this negates much of the benefit of
> using native Unicode support. Is there really no font that aims to
> include every single Unicode character?

Is single Unicode character
Basic Latin U+0020 - U+007E and
Latin-1 Supplement U+00A0 - U+00FF ?

It seems that Linux Libertine O has those glyphs.

>> (something like ``Table of Contents'' broken etc.)
>>
>> That can be fixed in other ways, without resorting to native UTF-8.
> 
> I agree.

In the case of LuaTex, exactly, it can be fixed.
In the case of XeTeX, unfortunately,
it cannot be fixed if I understand correctly.

>> CJK characters can not be used without native UTF-8 support.
>>
>> They still won't work without loading a font that has them (at the right
>> time, without interfering with other fonts already loaded, etc.).  Not
>> simple.  There are no CJK characters in lmodern, unless I'm totally
>> missing them.

Yes, CJK fonts are required.
For example, if you want to use Japanese characters,
I think that it is possible to set the Japanese font in txi-ja.tex.
However, if the native Unicode support is disabled,
the Japanese characters cannot be used in this way.

>> Anyway, it's up to Gavin whether to install your patch.  I don't have
>> strong feelings about it.  Just pointing out that there are both gains
>> and losses.
> 
> It would be fine as an option. If it's substandard in its glyph
> support there's always the chance of improvements later.
> 
> That said, if there's a fix for the table of contents issue, maybe the
> desire for native UTF-8 support will go away.

If it is your decision, I'm OK.
However, I want native Unicode support even if it is an option.

> I don't think we should use my previous idea of only using native
> UTF-8 support if "@documentencoding UTF-8" is not given. I thought it
> was a neat idea but I can see that some people would find it
> confusing.

In the case of texi2html, "@documentencoding UTF-8" should be given.
Most html browsers recognize "charset=utf-8" in the generated html files
and use something like native Unicode support.
Therefore Japanese characters can be used.

In the case of texi2pdf,
it is necessary that the same texi files can be used,
in my humble opinion.



Re: XeTeX encoding problem

2016-01-15 Thread Masamichi HOSODA
 (something like ``Table of Contents'' broken etc.)

 That can be fixed in other ways, without resorting to native UTF-8.
>>>
>>> I agree.
>>
>> In the case of LuaTex, exactly, it can be fixed.
>> In the case of XeTeX, unfortunately,
>> it cannot be fixed if I understand correctly.
> 
> I think it could be done by changing the active definitions of bytes
> 128-256 when writing to an auxiliary file to read a single Unicode
> character and write out an ASCII sequence that represents that
> character, probably involving the @U command. Do you know how to do
> this?

If I understand correctly, active definitions is unrelated.
In the case of native Unicode is enabled,

"Für" in UTF-8 ".tex":
letter -> ".tex"
F  -> 0x66
ü  -> 0xC3, 0xBC
r  -> 0x72

XeTeX reads ".tex" files as native Unicode:
letter -> ".tex" -> inner XeTeX
F  -> 0x66   -> U+0066
ü  -> 0xC3, 0xBC -> U+00FC
r  -> 0x72   -> U+0072

XeTeX writes ".toc" files in UTF-8:
letter -> ".tex" -> inner XeTeX -> ".toc"
F  -> 0x66   -> U+0066  -> 0x66
ü  -> 0xC3, 0xBC -> U+00FC  -> 0xC3, 0xBC
r  -> 0x72   -> U+0072  -> 0x72

As a result, ".tex" and ".toc" are same.
Therefore, table of contents is not broken.


On the other hand, in the case of "bytes" encoding,

XeTeX reads as following:
letter -> ".tex" -> inner XeTeX
F  -> 0x66   -> U+0066
ü  -> 0xC3, 0xBC -> U+00C3, U+00BC
r  -> 0x72   -> U+0072

XeTeX writes ".toc" files in UTF-8 *always*.
It cannot change without something like \XeTeXoutputencoding primitive:
letter -> ".tex" -> inner XeTeX-> ".toc"
F  -> 0x66   -> U+0066 -> 0x66
ü  -> 0xC3, 0xBC -> U+00C3, U+00BC -> 0xC3, 0x83, 0xC2, 0xBC
r  -> 0x72   -> U+0072 -> 0x72

As a result, ".tex" and ".toc" are different.
Moreover, ".toc" is broken. It cannot be repaired.

"0xC3, 0xBC" is replaced to \"u by \DeclareUnicodeCharacter etc.
It is correctly "ü".

However, "0xC3, 0x83" is replaced to \~A and
"0xC2, 0xBC" is replaced to $1\over4$.
It is not "ü".

Therefore, table of contents is broken.

I've posted a future request \XeTeXoutputencoding etc.
http://sourceforge.net/p/xetex/feature-requests/22/

>> Yes, CJK fonts are required.
>> For example, if you want to use Japanese characters,
>> I think that it is possible to set the Japanese font in txi-ja.tex.
>> However, if the native Unicode support is disabled,
>> the Japanese characters cannot be used in this way.
> 
> Good idea to put the font loading in the translation files.

Thank you.

Alternatively, it may be good even if there is a font configuration file
like txi-font-latinmodern.tex, txi-font-computermodern.tex, etc.



Re: XeTeX encoding problem

2016-01-15 Thread Masamichi HOSODA
> the following is created in the output auxiliary table of contents file:
> 
> @numchapentry{f@"ur}{1}{}{1}
> 
> Without it, it would be
> 
> @numchapentry{für}{1}{}{1}
> 
> Do you understand now how changing the active definitions can change
> what's written to the output files?

Thank you for your demonstration.
I understand it.

Exactly, it can be effective.

I've come up with another idea.

If I understand correctly, in current texinfo.tex,
0xC3 and 0xBC are set to active and are replaced with \"u.

This behavior change as following:
If native Unicode support is enabled,
U+00FC is set to active and is replaced with \"u.

I do not know whether it is possible, now.
If it is possible, texinfo.tex Unicode support
and native Unicode support can exist together.



Re: luatex problems with texinfo.tex

2016-01-14 Thread Masamichi HOSODA
> Thanks for preparing the test files. Experimenting, I found it wasn't
> related to the character encoding problem, because removing the
> \directlua code made no difference.
> 
> I got it down to the following:
> 
[...snip...]
> 
> This discussion on the lualatex-dev mailing list suggested the problem
> could be related to \newlinechar:
> http://tug.org/pipermail/lualatex-dev/2011-November/001379.html
> 
> I tried commenting out the use of \newlinechar in \scanmacro in
> texinfo.tex, and this changed the results: both lines appeared from
> the example above.
> 
> The page http://tracker.luatex.org/view.php?id=733 reports this bug as
> fixed, although according to
> http://tug.org/pipermail/luatex/2015-March/005149.html, it wasn't
> fixed in March 2015.

Thank you.
I've reported it to LuaTeX tracker.
http://tracker.luatex.org/view.php?id=962



Re: XeTeX encoding problem

2016-01-13 Thread Masamichi HOSODA
>> I've created a patch that uses native unicode support of both XeTeX and 
>> LuaTex.
>> It works fine in my XeTeX, LuaTeX and pdfTeX environment.
>> Except, LuaTeX create broken PDF bookmark.
>>
>> How about this?
> 
> It looks mostly all right. We'd need to wait until we have your
> copyright assignment on file before merging a patch of this size.

I sent it, today.

> The main change I'd make is to turn it off by default to avoid the
> risk of breaking something that worked before (for example: if someone
> didn't have the right fonts installed), but it would be easy to modify
> your patch to do this.

If XeTeX/LuaTeX is used and @documentencoding is UTF-8,
using native UTF-8 support is very natural for me.

Of course, even if using XeTeX/LuaTeX, if @documentencoding is US-ASCII,
I think that the native UTF-8 support may be turned off.

Most users use pdfTeX instead of XeTeX/LuaTeX.
The risk of using native UTF-8 support is few
because pdfTeX does not have the function.

If you consciously use XeTeX/LuaTeX instead of pdfTeX
and consciously set @documentencoding to UTF-8 instead of US-ASCII,
it means that you want to use native UTF-8 support in my humble opinion.

Otherwise, if you use UTF-8 characters on XeTeX/LuaTeX
without native UTF-8 support, some problems can be happen.
(something like ``Table of Contents'' broken etc.)
I think this is obviously a risk.

Additionally, CJK characters can not be used without native UTF-8 support.



Re: XeTeX encoding problem

2016-01-11 Thread Masamichi HOSODA
>> On the other hands, in XeTeX,
>> it seems that XeTeX does not have something like \XeTeXoutputencoding.
> 
> It appears not, from what I could find out.
> 
> For now, if you need to use XeTeX, you'd have to avoid any non-ASCII
> characters in anything written to an auxiliary file, e.g. use @"u
> instead of ü - that would be in section titles and index entries.
> 
> In theory, it should be possible to write out @U sequences to the
> auxiliary files wherever a non-ASCII character is used. I won't be
> working on this myself.
> 
>> At least in XeTeX, byte wise input is hard to use, isn't it?
>> To use XeTeX (and also maybe LuaTex) native Unicode support is better
>> than byte wise input in my humble opinion.
> 
> Maybe. I don't have anything to add to what's been said earlier in
> this discussion.

I've created a patch that uses native unicode support of both XeTeX and LuaTex.
It works fine in my XeTeX, LuaTeX and pdfTeX environment.
Except, LuaTeX create broken PDF bookmark.

How about this?
--- texinfo.tex.org	2016-01-09 09:38:07.812241700 +0900
+++ texinfo.tex	2016-01-12 01:10:58.012335400 +0900
@@ -1779,7 +1779,7 @@
 % #4 = \mainmagstep
 % #5 = OT1
 %
-\def\setfont#1#2#3#4#5{%
+\def\setfontdefault#1#2#3#4#5{%
   \font#1=\fontprefix#2#3 scaled #4
   \csname cmap#5\endcsname#1%
 }
@@ -1811,6 +1811,91 @@
 \def\scshape{csc}
 \def\scbshape{csc}
 
+% Native Unicode fonts settings for XeTeX and LuaTeX engine
+\newif\iftxiusenativeunicode
+\ifx\XeTeXrevision\thisisundefined
+  \ifx\luatexversion\thisisundefined
+\txiusenativeunicodefalse
+  \else
+\txiusenativeunicodetrue
+\input luaotfload.sty
+  \fi
+\else
+  \txiusenativeunicodetrue
+\fi
+
+\iftxiusenativeunicode
+  \def\setfontunicode#1#2#3#4#5{%
+\def\fontprefix{roman}
+\def\fontsuffix{regular}
+\edef\fontshape{#2}
+\ifx\fontshape\rmshape % r
+  \def\fontprefix{roman}
+  \def\fontsuffix{regular}
+\fi
+\ifx\fontshape\rmbshape % bx
+  \def\fontprefix{roman}
+  \def\fontsuffix{bold}
+\fi
+\ifx\fontshape\bfshape % b
+  \def\fontprefix{romandemi}
+  \def\fontsuffix{regular}
+\fi
+\ifx\fontshape\bxshape % bx
+  \def\fontprefix{roman}
+  \def\fontsuffix{bold}
+\fi
+\ifx\fontshape\ttshape % tt
+  \def\fontprefix{mono}
+  \def\fontsuffix{regular}
+\fi
+\ifx\fontshape\ttbshape % tt
+  \def\fontprefix{mono}
+  \def\fontsuffix{regular}
+\fi
+\ifx\fontshape\ttslshape % sltt
+  \def\fontprefix{monoslant}
+  \def\fontsuffix{regular}
+\fi
+\ifx\fontshape\itshape % ti
+  \def\fontprefix{roman}
+  \def\fontsuffix{italic}
+\fi
+\ifx\fontshape\itbshape % bxti
+  \def\fontprefix{roman}
+  \def\fontsuffix{bolditalic}
+\fi
+\ifx\fontshape\slshape % sl
+  \def\fontprefix{romanslant}
+  \def\fontsuffix{regular}
+\fi
+\ifx\fontshape\slbshape % bxsl
+  \def\fontprefix{romanslant}
+  \def\fontsuffix{bold}
+\fi
+\ifx\fontshape\sfshape % ss
+  \def\fontprefix{sans}
+  \def\fontsuffix{regular}
+\fi
+\ifx\fontshape\sfbshape % ss
+  \def\fontprefix{sans}
+  \def\fontsuffix{regular}
+\fi
+\ifx\fontshape\scshape % csc
+  \def\fontprefix{romancaps}
+  \def\fontsuffix{regular}
+\fi
+\ifx\fontshape\scbshape %csc
+  \def\fontprefix{romancaps}
+  \def\fontsuffix{regular}
+\fi
+\font#1="[lm\fontprefix#3-\fontsuffix.otf]" scaled #4
+  }%
+  \let\setfont\setfontunicode
+\else
+  \let\setfont\setfontdefault
+\fi
+
 % Definitions for a main text size of 11pt.  (The default in Texinfo.)
 %
 \def\definetextfontsizexi{%
@@ -9428,32 +9513,6 @@
   \global\righthyphenmin = #3\relax
 }
 
-% Get input by bytes instead of by UTF-8 codepoints for XeTeX and LuaTeX, 
-% otherwise the encoding support is completely broken.
-\ifx\XeTeXrevision\thisisundefined
-\else
-\XeTeXdefaultencoding "bytes"  % For subsequent files to be read
-\XeTeXinputencoding "bytes"  % Effective in texinfo.tex only
-\fi
-
-\ifx\luatexversion\thisisundefined
-\else
-\directlua{
-local utf8_char, byte, gsub = unicode.utf8.char, string.byte, string.gsub
-
-local function convert_char (char)
-  return utf8_char(byte(char))
-end
-
-local function convert_line (line)
-  return gsub(line, ".", convert_char)
-end
-
-callback.register("process_input_buffer", convert_line)
-}
-\fi
-
-
 % Helpers for encodings.
 % Set the catcode of characters 128 through 255 to the specified number.
 %
@@ -9478,13 +9537,6 @@
 %
 \def\documentencoding{\parseargusing\filenamecatcodes\documentencodingzzz}
 \def\documentencodingzzz#1{%
-  % Get input by bytes instead of by UTF-8 codepoints for XeTeX,
-  % otherwise the encoding support is completely broken.
-  % This settings is for the document root file.
-  \ifx\XeTeXrevision\thisisundefined
-  \else
-\XeTeXinputencoding "bytes"
-  \fi
   %
   % Encoding being declared for the document.
   \def\declaredencoding{\csname #1.enc\endcsname}%
@@ 

Re: XeTeX encoding problem

2016-01-10 Thread Masamichi HOSODA
> Here's the code that worked for me:
> 
> local function convert_line_out (line)
>   local line_out = ""
>   for c in string.utfvalues(line) do
>  line_out = line_out .. string.char(c)
>   end
>   return line_out
> end
> 
> callback.register("process_output_buffer", convert_line_out)
> 
> Apparently LuaTeX will freeze if there's an error in the lua code.
> Also it has its own versions of the Lua libraries: string.utfvalues
> was mentioned in the LuaTeX reference manual, and the other functions
> I was trying to use evidently weren't there.

Thank you.
It works for my LuaTeX environment.

On the other hands, in XeTeX,
it seems that XeTeX does not have something like \XeTeXoutputencoding.

At least in XeTeX, byte wise input is hard to use, isn't it?
To use XeTeX (and also maybe LuaTex) native Unicode support is better
than byte wise input in my humble opinion.



Re: XeTeX encoding problem

2016-01-10 Thread Masamichi HOSODA
In XeTeX and LuaTex, non-ascii chapter name of ``Table of contents''
is broken.
In pdfTeX, it is not broken.

Attached files are texi file and screenshots of PDFs.
\input texinfo.tex

@documentencoding UTF-8

@contents

@chapter für

für

@bye


Re: XeTeX @image support

2016-01-06 Thread Masamichi HOSODA
>> Here's a file that I ran with pdftex and with luatex: both worked.
>> If this looks right, the code can be moved into texinfo.tex.
 
 \ifx\XeTeXrevision\thisisundefined
 \else
 \XeTeXinputencoding "bytes"
 \fi
 
 although I haven't been able to test this.
>>> 
>>> I've tried the attached file.
>>> Both pdfTeX and XeTeX, it works fine in my environment.
>>> Thank you.
>> 
>> I've noticed that XeTeX with texinfo.tex can not use @image.
>> Here is a patch that I've tried to make.
> 
> It seems that \XeTeXpdffile changes input encoding.
> So I add ``\XeTeXinputencoding "bytes"''.

I mistook.
\XeTeXpdffile does not change input encoding.
It is unnecessary that adding ``\XeTeXinputencoding "bytes"''.



Re: XeTeX @image support

2016-01-05 Thread Masamichi HOSODA
> Here's a file that I ran with pdftex and with luatex: both worked.
> If this looks right, the code can be moved into texinfo.tex.
>>> 
>>> \ifx\XeTeXrevision\thisisundefined
>>> \else
>>> \XeTeXinputencoding "bytes"
>>> \fi
>>> 
>>> although I haven't been able to test this.
>> 
>> I've tried the attached file.
>> Both pdfTeX and XeTeX, it works fine in my environment.
>> Thank you.
> 
> I've noticed that XeTeX with texinfo.tex can not use @image.
> Here is a patch that I've tried to make.

It seems that \XeTeXpdffile changes input encoding.
So I add ``\XeTeXinputencoding "bytes"''.
--- texinfo.tex.org	2016-01-05 22:26:04.245558200 +0900
+++ texinfo.tex	2016-01-06 00:50:37.963751000 +0900
@@ -1449,6 +1449,48 @@
   \let\pdfmakeoutlines = \relax
 \fi  % \ifx\pdfoutput
 
+\ifx\XeTeXrevision\thisisundefined
+\else
+  \def\doxeteximage#1#2#3{%
+\def\xeteximagewidth{#2}\setbox0 = \hbox{\ignorespaces #2}%
+\def\xeteximageheight{#3}\setbox2 = \hbox{\ignorespaces #3}%
+%
+\let\xeteximgext=\empty
+\def\xeteximgpdf{0}
+\begingroup
+  \openin 1 #1.pdf \ifeof 1
+\openin 1 #1.PDF \ifeof 1
+  \openin 1 #1.png \ifeof 1
+\openin 1 #1.jpg \ifeof 1
+  \openin 1 #1.jpeg \ifeof 1
+\openin 1 #1.JPG \ifeof 1
+  \errmessage{Could not find image file #1 for XeTeX}%
+\else \gdef\xeteximgext{JPG}%
+\fi
+  \else \gdef\xeteximgext{jpeg}%
+  \fi
+\else \gdef\xeteximgext{jpg}%
+\fi
+  \else \gdef\xeteximgext{png}%
+  \fi
+\else \gdef\xeteximgext{PDF} \gdef\xeteximgpdf{1}%
+\fi
+  \else \gdef\xeteximgext{pdf} \gdef\xeteximgpdf{1}%
+  \fi
+  \closein 1
+\endgroup
+%
+\ifnum\xeteximgpdf=1
+  \XeTeXpdffile "#1".\xeteximgext ""
+\else
+  \XeTeXpicfile "#1".\xeteximgext ""
+\fi
+\ifdim \wd0 >0pt width \xeteximagewidth \fi
+\ifdim \wd2 >0pt height \xeteximageheight \fi \relax
+%
+\XeTeXinputencoding "bytes"
+  }
+\fi
 
 \message{fonts,}
 
@@ -9078,10 +9120,14 @@
   \ifpdf
 \dopdfimage{#1}{#2}{#3}%
   \else
-% \epsfbox itself resets \epsf?size at each figure.
-\setbox0 = \hbox{\ignorespaces #2}\ifdim\wd0 > 0pt \epsfxsize=#2\relax \fi
-\setbox0 = \hbox{\ignorespaces #3}\ifdim\wd0 > 0pt \epsfysize=#3\relax \fi
-\epsfbox{#1.eps}%
+\ifx\XeTeXrevision\thisisundefined
+  % \epsfbox itself resets \epsf?size at each figure.
+  \setbox0 = \hbox{\ignorespaces #2}\ifdim\wd0 > 0pt \epsfxsize=#2\relax \fi
+  \setbox0 = \hbox{\ignorespaces #3}\ifdim\wd0 > 0pt \epsfysize=#3\relax \fi
+  \epsfbox{#1.eps}%
+\else
+  \doxeteximage{#1}{#2}{#3}%
+\fi
   \fi
   %
   \ifimagevmode


XeTeX @image support (was Re: luatex problems with texinfo.tex)

2016-01-05 Thread Masamichi HOSODA
 Here's a file that I ran with pdftex and with luatex: both worked.
 If this looks right, the code can be moved into texinfo.tex.
>> 
>> \ifx\XeTeXrevision\thisisundefined
>> \else
>> \XeTeXinputencoding "bytes"
>> \fi
>> 
>> although I haven't been able to test this.
> 
> I've tried the attached file.
> Both pdfTeX and XeTeX, it works fine in my environment.
> Thank you.

I've noticed that XeTeX with texinfo.tex can not use @image.
Here is a patch that I've tried to make.
--- texinfo.tex.org	2016-01-05 22:26:04.245558200 +0900
+++ texinfo.tex	2016-01-06 00:15:56.681485800 +0900
@@ -1449,6 +1449,46 @@
   \let\pdfmakeoutlines = \relax
 \fi  % \ifx\pdfoutput
 
+\ifx\XeTeXrevision\thisisundefined
+\else
+  \def\doxeteximage#1#2#3{%
+\def\xeteximagewidth{#2}\setbox0 = \hbox{\ignorespaces #2}%
+\def\xeteximageheight{#3}\setbox2 = \hbox{\ignorespaces #3}%
+%
+\let\xeteximgext=\empty
+\def\xeteximgpdf{0}
+\begingroup
+  \openin 1 #1.pdf \ifeof 1
+\openin 1 #1.PDF \ifeof 1
+  \openin 1 #1.png \ifeof 1
+\openin 1 #1.jpg \ifeof 1
+  \openin 1 #1.jpeg \ifeof 1
+\openin 1 #1.JPG \ifeof 1
+  \errmessage{Could not find image file #1 for XeTeX}%
+\else \gdef\xeteximgext{JPG}%
+\fi
+  \else \gdef\xeteximgext{jpeg}%
+  \fi
+\else \gdef\xeteximgext{jpg}%
+\fi
+  \else \gdef\xeteximgext{png}%
+  \fi
+\else \gdef\xeteximgext{PDF} \gdef\xeteximgpdf{1}%
+\fi
+  \else \gdef\xeteximgext{pdf} \gdef\xeteximgpdf{1}%
+  \fi
+  \closein 1
+\endgroup
+%
+\ifnum\xeteximgpdf=1
+  \XeTeXpdffile "#1".\xeteximgext ""
+\else
+  \XeTeXpicfile "#1".\xeteximgext ""
+\fi
+\ifdim \wd0 >0pt width \xeteximagewidth \fi
+\ifdim \wd2 >0pt height \xeteximageheight \fi \relax
+  }
+\fi
 
 \message{fonts,}
 
@@ -9078,10 +9118,14 @@
   \ifpdf
 \dopdfimage{#1}{#2}{#3}%
   \else
-% \epsfbox itself resets \epsf?size at each figure.
-\setbox0 = \hbox{\ignorespaces #2}\ifdim\wd0 > 0pt \epsfxsize=#2\relax \fi
-\setbox0 = \hbox{\ignorespaces #3}\ifdim\wd0 > 0pt \epsfysize=#3\relax \fi
-\epsfbox{#1.eps}%
+\ifx\XeTeXrevision\thisisundefined
+  % \epsfbox itself resets \epsf?size at each figure.
+  \setbox0 = \hbox{\ignorespaces #2}\ifdim\wd0 > 0pt \epsfxsize=#2\relax \fi
+  \setbox0 = \hbox{\ignorespaces #3}\ifdim\wd0 > 0pt \epsfysize=#3\relax \fi
+  \epsfbox{#1.eps}%
+\else
+  \doxeteximage{#1}{#2}{#3}%
+\fi
   \fi
   %
   \ifimagevmode


Re: luatex problems with texinfo.tex

2016-01-04 Thread Masamichi HOSODA
>>> Here's a file that I ran with pdftex and with luatex: both worked.
>>> If this looks right, the code can be moved into texinfo.tex.
> 
> \ifx\XeTeXrevision\thisisundefined
> \else
> \XeTeXinputencoding "bytes"
> \fi
> 
> although I haven't been able to test this.

I've tried the attached file.
Both pdfTeX and XeTeX, it works fine in my environment.
Thank you.
\ifx\XeTeXrevision\thisisundefined
\else
\XeTeXinputencoding "bytes"
\fi

\input texinfo.tex

@documentencoding UTF-8

† ‡ § ¶

@bye


Re: luatex problems with texinfo.tex

2016-01-04 Thread Masamichi HOSODA
>> This appears to be the code to get the byte values into LuaTeX:
>>
>> http://wiki.luatex.org/index.php/Process_input_buffer#Latin-1
>>
>> It needs a bit more background knowledge before it can be copied and
>> pasted into texinfo.tex.
> 
> Here's a file that I ran with pdftex and with luatex: both worked. If
> this looks right, the code can be moved into texinfo.tex.

It worked for me, too.
Thank you.

But, still another errors occur while making LilyPond documents.
Attached files --- macro-html.texi and macro-quotation.texi --- are minimal 
examples.

By pdfTeX, both they are worked fine.
By LuaTeX, both they are failed like following:

```
$ PDFTEX=luatex texi2pdf -b macro-html.texi
[...snip...]
Runaway argument?

./macro-html.texi:34: File ended while scanning use of \doignoretext.
 
@par 
@scanmacro ...atspaces }@scantokens {#1@texinfoc }
  @aftermacro 
l.34 @divId{aaa}
  
)
! Emergency stop.
<*> ...\let~\normaltilde  \input ./macro-html.texi
  
!  ==> Fatal error occurred, no output PDF file produced!
```

```
$ PDFTEX=luatex texi2pdf -b macro-quotation.texi
[...snip...]
./macro-quotation.texi:30: This command can appear only outside of any envir
onment, not in environment @quotation.
@badenverr ...temp , not @inenvironment @thisenv }
  
@checkenv ...@ifx @thisenv @temp @else @badenverr 
  @fi 
@\node #1->@checkenv {}
   @donode #1 ,@finishnodeparse 
l.30 @node bbb

[1{/var/lib/texmf/fonts/map/pdftex/updmap/pdftex.map}])
(\end occurred inside a group at level 1)

### semi simple group (level 1) entered at line 1 (@begingroup)
### bottom level
```
\ifx\luatexversion\thisisundefined
\else
\directlua{
local utf8_char, byte, gsub = unicode.utf8.char, string.byte, string.gsub

local function convert_char (char)
  return utf8_char(byte(char))
end

local function convert_line (line)
  return gsub(line, ".", convert_char)
end

callback.register("process_input_buffer", convert_line)
}
\fi

\input texinfo.tex

@documentencoding UTF-8

@macro divId {ID}
@html

@end html
@end macro

@macro divEnd
@html

@end html
@end macro

@divId{aaa}

bbb

@divEnd

@bye
\ifx\luatexversion\thisisundefined
\else
\directlua{
local utf8_char, byte, gsub = unicode.utf8.char, string.byte, string.gsub

local function convert_char (char)
  return utf8_char(byte(char))
end

local function convert_line (line)
  return gsub(line, ".", convert_char)
end

callback.register("process_input_buffer", convert_line)
}
\fi

\input texinfo.tex

@documentencoding UTF-8

@macro aaa
@quotation
aaa
@end quotation
@end macro

@aaa

@node bbb

bbb

@bye