Re: [XeTeX] ifcat changed?

2017-04-28 Thread Apostolos Syropoulos
Hello,
I have studied a bit the source code of XeTeX and luaTeX. XeTex definesa 
procedure that is used to insert primitive commands into a hash table.The 
procedure;s declaration is as follows:
procedure primitive(@!s:str_number;@!c:quarterword;@!o:halfword);
This is not exactly Pascal but I think one understands what is going on.Now 
luaTeX inserts the TeX primitives with the following command
#  define primitive_tex(a,b,c,d)primitive((a),(b),(c),(d),tex_command)
where function primitive is declared as follows:
extern void primitive(const char *ss, quarterword c, halfword o, halfword off,
  int cmd_origin);
XeTeX uses catcodes to compare commands, while luaTeX has special command 
codeswhich are different from catcodes. I think this approach is better. 
However, itis my understanding that the problem that started this thread cannot 
be solved trivially.
A.S.

--
Apostolos Syropoulos
Xanthi, Greece







 

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] ifcat changed?

2017-04-16 Thread Apostolos Syropoulos
>As far as I can tell from the sources, the bug likely was there from the
>start, and only affects \span, \cr and \crcr.  Basically, their
>character code is too small.  This can be fixed by changing
>"special_char" from 65537 to 1114112 or so, to make the values of
>"span_code", "cr_code", "cr_cr_code" be above "biggest_usv".
Exactly! This is the difference between XeTeX and luaTeX. The code that follows 
isfrom xetex.web

@d special_char=65537 {|biggest_char+2|}@d span_code=special_char {distinct 
from any character}
@d cr_code=span_code+1 {distinct from |span_code| and from any character}
@d cr_cr_code=cr_code+1 {this distinguishes \.{\\crcr} from \.{\\cr}}
and this code is from luaTeX's align.h:
#  define span_code 1114114 /*  {|biggest_char+3|} */
#  define cr_code (span_code+1) /* distinct from |span_code| and from any 
character */
#  define cr_cr_code (cr_code+1)/* this distinguishes \.{\\crcr} from 
\.{\\cr} */


A.S.
--
Apostolos Syropoulos
Xanthi, Greece
 


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] ifcat changed?

2017-04-16 Thread Bruno Le Floch
Filed https://sourceforge.net/p/xetex/bugs/138/ with a text essentially
identical to my message below explaining the bug's origin and how to fix it.

On 04/16/2017 06:50 AM, Julian Bradfield wrote:
> On 2017-04-16, Zdenek Wagner  wrote:
>> 2017-04-16 10:08 GMT+02:00 Julian Bradfield :
> 
>>> Definitely a bug. The TeXbook defines the behaviour of \if and \ifcat,
>>> and all control sequences are considered to have character code 256
>>> and category code 16, unless \let equal to a non-active character, in
>>> which case they have the value of that character.
>>>
>>> Not all control sequences but primitives. Unlike \ifx, \if and \ifcat
>> perform full expansion.
> 
> (a) Yes, they do perform expansion. That's irrelevant to the point at
> hand, since expansion happens before the comparison.
> (b) All control sequences, not just primitives:
> 
> \ifcat\noexpand\foo\noexpand\baz true\else false \fi
> 
> \ifcat\noexpand\foo\halign true\else false \fi
> 
> As Philip pointed out, I was reporting Knuth's words, which are by
> definition authoritative.

As far as I can tell from the sources, the bug likely was there from the
start, and only affects \span, \cr and \crcr.  Basically, their
character code is too small.  This can be fixed by changing
"special_char" from 65537 to 1114112 or so, to make the values of
"span_code", "cr_code", "cr_cr_code" be above "biggest_usv".

The test \ifcat and \if use to distinguish control sequences from
normal/active characters is

(cur_cmd>active_char)or(cur_chr>biggest_usv)

Most tokens that are not character tokens have "cur_cmd" greater than
"active_char".  All exceptions are primitives, among which \relax,
\span, \cr, \crcr.  For these primitives, Knuth made sure that "cur_chr"
was bigger than 255, but some cases were not increased enough when
switching to Unicode in XeTeX.  I think I went through all cases and
only "span_code", "cr_code", "cr_cr_code" need to be changed, although I
think it makes sense to also increase "special_char" (used as a
\noexpand marker).

On a related note, I think "define(p,relax,256)" should be
"define(p,relax,too_big_usv)" but I'm not quite following the code there
so don't trust me.  Namely, I don't see how the XeTeX code ends up
correctly giving TRUE in \chardef\foo=123\ifx\relax\foo TRUE\fi.

Best,

Bruno


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] ifcat changed?

2017-04-16 Thread Julian Bradfield
On 2017-04-16, Zdenek Wagner  wrote:
> 2017-04-16 10:08 GMT+02:00 Julian Bradfield :

>> Definitely a bug. The TeXbook defines the behaviour of \if and \ifcat,
>> and all control sequences are considered to have character code 256
>> and category code 16, unless \let equal to a non-active character, in
>> which case they have the value of that character.
>>
>> Not all control sequences but primitives. Unlike \ifx, \if and \ifcat
> perform full expansion.

(a) Yes, they do perform expansion. That's irrelevant to the point at
hand, since expansion happens before the comparison.
(b) All control sequences, not just primitives:

\ifcat\noexpand\foo\noexpand\baz true\else false \fi

\ifcat\noexpand\foo\halign true\else false \fi

As Philip pointed out, I was reporting Knuth's words, which are by
definition authoritative.


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] ifcat changed?

2017-04-16 Thread Philip Taylor


Zdenek Wagner wrote:
> Not all control sequences but primitives.

Again, I would respectfully suggest that Knuth's own words are the best 
guidance here :

> *\**if * 
>
> TeX will expand macros  following *\if* until two unexpandable tokens are 
> found.  If either token is a control sequence, TeX considers it to have 
> character code 256 and category code 16, unless the current equivalent of 
> that control sequence has been *\let* equal to a non-active character token 
> ...
>
> *\**ifcat * 
>
> This is just like *\if*, but it tests the category code, not the character 
> code ...

Philip Taylor



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] ifcat changed?

2017-04-16 Thread Zdenek Wagner
2017-04-16 10:08 GMT+02:00 Julian Bradfield :

> On 2017-04-15, Bruno Le Floch  wrote:
> > The primitive conditional "\ifcat\relax\cr true\else false\fi" gives
> > "true" in pdfTeX, LuaTeX, (e)(u)pTeX, and XeTeX from some time ago
> > (could be years), but "false" in XeTeX 0.6
>
> Definitely a bug. The TeXbook defines the behaviour of \if and \ifcat,
> and all control sequences are considered to have character code 256
> and category code 16, unless \let equal to a non-active character, in
> which case they have the value of that character.
>
> Not all control sequences but primitives. Unlike \ifx, \if and \ifcat
perform full expansion.
Try the following code:

\def\a{$A$}
\def\b{hello}
\def\c{world}
\ifcat\a\b\else\c\fi

The output will be world because $ and A have different category codes.

Similarly, \ifcat\relax\a will compare \relax with $.


Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz



>
> --
> Subscriptions, Archive, and List information, etc.:
>   http://tug.org/mailman/listinfo/xetex
>


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] ifcat changed?

2017-04-16 Thread Apostolos Syropoulos
>Definitely a bug. The TeXbook defines the behaviour of \if and \ifcat,
>and all control sequences are considered to have character code 256
>and category code 16, unless \let equal to a non-active character, in
>which case they have the value of that character.
After comparing the relevant code in 
texlive/source/texk/web2c/luatexdir/tex/conditional.w (function void 
conditional(void))
and 
texlive/source/texk/web2c/xetexdir/xetex.web (@;)
I think they are identical. Note these things process \if and \ifcat commands.

A.S.
--
Apostolos Syropoulos
Xanthi, Greece




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] ifcat changed?

2017-04-16 Thread Julian Bradfield
On 2017-04-15, Bruno Le Floch  wrote:
> The primitive conditional "\ifcat\relax\cr true\else false\fi" gives
> "true" in pdfTeX, LuaTeX, (e)(u)pTeX, and XeTeX from some time ago
> (could be years), but "false" in XeTeX 0.6

Definitely a bug. The TeXbook defines the behaviour of \if and \ifcat,
and all control sequences are considered to have character code 256
and category code 16, unless \let equal to a non-active character, in
which case they have the value of that character.


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] ifcat changed?

2017-04-15 Thread Ulrike Fischer
Am Sat, 15 Apr 2017 07:26:38 -0400 schrieb Bruno Le Floch:

> The primitive conditional "\ifcat\relax\cr true\else false\fi" gives
> "true" in pdfTeX, LuaTeX, (e)(u)pTeX, and XeTeX from some time ago
> (could be years),

Looks like *many* years. I get the wrong output with texlive 2012. 

> It would be useful for me to know which of \ifcat, \relax, and \cr
> changed, 

It looks like a problem with \cr for me, but it is difficult to be
sure. 

-- 
Ulrike Fischer 
http://www.troubleshooting-tex.de/



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] ifcat changed?

2017-04-15 Thread Jonathan Kew
This sounds like a bug. Offhand, I don't know what changed to cause 
this, but it probably shouldn't have!


Filing an issue at https://sourceforge.net/projects/xetex/ would be 
useful, to help us keep track.


JK

On 15/04/2017 12:26, Bruno Le Floch wrote:

Dear all,

The primitive conditional "\ifcat\relax\cr true\else false\fi" gives
"true" in pdfTeX, LuaTeX, (e)(u)pTeX, and XeTeX from some time ago
(could be years), but "false" in XeTeX 0.6

It would be useful for me to know which of \ifcat, \relax, and \cr
changed, to determine whether I should just special-case \cr in my
package, or use some other tool than \ifcat.

Best regards,

Bruno


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex





--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex