Re: [luatex] LuaMetaTeX ";" syntax in \numexpr

2024-03-25 Thread Hans Hagen

On 3/25/2024 6:22 PM, Joseph Wright wrote:

On 25/03/2024 17:20, Hans Hagen wrote:

On 3/25/2024 6:05 PM, Joseph Wright wrote:

Hello all,

The current LuaMetaTeX docs say that \numexpr is extended to include
division with truncation using ":". They don't, however, mention that
";" will be interpreted as mod. The new \numexpression includes mod
functionality using "%", although it also seems to accept ";".

Before I make some non-trivial macro changes to allow for this
(https://github.com/latex3/latex3/issues/1518), can I check if ";" is
intended to work but is simply undocumented, or if it's been left as an
oversight and might therefore be dropped from the code  (in favour of
"%")?

as luatex is frozen, what's there will stay


I was talking specifically about LuaMetaTeX :)


Ha, so now i need to check if it's even in luatex ... well ...

That said, as we use it at least once in lmtx it will stay there too 
(and we won't impose \numexpression 10 mod 3\relax on you).


Hans


-----
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] LuaMetaTeX ";" syntax in \numexpr

2024-03-25 Thread Hans Hagen

On 3/25/2024 6:05 PM, Joseph Wright wrote:

Hello all,

The current LuaMetaTeX docs say that \numexpr is extended to include
division with truncation using ":". They don't, however, mention that
";" will be interpreted as mod. The new \numexpression includes mod
functionality using "%", although it also seems to accept ";".

Before I make some non-trivial macro changes to allow for this
(https://github.com/latex3/latex3/issues/1518), can I check if ";" is
intended to work but is simply undocumented, or if it's been left as an
oversight and might therefore be dropped from the code  (in favour of "%")?

as luatex is frozen, what's there will stay

Hans

-----
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] Synchronize ERROR: unknown node type 11

2024-03-20 Thread Hans Hagen

On 3/20/2024 2:21 PM, Jérôme LAURENS wrote:




Le 20 mars 2024 à 08:20, Hans Hagen  a écrit :

because in that case the mathnode *is* a glue node, that is: it's glue fields 
are used (and they have the same layout as a glue node) so synctex should just 
accept a math node as valid glue in luatex


This is not a good reason and not a good idea. What you are asking for is to 
duplicate the synctex information for math nodes with glue: once as math node 
and once as glue node. With such kind of duplication there is no gain in 
synchronization accuracy. On the contrary, the cost of synchronization is 
bigger (bigger write,  bigger read, bigger parse and bigger search) . So either 
use `synctexmath` or `synctexhorizontalruleorglue` but not both on the same 
node.

Another solution is to add a (multi) conditional do nothing chunk of code in 
`synctexhorizontalruleorglue`, just to fit LuaTeX code convenience, and update 
the synctex code documentation accordingly… That does not sound really right.

One more solution, in pdflistout.c, line 816 (or so), replace
```
synctexhorizontalruleorglue(p, this_box);
```
with
```
If (type(p) != math_node) {// because synctexmath has already been called for 
math_node
   synctexhorizontalruleorglue(p, this_box);
}
```
So simple and easy. Moreover, the synctex library is then used the proper wa
we won't change anything: just accept the math node math as glue because 
that's what it is there: a glue + math state


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] Synchronize ERROR: unknown node type 11

2024-03-20 Thread Hans Hagen

On 3/19/2024 9:09 PM, Jérôme LAURENS wrote:

I could not find the exact location of the problem, but here are some hints 
that may help a lot to solve it.

`Synchronize ERROR: unknown node type 11` is a bit misleading, it should be 
`unexpected node type 11` instead.
11 corresponds to `math_node` in LuaTeX (according to `texnodes.h` line 412). 
This error message comes from the `synctexhorizontalruleorglue` function, when 
it is called on a node with the wrong type. It means that 
`synctexhorizontalruleorglue` is called on a math node, which is not expected. 
So, somewhere in LuaTeX, `synctexhorizontalruleorglue` is used instead of the 
dedicated `synctexmath`. Also, the problem is somehow related to 
`\mathsurroundskip`.

I hope this helps.


because in that case the mathnode *is* a glue node, that is: it's glue 
fields are used (and they have the same layout as a glue node) so 
synctex should just accept a math node as valid glue in luatex



On 3/19/2024 5:12 PM, Jérôme LAURENS wrote:

Hi
I suspect this might not be the best location for a bug report, but I could not 
get mantisBT to work.
This bug stands around since 2019 
(https://github.com/jlaurens/synctex/issues/30)
Source file `bug.tex`
```
\mathsurroundskip=1cm
$y$%
\bye
```
## Command
```
luatex --synctex=1 bug.tex
```
In the terminal output below all the `Synchronize ERROR` lines are unexpected, 
they are not ini the log file:
```
This is LuaTeX, Version 1.18.0 (TeX Live 2024)
  restricted system commands enabled.
(./bug.tex [1
Synchronize ERROR: unknown node type 11


these are math nodes and such nodes can have glue set (they also have syntaex 
fields) so afaiks it's not a bug in luatex

(fwiw, glue as alternative for surround kerns have been in luatex for quite a 
while so i suppose that older luatex versions give the same error)


Synchronize ERROR: unknown node type 11
{.../TeXLive/Master/texmf-dist/fonts/map/pdftex/updmap/
pdftex.map}
Synchronize ERROR: unknown node type 11
Synchronize ERROR: unknown node type 11
])<.../TeXLive/Master/texmf-dist/fonts/type1/public/amsfo
nts/cm/cmmi10.pfb><.../TeXLive/Master/texmf-dist/fonts/type1/public/
amsfonts/cm/cmr10.pfb>
Output written on bug.pdf (1 page, 15421 bytes).
SyncTeX written on bug.synctex.gz.
Transcript written on bug.log.
```
MacOS 13, fresh TeXLive updated today from svn
TIA
Please notice that I am not a member of the list


--

-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-





--

-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] Synchronize ERROR: unknown node type 11

2024-03-19 Thread Hans Hagen

On 3/19/2024 5:12 PM, Jérôme LAURENS wrote:

Hi

I suspect this might not be the best location for a bug report, but I could not 
get mantisBT to work.
This bug stands around since 2019 
(https://github.com/jlaurens/synctex/issues/30)

Source file `bug.tex`
```
\mathsurroundskip=1cm
$y$%
\bye
```
## Command
```
luatex --synctex=1 bug.tex
```
In the terminal output below all the `Synchronize ERROR` lines are unexpected, 
they are not ini the log file:
```
This is LuaTeX, Version 1.18.0 (TeX Live 2024)
  restricted system commands enabled.
(./bug.tex [1
Synchronize ERROR: unknown node type 11


these are math nodes and such nodes can have glue set (they also have 
syntaex fields) so afaiks it's not a bug in luatex


(fwiw, glue as alternative for surround kerns have been in luatex for 
quite a while so i suppose that older luatex versions give the same error)



Synchronize ERROR: unknown node type 11
{.../TeXLive/Master/texmf-dist/fonts/map/pdftex/updmap/
pdftex.map}
Synchronize ERROR: unknown node type 11

Synchronize ERROR: unknown node type 11
])<.../TeXLive/Master/texmf-dist/fonts/type1/public/amsfo
nts/cm/cmmi10.pfb><.../TeXLive/Master/texmf-dist/fonts/type1/public/
amsfonts/cm/cmr10.pfb>
Output written on bug.pdf (1 page, 15421 bytes).
SyncTeX written on bug.synctex.gz.
Transcript written on bug.log.
```
MacOS 13, fresh TeXLive updated today from svn

TIA

Please notice that I am not a member of the list





--

-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] LuaTeX precompilation

2023-12-22 Thread Hans Hagen

On 12/22/2023 1:21 AM, Reinhard Kotucha via luatex wrote:

On 2023-12-21 at 10:27:48 +0100, Hans Hagen wrote:

  > I stop here as I've written plenty about performance the last
  > decades in various docs,

Hi Hans,
please let me mention this though:

A few months ago I uncompressed the lualatex format file and
compressed it with lz4.  The lz4-compressed file was smaller than the
original format file.  Though I know that luatex doesn't use maximum
gzip compression, I didn't really expect this.

  26702648  lualatex.fmt-uncompressed
  12201683  lualatex.fmt
   9926586  lualatex.fmt.lz4

But what I absolutely didn't expect is that de-compression of lz4
files is about 7.5 times as fast as de-compression of the gzip'ed
format file.

In order to determine speed I directed the de-compressed output to
/dev/null because when writing to disk results differ much more
between successive runs and a format file is extracted in memory and
not written to disk.

Maybe it's worthwhile to keep an eye on lz4.
Ok, it's a bit off topic but ... in luametatex (context lmtx) we don't 
zip the format at all. One reason is that is not really needed because 
the format file smaller and also because there is that little overhead 
involved (actually endian swapping also takes time but in a thread on 
this list a while ago that was not deemed relevant).


Anyway, I played with (and interface to) zip (gz), lzo, lz4 and zstd for 
various purposes (one is that i can have a real fast texlive files 
uncompression when i need to see what is in these xz files, esp when it 
comes to fonts).	


cont-en.fmt:

uncompressed  gzip -3format
luametatex: 19,387,6826,483,137  (we don't compress)
luatex: 19,819,3716,281,775  11,433,340

(there is more code in luametatex but we save some elsewhere in tex)

So, actually luates doesn't compress agressive (11 vs 6 MB). I have no 
clue how large other macro package format files are (compressed or 
uncompressed).


(bit as mentioned, in earlier threads about performance formats when i 
mentioned formats it was concluded that that had little impact)


it's very unlikely that we will change luatex in this respect,

Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] LuaTeX precompilation

2023-12-21 Thread Hans Hagen

On 12/21/2023 10:01 AM, Henrik Mannerström wrote:
On Thu, Dec 21, 2023 at 10:40 AM luigi scarso <mailto:luigi.sca...@gmail.com>> wrote:


I think we should  start from

https://tex.stackexchange.com/questions/324559/guidelines-for-using-mylatexformat-with-luatex
 
<https://tex.stackexchange.com/questions/324559/guidelines-for-using-mylatexformat-with-luatex>

True! The goal behind my question was a bit more ambitious: Do dump 
(serialise) the whole state of the luatex engine into a restartable 
glob. Or even better, to have a tex-server that never leaves memory. 
This way I can do my type-recompile loop without interrupting by flow.


On modern machines and os's tex actually never leaves memory; the binary 
and format are likely cached. Just compare two times making a format 
after a machine startup.


As a service one would have to reset some states ... which ones? How to 
reset bits of lua without restarting it?



(precompiled format it's a bit misleading --  a format it's already
a  set of compiled macros. I prefer custom format, as the link says)

It does not have to be a custom format in a strict TeX-sense, in that it 
is saved somewhere permanently. I usually work on only one document at a 
time, so I'm totally happy with one lengthy setup and then fast iterations.


You can only partially save the state of lua. In tex there's also the 
backend to keep in mind which has its own states. And here's the 
multiple run issue. It gets really complex.


What you could try to do is start N tex instances and let them sit idle 
till triggered and onece doen start it again, but then teh controller 
program needs to know which ons is ready.


What does has some impact is large cpu caches, fast memory (here ddr4 
can beat ddr5), not having some zoom session (trying to improve a fuzzy 
real time image) or 25 browser tabs open (taking gig's of mem) with a 
music app running in the background.


Hans

-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] LuaTeX precompilation

2023-12-21 Thread Hans Hagen

On 12/21/2023 8:55 AM, Henrik Mannerström wrote:
On Wed, Dec 20, 2023 at 9:57 PM Hans Hagen <mailto:j.ha...@xs4all.nl>> wrote:


Did you time how long loading these packagestakes? Things like fonts
always need to be loaded (and a 8 bit font is way smaller than a 32 bit
wide font with features). You could try to use latex/luatex with 8 bit
fonts.


Without a precompiled format, pdflatex takes 1.2s for my four page 
example. That lualatex takes 2.5s is not that big a difference. What my 
question is about is precompiled formats: If I put everything in the 
preample and precompile it, pdflatex takes only 0.56s. /Here/ is the big 
difference. When I'm typing a typical technical note, I have over the 
years developed a workflow where I use latex + editor as a 
wysiwyg-setup. With two windows: one for the pdf and one for the source, 
recompile almost once for every sentence or equation. /It is here/ where 
the difference adds up. Judging by the other posts sparked by my 
question, I'm not the only one that feels the difference.


When I'm in such trial and error mode I normally work on chapters and 
processing some 20-30 pages is fast (sub 2 sec). But of coursenothign 
can compete with pdftex here, and actually dvi beats pdf here.


I readily admit that luatex is a wonderful piece of software and I 
really appreciate the work you have done. Could we make it even better? 
 From a technical point of view, how hard would it be to have something 
akin to a precompiled format in luatex?


The format contains:

- the state of registers (many more in luatex)
- allocated nodes (often can be neglected)
- macro definitions (depends on the macro package)

and in pdftex

- hyphenation patterns
- fonts when preloaded

In pdftex fonts are always small. In luatex the wide fonts are not 
preloaded, nor are patterns. Actually when in pdftex you emulate utf8 
there can be some impact. Making a context format:


pdftex (with mkii) : 7 sec, 4.1 sec when output piped to file
luatex (with mkiv) : 4 sec, 3.2 sec when output piped to file
luametatex (with mkxl) : 2 sec, 1.6 sec when output piped to file

these are rought timings and the non piped are a bit higher when we use 
a 4K display and full screen console. Keep in mind that terminal output 
(as well as runtime enabled synctex when processing a file) have impact 
on your run. Actually, in luametatex/mklx (aka lmtx) we could avoid a 
format at the cost of 1 sec extra runtime but that would not work well 
on networks.


Because luatex is 32 bit it's format file is larger (just run gzip -d on 
a format file to see its real size). Every token in a linked list (macro 
body) takes 8 bytes, so you can do your math here wrt preloading more. A 
node can be even more. Of course it can save some startup time but it 
also adds to the unzipping pipeline.


There is a hit of optimization in luatex for storing registers but zeros 
zip good anyway. Explaining the reasons for luametatex being faster is 
beyond this thread as it's only used in context.


Adding pattern to the format is possible but I bet that preloading some 
15 of them will have a negative impact (there's a reason we delay 
loading; keep in mind that when we developed luatex we aslso spent quite 
some time on eleminating impact on performance).


Adding fonts to a format would be complex when there's also lua code 
involved (much lives in lua memory in rather complex structures). A no-go.


Another topic that comes up regularly is to keep tex in memory but 
adding save state and revert to prestine would add a lot complexity to a 
macro package (whet to reset and forget) and in the end realoading a 
format is faster.


One thing you should keep in mind is that with ssd and proper os file 
cching, there's often little to gain by preloading. TeX is very fast in 
parsing input and expanding macros but of course it does depend on what 
gets loaded. You can gain on some expansion, string macro bodies, and 
checking and so, but then it nmakes sense to time that per component to 
get an idea (for instance loading tikz runtime has quite some impact but 
so might having it in the format).


I stop here as I've written plenty about performance the last decades in 
various docs,


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] LuaTeX precompilation

2023-12-20 Thread Hans Hagen

On 12/20/2023 10:07 PM, blum...@fi.muni.cz wrote:


What I'll probably do is to continue to use pdftex until the book is
done, and then generate the final version with luatex and better fonts.


It could be interesting to see (and report) if indeed that gives better 
resuts. If you use LM hen basically the T1 fonts are comparable 
(shapewise) and it might even be that luatex does math a little less 
well due to opentype math fonts having issues.


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] LuaTeX precompilation

2023-12-20 Thread Hans Hagen

On 12/20/2023 6:44 PM, blum...@fi.muni.cz wrote:

Hans Hagen wrote:

On 12/20/2023 3:48 PM, Axel Kittenberger wrote:

Sorry, but I have to disagree here, for me the performance
differences were indeed a dealbreaker for me to push lualatex as a
general purpose replacement in my department.


In documents of average complexity i normally get 30 page/second
performance. On more complex documents pdftex can be slower. I have a
2017 laptop so more modern hardware will gove lower numbers


I agree with Alex here. Performance is the only reason I haven't
switched to lualatex for all my documents yet.

For instance, I get the numbers

8 s  for pdflatex
   12 s  for lualatex with the same TFM fonts
   18 s  for lualatex using luaotfload and fontspec instead

This was for a math heavy document of about 600 pages using large OTF fonts

Do you need otf fonts? Maybe t1 will do the job as well.


This performance difference is very noticeable. 8 s compile times are
fine, but having to wait 18 s interrupts my workflow.


In that case: run chapters when you work on them and not whole books ...


Note that I did not spend time to figure out what causes the slowdown.
It might be harfbuzz, but it also might be fontspec, which wouldn't be
luatex's fault.
mostly being 32 bit which is more memory demanding and memory access is 
a bottleneck


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] LuaTeX precompilation

2023-12-20 Thread Hans Hagen

On 12/20/2023 6:09 PM, Henrik Mannerström wrote:

The following file (currently not "approved by the community”) compiles in 2.2 
seconds with lualatex and in 0.3 seconds with pdflatex with a precompiled format. 
That is over seven times slower. When I add four pages of meaningful content (text 
and equations) the times are 2.5s and 0.5s, still five times slower. This is meant 
to be an illustration only, but not so small to be meaningless. The minimal file: 
\documentclass{minimal}  \begin{document} Hello World! \end{document} still takes 
0.7s with lualatex.

\documentclass{article}

\usepackage{lmodern}
\usepackage[protrusion=true,expansion=false]{microtype}
\usepackage{geometry}
\usepackage{mathtools}
\usepackage{babel}
\usepackage{csquotes}
\usepackage{parskip}
\usepackage{enumitem}
\usepackage{biblatex}

\begin{document}
Hello World!
\end{document}

What if you replace the Hello World by \null and disable page numbers?

So,

pdftex : 0.3 sec -> 0.5 sec -> delta 0.2 sec
luatex : 2.2 sec -> 2.5 sec -> delta 0.3 sec

which means that luatex needs 0.1 sec more when we do real pages?

- 32 bit fonts with features
- unicode math
- 32 bit patterns

now, what really can slow down is protrusion and expansion (how useful 
and needed are they)? actually expansion in luatex is done a bit 
different (more granular, less font instances, etc) but I;m not sure if 
that adds much,


Hans

-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] LuaTeX precompilation

2023-12-20 Thread Hans Hagen

On 12/20/2023 6:19 PM, Henrik Mannerström wrote:

This thread turned into a debate over what is slow, what packages you should 
use and how old computers you can use. I’m re-posting my original question:

How hard would it be to make a luatex compatible precompiled format? One that 
basically serialises the whole luatex state to a file and could re-use
it? Would it speed up the compilation of short documents? Is there any plans or 
projects related to this?


Did you time how long loading these packagestakes? Things like fonts 
always need to be loaded (and a 8 bit font is way smaller than a 32 bit 
wide font with features). You could try to use latex/luatex with 8 bit 
fonts.


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] LuaTeX precompilation

2023-12-20 Thread Hans Hagen

On 12/20/2023 3:48 PM, Axel Kittenberger wrote:



On Wed, Dec 20, 2023 at 2:31 PM Hans Hagen <mailto:j.ha...@xs4all.nl>> wrote:


in practice one can neglect the performance drop because computers
likely have become (more than) 3 times faster since 2005, when luatex
showed up, and at that time pdftex performance was considered okay


Sorry, but I have to disagree here, for me the performance differences 
were indeed a dealbreaker for me to push lualatex as a general 
purpose replacement in my department. (remember the discussion with one 
patch https://tug.org/pipermail/luatex/2023-June/007824.html 
<https://tug.org/pipermail/luatex/2023-June/007824.html> but with my 
general impetus to improve runtime performance I was eventually told to 
stop when profiling into the Lua part proofed hard).


I don't consider these ipsum test real tests. Who knows how much cpu 
cache is used for a test that does basically nothing. If zip impact is 
an issue, then you can run comprelevel 0. You can also disable synctex 
if enabled.


A useless test with context:

\null / \dorecurse{1000}{\tufte\par} % 233 pages

0.6   / 1.1 seconds pdftex
0.7   / 2.7 seconds luatex
0.5   / 2.6 seconds luametatex
0.8   / 5.5 seconds xetex

the time includes the runner script (so for instance for luametatex the 
real run has < .5 sec startup time). xetex migh tbe slow because of the 
binary (not sure how optimized it is).


In documents of average complexity i normally get 30 page/second 
performance. On more complex documents pdftex can be slower. I have a 
2017 laptop so more modern hardware will gove lower numbers


(I'i do regular perfrmance tests so by now i know pretty well where 
bottlenecks in tex can be)


Why it may be true, that lualatex may run now in less time on the same 
document than pdflatex ran 20 years ago, it's still a tall ask for 
someone to switch from a compiler that uses 150s for a complicated 
document to one that uses 210s, in this case just for 
compatibility/simplicity reasons, having to wait a minute longer? Sorry 
deal breaker. That's why I stuck to pdflatex as default and use lualatex 
only when one of its more advanced features is absolutely necessary, and 
to my impression this seems to be a widespread notion.


i can't remember the last time when i needed 150 sec for a run, and can 
live with 10 sec for a 350 page document (if that becomes an issue i 
have to upgrade hardware)


So in this sense, yes you are right, when you need one of lualatex 
advanced features it's to be considered okay, as in pdflatex was okay 20 
years ago, if you do not specifically need it though, then no, stick 
with pdflatex.
indeed. i suppose that most latex users can just use pdftex, because 
after all the selling point is often 'articles' and such and those 
styles are (i assume ) pretty stable and when the language is english 
there is little to gain from luatex (even 8 bit fonts are okay then)


just use what works best (pdftex will be around for ages); i assume 
latex will become faster over time so maybe in a few years your users 
won't notice a move to luatex


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] LuaTeX precompilation

2023-12-20 Thread Hans Hagen

On 12/20/2023 12:08 PM, Henrik Mannerström wrote:
According to my very limited understanding, luatex is generally slower 
than pdftex because the font loading is more involved [1,2,3,4]. In 
pdftex, compilation can be sped up by precompiling a preamble to a 
format, which loads much faster. To my understanding precompiled formats 
are not that useful in luatex due to mixing of tex- and lua-code. [5]


As I'm probably not the only one that writes many more short 
math-oriented notes than complete books, the startup speed of luatex is 
an issue. I resisted the switch from pdftex to luatex very long because 
of this.


How hard would it be to make a luatex compatible precompiled format? One 
that basically serialises the whole luatex state to a file and could 
re-use it? Would it speed up the compilation of short documents? Is 
there any plans or projects related to this?
you don't specify what 'slower is', depending on what you do pdftex is 
some 3 times faster on simple things (only pargraphs of text) but the 
difference becomes smaller on more complex documents where lua kicks in 
to help with otherwise time consuming tasks


in practice one can neglect the performance drop because computers 
likely have become (more than) 3 times faster since 2005, when luatex 
showed up, and at that time pdftex performance was considered okay


anyway, it all depends on what macros you use, maybe try to 
lean-and-mean that bit, so there is nothing that the engine can do for you


(if you run many small docs you can run them in parallel)

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] Performace comparison

2023-11-30 Thread Hans Hagen

On 11/30/2023 7:32 PM, Andreas Scherer wrote:

Dear all,

While working on next year's 'knuth-pdf' package 
(https://ctan.org/pkg/knuth-pdf) I have noticed that LuaTeX is the one 
PDF engine that revs up the fan of my laptop more than the others.


Running 'time makeall' (see 
https://mirrors.ctan.org/web/pwebmac/makeall) gives the following 
results (second run each):


$ time makeall -npt xetex
real    1m2,738s
user    1m7,745s
sys 0m8,641s
[$ wc -c *.pdf -> 30355436 total]

$ time makeall -npt pdftex
real    0m53,242s
user    0m53,042s
sys 0m2,140s
[$ wc -c *.pdf -> 40045111 total]

$ time makeall -npt luatex
real    1m15,286s
user    1m13,728s
sys 0m3,615s
[$ wc -c *.pdf -> 44176632 total]

Just an observation.

So (just a remark),

pdftex  53 sec  8 bit input, 8 bit fonts
xetex   63 sec  utf input, wide fonts
luatex  75 sec  utf input, wide fonts

Given the 53 sec which is quite some for a pdftex run, luatex is not 
doing that bad. On pure text runs luatex with open type fonts can be 
three times slower, and depending on circumstances xetex can be faster 
than luatex (so that metric looks ok compared to the others).


Of course there are scenarios where luatex can be faster. Anyway, 
unicode and wide fonts have impact on the parsing of input, memmory 
usage, font handling, language processing, backend etc.


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] Identifying why LuaTeX 2023 is slower than LuaTeX 2022

2023-08-30 Thread Hans Hagen

On 8/30/2023 9:44 AM, Max Chernoff wrote:


None of these patches should have any performance implications, yet
"sys" is ~40% slower than either "2023" or "2023-initial", which makes
me think that the LuaTeX currently in TL23 ("sys") was erroneously
compiled with "-O0" or something.


i'm  bit puzzled now:

so the tl 2023 bin is slower but when you compile fresh the performenac 
is the same? if so ... then why waste time on it if generating a new bin 
solves the problem?


i can indeed imagine different compile flags being used, although i 
wonder if that will give a 50% performance gain .. i did observe small 
differences between gcc versions and a few options but we're talking low 
percentages here



Is the same
fontloader used for every test (which assume stability over 2017 - 2023)?


Yes.


Is the luatex build.sh used or some tex live one?


I used the LuaTeX "build.sh" for all the "20xx" binaries, albeit with a
few tiny patches. Very detailed build instructions here:

https://tug.org/~mseven/luatex.html#binary-details

"2023-initial" would have been built with the standard TL build script.
"sys" was probably built with the TL script, but it's a special ad-hoc
release so something different might have happened.


i'm not sure if there was a reason for that (afaik texlive goes for a 
more conservative older compilation just to make sure it works on older 
os versions)



Benchmark 3: PATH=/tmp/texlive-testing/2023-initial/bin/x86_64-linux:/bin/ 
luatex -ini factorial.tex
  Time (mean ± σ):  3.958 s ±  0.039 s[User: 3.933 s, System: 0.024 
s]
  Range (min … max):3.908 s …  4.031 s10 runs
 
Benchmark 4: PATH=/tmp/texlive-testing/sys/bin/x86_64-linux:/bin/ luatex -ini factorial.tex

  Time (mean ± σ):  5.519 s ±  0.054 s[User: 5.489 s, System: 0.029 
s]
  Range (min … max):5.464 s …  5.589 s10 runs

All 4 tests use the exact same TL23 texmf trees, so the TeX code, Lua
code, and all the other binaries are identical; only the "luahbtex" and
"luatex" binaries are different. The only difference between "2023-
initial" and "sys" binaries should be the socket and debug/popen
patches; they (probably) used the same compilers and the source should
be otherwise identical.


These are as you mentioned irrelevant there.


Yet the current TL23 LuaTeX binary ("sys") is 50% slower than the
initially-released TL23 LuaTeX binary ("2023-initial"), so something
weird is definitely going on. My guess would be that the current LuaTeX
binaries were compiled with -O0 while all the other binaries use -O3, or
something similar. Just a guess though.


Factorial does little (not spread all over token space). Some mem access 
for registers, a little amount of macro tokens that likely sit in the 
cpu cache. Plus making a macro that gets larger body every iteration (so 
that is actually the bottleneck as it involved copying tokens). As you 
start ini tokens are not scattered that much.


So, do you see the same 50 % drop with the current luatex when you 
compile without O3 ?


One thing I can imagine that there is less inlining applied for lua end 
but older compilers (doesn't tl still use gcc 7) were not that 
aggressive in that anyway. (fwiw, link time optimization at most gains a 
few % on tex but that's even more recent.)


In factorial the bottleneck might be likely in \the which (because we're 
an utf engine) has a couple of calls that might benefit from inlining, 
but imo unlikely to make the 50% drop.


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] Identifying why LuaTeX 2023 is slower than LuaTeX 2022

2023-08-30 Thread Hans Hagen

On 8/30/2023 1:16 AM, Reinhard Kotucha wrote:

On 2023-08-29 at 15:38:35 +0200, Hans Hagen wrote:

  > So guess what: as i mentioned before there has been a discussion about
  > portable fmt files (irr a latex request) and the normal luatex build
  > script has --no-dump-share but araiks tex live doesn't do that (any
  > longer) so then we get 0.020 sec per run more (2+ sec on 100 runes).
  >
  > (plain has a very small format so one can wonder if it's the same there
  > as in other macropackages)

Hi Hans,
as far as plain TeX is concerned, I don't think that the endianmess is
the main problem.  As of 2016 TeX Live loads the file UnicodeData.txt
(1.9 MB) into the format file.


i didn't say that it was for plain, although one should realize that a 
format file is compressed so a the real one is a bit bigger .. reports 
about 'preformance here' are mixed: plain (basically only measuring 
start up time) and as well as reports on latex performance drops (which 
involves loading more i guess)



If you remove the line

   \input load-unicode-data.tex

from luatex.ini you probably save much more time than with
--disable-dump-share.


well, i don't have or load that file at all .. we have a plain setup 
right from the beginning of the project that e use; fyi: format file 
size 361K compressed, 1.44 M uncompressed. And that's what i measured with


the fact that you mention some large file being loaded (maybe that's an 
inefficient one indeed) already indicates that we're talking macro 
package specifics here which for me makes firther tests irrelevant



  > Of course there can be other factors in a non context universe like the
  > time needed to load a file database (wasn't there a upper / lowercase
  > checking change recently?), fonts etc.

There are currently more than 200,000 files in texmf-dist.  I've
written a script based on strace(1) which creates a tiny subset of TeX
Live in another directory containing only the files needed to process
that particular document.  It's invoked like this:


I dont' have that many files although plenty fonts.


tinytex luatex myfile.tex

When I compile a plain TeX document this way the tiny TeX distribution
created by my script usually contains less than 30 files.


Plain doesn't need much.


Kpathsea needs a bit more time to locate files within a system
containing 200,000 files than within a system containing only 30 files.
The difference is measurable but negligible.  Kpathsea is extremely
fast.


Good that this has been made better because in the past a full tex live 
installation was good for adding seconds to startup time. SSD's and OS 
file caching probably have been of help ere too.


btw, the reports hereare for linux, windows is a bit different because 
there using a native vs a cross compiled binary make a difference but 
let's not dive into that



  > Then when we talk macro packages,

We shouldn't.  If we are interested in the speed of an *engine* we
have to avoid macro packages and LaTeX by all means.  LaTeX introduces
too many rooms of freedom and is a moving target.


well, now you assume all plains being the same but there is no standard 
plain here and i haven't seen timings that are pure engine based apart 
from startup time



For benchmarking it's best to stick with plain TeX.  It's pretty easy
to create a 500+ pages lorem ipsum with a halfways reasonable text
editor.
Well, that's what I did: 1000 ipsums. But usign a plain that is not that 
different from a few years back. And 0.04 sec start up difference is no 
big deal given possible changes in libraries, maybe kpse, the compiler, 
etc. In the early days of luatex we even noticed that sometimes adding 
somethign to the engine made for a change in performance (cache hits 
which was then solved by upgrading hardward) so ... also a bit of a gamble.


Hans

-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] Why does LuaTeX show the middle dot instead of ano teleia?

2023-08-29 Thread Hans Hagen

On 8/29/2023 8:59 PM, Heiko Oberdiek via luatex wrote:

Hello,

On 2023-08-29 20:29, Joseph Wright wrote:

On 29/08/2023 19:27, Heiko Oberdiek via luatex wrote:


using LuaTeX to review the glyphs of a font, I discovered an oddity 
about U+0387 ANO TELEIA. LuaTeX shows U+00B7 MIDDLE DOT instead.



 \symbol{"00B7}% MIDDLE DOT
 \symbol{"0387}% ANO TELEIA



 From UnicodeData.txt:

 0387;GREEK ANO TELEIA;Po;0;ON;00B7N;

so it looks like it's a simple normalisation.


Start of the UnicodeData.txt format description 
(https://www.unicode.org/reports/tr44/#UnicodeData.txt):

   [0] Code value
   [1] Character name
   [2] General category
   [3] Canonical combining classes
   [4] Bidirectional category
   [5] Character decomposition
   ...

In the LuaTeX manual, I found:

| Normalization of the Unicode input is on purpose not built-in and
| can be handled by a macro package during callback processing.
| We have made some practical choices and the user has to
| live with those.

The TeX input above, however, is plain ASCII. Therefore, any 
normalization of the file contents should not matter.


Of course, I do not want to have any decomposition that replaces
the glyph with a different character. That would make reviewing
the original glyph impossible.
Indeed it's not a luatex issue. Rendering in context gives two different 
glyphs with the mentioned unicodes.


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] Identifying why LuaTeX 2023 is slower than LuaTeX 2022

2023-08-29 Thread Hans Hagen
/ luatex test.tex
1.01 ± 0.05 times faster than 
PATH=/tmp/texlive-testing/2019/bin/x86_64-linux:/bin/ luatex test.tex
1.02 ± 0.04 times faster than 
PATH=/tmp/texlive-testing/2020/bin/x86_64-linux:/bin/ luatex test.tex
1.08 ± 0.05 times faster than 
PATH=/tmp/texlive-testing/2023-initial/bin/x86_64-linux:/bin/ luatex test.tex
1.12 ± 0.06 times faster than 
PATH=/tmp/texlive-testing/2018/bin/x86_64-linux:/bin/ luatex test.tex
1.13 ± 0.08 times faster than 
PATH=/tmp/texlive-testing/2017/bin/x86_64-linux:/bin/ luatex test.tex
1.42 ± 0.06 times faster than 
PATH=/tmp/texlive-testing/sys/bin/x86_64-linux:/bin/ luatex test.tex


The "20xx" binaries are freshly-built versions of the LuaTeX source from
that year, "2023-initial" is the LuaTeX binary initially included with
TL23, and "sys" is the current TL23 LuaTeX binary (there was a mid-year
update this year). All the tests used an up-to-date TL23 tree, with only
the LuaTeX binaries modified. I've attached the full test script.

"2023" (1.01 ± 0.05), "2023-initial" (1.08 ± 0.05), and
"sys" (1.42 ± 0.06) were all compiled from the same sources (mostly) and
with the same compilers (I think?), so I'm not sure why the current
binaries have such a large regression. Compiler flags maybe? I think
that Karl built the current TL23 binaries ("sys"), so he might know
more.

Thanks,
-- Max






--

-----
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] Identifying why LuaTeX 2023 is slower than LuaTeX 2022

2023-08-28 Thread Hans Hagen

On 8/28/2023 5:43 PM, Paulo Roberto Massa Cereda wrote:

Thanks for the feedback, Hans! Apologies for the "vague" context, I
was in a hurry. I will try to provide more details as follows:

My initial test was:

8<
\documentclass{article}

\usepackage{fontspec}

\setmainfont[
 Extension=.ttf,
 Path=./fonts/,
 UprightFont=*-Regular,
 BoldFont=*-Bold,
 ItalicFont=*-Italic,
 BoldItalicFont=*-BoldItalic
]{JetBrainsMono}

\begin{document}

Hello world.

\end{document}


So that is mostly measuring start up time.

There is no real big difference between jetbrain regular and for 
instance dejavu sans mono):


1000 lorum paragraphs (standard 12pt context layout):

% jetbrain : 200 pages 1.4 sec
% dejavu   : 200 pages 1.2 sec

if we just process the font (hbox)

% jetbrain : 0.39 sec (0.15) sec for 200 pages
% dejavu   : 0.36 sec (0.15) sec for 200 pages

in base mode:

% jetbrain : 0.15 sec for 200 pages
% dejavu   : 0.15 sec for 200 pages

(windows 10, luametatex with context, 2017 laptop with Intel(R) Xeon(R) 
CPU E3-1505M v6 @ 3.00GHz)


Just to give you some comparison.


8<

JetBrains Mono was obtained from here: https://www.jetbrains.com/lp/mono/

By "cold cache" (and apologies in advance for probably using the wrong
term for describing), I was referring to deleting

texmf-var/luatex-cache

before each benchmark run. I did this with both TL 2022 and TL 2023.

Then I tried with an actual document. I had Lorem Ipsum paragraphs
replicated more than 1000 times:

➜ wc -l test.tex
2499 test.tex

➜ hyperfine -L year 2022,2023 --warmup 2
'texlive/{year}/bin/x86_64-linux/lualatex test.tex'
Benchmark 1: texlive/2022/bin/x86_64-linux/lualatex test.tex
   Time (mean ± σ):  2.881 s ±  0.021 s[User: 2.771 s, System: 0.108 s]
   Range (min … max):2.848 s …  2.907 s10 runs

Benchmark 2: texlive/2023/bin/x86_64-linux/lualatex test.tex
   Time (mean ± σ):  5.814 s ±  0.059 s[User: 5.675 s, System: 0.128 s]
   Range (min … max):5.760 s …  5.910 s10 runs


hm, so startup time differences can be neglected


Summary
   texlive/2022/bin/x86_64-linux/lualatex test.tex ran
 2.02 ± 0.03 times faster than
texlive/2023/bin/x86_64-linux/lualatex test.tex

  For HarfBuzz, I had the OTF font example above with:


That could be changes in that library, right?


\setmainfont[
 Extension=.ttf,
 Path=./fonts/,
 UprightFont=*-Regular,
 BoldFont=*-Bold,
 ItalicFont=*-Italic,
 BoldItalicFont=*-BoldItalic,
 Renderer=HarfBuzz
]{JetBrainsMono}

For the normalize thing one, I had:

\setmainfont[
 Extension=.ttf,
 Path=./fonts/,
 UprightFont=*-Regular,
 BoldFont=*-Bold,
 ItalicFont=*-Italic,
 BoldItalicFont=*-BoldItalic,
 RawFeature=-normalize
]{JetBrainsMono}

Marcel suggested me to update TL2022 luaotfload-tool to the latest
version as well, so I did:

➜ texlive/2022/bin/x86_64-linux/luaotfload-tool --version | grep
"luaotfload-tool version"
luaotfload-tool version: "3.24"

➜ texlive/2023/bin/x86_64-linux/luaotfload-tool --version | grep
"luaotfload-tool version"
luaotfload-tool version: "3.24"

➜ hyperfine -L year 2022,2023 --warmup 2
'texlive/{year}/bin/x86_64-linux/lualatex test.tex'
Benchmark 1: texlive/2022/bin/x86_64-linux/lualatex test.tex
   Time (mean ± σ): 689.6 ms ±   2.0 ms[User: 588.9 ms, System: 100.1 
ms]
   Range (min … max):   687.3 ms … 693.6 ms10 runs

Benchmark 2: texlive/2023/bin/x86_64-linux/lualatex test.tex
   Time (mean ± σ):  1.264 s ±  0.007 s[User: 1.151 s, System: 0.112 s]
   Range (min … max):1.254 s …  1.277 s10 runs

Summary
   texlive/2022/bin/x86_64-linux/lualatex test.tex ran
 1.83 ± 0.01 times faster than
texlive/2023/bin/x86_64-linux/lualatex test.tex

I also tried with \usepackage[T1]{fontenc}:

➜ hyperfine -L year 2022,2023 --warmup 2
'texlive/{year}/bin/x86_64-linux/lualatex test.tex'
Benchmark 1: texlive/2022/bin/x86_64-linux/lualatex test.tex
   Time (mean ± σ): 457.4 ms ±   3.1 ms[User: 380.7 ms, System: 76.3 ms]
   Range (min … max):   452.8 ms … 463.6 ms10 runs

Benchmark 2: texlive/2023/bin/x86_64-linux/lualatex test.tex
   Time (mean ± σ): 893.9 ms ±   8.4 ms[User: 805.5 ms, System: 88.0 ms]
   Range (min … max):   886.8 ms … 915.7 ms10 runs

Summary
   texlive/2022/bin/x86_64-linux/lualatex test.tex ran
 1.95 ± 0.02 times faster than
texlive/2023/bin/x86_64-linux/lualatex test.tex

I can create a GitHub repository with some tests for reproducibilty,
if it helps.

I will try to compile the sources. Will report soon.
best try that first because it might be that you have a subtoptimal 
binary or get bins from


https://dl.contextgarden.net/build/luatex/

(these are generated in the tex live setup so basically for older 
linuxes in order to be compatible)


Hans

-----

Re: [luatex] Identifying why LuaTeX 2023 is slower than LuaTeX 2022

2023-08-28 Thread Hans Hagen

On 8/28/2023 4:49 PM, Paulo Roberto Massa Cereda wrote:

Dear friends,

I beseech your wisdom. :) In my tests, I noticed LuaTeX 2023 is
significantly slower than the 2022 counterpart. Here's a MWE:

8<
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna
aliqua. Ut enim ad minim veniam, quis nostrud exercitation
ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit
esse cillum dolore eu fugiat nulla pariatur. Excepteur
sint occaecat cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.
\bye
8<

Benchmark 1: texlive/2022/bin/x86_64-linux/luatex test.tex
   Time (mean ± σ): 177.7 ms ±   1.4 ms[User: 140.9 ms, System: 36.6 ms]
   Range (min … max):   175.8 ms … 180.3 ms16 runs

Benchmark 2: texlive/2023/bin/x86_64-linux/luatex test.tex
   Time (mean ± σ): 286.7 ms ±   2.3 ms[User: 246.1 ms, System: 40.3 ms]
   Range (min … max):   284.0 ms … 291.2 ms10 runs

Summary
   texlive/2022/bin/x86_64-linux/luatex test.tex ran
 1.61 ± 0.02 times faster than texlive/2023/bin/x86_64-linux/luatex test.tex

I initially had tried with LuaLaTeX and got similar results. Ulrike
Fischer and Marcel Krüger kindly helped me find out what was going on,
but we could not find anything relevant. I've tried:

- loading a custom OTF font: 2022 was 1.53 ± 0.02 (cold cache) and
1.83 ± 0.03 (existing cache) times faster
- loading the default OTF font: 2022 was 1.68 ± 0.02 (cold cache) and
2.02 ± 0.03 (existing cache) times faster
- HarfBuzz: 2022 was 1.57 ± 0.02 times faster (cold cache) and 1.84 ±
0.05 times faster (existing cache)
- raw feature (normalize): 2022 was 1.81 ± 0.03 times faster
- updated luaotfload-tool in TL2022: 2022 was 1.83 ± 0.01 times faster
- no OTF: 2022 was 1.95 ± 0.02 times faster

I was wondering if someone could shed some light into this. :) Thanks!
I can't test it here (no 2022 installed and i;m also nto going to fdo 
that) but you need to define


- cold cache
- custom OTF
- default OTF
- raw feature

as i have no clue what that means here. Also, a simple single paragraph 
is no real test. How does it look for 50 pages. What for a more complex 
document of say 300 pages. In a simple test startup time (whatever needs 
to be initialized) kicks in. So, what if no fonts are loaded ta all. If 
you use plain, what fonts gets loaded by default.


I can't really compare (laptop windows 10) but using my plain version it 
needs .5 sec for a single li, 1.2 seconds for 100 pages of it, 4.48 
seconds for 500 pages.


As a comparison, context needs 5.2 sec for 500 pages in node mode and 
3.65 sec in base mode (luametatex: 5.15 and 3.45 with 1.6 sec spend in 
backend). Anywway over 80 pps for a simpel document like this which will 
of course drop down to 20-40 pps for a more complex document.


Some 0.1 seconds loss between versions is not dramatic unless if 
multiplies with largers runs.


(btw, i'd expect windows to be a bit more sensitive between years 
depending on native vs crosscompiled and tighter mem protection between 
versions.)


So, a question is: what is the fontless baseline? What if you compile a 
binary for your system yourself? Is this only plain or also other macro 
packages?


Another thing I can think of is that format loading time changed maybe 
because of a change in endian related storage. I remember some 
discussion about that (portable formats) and on intel byte swapping then 
kicks in. But again that's only startup time related.


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] Unicode support

2023-08-13 Thread Hans Hagen

On 8/13/2023 1:12 PM, Werner LEMBERG wrote:



Do you use the `uniParse.tcl` script, probably

   
https://opensource.apple.com/source/tcl/tcl-14/tcl/tools/uniParse.tcl.auto.html 
 ,

to regenerate `slnudata.c` as soon as a new Unicode version gets
published?


Having a closer look it seems that this file only handles the BMP
(i.e., the range U+ to U+), which is, well, not optimal.
Wouldn't it make thus sense to switch to an actively maintained UTF-8
library like 'luautf8'?

   https://github.com/starwing/luautf8
basicallty the slunicode is obsolete (but kept) ... one can just write 
lua helpers instead (the are are some traversers etc in the string 
namespace)


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] runtime performance of lua(la)tex incl. gprof

2023-06-22 Thread Hans Hagen
tex). I occasionaly run a test that goes from 
simple to more complex that make run luametatex faster than pdftex. So, 
don't stare too much at these numbers.


Over the last few years I managed to speed up quite some parts of the 
engine but measuring significant gain in some specific area by running 
millions of iterations becomes less impressive once the real number of 
expansions (etc) is in the tens of thousands or less. Say that 
luametatex is some 50% faster than luatex (of which we loose 10-15% on 
the more advanced and demanding backend). That is nice but inefficient 
macro code (could come from anywhere) quickly makes that gain go away 
because it touches those improved areas less than some core expansion 
stuff.


So, to come back to your 8 times slower luatex than pdftex ... if 
context on simple files is < 2.5 times slower and often less, then you 
need to look at the macros (or lua code if used) and not at the engine. 
There's little to gain there.


(You can try to run with luajiitex which has a faster virtual machine.)

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] runtime performance of lua(la)tex incl. gprof

2023-06-21 Thread Hans Hagen

On 6/21/2023 1:41 PM, Axel Kittenberger wrote:

On Wed, Jun 21, 2023 at 12:04 PM Taco Hoekwater  wrote:


Using dummy data doesn’t help I believe, because you then need to check
the memory pointer against the dummy pointer instead of against NULL. That
should not be helpful.



Albeit the idea was to be branch less in the function that is a bottleneck,
you are right, I threw something like this together (buggy, it will crash
on an unknown symbol) albeit it compiled the test that on the other hand
did indeed not help at all, even very slightly slower (likely due to
initialization time of the dummy variables)
I checked the HIGH being array patch and indeed it gains max about 1% on 
the pure text ipsum test; in practice pages are of course not that full 
so i see no real gain on a regular 400 page document. Other 
optimizations probaly make that it's not more, but in a more realistic 
test it's also more likely that there are more cache misses anyway if 
the cache is sort of small).


But even little gain is some gain so I suppose we can patch it indeed.

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] runtime performance of lua(la)tex incl. gprof

2023-06-20 Thread Hans Hagen
ring. I bet that the 350 page manual 
will be processed in 4 seconds, which is ok for me.


ps. context spends > 50% of its tiem in lua so it gets harder to gain 
runtime in the fontend (but it does happen)


ps. don't take context as reference; i can imagine that latex and plain 
process faster, you mention latex:


   pdflatex: user 0m 1.920s (3.1 MB result)
   lualatex: user 0m17.565s (3.8 MB result)

   8 times slower.

which is indeed somewhat suspicious, but it might be due to 8 bit vs 32 
bit input and fonts. More realistic is the plain comparison:


   pdftex: user 0m1.053s (2.9 MB result)
   luatex: user 0m1.943s (3.1 MB result)

but i guess both use 8 bit fonts here (luatex in base mode) so .9 sec 
for an engine with more features is okay i think, esp given that 
machines got a bit faster since latex showed up.


Hans


-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] asking for style/code critique

2022-12-18 Thread Hans Hagen

On 12/18/2022 8:58 PM, Werner LEMBERG wrote:


attached is my first serious attempt to write a 'hyphenate' callback
(to be used for LilyPond's Texinfo documentation).  It is actually my
first Lua program ever.  Fortunately, initial tests indicate that it
works aas expected :-)

Please have a quick look and check whether you can see any big
problems – mainly stylewise (besides using two spaces for indentation
instead of three) but perhaps codewise, too.
it's not my habit to comment publicly on coding styles so i limit myself 
to suggesting to make your shorthands local, so


local DISC = node.id("disc")

just to avoid a clash with someone elses code

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] adjusting `hyphenate` callback for single function

2022-12-08 Thread Hans Hagen

On 12/8/2022 11:10 AM, Werner LEMBERG wrote:


Folks,


being a complete newbie who wants to experiment a little with LuaTeX,
I ask for some help to improve `texinfo.tex` for generating the
documentation of LilyPond.

Within the `@code` macro, I want to do two things.

(a) Hook into the `hyphenate` callback to insert hyphenation
 discretionaries based on some constraints.

(b) Hook into `pre_linebreak_filter` (or rather `contribute_filter`?)
 to inhibit line breaks before/after a single character if
 preceded/followed by a (single) space.  In other words, this space
 should be replaced with `@tie{}`.

How can I do that?  I guess I have to do the following.

(1) Write hooks `code_hyphenate` and `code_pre_linebreak_filter`.

(2) At the beginning of `@code`, append the `code_hyphenate` hook.

(3) At the end of `@code`, remove the `code_hyphenate` hook.

Assuming this is the correct approach I have some questions.

* It's not written explicitly in the LuaTeX manual but I guess that
   the argument to the `hyphenate` hook is a single word.

* I couldn't find information in the manual about the order of hooks.
   It seems that I have to save the original hook by myself and call it
   before/after, right?

* Is there already some code available that I could use as a starting
   point?
talking of luatex, if you want to influence hyphenation, you can best 
plug into the pre_linebreak_filter and hpack_filter callbacks and just 
do some interpretation of the node lists (glyph nodes) and bases on what 
you want to do inject or replace them by disc nodes that then control 
the hyphenator later on as we as the linbebreak routine


it's all up to the macro package how to organize and order things and 
they all do things differently


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] Upgraded luatex sub/super scripts

2022-11-19 Thread Hans Hagen

On 11/19/2022 11:20 PM, Orders wrote:

I run luatex for plain TeX using the miktex installation under linux
mint 21.  I upgraded the miktex installation today, and single-character
subscripts and superscripts cause an error.


For example, $x_1$ produces the following error

! \textfont-1 is undefined (character 49).
l.2 The incorrect variable is $x_1$

Enclosing the sub/superscript in braces ($x_{1}$) fixes the problem.

A minimal test program is as follows:

\input luaotfload.sty
The incorrect variable is $x_1$.  The correct variable is $x_{1}$.
\end

Is there any change which might cause this?  Is this a miktex problem?

maybe you're using the experimental version, can you try

\variablefam=-1 % or 99 or so

if that works, you just have to wait till a pending 'default value' 
patch bubbles up


Hans



-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] token.expand: how to use it?

2022-10-23 Thread Hans Hagen

On 10/23/2022 12:34 AM, Eduardo Ochs wrote:

Hi list,

how do I use the function token.expand, that is (very) briefly
mentioned here?

   http://mirrors.ibiblio.org/CTAN/systems/doc/luatex/luatex.pdf#page=224

I've put this in a .tex file

   \def\foo{FOO}
   \def\bletch#1{BL#1ETCH}

and entered a Lua REPL. All the "print"s below give meaningful
results,

   print(token.get_macro("foo"))
   print(token.get_macro("bletch"))
   print(token.get_meaning("foo"))
   print(token.get_meaning("bletch"))
   print(token.create("bletch"))
   print(token.create("bletch").expandable)

and the last one says "true"... but I've tried to call token.expand
with several kinds of arguments, and in all cases I got this error,

   ! Undefined control sequence.
   \repl ->\directlua { print(); run_repl2_now() }

where \repl is the tex macro that I use to run my Lua REPL...
token.expand is rather useless and a left-over from earlier mechanisms 
that have been replaced (it's now kind of hard to make an example).


Anyway, you can best push a token into the input with tex.print:

\def\bletch#1{BL#1ETCH}

\directlua{tex.print(token.create("bletch"))}x
\directlua{tex.print(token.create("bletch"))}{xxx}xx

Hans

-----
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-



Re: [luatex] extreme slow down using pre release TL 2022 with lualatex

2022-03-18 Thread Hans Hagen

On 3/18/2022 1:11 AM, Nasser M. Abbasi wrote:


After adding more problems, I see the slow down.


could be a memory issue (if lua is used, maybe the garbage collector)


May be because I am running this inside Ubuntu under windows 10
Linux subsystem this slow down happens. I do not know. I only have
one PC now. I need to go buy a new PC and install
Linux on it and try again to see if this is the cause.

should be no problem, but it can depend on what vatiant you use:

wls 1 : fast disk access accross os boundaries
wls 2 : faster disk access inside vm, much slower across

(so if you access files via /mnt/c then use a wls 1 machine)

but it is more likely that the slowdown relates to (your) macros or 
approach to the problem because there have been no changes in the engine 
that affect memory, node / token comsumption, or logging, but it can be 
you just cross some critital border now.


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Processing every letter in math

2022-03-15 Thread Hans Hagen

On 3/15/2022 6:16 PM, Javier Bezos wrote:

How can I process every single letter in math? Processing
glyph nodes in text is more or less trivial, and to some
extent even in-line math (surrounded by 'math' type nodes),
but processing math, including displays, sub/superscripts,
fractions, radicals. etc.? From the manual I think the
solution must be the node.mlist_to_hlist and the
corresponding callback, and this is what I’ve achieved
so far:

---
\directlua{

function math_tweak(head, d, p)
   head = node.mlist_to_hlist(head, d, p)
   for item in node.traverse(head) do
     if item.id == node.id'glyph' then
   item.char = 88
     end
   end
   return head, d, p
end

callback.register('mlist_to_hlist', math_tweak)

}

$a$ ${b\over c}$ $\sqrt{d}$ $e$

$$f$$

\bye


Here, only ‘a’, ‘e’ and ‘f’ are converted to ‘X’ (= 88).

With a branch for item.id == 0 calling recursively the
function sub/superscripts are processed, but that’s not
quite correct.
just print(item.id) and you'll probably see vlists as well so you need 
to also process item.id == 1; you might also run into composed shapes 
(extensibles) .. it also depends onm what a macro package does (\frac, 
\sqrt which is not standardized and therefore less predictable)


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] extreme slow down using pre release TL 2022 with lualatex

2022-03-14 Thread Hans Hagen

On 3/14/2022 10:07 AM, Nasser M. Abbasi wrote:


What can I do to find why it is taking so long? or
what is it doing?

- first check if it's a latex issue
- compare smaller files (with \tracingall)

afaik there have not been fundamental changes in the engine

Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] portable fmt between 32-bit/64-bit machines

2022-02-08 Thread Hans Hagen

On 2/8/2022 3:23 AM, Michal Vlasák wrote:

On Tue Feb 8, 2022 at 1:58 AM CET, Michal Vlasák wrote:

I posted a few more details on github:

https://github.com/latex3/latex2e/issues/775#issuecomment-1032099002


An update to the proposed patch (more apparent error checking, correct
for both LuaJIT/5.1 and 5.3). And hopefully a solution to the
portability issue at hand (at least for some architectures).

https://github.com/latex3/latex2e/issues/775#issuecomment-1032138520

Hans, Luigi, what do you think?


as we have a patched lib in luatex ayway it's ok i guess


Of course there is a runtime cost, though negligible and it brings space
savings, which is a trade off the Lua guys have already made in 5.4.


hard to say it it saves something


I also think that the byte swapping for big/little endian could be done.
If the saved format is little endian there should be no cost for most
architectures (unlike the dump sharing on TeX side).


less important i think (and it has a quite a penalty on native windows 
binaries but maybe that's because of these allocations not being optimized)



Kind of related:

https://mailman.ntg.nl/pipermail/dev-luatex/2021-July/006501.html
whatever we do it should be a compile time option (if only to not give 
people yet another reason to complain that luatex is so slow)


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] portable fmt between 32-bit/64-bit machines

2022-02-07 Thread Hans Hagen

On 2/7/2022 9:15 AM, luigi scarso wrote:

Hi,
here
https://github.com/latex3/latex2e/issues/775 
<https://github.com/latex3/latex2e/issues/775>

there is an issue about the portability of the lua bytecode .
Suggestions ?
In any case, the patches are  not for TeXLive 2022 .
Even with a patsh (not sure which one i only see akira mentioning the 
size of a blob) it is not guaranteed to be portable and formats are 
already not portable (endian wise) so why bother ... just make a new 
format, right? We've dealt with it for > 15 years so ... anyway, Karl 
doesn't mind if nothing changes so we're not really in a hurry ... 
indeed not for upcoming.


Btw, can't windows 32 be compiled with 64 bit pointers? Does the same 
issue occur with the mingw binaries?


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Bad skips with display math

2022-01-13 Thread Hans Hagen

On 1/13/2022 5:46 PM, Javier Bezos wrote:



 It didn’t occur to me to think of the old good bidi features
of etex and I was just playing with \...dir’s. After some simple
tests it seems to do the trick.


After some additional tests, setting \predisplaydirection=0
doesn’t seem the solution .

Any additional suggestions on how to make luatex usable for
serious math typesetting with RTL languages will be welcome.
This includes the possibility of manipulations at the lua
level (for example, I’m wondering if something similar to the
macro \lastlinelength in sec. 7.7.2 can be used to catch and
fix the spacing).

I'm just looking for a solution. After tree weeks of fighting
I’m somewhat desperate.
you can use callbacks to loop back overlists and see if some above glue 
was added and then go back till you find a line that you can then 
'repack' for testing to get the natural width


anyway, i have a patch for the r2l last line issue but as i'm in the 
middle of something else it has to wait for a few days to get included 
(maybe the weekend as i have to enter luatex mode)


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Bad skips with display math

2022-01-13 Thread Hans Hagen

On 1/13/2022 12:20 PM, luigi scarso wrote:



On Thu, Jan 13, 2022 at 11:58 AM Javier Bezos <mailto:jbez...@gmail.com>> wrote:



Frozen— So I presume the following bug won’t get fixed and
we have to find some workaround? 

https://tug.org/pipermail/luatex/2019-April/007126.html
<https://tug.org/pipermail/luatex/2019-April/007126.html>


We fix bugs and make small improvements, but stability of the engine is 
the main goal and it's possible that some bugs become features --- the 
issues should be managed at the format level.
and if we fix it might only be 'active' when a mode parameter is set 
(likely in this case) in order not to break something out of our scope


Hans


-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Bad skips with display math

2022-01-12 Thread Hans Hagen

On 1/12/2022 7:00 PM, Javier Bezos wrote:

Hello,

Consider the following document, typeset with 1.13.2 (TeX Live 
2021/W32TeX):


==

Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
$$(a+b)^2 = a^2+2ab+b^2 \eqno (1.10)$$
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
$$(a+b)^2 = a^2+2ab+b^2 \leqno (1.10)$$
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

\textdir TRT \pardir TRT \bodydir TRT \pagedir TRT

Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
$$(a+b)^2 = a^2+2ab+b^2 \eqno (1.10)$$
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
$$(a+b)^2 = a^2+2ab+b^2 \leqno (1.10)$$
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

\bye


\predisplaydirection=0 might help here (defaults to -1)

that code is very unlikely to be changed (already tricky enough)


===

The result is http://www.texnia.com/archive/luamathskip.pdf

As you can see, the labels are reversed, but the calculations for
the vertical skips, based on the previous line, are not. Either
the labels shouldn’t be reversed (which makes sense for me,
because conceptually we are in math and not in a paragraph) or
the calculations have to be fixed.

Please, fix also the long standing bug which misplaces the label
with leqno when \pardir (but no other dirs) is set to TRT.
I'll look into it (in luatex maybe under some flag/mode control because 
we are frozen).


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] luatex doesn't know `TFMFONTS` environment variable

2022-01-09 Thread Hans Hagen

On 1/9/2022 10:49 AM, Werner LEMBERG wrote:


It seems that luatex only knows how to handle the `TEXFONTS` environment
variable for TFM files but not `TFMFONTS` – I did a test adding a
directory tree with

   TFMFONTS=".../foo/bar//;"

and the TFM files were not found.  Doing the same with `TEXFONTS` it
works as expected.

If this is intentional please document it.

\startitem
When kpathsea is used to find files, \LUATEX\ uses the \type {ofm} file
format to search for font metrics. In turn, this means that 
\LUATEX\ looks at
the \type {OFMFONTS} configuration variable (like \OMEGA\ and 
\ALEPH) instead
of \type {TFMFONTS} (like \TEX\ and \PDFTEX). Likewise for virtual 
fonts
(\LUATEX\ uses the variable \type {OVFFONTS} instead of \type 
{VFFONTS}).

\stopitem


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] [Dev-luatex] features request 2

2021-09-02 Thread Hans Hagen

On 9/2/2021 3:15 PM, Linas Stonys wrote:

Hi,
from time to time I face the problem where I need to do something inside 
tex group (simple group or box or table cell)

or to mark up some content with tex4ht tags ( or tagging pdf )
and sometimes \aftergroup is not enough.   So I would like to ask about 
new token or token list which would

be inserted before current group end (\beforeendgroup).


EXAMPLE:
\showboxdepth=\maxdimen
\showboxbreadth=\maxdimen
b
\hbox{x\aftergroup d  x}
c
\showlists

LOG OUTPUT:
\tenrm b
\glue(\spaceskip) 3.3 plus 1.6 minus 1.1
\hbox(4.30554+0.0)x13.88893, direction TLT
.\tenrm x
.\glue(\spaceskip) 3.3 plus 1.6 minus 1.1
.\tenrm x
\tenrm d
\glue(\spaceskip) 3.3 plus 1.6 minus 1.1
\tenrm c


COULD BE:
b
\hbox{x\beforeendgroup d  x}
c
\showlists

LOG OUTPUT:
\tenrm b
\glue(\spaceskip) 3.3 plus 1.6 minus 1.1
\hbox(4.30554+0.0)x13.88893, direction TLT
.\tenrm x
.\glue(\spaceskip) 3.3 plus 1.6 minus 1.1
.\tenrm x
.\tenrm d
\tenrm c

Of course here needs to think about several same macros how they should 
queue the content.

In luametatex we have

\starttext

{[test\atendofgroup{]}

\stoptext

(plus more)

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Adding a callback before trailing spaces are removed from a line of input

2021-08-29 Thread Hans Hagen

On 8/29/2021 6:38 PM, Vítek Novotný wrote:


And yet, you are included in the copyright line of the Markdown package,
since you contributed to the lunamark parser:

 $ docker run --rm -i witiko/markdown markdown-cli -v
 markdown-cli.lua (Markdown) 2.10.0-64-ge9b5180
 Copyright (C) 2009-2016 John MacFarlane, Hans Hagen
 Copyright (C) 2016-2021 Vít Novotný
 License: LPPL 1.3c
sure, i remember that i helped make that one faster and more efficient; 
i probabloy have some code laying around from when i looked into it but 
when simethign is supporte din conetxt that doesn't mean i'm a user 
myself (probably most of what i add to context etc is not used by 
myself, it's often about challenges and fun)


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Adding a callback before trailing spaces are removed from a line of input

2021-08-29 Thread Hans Hagen

On 8/29/2021 12:43 AM, Vítek Novotný wrote:

Hello all,

in Knuth's TeX, trailing spaces are removed very early on when a line is
being put to the input buffer. [1]  According to Eijkhout's TeX by
Topic, this is because "these spaces are hard to see in an editor" [2].

  [1]: https://texdoc.org/serve/tex.pdf/0#page=15
  [2]: http://mirrors.ctan.org/info/texbytopic/TeXbyTopic.pdf

I develop and maintain the Markdown package [3] for plain TeX, ConTeXt,
and LaTeX. The package makes it possible to use the lightweight markup
of markdown [4] in TeX documents. In markdown, a hard line break can be
inserted by ending a line with two or more spaces. However, since
trailing spaces are removed by TeX, hard breaks are only recognized when
we' are inserting an external markdown file, not when markdown is typed
in the top-level document. This deficiency is known and documented [5],
but I am hoping we could resolve it with LuaTeX.


i wonder how this double spaces works out in practice, for instance one 
needs to 'visualize' them in the editor so see them and also make sure 
that the editor is not in 'prune space at the end of line' mode



  [3]: https://github.com/witiko/markdown
  [4]: https://daringfireball.net/projects/markdown
  [5]: https://mirrors.ctan.org/macros/generic/markdown/markdown.pdf#page=20

In LuaTeX, the `process_input_buffer` callback [6] can be used to
intercept the text coming *out* of the input buffer. However, the
trailing spaces have already been removed by this point.

By adding a callback right after a line has entered the input buffer
[1], we could either replace the trailing space characters with tabs,
or place a character such as the zero-width non-joiner (U+200C) to the
right of the trailing spaces.


i fear that this will introduce a performance hit


  [6]: https://www.pragma-ade.com/general/manuals/luatex.pdf#page=176

Is this something you would consider---if not for LuaTeX then perhaps
for LuaMetaTeX?
for a while i had a endofline handler but removed it because i never 
used it as it made no sense (it's all too unpredictable) so i removed it
(i wanted to backport it but it doesn't really fit in now that luatex 
also has some special \par handling added)


concerning context, how is \startmarkdown defined? I'm pretty sure that 
this issue can handled without adding callbacks (i never needed/use(d) 
markdown myself so i can only guess here) but we can discuss the needs 
off-luatex-list


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] how many bytes for fontdimens?

2021-08-07 Thread Hans Hagen

On 8/6/2021 3:23 AM, Reinhard Kotucha wrote:

On 2021-08-05 at 10:01:18 +0200, jfbu wrote:

  > I had accidentally TEXMFCNF set in my environment (from previous tests).
  >
  > Testing some more with this environment variable of TeXLive unset, I
  > get completely different results.

Hello Jean-François,
as Hans already pointed out, the size of the texmf tree matters.  IMO
it's not a good idea to change TeX Live's configuration.  You can't
achieve a significant improvement this way.  And please avoid
TeX-related environment variables whenever possible.
of course with ssd's and plenty of memory (so the OS can cache lost of 
files) things became less an issue the last decade


Hans

-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] how many bytes for fontdimens?

2021-08-05 Thread Hans Hagen

On 8/5/2021 10:01 AM, jfbu wrote:



Le 5 août 2021 à 09:50, jfbu mailto:j...@free.fr>> a 
écrit :


% pdftex
with (total): 5.34964pt
without (total): 5.45686pt
with minus without: -0.10722pt

with (total): 5.55394pt
without (total): 5.61536pt
with minus without: -0.06142pt

with (total): 5.35501pt
without (total): 5.3827pt
with minus without: -0.0277pt

% luatex
with (total): 6.08586pt
without (total): 6.30107pt
with minus without: -0.21521pt

with (total): 6.06969pt
without (total): 6.35312pt
with minus without: -0.28343pt

with (total): 6.11238pt
without (total): 6.37778pt
with minus without: -0.2654pt



I had accidentally TEXMFCNF set in my environment (from previous tests).

Testing some more with this environment variable of TeXLive unset, I
get completely different results.

% pdftex
with (total): 5.46095pt
without (total): 5.43422pt
with minus without: 0.02673pt

% luatex
with (total): 6.10898pt
without (total): 6.03908pt
with minus without: 0.0699pt


And re-setting it (to "$(pwd)") I reproduce again typically the above 
results...


i.e. with TEXMFCNF set its is advantageous to use \romannumeral,
both with pdftex and with luatex.

with TEXMFCNF not set, it is disadvantageous...

... that’s highly baffling to me !

I repeated quite a few times to make sure timings are consistent,
it seems with luatex it is, with pdftex less so

ok, I don’t want to induce people to lose time on ill-conceived test


one thinkg that you need to keep in mind is the size of the texmf tree

in context we don't use kpse so we don't suffer much from the size of 
the tree (it was actually oen of the first thing i did in mkiv/luatex, 
some 15 years ago: replace kpse with a lua variant)


when you use kpse (pdftex, luatex etc with latex) these lsr files are 
loaded and hashed, that takes time


then, when you add a local tree (e.g. home) that one gets scanned every run

so, in a tex live setup: the more files, the longer the start up time

a context installation (from the garden) is rather small compared to 
texlive, and an lmtx installation even smaller (most is documentation 
and fonts) .. over the year i've spend a lot of time making the user 
experience such that on the average a run is ok (on my 2013 laptop lmtx 
start up time is < .5 sec which includes dealing with all (some) 500 lua 
modules


i have no clue about latex overhead as i never run that (one of the 
persistent naratives is that latex is way faster than context but i'm 
not really sure about that)


most tex performance test are rubish ... one can test "test\par" or 
"test\page" a thousand times but what does it say ...


\starttext \testfeatureonce{1000}{test\par} \stoptext

system  > feature test done: 1000 steps, 0.090 seconds, 
0.89939 per step
mkiv lua stats  > runtime: 0.542 seconds, 25 processed pages, 25 shipped 
pages, 46.125 pages/second



\starttext \testfeatureonce{1}{test\par} \stoptext

system  > feature test done: 1 steps, 0.907 seconds, 
0.90726 per step
mkiv lua stats  > runtime: 1.370 seconds, 244 processed pages, 244 
shipped pages, 178.067 pages/second


\starttext \testfeatureonce{1}{\null\page} \stoptext

system  > feature test done: 1 steps, 12.509 seconds, 
0.001250932 per step
mkiv lua stats  > runtime: 13.487 seconds, 1 processed pages, 1 
shipped pages, 741.441 pages/second


\starttext \testfeatureonce{1000}{\samplefile{tufte}\page} \stoptext

system  > feature test done: 1000 steps, 4.290 seconds, 
0.004290208 per step
mkiv lua stats  > runtime: 4.815 seconds, 1000 processed pages, 1000 
shipped pages, 207.703 pages/second


now all this is not realistic: kick in some color, fonts, graphics, 
advanced structuring and in the end 20-30 pps is what one gets


(maybe on a modern machine twice that)

add some tikz and one can drink a coffee

in the end using an ssd, more memory (caching) had more impact than cpu 
bosts


(ok, i made the luametatex mem footprint much smaller than the luatex 
one but that's more for raspbery pi and such ... on that we run some 3 
times slower than on my intel laptop .. more a pet project)


Hans


-----
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] how many bytes for fontdimens?

2021-08-03 Thread Hans Hagen

On 8/3/2021 3:26 PM, jfbu wrote:


Le 3 août 2021 à 12:49, Hans Hagen  a écrit :

On 8/3/2021 9:45 AM, jfbu wrote:

Hi,

Le 3 août 2021 à 09:09, Hans Hagen mailto:j.ha...@xs4all.nl>> a écrit :

On 8/2/2021 9:37 PM, jfbu wrote:

forgot to mention that I am aware a \fontdimen is limited to 2**30 strictly 
anyhow
but my question is whether such « arrays » are stored 32bits or 64bits itemwise

it happens to be an array of 32 bit integers (that grows on demand) but such 
implementaiton details are unspecified (could as well have been a sparse array 
in which case each entry that is actually set has more

also, the fact that it grow is a sort of side effect of the fact that tfm fonts 
can have 7 or more, but 7 are used, for text upto more for math fonts

so, i wouldn't rely on these properties too much

Thanks!
Reason I asked is because I contributed an Eratosthenes Prime sieve to a github 
site comparing a whole bunch of langages and it is asked there to specify 
whether the « arrays » use 1bit, 8bits, 32bits, or 64bits (or « unknown ») per 
(potential) prime.
https://github.com/PlummersSoftwareLLC/Primes/blob/drag-race/CONTRIBUTING.md#flag-storage
 
<https://github.com/PlummersSoftwareLLC/Primes/blob/drag-race/CONTRIBUTING.md#flag-storage>


ha, i've seen that one a few weeks ago (after some yt video) and to be honnest 
it's one of these useless speed comparisons (can be fun, but useless as one 
compares languages with different objectives and doing some prime stuff is 
hardly representative for usage)



It seems the C++ solutions listed as fastest do quite a part of the job at 
compile time not at execution time...


that's why comparing script languages and compiled ones makes no sense 
(one can argue for jit but in e.g. luatex that doesn't pay of)



The chosen topic (basic sieve and possibly some refinements like « wheel » 
algorithms) seems to test the capacity of the language to access successive 
memory addresses in the most efficient way and/or perhaps to move bit patterns 
around.


but that is not made clear -)


Also, it stops at having the array done, but does not address how you use this 
after the fact.


indeed


If the final aim is to print out primes to a file the bottleneck may be rather 
into the conversion of the bit pattern into explicit ascii bytes...


then lua might perform better because it's all about strings (although 
printing numbers is a bad example as they all differ here)



For example, if I produce using lualatex a pdf file of all primes less than 
999,999,999, most of the time by far is consumed by the typesetting phase 
(something like 1350 pages, 10 columns per page)


sure because quite some processing kicks in there (fonts, backend, 
columnization) but also the way that big array gets serialized


but 1350 pages for 999.999.999 is not that bad because here 10.000.000 
in 12 columns and an 8pt monospaced font takes 955 pages and indeed some 
runtime (using context that is, latex is supposed to be faster)



This is what I do except for one auxiliary array in the « wheel » because I 
felt it would be cheating to allocate the 480 slots to start with. I could 
however have allocated say 1155 = 2310/2 (2310 = 2 * 3 * 5 * 7 *11) slots, 
wasting some.


even in lua you'd waste them and if oen goes hash there is more overhead 
(linked list) than in an array



But this is executed only once anyhow, at loading time of the « sieve library » 
independently of the number of passes done during benchmark. Maybe I will do 
the change for some small gain.


I will thus modify the « bit count » tag of my « solution » from unknown to 
32bits, thanks to your answer, knowing though that this remains officially 
unspecified. But the Dockerfile which I was asked to include, and which their 
benchmarking uses, pulls a texlive-minimal based image dating back to 2018.
Perhaps someone here will be interested into contributing a genuine luatex 
(i.e. using Lua) solution (my code uses only Knuth TeX; there is also a LaTeX3 
code also on the github site).


a lua solution in luatex is just a lua solution -)


There is already at least one Lua contribution. I don’t know if a genuine 
luatex would have to be categorized under « PrimeTeX » or « PrimeLua » ...
... in particular a LuaTeX genuine solution may have a way to use an « array » 
not based on font dimension parameters.


mixing lua and tex will also introduce lua call overhead so there is no gain 
there (maybe let lua do the sqrt but then you can well do all in lua)

my guess is that the sqrt is the bottleneck

fontdimens are actually bnto that slow not that slow because they are (1) 
global so no save stack overhead, and (2) directly accessible because they are 
part of the font structure (so no tex dimen access overhead)

also, using etex \dimexpr is also slower than the simple operators


Not to mention that etex division rounds which sometimes is more inconvenient 
than truncating


(it's why in luametatex we have : for inte

Re: [luatex] how many bytes for fontdimens?

2021-08-03 Thread Hans Hagen

On 8/3/2021 9:45 AM, jfbu wrote:

Hi,

Le 3 août 2021 à 09:09, Hans Hagen <mailto:j.ha...@xs4all.nl>> a écrit :


On 8/2/2021 9:37 PM, jfbu wrote:
forgot to mention that I am aware a \fontdimen is limited to 2**30 
strictly anyhow
but my question is whether such « arrays » are stored 32bits or 
64bits itemwise
it happens to be an array of 32 bit integers (that grows on demand) 
but such implementaiton details are unspecified (could as well have 
been a sparse array in which case each entry that is actually set has more


also, the fact that it grow is a sort of side effect of the fact that 
tfm fonts can have 7 or more, but 7 are used, for text upto more for 
math fonts


so, i wouldn't rely on these properties too much


Thanks!

Reason I asked is because I contributed an Eratosthenes Prime sieve to a 
github site comparing a whole bunch of langages and it is asked there to 
specify whether the « arrays » use 1bit, 8bits, 32bits, or 64bits (or 
« unknown ») per (potential) prime.


https://github.com/PlummersSoftwareLLC/Primes/blob/drag-race/CONTRIBUTING.md#flag-storage 
<https://github.com/PlummersSoftwareLLC/Primes/blob/drag-race/CONTRIBUTING.md#flag-storage>


ha, i've seen that one a few weeks ago (after some yt video) and to be 
honnest it's one of these useless speed comparisons (can be fun, but 
useless as one compares languages with different objectives and doing 
some prime stuff is hardly representative for usage)


I am using luatex for the benchmark because even setting pdftex’s 
font_mem_size at its maximum TeXLive setting, the memory is at risk of 
being exhausted on current personal computers  from the condition that 
the benchmark must iterate at least until a duration of 5 seconds and 
each iteration re-in allocates a \fontdimen « array » (in the case at 
hand about 500,000 entries are needed to sieve up to 1,000,000 and 
memory will get exhausted before the 300th pass)


Also it seems luatex runs comparatively faster once the sieving range is 
large enough (the instantiation step which requires extending 
dynamically does take some time).


hm, just allocate the largest fontdimen you need first, that will make 
the fontdimen array grow at the beginning only (instead of at each step)


I will thus modify the « bit count » tag of my « solution » from unknown 
to 32bits, thanks to your answer, knowing though that this remains 
officially unspecified. But the Dockerfile which I was asked to include, 
and which their benchmarking uses, pulls a texlive-minimal based image 
dating back to 2018.


Perhaps someone here will be interested into contributing a genuine 
luatex (i.e. using Lua) solution (my code uses only Knuth TeX; there is 
also a LaTeX3 code also on the github site).


a lua solution in luatex is just a lua solution -)

There is already at least one Lua contribution. I don’t know if a 
genuine luatex would have to be categorized under « PrimeTeX » or 
« PrimeLua » ...


... in particular a LuaTeX genuine solution may have a way to use an 
« array » not based on font dimension parameters.


mixing lua and tex will also introduce lua call overhead so there is no 
gain there (maybe let lua do the sqrt but then you can well do all in lua)


my guess is that the sqrt is the bottleneck

fontdimens are actually bnto that slow not that slow because they are 
(1) global so no save stack overhead, and (2) directly accessible 
because they are part of the font structure (so no tex dimen access 
overhead)


also, using etex \dimexpr is also slower than the simple operators

One particular point I don’t know is whether LuaTeX would allow a 
« faithful » solution: this seems to mean roughly a class-encapsulated 
one (it is hard to understand what they precisely mean in their 
guidelines), which I could not really emulate in my code due to global 
nature of fontdimen assignments.


hm, do you really need local?

if you use csnames, then you can also consider using \chardef's for 
numbers (these obey grouping)


(I also experimented with  a csname based approach but never could reach 
comparable speed to fontdimen arrays ; and this required extending other 
parts of the memory)


in luatex csname is costly because of the serialization (pdftex is 
probably faster because there is no utf related overhead)


Here is a link to how the various implementations sort out currently on 
one specific machine:


https://plummerssoftwarellc.github.io/PrimeView/?sc=dt=True=30 
<https://plummerssoftwarellc.github.io/PrimeView/?sc=dt=True=30>
the lua solution they post is not only somewhat slow but also makes some 
(imo wrong, but who am i to claim) assumptions about how lua stores data 
so it was not that hard to make a variant that was over 200 times faster


because i have a relative old laptop i can't compare with the numbers 
for e.g. c there (of course lua will be slower) but as i consider these 
shootouts useles anyway, i didn't want to spend more time on it (all 
that docker stuff 

Re: [luatex] how many bytes for fontdimens?

2021-08-03 Thread Hans Hagen

On 8/2/2021 9:37 PM, jfbu wrote:

forgot to mention that I am aware a \fontdimen is limited to 2**30 strictly 
anyhow

but my question is whether such « arrays » are stored 32bits or 64bits itemwise
it happens to be an array of 32 bit integers (that grows on demand) but 
such implementaiton details are unspecified (could as well have been a 
sparse array in which case each entry that is actually set has more


also, the fact that it grow is a sort of side effect of the fact that 
tfm fonts can have 7 or more, but 7 are used, for text upto more for 
math fonts


so, i wouldn't rely on these properties too much

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] luatex doesn't set /Smask in pdf 2.0

2021-06-30 Thread Hans Hagen

On 6/30/2021 5:58 PM, luigi scarso wrote:



On Wed, Jun 30, 2021 at 4:55 PM Ulrike Fischer <mailto:lua...@nililand.de>> wrote:


I don't think that anything changed compared to 1.7. I didn't check
the full reference, but the SMask entry simply says

SMask stream (Optional; PDF 1.4) 

I guess that the test was added to handle PDF <1.4 and
simply wasn't adapted when the major version was added.


The point is always the same: without a public 2.0 reference, we can 
make patches here and there

while waiting for the next failure.
We can probably gamble that version 2 supports it so we could add some 
"or version > 2" to the test but indeed 'no standard on our machines' 
maans 'no testing' and no recent acrobat on our machines means 'no 
testing if a patch works either' ... no way we're going to subscribe to 
some monthly adobe service for tools we don't really need nor can afford 
to pay for the rest of our life just for the sake of maintaining free 
software.


Hans




-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] [EXT] Iteration over the dictionary of a stream

2021-03-31 Thread Hans Hagen

On 3/31/2021 5:41 PM, Andreas Matthias wrote:

On Wed, Mar 31, 2021 at 5:10 PM Philip Taylor  wrote:


for k, v in pairs (pdfe.dictionarytotable (doc.Pages [1])) do


Here, you are iterating over a /Page dictionary (doc.Pages[1]), which
is a real dictionary.
No issues when iterating over real dictionaries with pdfe.dictionarytotable().

But the /Contents entry of this dictionary refers to a stream. And the
first part of
a stream object is a dictionary. But you cannot use pdfe.dictionarytotable() in
this case.

   local doc = pdfe.open ('h.pdf')
   local page = doc.Pages[1]
   local a = pdfe.dictionarytotable(page)
   print("page",a)
   local b = a.Contents
   print("contents",b[1],b[2],b[3])
   local c, d, e = pdfe.getfromreference(b[2])
   print("stream",c,d,e)
   local f = pdfe.dictionarytotable(e)
   print("whatever",f)

a stream object is a referenced object with a stream and a dictionary

-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Accessing external Lua libraries

2021-03-25 Thread Hans Hagen

On 3/25/2021 1:12 PM, Faheem Mitha wrote:


So, a couple of questions.

First, is it necessary to add stuff to the search path at the TeX end 
(kpse or whatever)? My experience is that adding stuff at the Lua end 
suffices for my needs. But of course my experience does not cover all 
use cases, and I think I've avoided using luarocks.


you also need to match the lua version (irr luarocks is overkill and can 
interfere badly)


Second, is there some reason not to include the functionality of 
luapackageloader by default in LuaTeX? Then the Lua paths would work as 
expected. It might still make sense to have the functionality as a 
separate package, in case people wanted to disable that functionality 
for some reason. Though I can't imagine why they would want to.


see below ... demands differand we don't impose a strategy

I originally became aware of LuaTeX around 2012, but at that time I 
could not figure out a way to load external packages, though since I 
don't recall asking for help at the time, I probably did not pursue it 
very hard. But if this had worked out of the box for me then (and as 
stated I can't see any reason why it shouldn't), I would have started 
using LuaTeX much earlier.
it has always been possible to load libraries, luatex has some 
protection against loading but that's real low level inhibition


dealing with this is upto the macro package because tex has rules about 
where it looks for files and might want to limit choices for all kind of 
reasons (like security or interference); there is a long tradition of 
dealing with files of any kind in tds and that's not going to change; 
much in tex that has to do with files also relates to long term stability


so, for stability you might want to put lua files and binary modules in 
the tex binary tree but lua files can be put elsewhere too; maybe you 
want some specific version over some other; maybe you don't want 
interference with what other programs put in the lua lookup paths; you 
hav eto match the lua version in luatex anyway; there is no universal 
approach to this


there is no need to change anything as a macro package can put whatever 
loader it wants on top of what is already there; of course it also 
depends on what paths are defined in the configuration file (when you 
use kpse); if it doesn't work then it's probably because no one bothered 
 to deal with it (which is an indication of something hardly being 
used) so you can try to get support for your approach in the macro 
package that you use


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Ligatures

2021-03-19 Thread Hans Hagen

On 3/19/2021 11:09 AM, Philip Taylor wrote:

Hans Hagen wrote:

it looks ok here; the font no dflt script so you need to activate the 
latn script for ligatures to work


Thank you Hans, but I am suprised that it looked OK at your end.  At 
this end, and following your suggestion, all of the lines in the 
following example now display the correct ligatures, but in my earlier 
example, with no 'script=latn', only the first (Adobe Minion Pro) 
produced ligatures.  Are you able to explain why LuaTeX does the right 
thing with Adobe Minion Pro in the absence of a 'script=latn' but does 
not with the other three fonts tried ?
because there is a dflt/dflt script/language entry in the features table 
of those fonts and features are driven by script/language combinations 
(while the other font has latn/dflt ... it could be  that you'd have to 
set latn/eng because not all latin scripts might like these ligatures)


(i can't speak for plain/latex but context uses different heuristics 
when dealing with these issues which can be why its users observe 
different results)


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Ligatures

2021-03-19 Thread Hans Hagen

On 3/19/2021 10:21 AM, Philip Taylor wrote:
I asked a question on the XeTeX list concerning support for coloured 
fonts yesterday, and it was suggested that LuaTeX might provide a 
solution to my problem.  Indeed it does, but it introduces a new one — 
while LuaTeX appears perfectly happy to form (for example) an "fi" 
ligature from the separate letters "f" & "i" in the font Minion Pro, it 
does not do the same with the font Adobe Caslon Pro, whether or not 
"+liga"and/or "+dlig" is specified, nor with the font Garamond and 
probably others.  The following is my test-bed, which produces correct 
ligatures using XeTeX but not using LuaTeX or LuaHbTeX.  Can anyone 
advise, please ?  UNIV: TeX Live 2021 (pre-test), Windows 7 64-bit.


/Philip Taylor
/
\ifcsname directlua\endcsname \input luaotfload.sty \fi


\font \bodyfont = "Minion Pro"

\centerline {\bodyfont Minion is a fit subject}

\font \bodyfont = "Adobe Caslon Pro"

\centerline {\bodyfont Caslon is a fit subject}

\font \bodyfont = "Adobe Caslon Pro:+liga;+dlig"

\centerline {\bodyfont Caslon is a fit subject}


it looks ok here; the font no dflt script so you need to activate the 
latn script for ligatures to work


Hans


-----
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] UTF-16 with pdfe.getstring()

2021-03-17 Thread Hans Hagen

On 3/17/2021 5:30 PM, Andreas Matthias wrote:

I'm having a hard time with pdfe.getstring(). What am I supposed to do if it
returns an UTF-16 encoded string? How to convert it to UTF-8?

Here is what I'm actually trying to do: I'm reading the /Contents of a
Text-Annotation
with pdfe.getstring(). The returned string happens to be UTF-16
encoded. Now I want to
use this string to create a pdf_annot whatsit. Of course this doesn't work:

This is LuaTeX, Version 1.13.0 (TeX Live 2021/dev)
  restricted system commands enabled.
(./test.tex
! String contains an invalid utf-8 sequence.
l.17 }

I've attached an example to replicate this issue.

  contents = annot.Contents
  local t = { }
  for c in string.gmatch(contents,".") do
  t[#t+1] = string.format("%02X",string.byte(c))
  end
  contents = table.concat(t)
  local str = '/Subtype/Text/Contents <' .. contents .. '>'


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] [NTG-context] Diagnositics in LuaTeX

2020-12-28 Thread Hans Hagen

On 12/28/2020 3:54 PM, Henri Menke wrote:

Dear lists,

Recently I've been working a bit on PGF/TikZ and many times I wished I had
certain metrics that are available in regular programming languages.  In
particular I am missing:

- A callback that is triggered when a macro is defined.  This would allow me to
trap when a cs is defined in the wrong place.  That could probably be done by
overriding `\def` but there are some gotchas with prefixes, e.g. `\long\def`.


if you intercept for that reason you need to make the callback function 
situation aware which means that you can as well test things the tex 
which is way more efficient



- A callback that is triggered when a token is expanded (or executed in the case
of primitives).  In conjunction with the previous request, this would allow me
to measure code coverage by comparing which macros are defined and which ones
are used.


given the amount of expansion going on in the engine that would mean, 
for a reasonable large document, or for some complex macro package like 
pgf, many millions of


Also, what exactly is 'expanded'. Some tokens are handled directly, some 
enter an expansion call, others are macros that themselves use expansion 
to even get started. Basically you want to check each token that is read 
or accessed from internal storage (lists). Adding granilarity means many 
callbacks (each one checking if it's defined, then setting up lua, call 
the function, and go back to tex). There's also a lot of pushback ging on.


If you just hook into 'picking up a token', here are some numbers if we 
hook into that:


making context lmtx format: 4M (million)
simple tufte document with minimal font setup: 500K
loading tikz (with patterns lib): 3M
drawing simple pattern in tikz: 100K (1 cm circle with pdf pattern)
290 page luametatex manual: 105M

(ok, latex is probably much more efficient than context but it gives an 
idea)



- Switches to trap on certain pathological events.  For example I want to trap
when TeX inserts a frozen \relax because a number couldn't be read.  Another
thing I want to trap is `Missing character: There is no  in font !`.
There are probably more silent TeX errors that I currently don't have in mind.


there is s missing glyph callback and with some juggling the 'no number' 
can be intercepted at the tex end ... you can set an error callback and 
look at the message



Are these things possible in LuaTeX right now or could they be made possible in
the future?
all is possile but this will not happen ... enabling callbacks like that 
would make luatex unuseable slow (which is probably not what you want 
because pgf is already kind of slow) (even checking for the callback 
being set takes time)


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] fio library byte order

2020-06-29 Thread Hans Hagen

On 6/29/2020 7:45 PM, Reinhard Kotucha wrote:


96MiB per file.  Processing means to apply a lookup table and a 3×3
color matrix, quite inexpensive operations.  What takes most of the
time is to extract single bytes with string.sub() and to convert them
to integers.  Finally I have to convert everything back to uint16.

last mail as it's way off topic ...

loading 96 MB string and converting 2 bytes to an unsigned: 1.7 sec
idem plus a lookup: 2.1 sec
idem but then also going back string with 2 byte unsigned : 5.0 sec

add a few sec for some calculations in between, so still doable in lua 
(half that time when using luajit)


(measured on my 8 year old laptop so on a modern machine maybe less than 
half the mentioned time which is okay compared to some pure c approach i 
guess)


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] fio library byte order

2020-06-29 Thread Hans Hagen

On 6/29/2020 10:30 AM, Taco Hoekwater wrote:




On 28 Jun 2020, at 03:26, Reinhard Kotucha  wrote:



As far as I understand it's sufficient that the relevant
functions read{cardinal,integer}{2,4} obey a flag which tells
them whether byte re-ordering is necessary.  The flag has to be
set if host and file byte orders are different.  I don't know
whether we have to consider 64 bit integers too.


that adds passing parameters and checking them for each call
... you can then as well use lua's 'read' function and convert with
string.byte/char which is then about equally fast


This is what I actually did.  It took 14 s to process a PNM file, way
too much if I have to process hundreds or thousands files.  I ported
the script to C and could process the file within 270 ms.  I can't
imagine that obeying a variable in C can slow down everything so much.


The quick way to fix this without noticeable overhead would be to add
a few extra function definitions for read{cardinal,integer}{2,4} in
the byte permutations that are likely to actually happen. There are
not *that* many of those.

that was indeed the idea ... (i'll do it this week)

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] fio library byte order

2020-06-28 Thread Hans Hagen

On 6/28/2020 3:26 AM, Reinhard Kotucha wrote:


  > (btw, the format file used to normalize to hig endian but that was
  > dropped long ago already: formats are no longer portable, which in fact
  > was already dropped before that)

I don't understand.  All format files in TeX Live work on all systems.
They have a distinct byte order and are portable among all systems
supported by TeX Live.  What do you mean if you say "formats are no
longer portable"?


i can only speak for luatex but indeed, there we don't juggle the bytes 
of the format file (actually, as most users use a le system it ended up 
always juggling) ... one reason is that a format can have stored 
bytecode or whatever code which can be system dependent so ...



  > that adds passing parameters and checking them for each call
  > ... you can then as well use lua's 'read' function and convert with
  > string.byte/char which is then about equally fast

This is what I actually did.  It took 14 s to process a PNM file, way
too much if I have to process hundreds or thousands files.  I ported
the script to C and could process the file within 270 ms.  I can't
imagine that obeying a variable in C can slow down everything so much.


how big a file ... also, i bet you do more than just reading, you don't 
define what 'process' is  (270 ms for 100K files is still not fast I guess)



I'm not very familiar with C programming.  You say that it's expensive
to pass arguments to a function.  What I had in mind is that functions
obey a global variable at runtime which denotes whether byte order
conversion is necessary or not.


passing variables in c is no issue (also because compilers are smart 
enough to deal with it)


a global variable would not work because one can read several files a 
the same time interleaved with different properties


i'm talking of picking up some optional argument passed by lua (passed 
on stack, checking needed, etc)


anyway, there's nothing wrong with writing and using a c program if that 
is more suitable esp when you need to process that many files ... 
opening closing in lua is slower than in c, as is storing all your read 
bytes in lua variables (and i'm not even talking about the fact that a 
file metatable has to be looked up and type being checked for every 
read) plus some garbage collection every now and then


as you can compile c, you can also write a dedicated library and add 
that to luatex (assuming you need to do this runtime from luatex)


(you could consider using ffi)

I downloaded the 3.7 GB texlive iso and read integers from that one

-- 360 sec : one  byte integers + counting
-- 224 sec : two  byte integers + counting
-- 166 sec : four byte integers + counting (160 no counting)

But that's a lot of lua calls.

Then I downloaded the tug logo from the website

-- string : .55 sec for 1000 times (including opening / loading)
-- file   : .67 sec for 1000 times (including opening / loading)

So, that's milliseconds per file.

Finally I processed the 3414 files in the 268M context distribution and 
read 2 byte integers from those till end of file which took 15 seconds 
for the lot. So, no complaints from my end.


I think it's not the file handling that is your bottleneck.

Hans

-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] fio library byte order

2020-06-28 Thread Hans Hagen

On 6/28/2020 9:57 AM, Philip Taylor wrote:

Reinhard Kotucha wrote:


I'm not very familiar with C programming.  You say that it's expensive
to pass arguments to a function.


Then it is not fit for purpose, something I thought when I was first 
exposed to it and something I continue to think to this day.  I think 
that we now have conclusive evidence to support that assertion.

hm, no code seen, no tests seen, so no proof of anything

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] fio library byte order

2020-06-27 Thread Hans Hagen

On 6/27/2020 2:56 AM, Reinhard Kotucha wrote:

On 2020-06-20 at 10:25:33 +0200, Hans Hagen wrote:

  > On 6/19/2020 11:16 PM, Reinhard Kotucha wrote:
  > > Hi,
  > > it's nice that with the fio library LuaTeX can now process binary
  > > files.  What I'm missing is the ability to specify the byte order
  > > (little vs. big endian).
  > >
  > > I didn't find any hint, neither in the manual nor in the sources.
  > >
  > > Without being able to specify the byte order usage of the library is
  > > quite limited.  The byte order of a particular file is not necessarily
  > > the same as that of your system.
  > >
  > > Some file formats have a certain byte order (PNM, for instance) and
  > > others precede binary data with a byte order mark (TIFF).  In any case
  > > it's necessary to specify the byte order before reading binary stuff.
  > >
  > > Is there a chance to provide a switch?
  >
  > When I have time I'll backport a couple of the additional
  > [integer|cardinal]*_le ones that we have in luametatex (I though that
  > i'd already done that).

Hi Hans,
I must admit that I don't know anything about luametatex.  I just
looked into liolibext.c .

IMO there are a few things to consider.

The current code extracts single bytes from a file.

  | static int readcardinal2(lua_State *L) {
  | FILE *f = tofile(L);
  | int a = getc(f);
  | int b = getc(f);
  |

This, and even the extraction of short strings, is extremely slow.
It's much more efficient to read data blocks of 8192 bytes, for
instance, into memory and to process these data blocks.  I'm not
convinced that reading a complete file into memory is a good idea,
despite its simplicity.


that would add all kind of overhead (buffer underrun, adapting to seek 
etc and therefore reload) and we can assume that the operating system 
also buffers



Processing the content of a file with the fio library is then similar
to processing a string with the sio library, with the exception that
endianness has to be considered when files are involved.


it depends on what one does, sometimes a full load and using sio is 
faster but that also has its overhead (pseudo seek)


as usual i did lots of (performance) tests and there is not that much to 
gain on either end (several variants were played with)



The host byte order must always be determined automatically, either
with Luigi's approach or probably more easily with ntohs(3) if this
function is available on Windows too.  The file byte order has to be
specified by the user because it depends on the file format.


the lib is meant for usage in known scenarios (known, documented file 
formats), not arbitrary, depending on architecture or implementation


(btw, the format file used to normalize to hig endian but that was 
dropped long ago already: formats are no longer portable, which in fact 
was already dropped before that)



If a particular file format has a BOM in its header, the BOM can be
evaluated by the user, for instance with fio.readline().  This means
that a user should be able to specify the andianness at any time, not
necessarily in advance.


sure but a few extra readers would solve that


As far as I understand it's sufficient that the relevant functions
read{cardinal,integer}{2,4} obey a flag which tells them whether byte
re-ordering is necessary.  The flag has to be set if host and file
byte orders are different.  I don't know whether we have to consider
64 bit integers too.


that adds passing parameters and checking them for each call ... you can 
then as well use lua's 'read' function and convert with string.byte/char 
which is then about equally fast



If you intend to go this way the number of functions in liolibext.c
can be halved because there is no significant difference between a
buffer and a string.  Only very few functions have to be aware of
endianness.


halved in calls to simple functions, enlarged by more checking .. .more 
pain than gain



There is one difference though.  A string is always complete while a
buffer contains only a part of a file.  If a there are not enough
bytes at the end of a buffer in order to fulfill a request, the
missing bytes can be loaded from the file and appended to the buffer.
This has no significant impact on speed because it happens quite
rarely.  It's similar to the example in PIL, chapter 'The complete I/O
Model', section 'A small performance trick'.

If the user doesn't specify a byte order we can assume host byte
order.  I can't imagine any reasonable use case right now, except
if a temporary file is read by the same process that created it.

as we have lua 5.3 you can consider using the string.unpack function

Hans

-----
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] fio library byte order

2020-06-20 Thread Hans Hagen

On 6/19/2020 11:16 PM, Reinhard Kotucha wrote:

Hi,
it's nice that with the fio library LuaTeX can now process binary
files.  What I'm missing is the ability to specify the byte order
(little vs. big endian).

I didn't find any hint, neither in the manual nor in the sources.

Without being able to specify the byte order usage of the library is
quite limited.  The byte order of a particular file is not necessarily
the same as that of your system.

Some file formats have a certain byte order (PNM, for instance) and
others precede binary data with a byte order mark (TIFF).  In any case
it's necessary to specify the byte order before reading binary stuff.

Is there a chance to provide a switch?
When I have time I'll backport a couple of the additional 
[integer|cardinal]*_le ones that we have in luametatex (I though that 
i'd already done that).


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] LuaTeX picky about internal PDF encoding, breaks self-hosted embedded documents

2020-03-27 Thread Hans Hagen

On 3/27/2020 1:16 PM, Johannes Hielscher wrote:


You are 100% right. That's why I did not call it a bug in the first
place, because everyone does their job right, and nothing has to be
fixed. I have found this out the hard way, and just wanted to leave
it somewhere: it might be helpful for someone else scratching their
heads about the sparse evidence of pdftex being less pedantic about
buggy PDFs than luatex.


That's indeed the danger of programs being tolerant. Mupdf. qpdf, xpdf 
all have some (different) strategies in loading files, sometimes they 
ignore the xref, sometimes they fix and recover, etc. The problem with 
such heuristics is that bad pdf stays around (if one already knows what 
gets 'ignored'. There are also similar tricks for dealing with bad fonts 
cq. font embedding and annotations and such. So one never really knows 
if the opdf file one makes is ok (also because validators don't check 
everything).



As already stated, no mercy for people who have their PDF encoding/
xref tables not under control, and even a bit less in luatex (which
is not necessarily a bad thing!). Fall-out wrt. hard to detect edge
cases in high-level environments included.


Indeed. The most one can expect is a message that something is wrong. Of 
course there can be real bugs in the inclusion, which then need to be 
solved, but we've tested with many thousands of files so it looks ok so 
far.


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] LuaTeX picky about internal PDF encoding, breaks self-hosted embedded documents

2020-03-27 Thread Hans Hagen

On 3/27/2020 10:57 AM, Johannes Hielscher wrote:

tl;dr: The {filecontents} environment only writes UTF-8 encoded files,
so differently encoded PDFs cannot be referenced by luatex, being more
pedantic about PDF byte offsets (xref etc.) than pdftex.




Hi LuaTeX devs,

Strange issue here, for which I needed some time to find out where the
“error” (if any) is. LuaTeX is a lot more pedantic about PDFs adhering
to standards than pdfTeX. This is not necessary a bad thing, as nobody
expects broken documents (bad xref tables/stream lengths) to work well
with any program. I totally don't want to request that luatex imitated
pdftex's liberal interpreter (don't make it too easy for folks like me
to manually edit PDF files). But I seemingly found a corner case where
this indeed makes a difference, and I just want to assure that I'm not
totally on the wrong track with that:
The {filecontents} environment “embeds” plaintext documents into LaTeX
and writes them into a new file. I tried to use this for shipping some
self-contained PDFs for within the document. This did work for pdftex,
but not for luatex. Turned out that {filecontents} always writes UTF-8
files, but in the copy-pasted (unpacked, so largely ASCII) PDFs, there
is an "%âãÏÓ" encoding safeguard (?) comment (second line, see attach-
ment). This has a different length for UTF-8 than luatex expected from
the iso8859-1 original internal encoding of the PDF file (that's, just
to make things worse, invisible in some “intelligent” editors and diff
tools). So, LuaTeX will die with:

internal error: unknown image type
!  ==> Fatal error occurred, no output PDF file produced!

Since there is no way (at least I didn't come up with one) to manually
specify the output encoding of {filecontents}, or to trick the PDF in-
put drivers of luatex into reading PDFs with a different encoding than
usual, this makes the embedding self-hosted PDFs in LuaTeX impossible,
given that they have been created in another encoding than UTF-8. Most
probably there are other cases of encoding-sensitive data that someone
might embed via {filecontents} as it has worked in pdftex for ages (at
least in conjunction with filecontents.sty).

Find attached a tarball with a MWE, with PDFs for two \includegraphics
and the third one created during translation of the document. Obvious-
ly, pdftex can cope with the “right” latin1 and the “wrong” UTF-8 PDFs
but luatex cannot. Which engine is closer to “ideal” behaviour? Or did
I overlook something important?

these are two separate issues:

- The utf8 pdf file is wrong in the sense that the xref table is made 
for single byte characters, afaiks it counts each multibyte utf 
character as one byte. The pdf library in luatex assumes a correct xref 
table and does no magic in reconstructing (read: gambling). If you want 
bad files to be read you can consider feeding them into some external 
program that fixes them.


- When you embed some pdf stream in the source file you depend on your 
macro package for dealing with how that input results in something 
useable. Luatex is an utf engine and assumes utf input. I don't knwo 
what that enviromnent does but that's the level one had to deal with it 
as the pdf library is not involved in that.


Hans



-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] What changed in today's pretest update?

2020-03-14 Thread Hans Hagen

On 13/03/20, 16:08, Herbert Schulz wrote:



PS: I didn't have to use --shell-escape before.
We also brought loadlib (+the related package loader of require) under 
escape control .. this happens in the preloaded lua stub (which was 
introduced years ago to avoid th eneed to patch lua itself).


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Null bite doesn't write to file

2020-03-05 Thread Hans Hagen

On 3/5/2020 1:07 PM, Phelype Oleinik wrote:

Hi all,

When writing a null byte to a file, as in `\immediate\write\test{abc^^@def}`,
the tokens after the null byte are not written (because of C's \0 string
termination, I suppose?). The code:

\newwrite\test
\immediate\openout\test=testfile.tex
\catcode`\^^@=12
\immediate\write\test{abc   def}
\immediate\write\test{abc^^@def}
\closeout\test
\bye

when run with LuaTeX produces testfile.tex with:

abc def
abc

whereas pdfTeX and XeTeX it produces:

abc def
abc^^@def

(or something like that, depending on the -8bit flag).

Is it a bug? Can it be reasonably fixed?
it's more a side effect of luatex being a mix that uses c strings in 
many places (and some internals can optionally go through callbacks)


anyway, i looked at it and can make it work but this change in behaviour 
will have to wait till *after* the tex live code freeze (which then 
gives folks a year to adapt to it)


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] glyph_nodes / character id instead of unicode value

2020-02-05 Thread Hans Hagen

On 2/4/2020 4:18 PM, Patrick Gundlach wrote:


so far (at least if I understand my own code correctly), I create glyph nodes 
and insert unicode values into the char field to get the desired output. Can I 
put glyph ids there instead? Do I need to specify something somewhere else?
In e.g. pdftex (8 bit font engine) the char index actually is the index 
in a font, although you can cheat a bit by using an encoding vector in 
the backend. This is why in the 8 bit engines one speaks of font 
encoding, etc. The char field is then also related to the hyphenation 
mechanism (patterns).


In luatex it is entirely up to you what use the field for. If you use 8 
bit fonts, then they can be indexed (as in pdftex), but normally you 
will consider them unicode slots. They are just numbers. When using the 
built in font handler, they can be indices or whetever goes into the 
font's character table, where each entry can also have an index field 
that then maps into the glyph indices vector of a font. Of course when 
you replace some character (maybe using lua) then the reference can be 
something else than unicode.


So, for the frontend it's just numbers, for the backend end it depends 
on the presence of an index field or when you use t1 fonts the mapping 
in the encoding vector. The backend also looks at an optional tounicode 
field.


So, one answer can be: it's anything you like it to be.

In traditional tex there was a distinction between chartacter and glyphs 
nodes, in luatex it's all glyphs. The subtype can be (and used when you 
leave all to tex) used to flag such a node as being processed (by the 
lig / kern handlers), for instance in order to prevent duplicate 
processing.


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] glue fill order

2019-10-09 Thread Hans Hagen

On 10/9/2019 12:26 PM, David Carlisle wrote:

On Wed, 9 Oct 2019 at 10:27, luigi scarso  wrote:




On Sat, Oct 5, 2019 at 10:42 PM David Carlisle  wrote:


Is there any chance that luatex could change here to be compatible
with other systems?



we are  discussing it, but there is not an easy solution.


thanks, yes  I'm not sure my "-1" suggestion is an improvement
actually, it was just a suggestion to
think of a possible change.


As is now in luatex, \gluestretchorder is consistent (integer always greater 
than zero and ordered ),
something else  breaks consistency and backward compatibility (this is a quite 
old macro).



Yes my suggestion aligns 0,1,2,3 with pdftex so a test of
\gluestretchorder=1 works the same way but
tests of \gluestretchorder>0 to mean "stretchy" then break.  It may be
that the best can be done is just document it in the manual.
It's a bit weird but it's not the only place where luatex and pdftex
differ (and as far as I can tell it's always been this way in luatex
without
anyone noticing or complaining before:-)


we'll add \eTeXglue[stretch|shrink]order for your purpose which returns 
the -1 so we keep the 1..4 range (nicer in an ifcase) in the normal 
primitives so it's then up to latex folks to decide to alias it in latex 
or not


at the lua end nothing will change; after all, node types are also 
different from the etex ones; if needed, you can always overload 
tex.setglue etc. and the interfaces are kind of frozen anyway



David



If it is not possible to change the default behaviour would it be
possible to have a flag settable in the format so that  fi fil fill
fill were (say) -1,1,2,3 not 1,2,3,4?
both for \gluestretchorder and tex.setglue ?



At format level one can patch it, the point is at engine level.


--
luigi



--

-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] \openin does not find files with no extension

2019-07-29 Thread Hans Hagen

On 7/29/2019 6:19 PM, Harald Hanche-Olsen wrote:

From: Hans Hagen (mailto:j.ha...@xs4all.nl)
Date: 29 July 2019 at 17:40:22


but afaik it doesn't remove a suffix first so if an engine explicitly
adds one (we no longer do now in the openin case but pdftex does) one
cannot open a file without suffix


Perhaps pdftex did so in the past, but it doesn't now,
unless I misunderstand what you're saying:

⬥ cat bar.tex
\newread\foo
\openin\foo=foo
\ifeof\foo
   \message{No foo!}
\else
   \read\foo to \bar
   \bar
\fi
\end
⬥ cat foo
\message{This is foo (no extension)}
⬥ pdftex bar
This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019) (preloaded 
format=pdftex)
  restricted \write18 enabled.
entering extended mode
(./bar.tex This is foo (no extension) )
No pages of output.
Transcript written on bar.log.

maybe it changed then as

https://www.tug.org/svn/texlive/trunk/Build/source/texk/web2c/pdftexdir/pdftex.web?view=markup

shows in line 32270

if cur_ext="" then cur_ext:=".tex";

(i'm not sure if that's the right source)

for \openin ... anyhow, i don't use pdftex now so i'm not going to 
investigate what really happens deep down (it would be kin dof weird if 
that code is still there and the suffix is ignored)


Hans

-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] \openin does not find files with no extension

2019-07-29 Thread Hans Hagen

On 7/27/2019 11:18 PM, Reinhard Kotucha wrote:

On 2019-07-27 at 21:49:54 +0200, Hans Hagen wrote:

  > Keep in mind that adding a .tex in case of no suffix being
  > there makes it impossible to open files without suffix.

There is more to consider:  Kpathsea searches each texmf tree for a
file without an extension and if none is found it appends '.tex' to
the name and searches again within the same tree.


but afaik it doesn't remove a suffix first so if an engine explicitly 
adds one (we no longer do now in the openin case but pdftex does) one 
cannot open a file without suffix


so,

\input foo  : search for 'foo' and if not found 'foo.tex'
\input foo.tex  : search for 'foo.tex'
\input foo.bar  : search for 'foo.bar' and 'foo.bar.tex' when enabled


It does *not* search all trees for a file without extension and
restart the search again if none is found.


so, \openin foo with tex appending to foo.tex automatically in the 
engine (not kpse) will never find 'foo'


(in luatex no suffix is appended any more in the engine now)


This is intended behavior.  So people who need files without
extensions should keep this in mind and be very careful.

i didn't look up the specs of the two flags in cnf so i might be wrong

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] \openin does not find files with no extension

2019-07-27 Thread Hans Hagen

On 7/27/2019 7:23 PM, Ulrike Fischer wrote:


yes, the luatex syntax has been extended here (which is fine). That
pdftex doesn't know this syntax is documented and not a problem.
Yes but the same 'add suffix when not set' code is used as in pdftex, 
but luatex did that only in then non-braced case (whjich pdftex doesn't 
have) while actually checking with a .tex suffix when not found is a 
kpse feature controlled by some flags in the cnf so even for pdftex one 
can wonder what is the right way (support for {} shouldn't be too hard 
for pdftex). Keep in mind that adding a .tex in case of no suffix being 
there makes it impossible to open files without suffix.


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Identical languages

2019-03-29 Thread Hans Hagen

On 3/29/2019 6:32 PM, Javier Bezos wrote:

Does luatex do any kind of internal memory optimization
when there are two languages with exactly the same
hyphenation patterns? For example, if they share a
pointer to a single pattern list.

no

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Core dump error

2019-03-17 Thread Hans Hagen

On 3/17/2019 3:44 PM, Robert Alessi wrote:

On Sun, Mar 17, 2019 at 03:21:58PM +0100, luigi scarso wrote:

intercepting this 'deleted or not present object' is ok, but best also
check your image as there is a reference to an object not present here
(probably harmless here but if the missing object is a font resource
then it could be a problem)

Hans




Ok...this refer to another bug, unrelated... :-)


I didn't check at first, but yes, this is a different issue.
the page stream refers to a state (shade probably) that doesn't exist so 
in the end you depend on the tolerance of the viewer


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Core dump error

2019-03-17 Thread Hans Hagen

On 3/17/2019 2:30 PM, Robert Alessi wrote:

On Sun, Mar 17, 2019 at 01:55:53PM +0100, Ulrike Fischer wrote:

Am Sun, 17 Mar 2019 13:40:24 +0100 schrieb Robert Alessi:


With the update today the crash is gone.
  

At least not on my side: I got the update this morning, but compiling:



I'm on windows and luatex --credits says that I have now

Development id: 7116


Thank you.  On Linux, I still have 7110.  I will check again when the
next update come.
intercepting this 'deleted or not present object' is ok, but best also 
check your image as there is a reference to an object not present here 
(probably harmless here but if the missing object is a font resource 
then it could be a problem)


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] behavior of string.format

2019-03-11 Thread Hans Hagen

On 3/11/2019 12:48 AM, Norbert Preining wrote:

On Sun, 10 Mar 2019, Reinhard Kotucha wrote:

continue the discussion at the bonfire...


Oh I **hate** you all for being able to go to the "bonfire" ... ;-)


wasting time on a discussing 'the assumption that %d will take your 
float and nicely truncate it anywhere' over 'use floor instead' could 
actually waste a nice bonfire


in fact, auto conversion from 'number to string' and 'string to number' 
isn't really worth a bonfire discussion either


so ... i might as well skip the bonfire this year

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Luatex 1.09.0 announcement (following)

2018-11-30 Thread Hans Hagen

On 11/30/2018 8:41 AM, Werner LEMBERG wrote:


About a month ago I wrote:


We have a new pdf parser (pplib from Paweł Jackowski) that replaces
poppler.  It is much smaller, a bit faster and it's written in
pure C [...]


Is there a project page for pplib?  The source code of this library
contained in TeXLive is very, very uncommented – in particular, a
description of the API is completely missing, AFAICS.  It also comes
with overly long lines and extremely densely written C code; it
almost feels as if the original source has been written with cweb or
something like that.


I would be glad if someone could answer my question.


During bachotex 2018 Pawel Jakowski (son of Jacko -- tex gyre project) 
showed me some code and after looking at it we realized that it could be 
used as drop in for poppler.


In luatex, the pdf library, is actually not used that much: it can open 
a pdf file and traverse the object tree. It has no further role in the 
backend which copies and creates objects itself. So, a lightweight drop 
in basically was considered doable quite well. Pawel explicitly limited 
the functionality to a bare minimum: opening a file and traversing 
objects. (But it's quite advanced as for instance we can also access to 
password protected files).


So, basically it went this way: pawel wrote the code, I replaced the 
inclusion code and rewrote the pdf access library (so that one got a 
different interface but the old one was way more complex and even has 
issues; we're not compatible here). Then luigi spent quite some time on 
integrating the library in the luatex source tree.


The final integration involved dealing with cross platform issues. 
Especially the arm platform with different alignment rules took some 
work (luigi and pawel sorted that out eventually). We had soem feedback 
from context testers (it's also always debatable to what extend one 
should support fuzzy cases, bad documents etc).


There might still be corner cases to cover but we expect all to be ready 
in time for tex live 2019. The biggest advantage is that we got rid of a 
c++ dependency and that the code (which is unlikely to change much) is 
part of the luatex code base. So it's in fact a library specially made 
for luatex originating in the tex community.


I hope that explains it a bit (there is not much more to tell i guess; 
normally this kind of progress gets reported in status articles),


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] PDF outlines in Lua mode

2018-10-21 Thread Hans Hagen

On 10/20/2018 8:27 PM, Patrick Gundlach wrote:


I was starting with a PDF destination set with a whatsit node:

d = node.new("whatsit","pdf_dest")
d.named_id = 0
d.dest_type = 0
d.dest_id = 1234
node.write(d)


the same as on the tex end

foo = pdf.reserveobj()

d.objnum = foo


this writes, if I remember correctly, a destination object such as "[3 0 R/XYZ 
133.768 707.016 null]"

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Accented characters on Windows / lfs

2018-09-28 Thread Hans Hagen

On 9/28/2018 9:25 PM, Reinhard Kotucha wrote:

On 2018-09-28 at 14:17:20 -0400, maxwell wrote:

  > On 2018-09-28 07:07, Harald Hanche-Olsen wrote:
  > > From: Hans Hagen 
  > > Date: 28 September 2018 at 12:07:03
  > >
  > > afaik windows has no utf filenames, so when i save a file with that
  > > name
  > > i get
  > >
  > > cöw.txt
  > >
  > > (internally i think names become unicode16 and display depends on
  > > the  code page)  ...  (But I am not a windows user myself, nor do
  > > I know much about windows, so I have nothing to contribute other
  > > than this reference. Sorry if it is off the mark or irrelevant.)
  >
  > I think this is fundamentally correct, but just in case: Windows
  > supports Unicode UTF-16 in file names in NTFS-based file systems
  > (but not in the earlier FATxx file systems).  NTFS was introduced
  > in Windows NT in 1993, and became a part of consumer-based Windows
  > systems with Windows 2000: https://en.wikipedia.org/wiki/NTFS If
  > you're getting weird characters (like in the line quoted above),
  > it's likely that you're viewing them in a non-UTF16 application.
  > So yes, in such applications the display depends on the code
  > page--although code pages themselves are largely deprecated in
  > modern versions of Windows, in favor of Unicode:
  >
  > 
https://en.wikipedia.org/wiki/Windows_code_page#Problems_arising_from_the_use_of_code_pages

It's not sufficient to declare code pages deprecated as long as they
are unavoidable.  The default code page of the CLI is CP850 in Western
Europe.  According to Phil Taylor it's possible to switch to UTF-8
with

   chcp 65001

but this only works if the font used in the terminal window is "Lucida
Console".  I can't imagine why it depends on a particular font but I
tried and it obviously works.


i've been using dejavu nono for many years with success with utf in the 
console


(btw, the console code in recent windows 10 is rewritten and also much 
faster)


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] callback for image loading?

2018-09-28 Thread Hans Hagen

On 9/28/2018 3:19 PM, Patrick Gundlach wrote:

Hello all,

is there a callback for image loading (reading)?

I use

img.scan{filename = "some filename" }

which (if I understand it correctly) calls find_image_file (with kpathsea 
disabled) but no reader callback.

Why I am asking: I have an image caching mechanism for images got via http(s) 
and I'd like to serve these images from memory.
no, because luatex parses the image (random access) and buffering huge 
bitmaps is not really an option


what i do in such case i cache on disk

Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Accented characters on Windows / lfs

2018-09-28 Thread Hans Hagen

On 9/28/2018 9:19 AM, Patrick Gundlach wrote:

Hello all,

I have a problem with Windows, accented characters and lfs.

My code is something like this:


for entry in lfs.dir(dir) do
   ...
end

and I have a file named 'cöw.pdf'

(LATIN SMALL LETTER O WITH DIAERESIS, U+00F6)

and the "entry" variable above has the bytes

63 F6 77 2E 70 64 66
c  ö  w  .  p  d  f


So the ö is encoded as F6.

Is it possible to get utf8 encoding there? Or do I need a mapping such as:

filename on disk -> utf8 -> filename on disk (for file access)?

Any advice on this topic?
afaik windows has no utf filenames, so when i save a file with that name 
i get


 cöw.txt

(internally i think names become unicode16 and display depends on the 
code page)


so, if you see

63 F6 77 2E 70 64 66

that's just bytes ... so you nee to recode

i'll mail you a solution

Hans




-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Subscript drop too large

2018-09-27 Thread Hans Hagen

On 9/27/2018 12:37 PM, Henri Menke wrote:

On 9/27/18 7:50 PM, Hans Hagen wrote:

On 9/27/2018 7:23 AM, Henri Menke wrote:

Dear list,

When using a Unicode math font, the subscript drop increases by a lot.
It does not seem to be font dependent, so it's likely an engine issue.
Is there any parameter which I could tweak?


all kind of math parameters .. like

$ \the\Umathsubshiftdrop\textstyle\quad {gt}_{gt} $
$ \Umathsubshiftdrop\textstyle.2ex  {gt}_{gt} $


This partly answers the question, but it seems that there is something
wrong with subscripts after boxes, see MWE below.  With XeTeX it looks
okay and the drop is basically the same with and without a Unicode math
font, but with LuaTeX the drop after the box is much larger with a
Unicode math font whereas the drop after a regular atom is the same.

\ifdefined\directlua
   \input luaotfload.sty
\fi
\font\math="Latin Modern Math" at 10pt
$\hbox{gt}_{gt} gt_{gt}$
\textfont0=\math
$\hbox{gt}_{gt} gt_{gt}$
\bye
thee drop shifts kick in after non-characters so if you don't want that 
you can just set them to zero (instead of the value set by the font)


(afaiks it looks the same as in pdftex)

Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Subscript drop too large

2018-09-27 Thread Hans Hagen

On 9/27/2018 12:37 PM, Henri Menke wrote:

On 9/27/18 7:50 PM, Hans Hagen wrote:

On 9/27/2018 7:23 AM, Henri Menke wrote:

Dear list,

When using a Unicode math font, the subscript drop increases by a lot.
It does not seem to be font dependent, so it's likely an engine issue.
Is there any parameter which I could tweak?


all kind of math parameters .. like

$ \the\Umathsubshiftdrop\textstyle\quad {gt}_{gt} $
$ \Umathsubshiftdrop\textstyle.2ex  {gt}_{gt} $


This partly answers the question, but it seems that there is something
wrong with subscripts after boxes, see MWE below.  With XeTeX it looks
okay and the drop is basically the same with and without a Unicode math
font, but with LuaTeX the drop after the box is much larger with a
Unicode math font whereas the drop after a regular atom is the same.

\ifdefined\directlua
   \input luaotfload.sty
\fi
\font\math="Latin Modern Math" at 10pt
$\hbox{gt}_{gt} gt_{gt}$
\textfont0=\math
$\hbox{gt}_{gt} gt_{gt}$


- fwiw i only test in context (i don't know wnat luaotfload does with math)

- afaiks your \math font is not set up as math

Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Subscript drop too large

2018-09-27 Thread Hans Hagen

On 9/27/2018 7:23 AM, Henri Menke wrote:

Dear list,

When using a Unicode math font, the subscript drop increases by a lot.
It does not seem to be font dependent, so it's likely an engine issue.
Is there any parameter which I could tweak?


all kind of math parameters .. like

$ \the\Umathsubshiftdrop\textstyle\quad {gt}_{gt} $
$ \Umathsubshiftdrop\textstyle.2ex  {gt}_{gt} $

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] How to insert mathkerns?

2018-09-15 Thread Hans Hagen

On 9/15/2018 7:50 AM, Henri Menke wrote:

Dear list,

I'm trying to add mathkerns to a font on the fly.  I adapted the code
from ConTeXt in good-mth.lua but nothing happens.  How can I make this
work?  MWE below.
N.B. I'm using 1.09.0 svn6938 with Ulrike's fontloader 2018-09-01

Cheers, Henri

---

\input luaotfload.sty
\directlua{
function kern_right_fence(tfmdata)
 if tfmdata.mathparameters then
 local characters = tfmdata.characters
 if characters[0x1D44E] then % we have at least an italic a
 print("PATCHING FONT " .. tfmdata.psname)
 characters[0x1D453].mathkerns = {
 force = true,
 bottomright = { { kern = 1000 } },
 }
 end
 end
end
%
luatexbase.add_to_callback("luaotfload.patch_font",
kern_right_fence,
"kern right fence")
}

\font\lmmath="Latin Modern Math:script=math;" at 10pt
\textfont0=\lmmath

\Umathcodenum`e="1D452
\Umathcodenum`f="1D453
\Umathcodenum`g="1D454

$efg$


I don't know how the plugs in otfload work but it's more something like

characters[0x1D453].math =
  { kerns = { bottomright = { { kern = 1000 } } }

where of course you need to make sure that when there is already a math 
table you add instead of replace; eventually that becomes something scaled


characters[0x1D453].mathkern.bottom_right = ...

don't confuse a higher level context interface with low level font 
properties


Hans


-----
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] new commands in luatex 1.08/9

2018-09-01 Thread Hans Hagen

On 9/1/2018 10:46 AM, David Carlisle wrote:

While checking the latex interfaces for the luatex update, the
following primitives which don't seem to be in the manual came up


 + compoundhyphenmode
 + endlocalcontrol
 + fixupboxesmode
 + gtoksapp
 + gtokspre
 + mathflattenmode
 + mathrulethicknessmode
 + xtoksapp
 + xtokspre

from the names and a quick look at the sources, we can see what some
of them do, but are these experimental additions that may go, or are
they new primitives that just missed the manual in the current
revision.  Either way some hints of what they do would be useful,
thanks:-)
once i've tested them (read: let them test in production by ctx users) 
they will become official in 1.10 (although for instance fixupboxesmode 
and endlocalcontrol might stay experimental for a quite while as they 
are really experimental)


(thanks for checking, i'll add them to the todo list)

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] "binary" marker missing in luatex 1.09, Development id: 6884

2018-08-12 Thread Hans Hagen

On 8/12/2018 10:20 AM, Patrick Gundlach wrote:




Am 12.08.2018 um 10:00 schrieb Hans Hagen :

On 8/11/2018 5:28 PM, Ulrike Fischer wrote:



In the pdf there is the "binary" marker on the line after the pdf
version missing. The uncompressed pdf starts like this:
%PDF-1.5
3 0 obj
<< /Length 449 >>
instead of
%PDF-1.5
%ÐÔÅØ
3 0 obj
<>

It's on purpose.


At least PDF A-1 (ISO 19005-1:2005(E)) requires these binary markers.

"6.1.2 File header
The % character of the file header shall occur at byte offset 0 of the file.

The file header line shall be immediately followed by a comment consisting of a 
% character followed by at
least four characters, each of whose encoded byte values shall have a decimal value 
greater than 127."
ok, i'll add a suitable blob then ... (i don't have these iso standards 
as i'm not going to spend money on iso standards)


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] "binary" marker missing in luatex 1.09, Development id: 6884

2018-08-12 Thread Hans Hagen

On 8/11/2018 5:28 PM, Ulrike Fischer wrote:

I just got the newest luatex 1.09, development id: 6884, from
w32tex.org and ran some tests.

In the pdf there is the "binary" marker on the line after the pdf
version missing. The uncompressed pdf starts like this:

%PDF-1.5
3 0 obj
<< /Length 449 >>

instead of

%PDF-1.5
%ÐÔÅØ
3 0 obj
<>
It's on purpose. (Btw, there were cases when whole blobs of e.g. 
executable code was put there.) The proper way to identitify a file 
doesn't depend on that piece of obscured "PTEX" but on metadata and such.


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Inspect last node without removing

2018-08-03 Thread Hans Hagen

On 8/3/2018 2:47 AM, Henri Menke wrote:

Dear list,

Sometimes I want to break out into Lua during typesetting and just 
inspect the last node at the current point.  LuaTeX gives me the 
opportunity to get the last node using node.last_node().  However this 
pops the node from the list, i.e. it disappears from the output stream. 
I don't want this.  I simply want to inspect the node and maybe insert 
something in the current list.  I'm not even able to reinsert the node 
because node.write(node.copy(last)) confuses the parbuilder.  Also, how 
can I get the current list?  MWE below.


the confusion is because node.write appends immediate so probably in the 
wrong spot


anyway, you can do

a\directlua{print(tex.nest[tex.nest.ptr].tail)}

assuming of course that you ask for it at the right time and level

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Re-order pages

2018-08-01 Thread Hans Hagen

On 8/1/2018 2:41 PM, Patrick Gundlach wrote:

Hi David,


I'm not sure you can do it in the not-yet finalised pdf but if you
just write out your table of contents pages to the end of the document
there are several pdf utilities that can re-order the pages


I'd like to keep away from external tools as they a) tend to require quite an 
installation (jre, python or such - I'd like to stay with the pure LuaTeX 
binary) and b) the functionality was already tested with pdfTeX/LuaTeX. So this 
might not be too far away.
that experimental pdftex page divert functionality of over a decade ago 
involved all kind of messy primitives ... that won't come back


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Re-order pages

2018-08-01 Thread Hans Hagen

On 8/1/2018 2:18 PM, Patrick Gundlach wrote:


pdf is capable of rendering the pages which is not necessary in the order of 
the time written. Let's say you write 10 pages into the pdf file, you can tell 
afterwards that the logical order is

1 7 2 3 4 5 6 8 9 10

Now I'd like to haven an interface to that.

The application is (as I have written in the first post) writing a table of 
contents. When I am at the end of a document, I know all the entries for the 
table of contents including the page numbers they appear on. Now instead of 
writing this information on the harddrive and reading it back in the second 
run, I could create another page (toc) and insert this as the real table of 
contents.

In the example above I write the table on page 7 and tell the PDF that it 
should display page number 7 after the first one. Of course, the visible page 
numbers should be in order, but this is not what I am asking.

My documents are sometimes really big (20 minutes to 40 minutes) and this would 
speed up things dramatically.
for that you need to adapt the page tree to flush a different order and 
i have been playing with that idea a while ago but didn't like messing 
up the backend for that (it would involve some caching and such)


i can think of a dirty hack where we have a callback that fetches the 
required page object number for each entry in the page tree and as one 
can ask for the object number that could work ok, think of (probably 
wrong but you get the idea):


local noftocpages = 2
local nofdocpages = 3

function gimmethepageobject(pagenumber)
if pagenumber > noftocpages then
return pdf.getpageref(pagenumber - noftocpages)
else
return pdf.getpageref(nofdocpages + pagenumber)
end
end

So think about this:

1   123 doc->  126
2   124 doc->  127
3   125 doc->  123
4   126 toc->  124
5   127 toc->  125

no checking of errors, no checking for valid numbers ... just a simple hack

(i can add such a callback if needed)

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Misplaced letter

2018-07-31 Thread Hans Hagen

On 7/31/2018 3:17 PM, Ulrike Fischer wrote:

Am Wed, 25 Jul 2018 19:38:26 +0200 schrieb Hans Hagen:


or you can try to add this to

nodes.simple_font-handler

  if not direction then
  direction = tex.get("textdir") -- experiment
  end


Is there a reason why this isn't there yet by default?
Could it break somewhere?
because normally it gets the direction of the encapsulating box (while 
align cells don't have that)


anyway, direction specific issues are also macro package dependent and 
context and latex might (and probably do) approach things differently so 
we can best play safe there


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Misplaced letter

2018-07-26 Thread Hans Hagen

On 7/26/2018 3:26 PM, Javier Bezos wrote:


\font\at= [c:/Windows/Fonts/arabtype.ttf]:%
    mode=node;script=arab;+init;+medi;+fina;+rlig;
how about also adding: 'kern' 'mark' 'mkmk' 'curs' as well as 'analyze' 
and maybe 'liga', 'calt', 'clig' and more


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Misplaced letter

2018-07-25 Thread Hans Hagen

On 7/25/2018 11:53 PM, Ulrike Fischer wrote:


does it still panic if two object are written?



i don't know .. i've never done that .. but i would not be surprised as
one can mess up in many ways (and adding all kind of checks for what are
programming errors at the lua end


Well I didn't do it by purpose and it wasn't at the lua end


i changed the error to:

! error:  (pdf backend): scheduled object is already used

so we still quit as we do with other errors, but with a less dramatic 
PANIC message.


(fwiw, we have lots of different warnings and errors than pdftex if only 
because the backend code evolves differently)


Hans
-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Misplaced letter

2018-07-25 Thread Hans Hagen

On 7/25/2018 11:53 PM, Ulrike Fischer wrote:


Robert's example was a simple plain tex example:

https://tug.org/pipermail/luatex/2018-May/006818.html

As far as I can see \letterspacefont doesn't like virtual fonts
anymore.
i think i found the reason (wrong packet size), so i have no abort now 
(i have no clue if the result is ok because i only tested loading the 
tfm/vf pair)


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Misplaced letter

2018-07-25 Thread Hans Hagen

On 7/25/2018 8:29 PM, Ulrike Fischer wrote:

Am Wed, 25 Jul 2018 19:38:26 +0200 schrieb Hans Hagen:



The spacing on the left of the left lines looks a bit wrong. And the
numbers should be perhaps read 1234. But I don't know enough about
arabic fonts to be able to setup a comparision with context and with
the luatex-plain code. And so I can't decide if the problem is with
luaotfload/the fontloader or some missing feature in the font
definition. Perhaps you could try too and show your output?


in such a low level halign the direction is not known so one should
either add a \textdir TRT in the cell or you can try to add this to


When I added \textdir to the cell the content disappeared ;-).
(Perhaps it is outside the text margin or whatever).


how about

{\textdir ... }

does it still panic if two object are written?


i don't know .. i've never done that .. but i would not be surprised as 
one can mess up in many ways (and adding all kind of checks for what are 
programming errors at the lua end will not happen i.e. only trivial 
tests that don't hit performance) .. it's a side effect of opening up 
that one can introduce conflicts



Did you find the source for Roberts problem with the "! error:
(vf): invalid DVI command"?
i didn't look into something like that; is there a simple test i can do 
without installing a ton of code? btw, if it's a bad font then such a 
message sounds ok to me ... using for instance a non-pdf vf special in a 
virtual font is invalid in pdf output mode as is a vf pdf special in dvi 
mode


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Misplaced letter

2018-07-25 Thread Hans Hagen

On 7/25/2018 6:12 PM, Ulrike Fischer wrote:

Am Sat, 21 Jul 2018 22:13:01 +0200 schrieb Hans Hagen:



afaik arabic works ok in context (i'd have gotten complaints from users
otherwise) so maybe you don't enable the right features


Javier posted an example at

https://github.com/lualatex/luaotfload/issues/423

I posted there also the output I get with a current fontloader and
the patches we made to luaotfload.

The spacing on the left of the left lines looks a bit wrong. And the
numbers should be perhaps read 1234. But I don't know enough about
arabic fonts to be able to setup a comparision with context and with
the luatex-plain code. And so I can't decide if the problem is with
luaotfload/the fontloader or some missing feature in the font
definition. Perhaps you could try too and show your output?


in such a low level halign the direction is not known so one should 
either add a \textdir TRT in the cell or you can try to add this to


nodes.simple_font-handler

if not direction then
direction = tex.get("textdir") -- experiment
end

of whatever variant you use .. both work ok

how numbers should read is not up to the fontloader code


Be aware that Ulrike takes recent version from the context distribution
so it might not be a drop in for texlive.


If Javier needs a newer fontloader we will have to push the patched
files and the newer fontloader to CTAN. luaotfload will break with
luatex 1.09 so at some time this will have to be done anyway. The
main point of my current tests is to be prepared for such a
situation ...


Isn't it more the reverse? That a more recent fontloader expects 1.09+
(it also depends on what compatibility hacks you add)?

(Btw, there have not that many changed in the font code recently, only 
that generic lua font lookup stuff you wanted.)


(Also, @ version 1.10 luatex is more of less complete, apart from low 
level improvements nothing fundamental will change / be added.)


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] Misplaced letter

2018-07-21 Thread Hans Hagen

On 7/21/2018 11:24 AM, Javier Bezos wrote:

Still struggling, after several months, with Arabic fonts :-(.
This is the only thing I need to release a new babel with support
for Arabic, but I'm stuck.

Hans:

     > I'll send Ulrike something to test.


afaik arabic works ok in context (i'd have gotten complaints from users 
otherwise) so maybe you don't enable the right features



Ulrike:

    With the patched function (added to a fontloader from 2017-12-12)
    the output of both examples looks ok (that means that the paragraphs
    are identical, I have no idea if they are correct ;-))


Please, could you post the fix here or send it to me?


Be aware that Ulrike takes recent version from the context distribution 
so it might not be a drop in for texlive.



By the way, I've attempted to use the context loader with:

---
[run]
fontloader = context;
---


with 'context' being ... actually the generic fontloader uses the fact 
that 'context' is undefined but i assume that it never was an issue in 
the wrapped code



in luaotfload.conf and with the help pf fontloader-luaotfload,
but it doesn't work, with the error:

-
luaotfload | db : Reload initiated (formats: otf,ttf,ttc); reason: "File 
not found: lmroman10-regular.".(load luc: C:/Aplica
ciones/TeXLive/texmf-var/luatex-cache/generic/fonts/otl/lmroman10-regular.luc) 

! Font \TU/lmr/m/n/10=[lmroman10-regular]:+tlig; at 10pt not loadable: 
metric d

ata not found or bad.
-

A fix for this error would be also most welcome. The following
loads the fonts, but Arabic is still buggy:

---
[run]
fontloader  = fontloader-reference-2017-08-18.lua;
---

--
Javier





--

-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] loading new language patterns

2018-05-20 Thread Hans Hagen

On 5/19/2018 10:27 PM, Luis Rivera wrote:

I'm trying to understand how to load a new set of language patterns in
plain luatex. This far, I collect I have to do

%%%
\directlua{
grc = lang.new()


grc = lang.new(123)


pattfile = io.open(kpse.find_file('grchyph.tex'), 'r')
lang.patterns(grc, pattfile:read('*all'))
pattfile:close()
}

\def\greek{\language\grc}


\def\greek{\language123 }



\greek χαῖρε
\bye
%%%

But then I don't know how to access the patterns with a language declaration.

I'm sorry, but I fear the documentation is far too terse on this topic
for a non-programmer. Any hints are welcome.

it assumes that you know tex (programming)

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] luatex crash when \hyphenation contains a discretionary

2018-05-19 Thread Hans Hagen

On 5/19/2018 6:01 PM, Ulrike Fischer wrote:

Am Mon, 7 May 2018 12:38:23 +0200 schrieb Ulrike Fischer:


When I compile this document with context + luatex 1.08 luatex
crashes:


[...]

The crash no longer happens with a luatex 1.09.

But now I wonder, is there/should there be a difference between

\hyphenation{mul-ti{-}{}{-}word{-}{}{-}boun-daries}

and

\hyphenation{mul-ti=word=boun-daries}

?  In my tests they give the same output.

if you input

multi-word-boundaries

that becomes

multiwordboundaries

and when matching the = is just a skip operation so we jump over the 
 and it stays


as {-}{}{-} injects a similar disc indeed you get the same

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] this can't happen error with Umiddle (luatex 1.07/texlive 2018)

2018-05-17 Thread Hans Hagen

On 5/17/2018 4:08 PM, David Carlisle wrote:

On 17 May 2018 at 10:36, Hans Hagen <j.ha...@xs4all.nl> wrote:

On 5/16/2018 10:16 AM, David Carlisle wrote:


The following plain tex example


$\left( A \middle|  \right)$ %OK



$\Uleft( A \Umiddle class 2 | B \Uright)$ %OK


$\Uleft( A \Umiddle class 2 | \Uright)$ %oops

\bye

produces


! This can't happen (mathspacing).
l.10 $\Uleft( A \Umiddle class 2 | \Uright)$
 %oops
!  ==> Fatal error occurred, no output PDF file produced!



I'd actually meant to have the A|B but the error message for my typo
seemed a bit harsh!

Having an empty math list to the right of a class 2  \Umiddle might be
odd but shouldn't be an error I assume?


it's a side effect of checking math spacing for noad pairs ... as we default
to zero for (traditionally unknown) combinations anyway, we can make that
check less stringent (so luatex 1.09+ will not abort)

Hans


Thanks. I  guessed as much.

Actually that leads to another question which I don't think is quite
clear from the manual.
If you use class 2 (or 3) with any of \Uleft/right/middle at the start
or end of a math list
do they lose their mathbin/mathrel status and become \mathord, as a
classic \mathbin{} does does?


that was indeed the assumption (that these combinations could not happen)


I'm guessing that the issue is that \Umiddle class 2 _doesn't_ lose
its mathbin status at the end of the list
and so put you (or me:-) in the previously impossible situation of a
class 2 atom at the end of the mathlist?
although we tag middle fences in luatex we follow the old approach which 
treats it as a left one which then assumes that there is something at 
the right of it (which makes sense as why use a middle and not a right 
then) and because in luatex we have more spacing combinations we ran 
into this special spacing case


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] this can't happen error with Umiddle (luatex 1.07/texlive 2018)

2018-05-17 Thread Hans Hagen

On 5/16/2018 10:16 AM, David Carlisle wrote:

The following plain tex example


$\left( A \middle|  \right)$ %OK



$\Uleft( A \Umiddle class 2 | B \Uright)$ %OK


$\Uleft( A \Umiddle class 2 | \Uright)$ %oops

\bye

produces


! This can't happen (mathspacing).
l.10 $\Uleft( A \Umiddle class 2 | \Uright)$
%oops
!  ==> Fatal error occurred, no output PDF file produced!



I'd actually meant to have the A|B but the error message for my typo
seemed a bit harsh!

Having an empty math list to the right of a class 2  \Umiddle might be
odd but shouldn't be an error I assume?
it's a side effect of checking math spacing for noad pairs ... as we 
default to zero for (traditionally unknown) combinations anyway, we can 
make that check less stringent (so luatex 1.09+ will not abort)


Hans

-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] pre_output_filter: How to get the vertical list of the full page?

2018-05-03 Thread Hans Hagen

On 5/3/2018 11:35 AM, Henri Menke wrote:

On 05/03/2018 07:13 PM, Hans Hagen wrote:

On 5/1/2018 11:50 PM, Henri Menke wrote:

On 02/05/18 03:17, Hans Hagen wrote:

On 5/1/2018 12:16 PM, Henri Menke wrote:

Dear list,

It seems as if the pre_output_filter only gives you the nodes in the
body block but without the \headline and \footline.  I would have
expected that those are included because \plainoutput performs

     \shipout\vbox{\makeheadline\pagebody\makefootline}

and I would have expected to get the vlist defined by the above \vbox
inside the callback.  Am I doing it wrong or am I using the wrong
callback?

you can for instance use vpack_filter


But how do I detect whether what is being packed is the main vertical
list?  According the manual `groupcode` should be  for the main
vertical list but that just never happens.  With the following MWE

  \directlua{
  callback.register("vpack_filter",
    function(head,groupcode)
    print('"' .. groupcode .. '"')
    return head
    end)
  }

  Hello World!

  \bye


build_page_filter


Sorry for being such a pain in the neck :(

How can I access the material to be pushed from within this callback?
Box 255 (the shipout box) is nil.  The reason I am asking this is
because I want to implement visual debugging like in ConTeXt.  Therefore
I want to walk the main vertical list on shipout and sprinkle it with
whatits to draw rules for boxes, glues, kerns, etc.

\directlua{
callback.register("buildpage_filter",
   function() print(tex.box[255]) end)
}

Hello World!
just hook code into the shipout routine at the tex end: take the box, 
pass it to soem lua code and walk over its content (there is no shipout 
callback - as it's not needed)


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] pre_output_filter: How to get the vertical list of the full page?

2018-05-03 Thread Hans Hagen

On 5/1/2018 11:50 PM, Henri Menke wrote:

On 02/05/18 03:17, Hans Hagen wrote:

On 5/1/2018 12:16 PM, Henri Menke wrote:

Dear list,

It seems as if the pre_output_filter only gives you the nodes in the
body block but without the \headline and \footline.  I would have
expected that those are included because \plainoutput performs

    \shipout\vbox{\makeheadline\pagebody\makefootline}

and I would have expected to get the vlist defined by the above \vbox
inside the callback.  Am I doing it wrong or am I using the wrong 
callback?

you can for instance use vpack_filter


But how do I detect whether what is being packed is the main vertical 
list?  According the manual `groupcode` should be  for the main 
vertical list but that just never happens.  With the following MWE


     \directlua{
     callback.register("vpack_filter",
   function(head,groupcode)
   print('"' .. groupcode .. '"')
   return head
   end)
     }

     Hello World!

     \bye


build_page_filter

-
      Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


  1   2   3   4   5   >