On 11/30/18 9:45 PM, Hans Hagen wrote: > On 11/30/2018 8:41 AM, Werner LEMBERG wrote: >> >> About a month ago I wrote: >> >>>> We have a new pdf parser (pplib from Paweł Jackowski) that replaces >>>> poppler. It is much smaller, a bit faster and it's written in >>>> pure C [...] >>> >>> Is there a project page for pplib? The source code of this library >>> contained in TeXLive is very, very uncommented – in particular, a >>> description of the API is completely missing, AFAICS. It also comes >>> with overly long lines and extremely densely written C code; it >>> almost feels as if the original source has been written with cweb or >>> something like that. >> >> I would be glad if someone could answer my question. > > During bachotex 2018 Pawel Jakowski (son of Jacko -- tex gyre project) > showed me some code and after looking at it we realized that it could be > used as drop in for poppler. > > In luatex, the pdf library, is actually not used that much: it can open > a pdf file and traverse the object tree. It has no further role in the > backend which copies and creates objects itself. So, a lightweight drop > in basically was considered doable quite well. Pawel explicitly limited > the functionality to a bare minimum: opening a file and traversing > objects. (But it's quite advanced as for instance we can also access to > password protected files). > > So, basically it went this way: pawel wrote the code, I replaced the > inclusion code and rewrote the pdf access library (so that one got a > different interface but the old one was way more complex and even has > issues; we're not compatible here). Then luigi spent quite some time on > integrating the library in the luatex source tree. > > The final integration involved dealing with cross platform issues. > Especially the arm platform with different alignment rules took some > work (luigi and pawel sorted that out eventually). We had soem feedback > from context testers (it's also always debatable to what extend one > should support fuzzy cases, bad documents etc).
That's definitely a sign of code smell. The code should not depend on the memory layout of the platform (unless you write an operating system or a compiler). > > There might still be corner cases to cover but we expect all to be ready > in time for tex live 2019. The biggest advantage is that we got rid of a > c++ dependency and that the code (which is unlikely to change much) is > part of the luatex code base. So it's in fact a library specially made > for luatex originating in the tex community. > > I hope that explains it a bit (there is not much more to tell i guess; > normally this kind of progress gets reported in status articles), > > Hans > > ----------------------------------------------------------------- > Hans Hagen | PRAGMA ADE > Ridderstraat 27 | 8061 GH Hasselt | The Netherlands > tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl > -----------------------------------------------------------------
