Re: [Wikisource-l] Budget for Wikisource

2017-06-30 Thread Alex Brollo
Opppss... I *presume* that _djvu.xml is bugged, really I only examined whole text file (deved, I think, from _djvu.xml file). I'll take a deeper look, examining too searchable PDF. Alex 2017-06-30 12:20 GMT+02:00 Alex Brollo : > Take a look to this case:

Re: [Wikisource-l] Budget for Wikisource

2017-06-30 Thread Alex Brollo
Take a look to this case: https://archive.org/details/GiacomoRacioppiLAgiografiaDiSanLaverioDel1162Images Here OCR (as you can see from _djvu.xml file) seems severely bugged, and obviously djvu file built by IA Upload tool can't be better than source. Please Aubrey go on notifying me any case of

Re: [Wikisource-l] Budget for Wikisource

2017-06-30 Thread Andrea Zanni
Unfortunately, sometimes, and apparently it's not related to the Google cover page (at least, I removed a page in a book and it doesn't have the problem. Another book indeed is disaligned, without removing the cover). Look this:

Re: [Wikisource-l] Budget for Wikisource

2017-06-30 Thread Sam Wilson
This is indeed a bug! I can't replicate it though. Does it happen for every book for you? Or only sometimes? Do you know what is different about the ones that fail? Is it related to removing (or not) the Google cover page? I can find time this weekend I think, to work on this. On Fri, 30 Jun

Re: [Wikisource-l] Budget for Wikisource

2017-06-30 Thread Andrea Zanni
Hello everyone, before talking again about this let me say that I think we have a "major" bug in the IA-upload: sometimes, the OCR is not aligned between the pages, meaning you have the right OCR but it's shown for the following page... Aubrey On Thu, May 11, 2017 at 1:30 AM, Sam Wilson