Re: 2.0.26 release? WAS: JBIG2 3.0.4 release?

2022-03-23 Thread Andreas Lehmkuehler
Am 23.03.22 um 05:28 schrieb Tilman Hausherr: I have created two issues on parsing exceptions, and it's not PDFBOX-5283. Maybe it's the same, maybe not. Re text extraction, I looked at one of the files (414724.pdf) and there's also a parsing warning, so maybe that is related too so lets just

Re: 2.0.26 release? WAS: JBIG2 3.0.4 release?

2022-03-22 Thread Tilman Hausherr
I have created two issues on parsing exceptions, and it's not PDFBOX-5283. Maybe it's the same, maybe not. Re text extraction, I looked at one of the files (414724.pdf) and there's also a parsing warning, so maybe that is related too so lets just wait. Tilman Am 22.03.2022 um 18:21 schrieb

Re: 2.0.26 release? WAS: JBIG2 3.0.4 release?

2022-03-22 Thread Tilman Hausherr
I don't have much time right now, but I just tested 077867.pdf and 392443.pdf and it's definitively a regression. I wonder if it was PDFBOX-5283. The files in content_diffs_no_exceptions.xls where the T column is non empty are suspicious and need more investigation. Tilman Am 22.03.2022

Re: 2.0.26 release? WAS: JBIG2 3.0.4 release?

2022-03-22 Thread Tim Allison
Reports are here: https://corpora.tika.apache.org/base/reports/tika-2.3-vs-2.4-pdfs.tgz It looks like no significant changes. Some diffs on a few files, but this was run on ~800k PDFs. There are a couple of cases where a file is now being detected as rfc822 instead of PDF. We have to fix that

Re: 2.0.26 release? WAS: JBIG2 3.0.4 release?

2022-03-21 Thread Andreas Lehmkuehler
Am 21.03.22 um 12:21 schrieb Tim Allison: I'm happy to run the tests today if that would be of any interest. Yes, please. TIA Andreas On Sun, Mar 20, 2022 at 5:01 PM Andreas Lehmkuehler wrote: Am 13.03.22 um 14:20 schrieb Tim Allison: From Tika's perspective, there's no rush. We're

Re: 2.0.26 release? WAS: JBIG2 3.0.4 release?

2022-03-21 Thread Tim Allison
I'm happy to run the tests today if that would be of any interest. On Sun, Mar 20, 2022 at 5:01 PM Andreas Lehmkuehler wrote: > > Am 13.03.22 um 14:20 schrieb Tim Allison: > > From Tika's perspective, there's no rush. We're waiting for a bug fix > > in POI (TIKA-3699). > > > > Please let me

Re: 2.0.26 release? WAS: JBIG2 3.0.4 release?

2022-03-20 Thread Andreas Lehmkuehler
Am 13.03.22 um 14:20 schrieb Tim Allison: From Tika's perspective, there's no rush. We're waiting for a bug fix in POI (TIKA-3699). Please let me know if/when I should run the regression tests. Thanks for the offer. Do we need to run the tests before cutting the release? Most of the tickets

Re: 2.0.26 release? WAS: JBIG2 3.0.4 release?

2022-03-13 Thread Andreas Lehmkuehler
Due to a possible issue in ToUnicodeWriter.writeTo (see dev@) I'm going to postpone the release for a week. I'd like to have a look at the issue and the proposed solution first. IMHO we should solve that issue ASAP to ensure that pdfs created with PDFBox follow the specs. Andreas Am 10.03.22

Re: 2.0.26 release? WAS: JBIG2 3.0.4 release?

2022-03-13 Thread Tim Allison
>From Tika's perspective, there's no rush. We're waiting for a bug fix in POI (TIKA-3699). Please let me know if/when I should run the regression tests. Thank you, all! Cheers, Tim On Sat, Mar 12, 2022 at 5:29 AM Andreas Lehmkuehler wrote: > > Am 11.03.22 um 08:30 schrieb Tilman

Re: 2.0.26 release? WAS: JBIG2 3.0.4 release?

2022-03-12 Thread Andreas Lehmkuehler
Am 11.03.22 um 08:30 schrieb Tilman Hausherr: Am 11.03.2022 um 08:19 schrieb Andreas Lehmkuehler: Am 10.03.22 um 20:16 schrieb Tilman Hausherr: I'd agree but that might mean PDFBOX-5384 wouldn't be fixed. It's there for quite some time and it seems to be a seldom corner case. IMHO it can wait

Re: 2.0.26 release? WAS: JBIG2 3.0.4 release?

2022-03-10 Thread Tilman Hausherr
Am 11.03.2022 um 08:19 schrieb Andreas Lehmkuehler: Am 10.03.22 um 20:16 schrieb Tilman Hausherr: I'd agree but that might mean PDFBOX-5384 wouldn't be fixed. It's there for quite some time and it seems to be a seldom corner case. IMHO it can wait if we won't find a solution before Monday.

Re: 2.0.26 release? WAS: JBIG2 3.0.4 release?

2022-03-10 Thread sahy...@fileaffairs.de
Am Freitag, dem 11.03.2022 um 08:19 +0100 schrieb Andreas Lehmkuehler: > Am 10.03.22 um 20:16 schrieb Tilman Hausherr: > > I'd agree but that might mean PDFBOX-5384 wouldn't be fixed. > It's there for quite some time and it seems to be a seldom corner > case. IMHO it > can wait if we won't find a

Re: 2.0.26 release? WAS: JBIG2 3.0.4 release?

2022-03-10 Thread Andreas Lehmkuehler
Am 10.03.22 um 20:16 schrieb Tilman Hausherr: I'd agree but that might mean PDFBOX-5384 wouldn't be fixed. It's there for quite some time and it seems to be a seldom corner case. IMHO it can wait if we won't find a solution before Monday. WDYT? Andreas Tilman Am 10.03.2022 um 19:05

Re: 2.0.26 release? WAS: JBIG2 3.0.4 release?

2022-03-10 Thread Tilman Hausherr
I'd agree but that might mean PDFBOX-5384 wouldn't be fixed. Tilman Am 10.03.2022 um 19:05 schrieb Andreas Lehmkuehler: Am 09.03.22 um 17:07 schrieb Tim Allison: All, I've been out of the office for a bit and haven't caught up yet. Apologies if I've missed the discussion. Are there plans

Re: 2.0.26 release? WAS: JBIG2 3.0.4 release?

2022-03-10 Thread Andreas Lehmkuehler
Am 09.03.22 um 17:07 schrieb Tim Allison: All, I've been out of the office for a bit and haven't caught up yet. Apologies if I've missed the discussion. Are there plans for a 2.0.26 release? We're probably a few weeks out How about cutting the release next Monday? Andreas from starting

2.0.26 release? WAS: JBIG2 3.0.4 release?

2022-03-09 Thread Tim Allison
All, I've been out of the office for a bit and haven't caught up yet. Apologies if I've missed the discussion. Are there plans for a 2.0.26 release? We're probably a few weeks out from starting our next 1.x and 2.x releases on Tika, and it would be great to incorporate 2.0.26. No problem at

Re: JBIG2 3.0.4 release?

2022-03-04 Thread Tilman Hausherr
Am 24.02.2022 um 07:41 schrieb Andreas Lehmkuehler: Am 22.02.22 um 07:49 schrieb Andreas Lehmkuehler: Hi, I'm planning to cut a new JBIG2 release next week. There aren't that much changes but I think the fixes are worth to be released. [1] I'm going to cut the release next weekend, if nobody

Re: JBIG2 3.0.4 release?

2022-02-23 Thread Andreas Lehmkuehler
Am 22.02.22 um 07:49 schrieb Andreas Lehmkuehler: Hi, I'm planning to cut a new JBIG2 release next week. There aren't that much changes but I think the fixes are worth to be released. [1] I'm going to cut the release next weekend, if nobody objects. Once it is done we should think about a

Re: JBIG2 3.0.4 release?

2022-02-21 Thread Tilman Hausherr
+1 Tilman Am 22.02.2022 um 07:49 schrieb Andreas Lehmkuehler: Hi, I'm planning to cut a new JBIG2 release next week. There aren't that much changes but I think the fixes are worth to be released. [1] WDYT? Andreas [1]

JBIG2 3.0.4 release?

2022-02-21 Thread Andreas Lehmkuehler
Hi, I'm planning to cut a new JBIG2 release next week. There aren't that much changes but I think the fixes are worth to be released. [1] WDYT? Andreas [1]