Just a friendly reminder, I'm planning to cut the release today in about
10 hours from now.
Andreas
Am 30.12.24 um 10:43 schrieb Andreas Lehmkühler:
Hi,
IMHO it is time to cut another 2.0.x release.
I'm planing to do so in two weeks from now.
Any objections?
Andreas
P.S.: I'd like to cut
Am 12.01.25 um 13:58 schrieb sahy...@fileaffairs.de:
Am Sonntag, dem 12.01.2025 um 13:24 +0100 schrieb Andreas Lehmkühler:
Am 08.01.25 um 04:56 schrieb Tilman Hausherr:
On 07.01.2025 15:00, Tilman Hausherr wrote:
- mysterious: govdocs1/838/838013.pdf has "ion: 4 | name: 4 |
creatinga: 3 |
Am Sonntag, dem 12.01.2025 um 13:24 +0100 schrieb Andreas Lehmkühler:
>
>
> Am 08.01.25 um 04:56 schrieb Tilman Hausherr:
> > On 07.01.2025 15:00, Tilman Hausherr wrote:
> > > - mysterious: govdocs1/838/838013.pdf has "ion: 4 | name: 4 |
> > > creatinga: 3 | ram: 3 | anand: 2 | jec: 2 | message:
Am 08.01.25 um 04:56 schrieb Tilman Hausherr:
On 07.01.2025 15:00, Tilman Hausherr wrote:
- mysterious: govdocs1/838/838013.pdf has "ion: 4 | name: 4 |
creatinga: 3 | ram: 3 | anand: 2 | jec: 2 | message: 2 | oc: 2 | ons:
2 | 0or: 1", "creatinga" and "anand" DO NOT APPEAR in ordinary text
e
On 07.01.2025 15:00, Tilman Hausherr wrote:
- mysterious: govdocs1/838/838013.pdf has "ion: 4 | name: 4 |
creatinga: 3 | ram: 3 | anand: 2 | jec: 2 | message: 2 | oc: 2 | ons:
2 | 0or: 1", "creatinga" and "anand" DO NOT APPEAR in ordinary text
extractions, not even with Tika from the command li
On 07.01.2025 14:10, Tilman Hausherr wrote:
latest:
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.32_vs_2.0.33-6.tar.xz
So this is pretty good now. Here's what I found:
- superscript degradation ("1 coupled" becomes "1coupled"): annoying,
but should be solved separately some day with
latest:
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.32_vs_2.0.33-6.tar.xz
-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org
On 06.01.2025 10:19, Tilman Hausherr wrote:
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.32_vs_2.0.33-4.tar.xz
This is starting to look much better. I'm doing another "B" run that
includes the last change from PDFBOX-5920. It does not yet include the
revert from PDFBOX-5384.
ht
On 05.01.2025 18:17, Tilman Hausherr wrote:
The last commit also fixes the changes in 2 italian files, e.g.
P5ZY75DGZMAP3VYLGTWRWTIOZN7IPXJO. However it turns out that their
text extraction was terrible before. The "La storia" part below the
blue bar in fine print is also difficult to read f
On 05.01.2025 18:17, Tilman Hausherr wrote:
The last commit also fixes the changes in 2 italian files, e.g.
P5ZY75DGZMAP3VYLGTWRWTIOZN7IPXJO. However it turns out that their text
extraction was terrible before. The "La storia" part below the blue
bar in fine print is also difficult to read for
The last commit also fixes the changes in 2 italian files, e.g.
P5ZY75DGZMAP3VYLGTWRWTIOZN7IPXJO. However it turns out that their text
extraction was terrible before. The "La storia" part below the blue bar
in fine print is also difficult to read for a human.
I'm gonna start another "B" run so
Hi,
thanks for running the tests.
I'm going to have a look as well.
Andreas
Am 05.01.25 um 13:47 schrieb Tilman Hausherr:
On 04.01.2025 21:26, Tilman Hausherr wrote:
After that, I'll do another "B" run but with ActualText disabled,
because this is responsible for most of the differences. S
On 04.01.2025 21:26, Tilman Hausherr wrote:
After that, I'll do another "B" run but with ActualText disabled,
because this is responsible for most of the differences. Some are
improvements, some are not.
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.32_vs_2.0.33-3.tar.xz
Currently
After that, I'll do another "B" run but with ActualText disabled,
because this is responsible for most of the differences. Some are
improvements, some are not.
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.32_vs_2.0.33-3.tar.xz
On 04.01.2025 14:27, Tilman Hausherr wrote:
On 04.01.2025 10:46, Tilman Hausherr wrote:
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.32_vs_2.0.33.tar.xz
definitively some work to do; one exception in CMap and some content
differences / losses.
I'm currently doing another "B" run to
On 04.01.2025 10:46, Tilman Hausherr wrote:
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.32_vs_2.0.33.tar.xz
definitively some work to do; one exception in CMap and some content
differences / losses.
I'm currently doing another "B" run to be sure that there are no further
exceptions
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.32_vs_2.0.33.tar.xz
definitively some work to do; one exception in CMap and some content
differences / losses.
The 2.0.33 snapshot version is from 31.12, I forgot to build 2.0.
However there have been no changes related parsing or text extra
+1
I'm setting myself a notice to start regression tests for 2.0 on 4.1
Tilman
On 30.12.2024 10:43, Andreas Lehmkühler wrote:
Hi,
IMHO it is time to cut another 2.0.x release.
I'm planing to do so in two weeks from now.
Any objections?
Andreas
P.S.: I'd like to cut the next 3.0.x release
18 matches
Mail list logo