Sean and Rogério,

It's easiest for everyone if the difference between upstream and packages
is as small as possible, so I've been working on removing files that are
problematic for Debian.

In recent releases I have been removing all files that were previously in
Files-Excluded, except for:
pikepdf:docs/images/save-pike.jpg - public domain image of a sign likely
produced by a government agency in Ireland
ocrmypdf:docs/logo/logo - as we previously discussed, the .svg is now the
master version of the logo, and can be edited by open source tools.

In ocrmypdf, there are no new test resources since 9.8. I believe that the
patch that drops a test in tests/test_metadata.py can also be removed -
this previously used a resource with problematic copyright status, which is
probably why it was added.

In pikepdf, there are a few synthetic files I generated, and
pikepdf:tests/resources/jbig2global.pdf is a PDF'd copy of ocrmypdf:
tests/resources/typewriter.png. disable test_icc_extract.patch
<https://sources.debian.org/patches/pikepdf/1.13.0+dfsg-2/disable-test_icc_extract.patch/>
can also be dropped, since the resource this used has been replaced with an
image I generated.

I updated debian/copyright in both projects at the HEAD revision (not a
tagged release). These files should reflect the current status.

I believe this means the updates shouldn't be too difficult, and also that
the -dfsg version tag could be dropped from both packages. (pikepdf is now
powerful enough that I can usually synthesize problematic constructs
instead of adding another test resource.)

James

On Sat, Jul 18, 2020 at 12:06 PM Sean Whitton <spwhit...@spwhitton.name>
wrote:
>
> Hello Rogério,
>
> On Mon 15 Jun 2020 at 09:13AM -03, Rogério Brito wrote:
>
> > A new major upstream version (10.0.1) of ocrmypdf was released a few
days
> > ago and it is *so much faster* than the previous versions 8.x, 9.x,
> > especially during the (painful) initial step of "Scanning".
> >
> > I installed it via pip in a virtual environment and it works very well
and
> > many hours of users will be saved if this new version is made available
for
> > users of Debian in general.
>
> Thank you for letting me know about the speed improvements.
>
> The main thing blocking updating both pikepdf and ocrmypdf -- which I
> try to do together since upstream is the same -- is updating d/copyright
> for all the new test resources which are included.
>
> This often requires looking up licenses on commons.wikimedia.org, and
> adding new files to Files-Excluded:.
>
> Perhaps you would be interested in helping out?
>
> What you would need to do is something like `git diff --name-status
> --diff-filter=ADR v1.13.0..v1.17.2` (versions are for pikepdf) and then
> work on a patch to d/copyright.
>
> All the other parts of the packaging, including actually applying
> Files-Excluded:, I can deal with easily myself.
>
> --
> Sean Whitton

Reply via email to