Bug#858767: gscan2pdf: fixes & improvements to gscan2pdf

2017-05-07 Thread Peter Marschall
Hi,

followup to my answer from yesterday.

Am Donnerstag, 20. April 2017, 18:56:30 CEST schrieb Jeff:
> Please refactor the patches so that the tests are updated in the same
> patch that affects the code that is tested.
Done, please see 
https://github.com/marschap/gscan2pdf/tree/updates_to_1.8.0_cleaned

> Please also refactor your changes to pass perlcritic -p t/perlcriticrc
Dito (as much as possible)

Best
Peter

-- 
Peter Marschall
pe...@adpm.de



Bug#858767: gscan2pdf: fixes & improvements to gscan2pdf

2017-04-20 Thread Jeff
Thanks for the revised patches.

Please refactor the patches so that the tests are updated in the same
patch that affects the code that is tested.

Please also refactor your changes to pass perlcritic -p t/perlcriticrc

I am unable to convince tesseract to output hocr with textangle values
(and therefore see the full value of the text rotation changes). Can you
give me a suitable test image?

Which version of tesseract are you using?



signature.asc
Description: OpenPGP digital signature


Bug#858767: gscan2pdf: fixes & improvements to gscan2pdf

2017-04-16 Thread Peter Marschall
Hi Jeff,

Am Mittwoch, 5. April 2017, 22:10:51 CEST schrieb Jeff:
> Thanks for your patches. I have committed everything up to "Canvas.pm:
> fix box offsets & text rotation" and pushed the commits, plus some of my
> own to the repo on Sourceforge.
Saw then in 1.8.0.
Thanks!

> At the moment, there are no unit tests for Gscan2pdf::Canvas, and as
> prefer to work test-driven, I would like a test which would fail without
> this patch.
I pdated the remaining patches, updated existing unit tests that
failed with my changes and added some additional unit tests.

You can find them in my github branch
https://github.com/marschap/gscan2pdf/tree/updates_to_1.8.0

> I am not 100% sure what problem your patch fixes, so would you mind
> constructing such a unit test? I would then have no problem committing
> the patch.
Unfortunately I did not get around to writing a unit test for the "Canvas.pm: 
fix box offsets & text rotation", as this patch fixes graphical glitches:
- in the OCR it draws bounding boxes of hOCR elements where they belong
- in the OCR tab it draws left-rotated text correctly rotated

Simply test with tesseract and rotated text with and without my patches to see 
the difference.

If you have an idea how to check for such kind of issues in a unit test, 

> The unit test would probably have to defined a page with some boxes
> before calling Gscan2pdf::Canvas->new() followed by canvas2hocr() and
> check the hocr output.
See my additions to 17_Canvas.t

Hoping to get the patches included
Peter

-- 
Peter Marschall
pe...@adpm.de



Bug#858767: gscan2pdf: fixes & improvements to gscan2pdf

2017-04-05 Thread Jeff
Hi Peter,

Thanks for your patches. I have committed everything up to "Canvas.pm:
fix box offsets & text rotation" and pushed the commits, plus some of my
own to the repo on Sourceforge.

At the moment, there are no unit tests for Gscan2pdf::Canvas, and as
prefer to work test-driven, I would like a test which would fail without
this patch.

I am not 100% sure what problem your patch fixes, so would you mind
constructing such a unit test? I would then have no problem committing
the patch.

The unit test would probably have to defined a page with some boxes
before calling Gscan2pdf::Canvas->new() followed by canvas2hocr() and
check the hocr output.

Whilst you are at it, Gscan2pdf::Canvas has quite a few
ProhibitMagicNumbers Perl::Critic overrides. If you know what the
numbers mean, I would appreciate a patch replacing the numbers with
descriptive Readonly variables.

Thanks for your efforts

Regards

Jeff





signature.asc
Description: OpenPGP digital signature


Bug#858767: gscan2pdf: fixes & improvements to gscan2pdf

2017-03-26 Thread Peter Marschall
Package: gscan2pdf
Version: 1.7.3-1
Severity: normal
Tags: patch

Hi Jeffrey,

I have tried to fix some issues that I have with gscan2pdf
in a series of patches at github:
  https://github.com/marschap/gscan2pdf/tree/updates_to_1.7.3

They comprise a series of fixes & improvements, like e.g:
- correct placement of bbox'es surrounding OCR'ed text
- correct orientation of OCR'ed text
- support for another option for unpaper: --no-mask-center
- new brightness & contrast tool (more flexible than the Threshold)
- new config setting to replace blanks with underscores when saving
- XDG-compliant config-file placement (incl. migration)
- fixed dealing with baselines
- wider borders & better aligned action buttons in dialogs
- ...

It would be cool if they (at least a subset) would make it into
upstream gscan2pdf.

Thanks for gscan2pdf
Peter


-- System Information:
Debian Release: 9.0
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'unstable'), (500, 'stable'), (1, 
'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 4.9.0-2-amd64 (SMP w/4 CPU cores)
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages gscan2pdf depends on:
ii  imagemagick8:6.9.7.4+dfsg-2
ii  imagemagick-6.q16 [imagemagick]8:6.9.7.4+dfsg-2
ii  libconfig-general-perl 2.63-1
ii  libdate-calc-perl  6.4-1
ii  libfilesys-df-perl 0.92-6+b1
ii  libgoo-canvas-perl 0.06-2+b3
ii  libgtk2-ex-simple-list-perl0.50-2
ii  libgtk2-imageview-perl 0.05-2+b3
ii  libhtml-parser-perl3.72-3
ii  libimage-magick-perl   8:6.9.7.4+dfsg-2
ii  liblist-moreutils-perl 0.416-1+b1
ii  liblocale-gettext-perl 1.07-3+b1
ii  liblog-log4perl-perl   1.48-1
ii  libossp-uuid-perl [libdata-uuid-perl]  1.6.2-1.5+b4
ii  libpdf-api2-perl   2.030-1
ii  libproc-processtable-perl  0.53-2
ii  libreadonly-perl   2.050-1
ii  librsvg2-common2.40.16-1+b1
ii  libsane-perl   0.05-2+b4
ii  libset-intspan-perl1.19-1
ii  libtiff-tools  4.0.7-5
ii  libtry-tiny-perl   0.28-1
ii  sane-utils 1.0.25-3pm1

Versions of packages gscan2pdf recommends:
ii  djvulibre-bin  3.5.27.1-7
ii  gocr   0.49-2+b1
ii  libgtk2-ex-podviewer-perl  0.18-1
ii  sane   1.0.14-12
ii  tesseract-ocr  3.04.01-5
ii  unpaper6.1-2+b1
ii  xdg-utils  1.1.1-1

gscan2pdf suggests no packages.

-- no debconf information