Bug#858767: gscan2pdf: fixes & improvements to gscan2pdf
Hi, followup to my answer from yesterday. Am Donnerstag, 20. April 2017, 18:56:30 CEST schrieb Jeff: > Please refactor the patches so that the tests are updated in the same > patch that affects the code that is tested. Done, please see https://github.com/marschap/gscan2pdf/tree/updates_to_1.8.0_cleaned > Please also refactor your changes to pass perlcritic -p t/perlcriticrc Dito (as much as possible) Best Peter -- Peter Marschall pe...@adpm.de
Bug#858767: gscan2pdf: fixes & improvements to gscan2pdf
Thanks for the revised patches. Please refactor the patches so that the tests are updated in the same patch that affects the code that is tested. Please also refactor your changes to pass perlcritic -p t/perlcriticrc I am unable to convince tesseract to output hocr with textangle values (and therefore see the full value of the text rotation changes). Can you give me a suitable test image? Which version of tesseract are you using? signature.asc Description: OpenPGP digital signature
Bug#858767: gscan2pdf: fixes & improvements to gscan2pdf
Hi Jeff, Am Mittwoch, 5. April 2017, 22:10:51 CEST schrieb Jeff: > Thanks for your patches. I have committed everything up to "Canvas.pm: > fix box offsets & text rotation" and pushed the commits, plus some of my > own to the repo on Sourceforge. Saw then in 1.8.0. Thanks! > At the moment, there are no unit tests for Gscan2pdf::Canvas, and as > prefer to work test-driven, I would like a test which would fail without > this patch. I pdated the remaining patches, updated existing unit tests that failed with my changes and added some additional unit tests. You can find them in my github branch https://github.com/marschap/gscan2pdf/tree/updates_to_1.8.0 > I am not 100% sure what problem your patch fixes, so would you mind > constructing such a unit test? I would then have no problem committing > the patch. Unfortunately I did not get around to writing a unit test for the "Canvas.pm: fix box offsets & text rotation", as this patch fixes graphical glitches: - in the OCR it draws bounding boxes of hOCR elements where they belong - in the OCR tab it draws left-rotated text correctly rotated Simply test with tesseract and rotated text with and without my patches to see the difference. If you have an idea how to check for such kind of issues in a unit test, > The unit test would probably have to defined a page with some boxes > before calling Gscan2pdf::Canvas->new() followed by canvas2hocr() and > check the hocr output. See my additions to 17_Canvas.t Hoping to get the patches included Peter -- Peter Marschall pe...@adpm.de
Bug#858767: gscan2pdf: fixes & improvements to gscan2pdf
Hi Peter, Thanks for your patches. I have committed everything up to "Canvas.pm: fix box offsets & text rotation" and pushed the commits, plus some of my own to the repo on Sourceforge. At the moment, there are no unit tests for Gscan2pdf::Canvas, and as prefer to work test-driven, I would like a test which would fail without this patch. I am not 100% sure what problem your patch fixes, so would you mind constructing such a unit test? I would then have no problem committing the patch. The unit test would probably have to defined a page with some boxes before calling Gscan2pdf::Canvas->new() followed by canvas2hocr() and check the hocr output. Whilst you are at it, Gscan2pdf::Canvas has quite a few ProhibitMagicNumbers Perl::Critic overrides. If you know what the numbers mean, I would appreciate a patch replacing the numbers with descriptive Readonly variables. Thanks for your efforts Regards Jeff signature.asc Description: OpenPGP digital signature
Bug#858767: gscan2pdf: fixes & improvements to gscan2pdf
Package: gscan2pdf Version: 1.7.3-1 Severity: normal Tags: patch Hi Jeffrey, I have tried to fix some issues that I have with gscan2pdf in a series of patches at github: https://github.com/marschap/gscan2pdf/tree/updates_to_1.7.3 They comprise a series of fixes & improvements, like e.g: - correct placement of bbox'es surrounding OCR'ed text - correct orientation of OCR'ed text - support for another option for unpaper: --no-mask-center - new brightness & contrast tool (more flexible than the Threshold) - new config setting to replace blanks with underscores when saving - XDG-compliant config-file placement (incl. migration) - fixed dealing with baselines - wider borders & better aligned action buttons in dialogs - ... It would be cool if they (at least a subset) would make it into upstream gscan2pdf. Thanks for gscan2pdf Peter -- System Information: Debian Release: 9.0 APT prefers testing APT policy: (990, 'testing'), (500, 'unstable'), (500, 'stable'), (1, 'experimental') Architecture: amd64 (x86_64) Kernel: Linux 4.9.0-2-amd64 (SMP w/4 CPU cores) Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages gscan2pdf depends on: ii imagemagick8:6.9.7.4+dfsg-2 ii imagemagick-6.q16 [imagemagick]8:6.9.7.4+dfsg-2 ii libconfig-general-perl 2.63-1 ii libdate-calc-perl 6.4-1 ii libfilesys-df-perl 0.92-6+b1 ii libgoo-canvas-perl 0.06-2+b3 ii libgtk2-ex-simple-list-perl0.50-2 ii libgtk2-imageview-perl 0.05-2+b3 ii libhtml-parser-perl3.72-3 ii libimage-magick-perl 8:6.9.7.4+dfsg-2 ii liblist-moreutils-perl 0.416-1+b1 ii liblocale-gettext-perl 1.07-3+b1 ii liblog-log4perl-perl 1.48-1 ii libossp-uuid-perl [libdata-uuid-perl] 1.6.2-1.5+b4 ii libpdf-api2-perl 2.030-1 ii libproc-processtable-perl 0.53-2 ii libreadonly-perl 2.050-1 ii librsvg2-common2.40.16-1+b1 ii libsane-perl 0.05-2+b4 ii libset-intspan-perl1.19-1 ii libtiff-tools 4.0.7-5 ii libtry-tiny-perl 0.28-1 ii sane-utils 1.0.25-3pm1 Versions of packages gscan2pdf recommends: ii djvulibre-bin 3.5.27.1-7 ii gocr 0.49-2+b1 ii libgtk2-ex-podviewer-perl 0.18-1 ii sane 1.0.14-12 ii tesseract-ocr 3.04.01-5 ii unpaper6.1-2+b1 ii xdg-utils 1.1.1-1 gscan2pdf suggests no packages. -- no debconf information