Yeah, I think that’s possible.  Let me check.  I’m not sure if it would replace 
all of ImageMagick.  That might be asking too much 😊

From: Tim Allison <[email protected]>
Sent: Wednesday, January 13, 2021 8:35 AM
To: [email protected]
Subject: Re: Rotation script

Wait, that _is_ tess4j, 8MB jar with bundled tessdata and win32 binaries.  Can 
we somehow liberate jdeskew so we don't package all of tess4j?

On Wed, Jan 13, 2021 at 7:15 AM Tim Allison 
<[email protected]<mailto:[email protected]>> wrote:
Peter,
  If you have a chance, can you see if that tess4j module has the same 
functionality as what we're getting with ImageMagick?  It'd be great to knock 
out 2 external dependencies with native Java if possible.

   Cheers,

         Tim

On Wed, Jan 13, 2021 at 7:14 AM Tim Allison 
<[email protected]<mailto:[email protected]>> wrote:
>But does the same thing apply in this case? It's not really using all of 
>tess4j. Just 1 package from it

Sorry, I should have checked on this exact point before responding.  If it 
isn't massive and has no native libraries, y, let's go for it.  Let me look 
into it a bit today.

On Tue, Jan 12, 2021 at 10:42 PM Peter Kronenberg 
<[email protected]<mailto:[email protected]>> wrote:
I sort of  see your reasons for not using Tess4j to replace the current command 
line calls to Tesseract. But does the same thing apply in this case? It's not 
really using all of tess4j. Just 1 package from it


________________________________
From: Tim Allison <[email protected]<mailto:[email protected]>>
Sent: Tuesday, January 12, 2021 9:11:58 PM
To: [email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>>
Subject: Re: Rotation script

I really like the idea of moving to pure Java for deskewing. We chose not to 
use tess4j earlier as a Java binding for tesseract because it requires native 
code....[1]

If we can do it with another call to tesseract from the command line or if 
there is a fairly lightweight pure Java, ASL 2.0 friendly image library that 
works well, that would be great.

[1]
https://issues.apache.org/jira/browse/TIKA-2293

On Tue, Jan 12, 2021 at 8:28 PM Peter Kronenberg 
<[email protected]<mailto:[email protected]>> wrote:
I'd been meaning to ask why you calculate the rotation with a Python script.  
As far as I can tell, that is the only reason for the Python dependency, which 
just adds a little (lot?) more complexity to the whole project, as well as who 
knows how much extra overhead there is to make the Python call. (not to 
mention, it took me practically a whole day last week to get all the 
dependencies working on a Linux system in order to be able to run Rotation.py)

But now, I have a more important reason to question this.  The Rotation script 
does not work very well.  I ran it on the attached files.  I started with the 
straight file and rotated them using Irfanview (15 = 1.5, 25 = 2.5)
Rotation.py returns 0 for the 1 and 1.5 degree file.  And it returns 1 for the 
2 degree file.  And it seems to always return an integer, or at least rounded 
to an integer.


Here is a simple Java routine which does the same thing and it appears to be 
far more accurate.  It uses Tess4j

<dependency>

  <groupId>net.sourceforge.tess4j</groupId>

  <artifactId>tess4j</artifactId>

  <version>4.5.4</version>

</dependency>


import com.recognition.software.jdeskew.ImageDeskew;

import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;

public class GetAngle {

    public static void main(String[] args) throws IOException {
        BufferedImage bi = ImageIO.read(new 
File("c:\\Testfiles\\Dickens_skew25.png"));
        ImageDeskew id = new ImageDeskew(bi);
        double imageSkewAngle = id.getSkewAngle(); // determine skew angle
        System.out.println(imageSkewAngle);
    }
}
I've been poking around this code and might actually do the change we discussed 
about not doing the rotation when the angle is 0, as well as allowing rotation 
even if you're not doing the pre-processing.  I'd be glad to take a look at 
this as well if you think it's a worthwhile direction.

Reply via email to