Y, jdeskew only imports java.awt... no other dependencies.  We can
copy/paste that source into our codebase and remove rotation.py.

On Wed, Jan 13, 2021 at 8:35 AM Tim Allison <[email protected]> wrote:

> Wait, that _is_ tess4j, 8MB jar with bundled tessdata and win32 binaries.
> Can we somehow liberate jdeskew so we don't package all of tess4j?
>
> On Wed, Jan 13, 2021 at 7:15 AM Tim Allison <[email protected]> wrote:
>
>> Peter,
>>   If you have a chance, can you see if that tess4j module has the same
>> functionality as what we're getting with ImageMagick?  It'd be great to
>> knock out 2 external dependencies with native Java if possible.
>>
>>    Cheers,
>>
>>          Tim
>>
>> On Wed, Jan 13, 2021 at 7:14 AM Tim Allison <[email protected]> wrote:
>>
>>> >But does the same thing apply in this case? It's not really using all
>>> of tess4j. Just 1 package from it
>>>
>>> Sorry, I should have checked on this exact point before responding.  If
>>> it isn't massive and has no native libraries, y, let's go for it.  Let me
>>> look into it a bit today.
>>>
>>> On Tue, Jan 12, 2021 at 10:42 PM Peter Kronenberg <
>>> [email protected]> wrote:
>>>
>>>> I sort of  see your reasons for not using Tess4j to replace the current
>>>> command line calls to Tesseract. But does the same thing apply in this
>>>> case? It's not really using all of tess4j. Just 1 package from it
>>>>
>>>>
>>>> ------------------------------
>>>> *From:* Tim Allison <[email protected]>
>>>> *Sent:* Tuesday, January 12, 2021 9:11:58 PM
>>>> *To:* [email protected] <[email protected]>
>>>> *Subject:* Re: Rotation script
>>>>
>>>> I really like the idea of moving to pure Java for deskewing. We chose
>>>> not to use tess4j earlier as a Java binding for tesseract because it
>>>> requires native code....[1]
>>>>
>>>> If we can do it with another call to tesseract from the command line or
>>>> if there is a fairly lightweight pure Java, ASL 2.0 friendly image library
>>>> that works well, that would be great.
>>>>
>>>> [1]
>>>> https://issues.apache.org/jira/browse/TIKA-2293
>>>>
>>>> On Tue, Jan 12, 2021 at 8:28 PM Peter Kronenberg <
>>>> [email protected]> wrote:
>>>>
>>>> I'd been meaning to ask why you calculate the rotation with a Python
>>>> script.  As far as I can tell, that is the only reason for the Python
>>>> dependency, which just adds a little (lot?) more complexity to the whole
>>>> project, as well as who knows how much extra overhead there is to make the
>>>> Python call. (not to mention, it took me practically a whole day last week
>>>> to get all the dependencies working on a Linux system in order to be able
>>>> to run Rotation.py)
>>>>
>>>> But now, I have a more important reason to question this.  The Rotation
>>>> script does not work very well.  I ran it on the attached files.  I started
>>>> with the straight file and rotated them using Irfanview (15 = 1.5, 25 = 
>>>> 2.5)
>>>> Rotation.py returns 0 for the 1 and 1.5 degree file.  And it returns 1
>>>> for the 2 degree file.  And it seems to always return an integer, or at
>>>> least rounded to an integer.
>>>>
>>>> Here is a simple Java routine which does the same thing and it appears to 
>>>> be far more accurate.  It uses Tess4j
>>>>
>>>> <dependency>
>>>>   <groupId>net.sourceforge.tess4j</groupId>
>>>>   <artifactId>tess4j</artifactId>
>>>>   <version>4.5.4</version>
>>>> </dependency>
>>>>
>>>>
>>>> import com.recognition.software.jdeskew.ImageDeskew;
>>>>
>>>> import javax.imageio.ImageIO;
>>>> import java.awt.image.BufferedImage;
>>>> import java.io.File;
>>>> import java.io.IOException;
>>>>
>>>> public class GetAngle {
>>>>
>>>>     public static void main(String[] args) throws IOException {
>>>>         BufferedImage bi = ImageIO.read(new 
>>>> File("c:\\Testfiles\\Dickens_skew25.png"));
>>>>         ImageDeskew id = new ImageDeskew(bi);
>>>>         double imageSkewAngle = id.getSkewAngle(); // determine skew angle
>>>>         System.out.println(imageSkewAngle);
>>>>     }
>>>> }
>>>>
>>>> I've been poking around this code and might actually do the change we
>>>> discussed about not doing the rotation when the angle is 0, as well as
>>>> allowing rotation even if you're not doing the pre-processing.  I'd be glad
>>>> to take a look at this as well if you think it's a worthwhile direction.
>>>>
>>>>

Reply via email to