[
https://issues.apache.org/jira/browse/TIKA-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17258822#comment-17258822
]
Nick Burch commented on TIKA-3260:
----------------------------------
If we can make a script that's valid python 2 + 3, that'd be ideal! Otherwise
these days swapping to python3 seems best to me
It is possible that something else might want python too (maybe one of the NLP
bits?), so potentially making setting the path a general thing rather than
Tesseract-Specific might be good?
ImageMagick might be able to give info on a few formats that Tika can't, so
there's something to be said for setting the path to that generally, so it can
be re-used by a future external parser as we do for ffmpeg and video?
(This would probably need a medium-sized bit of work to have a way to tell Tika
a number of different external programs you want, including Tesseract and the
current external ffmpeg/exiftool stuff, and give warnings if they aren't found,
so maybe a future 2.1 thing for a keen newbie!)
On most linux distros you just run {{convert}} not {{ImageMagick convert}}, so
that may be the source of some confusion on the executable
> Update rotation.py to work with python3 and a more modern matplotlib
> --------------------------------------------------------------------
>
> Key: TIKA-3260
> URL: https://issues.apache.org/jira/browse/TIKA-3260
> Project: Tika
> Issue Type: Improvement
> Reporter: Tim Allison
> Priority: Major
> Attachments: apache-tika-8408777197187584954.png,
> skewed5_image_text.png
>
>
> When I tried to work with rotation.py, I found that we should allow python to
> be python3 (not require an alias), and I found that rms_flat (once
> deprecated) has actually been removed in recent versions of matplotlib.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)