On Tue, 5 Jun 2018, John Jason Jordan wrote:
For subtitles in .idx/.sub format (bitmaps) I normally use vobsub2srt,
which uses Tesseract to convert them to .srt format (text). However,
suddenly I have .idx/.sub files for Hebrew, and vobsub2srt pukes them
up, bitching that Tesseract can't grok them:
Error opening data
file /usr/share/tesseract-ocr/tessdata/heb.traineddata Please
make sure the TESSDATA_PREFIX environment variable is
set to the parent directory of your "tessdata"
directory.
Failed loading language 'heb'
Tesseract couldn't load any languages!
Failed to initialize tesseract (OCR).
Since I don't speak Hebrew I can't grok them either, which makes
solving this problem difficult. However, I do know that Hebrew is
written with an alphabet, so converting the .idx/.sub files to .srt
must be possible.
Do you have the tesseract-ocr-heb package (or its equivalent for your
distribution) installed?
--
Paul Heinlein
[email protected]
45°38' N, 122°6' W
_______________________________________________
PLUG mailing list
[email protected]
http://lists.pdxlinux.org/mailman/listinfo/plug