I have output to hocr and tsv but I still get the all text without hard
return or any separation between paragraphs.
Is there an HOCR tool which allows to export to Microsoft Word?
The original document is in PDF format. It´s actually an official document.
First, I run ImageMagick and got a
Thanks @Shreeshrii
So the following commands recognize Arabic/English text
tesseract AE.jpg AE1 -l ara+eng
tesseract AE.jpg AE2 -l script/Arabic
بتاريخ الخميس، 19 مارس، 2020 6:42:19 م UTC+2، كتب Essam Zaky:
>
> Hi Dears
>
> What is the difference between script *.traineddata and normal
>
Yes and the result of the two commands could be different.
On Fri, Mar 20, 2020, 17:43 Essam Zaky wrote:
> Thanks @Shreeshrii
>
> So the following commands recognize Arabic/English text
> tesseract AE.jpg AE1 -l ara+eng
> tesseract AE.jpg AE2 -l script/Arabic
>
>
>
> بتاريخ الخميس، 19 مارس،
Take a look at gimagereader, which uses tesseract . It has the options you
are looking for.
On Fri, Mar 20, 2020, 17:55 Dayton wrote:
> I have output to hocr and tsv but I still get the all text without hard
> return or any separation between paragraphs.
>
> Is there an HOCR tool which allows
Deal all,
I need to detect the regions of a page without knowing the text inside in a
fast way.
I want to use tesseract from the command line.
How can I do this?
Which are the config value and the parameters useful to make a layout
analysis?
The task should be fast, for this reason I want to
Thanks shree. I´ll have a look at gimagereader. Looks like promising.
El viernes, 20 de marzo de 2020, 13:27:22 (UTC+1), shree escribió:
> Take a look at gimagereader, which uses tesseract . It has the options you
> are looking for.
>
> On Fri, Mar 20, 2020, 17:55 Dayton >
> wrote:
>
>> I
6 matches
Mail list logo