My target is to recognize Arabic with numbers and punctuation + English
there are some English lines contain Arabic word
and Some Arabic lines contain English word
i did some page layout analysis and split the text to lines and try to
detect the language of each word depending on word geometry
The issue with Arabic is related to RTL processing and how punctuation and
digits are handled. If your training text does not have them, you will have
greater success.
On Wed, Mar 25, 2020, 15:32 Essam Zaky wrote:
> Thanx @Loranzo and @Shree
> i will give try to fine tune , and if the result
Thanx @Loranzo and @Shree
i will give try to fine tune , and if the result still not satisfied will
switch again to build from scratch
بتاريخ الثلاثاء، 24 مارس، 2020 10:05:03 م UTC+2، كتب Essam Zaky:
>
> Hi Dears ,
>
> I would like to build *.traindata from scratch specially for English and
>
I think fine tuning may work very well in this case, no need to train from
scratch. Training from scratch does not guarantee better results,
especially if you don't do it correctly.
I suggest to try fine tuning first and see if the results are good enough
for you. In this way you get comfortable
@Lorenozo
I need to do that because because the accuracy of current Arabic not very
good as English , and i have a lot fonts need to add to Arabic model
adding them by fine tune will affect the model so i need to build from
scratch and make the model more generalized
so i need to know what is
AFAIK Ray is involved in other projects at Google. Unlikely to get a reply
from him.
See https://github.com/tesseract-ocr/tesstrain/wiki for training done
by @stweil on similar scale for Fraktur. The pages list the hardware
requirements, time taken etc.
Please check that you have enough
Thanks @shreeshrii
Would answer the questions depending on your experience ,
also is it possible to get help from Ray ?
بتاريخ الثلاثاء، 24 مارس، 2020 10:05:03 م UTC+2، كتب Essam Zaky:
>
> Hi Dears ,
>
> I would like to build *.traindata from scratch specially for English and
> Arabic
>
> So
7 matches
Mail list logo