Have you looked through the archives to check for the people working on
Farsi? They would have a good idea how to solve this problem.
Arsalan Ghasrsaz ghasr...@googlemail.com
https://github.com/reza1615/PersianOcr
--Sven
On Sat, Jan 19, 2013 at 7:31 AM, gold snake huangjin...@gmail.com wrote:
I'm training failure, final result looks like very bad. maybe because i
don't know how handle the same character in different position.
you looking like that: م , ئما , تىم , مور
actually i'm writing like that: م , ئما , تىم , مور
can you see one character like O, it's a same character, but
if i found create cube solution for my language, i must use it' thanks
anyway .that result is important
在 2013年1月18日星期五UTC+8上午6时41分28秒,Patrick Questembert写道:
Yes, cube remains a mystery for the common mortals ... I am experimenting
with it within ScanBizCards and here are my findings so far
Hi Tesseract folks (it's nice to be back),
On Wed, Jan 16, 2013 at 01:34:25PM -0600, Sven Pedersen wrote:
Cube means combining different languages.
Really? I don't think this is correct. Certainly using eng+grc works
to combine English and Ancient Greek recognition, despite grc having
no cube
OK, the fact that cube is something different than combining languages is a
major revelation to me. However, huangjingshe, I don't think you need the
cube feature for what you're doing. I believe the problem you're having is
something else. I would solve the other issues first and then maybe try
the Arab and English font some think very different.
English font if you input a+b , the result is :ab
but if you use Arab font input ئ+ا the result is ئا , if you not
understand, you can copy ئا and add a space for middle, you can find if you
input 2 different font , the result is a new font
Yes, glyph handling and combining is important -- if you search the
archives you'll see how people have dealt with it for Asian languages --
mainly Indian / Indic scripts. You need to specify the component parts in
your training. I sent you 2 links about the right to left support (RTL) in
Regarding cube:
- there are no more public information about cube than that 92 hits at
the forum I mentioned already (+ source code ;-))
- there are no information how to create cube data files (ok some of
them are text files...)
So you can:
1. try to use/train tesseract without
Really ;-)? I got 93 results. E.g.:
https://groups.google.com/forum/#!msg/tesseract-ocr/0msQtTB_XrI/D1noel9GpPgJ
https://groups.google.com/d/topic/tesseract-ocr/tyV5_z65XMk/discussion
https://groups.google.com/d/msg/tesseract-ocr/R7UCx0oV3PA/GE7KJ_76kS0J
Please honor time of people on this
I can't found any answer for my question in this link.
can you just tolk to me? Is have necessary to bully a rookie?
please...
在 2013年1月16日星期三UTC+8下午4时02分25秒,zdenop写道:
Really ;-)? I got 93 results. E.g.:
https://groups.google.com/forum/#!msg/tesseract-ocr/0msQtTB_XrI/D1noel9GpPgJ
The reason why Arabic has those files and your language does not is that
Arabic is set up to use the cube feature to combine it with other
languages, so you can do -l ara+eng and OCR a document with both Arabic
and English. That training is harder, and not necessary if you mainly want
to do
so you mean: cube exists just because for user combine it with other
language, the mean i'm not be need(because my language is not arab).
thanks.may be i'm English not good. i just cant understand what is cube,
what is for use , can't find Introduction.
and that mean cube and my result is left
On Wed, Jan 16, 2013 at 3:34 PM, Sven Pedersen sven.peder...@gmail.comwrote:
The reason why Arabic has those files and your language does not is that
Arabic is set up to use the cube feature to combine it with other
languages, so you can do -l ara+eng and OCR a document with both Arabic
and
Cube means combining different languages. There is not much documentation
on it -- Google developed it internally. But I don't think you need it. The
list of files you sent is related to the cube feature, so you don't need to
create them. For right to left, search the archives for right to left --
thanks again .but i have same question. if use cube just for combine with
other language when training. why when we read document can choice cube
mode just like Sven said??
it that you mean we can combine with other language use -l [lang] because
it's have cube file. if there is no any cube
My language some special, just like arab font, but bitween arab font have
some different, actually only different on shape of the font. and It's
writing right to left too.
I'm using standard tutorial
: https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
but when i'm finish and
search archive of tesseract forums for cube.
Zdenko
On Tue, Jan 15, 2013 at 2:16 PM, gold snake huangjin...@gmail.com wrote:
My language some special, just like arab font, but bitween arab font have
some different, actually only different on shape of the font. and It's
writing right to left
I can't found anything. common
在 2013年1月15日星期二UTC+8下午10时38分42秒,zdenop写道:
search archive of tesseract forums for cube.
Zdenko
On Tue, Jan 15, 2013 at 2:16 PM, gold snake huang...@gmail.comjavascript:
wrote:
My language some special, just like arab font, but bitween arab font have
18 matches
Mail list logo