Thank you Dmitri. I emailed him today, you got the CC already I think. Mostafa
On May 19, 10:19 pm, Dmitri Silaev <[email protected]> wrote: > Mostafa should try to contact Ray directly, seriously. > Things may have changed over time > > -- > Dmitri > > 2011/5/19 zdenko podobny <[email protected]>: > > > > > 2011/5/19 Mostafa <[email protected]> > > >> Hi Again, > > >> Seems no body knows where it is hiding. > >> Should I contact with CIA agent ? lol > > > If somebody isšreallyšinteresting she/he can know answer ;-). Within 1 > > minute ;-) ([1] [2]š[3]). BTW: there isšDevelopers forum. > > >> But I am kinda serious about the data. > > > There were several requests for training data (in forum, in issues). I did > > it too. There was no official reply to such requests. AFAIK Google is > > notšobligedšto release them. So I guess they have a reason for not providing > > them. > > On other hand this could bešopportunityšfor tesseract community :-): to > > create alternativeštrainingšset. As Ray mentioned ([3]) they use "more > > automated training process based on rendering text from fonts", so training > > base on "real world" scanned documents could bešinterestingš(but more > > difficult) > > > Zdenko > > > [1]šhttp://code.google.com/p/tesseract-ocr/people/list > > [2]http://code.google.com/p/tesseract-ocr/source/list > > [3]šhttp://groups.google.com/group/tesseract-dev/msg/1cdf3ebe8743d935 > > >> Mostafa > > >> On May 18, 2:43šam, éÌØÑ <[email protected]> wrote: > >> > He need for table that contains all supported alphabetics characters. > >> > Also, Parts of scanned books could not be protected by copyright. > > >> > Can you give any contacts of "jpn.traindata" dev team? > > >> > -- > >> > š š š š Best regards, > >> > š š š š šIlia. > > >> > ÷ ÷ÔÒ, 17/05/2011 × 18:24 +0200, zdenko podobny ÐÉÛÅÔ: > > >> > > On Tue, May 17, 2011 at 5:01 PM, éÌØÑ <[email protected]> wrote: > >> > > š š š š IMHO alphabets can't be protected by copyright. > > >> > > Mostafa did not asked for an alphabets. He asked for 'all the tif > >> > > files that used for creating...' and content of tiff file (e.g. > >> > > scanned books) could be protected by copyright. > > >> > > š š š š -- > >> > > š š š š Best regards, > >> > > š š š š Ilia. > > >> > > š š š š ÷ ÷ÔÒ, 17/05/2011 × 09:24 -0400, Dmitri Silaev ÐÉÛÅÔ: > > >> > > š š š š > I think copyright issues are preventing the dev team from > >> > > š š š š publishing > >> > > š š š š > these source files. However you can try to contact this > >> > > š š š š forum's > >> > > š š š š > moderator directly - he probably can take decision to share. > > >> > > š š š š > -- > >> > > š š š š > Dmitri > > >> > > š š š š > On Tue, May 17, 2011 at 4:58 AM, Mostafa > >> > > š š š š <[email protected]> wrote: > >> > > š š š š > > Hi, > > >> > > š š š š > > I am interested to get all the tif files that used for > >> > > š š š š creating the > >> > > š š š š > >jpn.traindata. > >> > > š š š š > > I just want to see how many characters are supported in > >> > > š š š š that file. > >> > > š š š š > > Because I have some other Japanese characters that can't > >> > > š š š š be recognized > >> > > š š š š > > by > >> > > š š š š > > the tesseract OCR. > > >> > > š š š š > > Does anybody know, where are those tif files ? > > >> > > š š š š > > Thanks > > >> > > š š š š > > -- > >> > > š š š š > > You received this message because you are subscribed to > >> > > š š š š the Google > >> > > š š š š > > Groups "tesseract-ocr" group. > >> > > š š š š > > To post to this group, send email to > >> > > š š š š [email protected] > >> > > š š š š > > To unsubscribe from this group, send email to > >> > > š š š š > > [email protected] > >> > > š š š š > > For more options, visit this group at > >> > > š š š š > >http://groups.google.com/group/tesseract-ocr?hl=en > > >> > > š š š š -- > >> > > š š š š You received this message because you are subscribed to the > >> > > š š š š Google > >> > > š š š š Groups "tesseract-ocr" group. > >> > > š š š š To post to this group, send email to > >> > > š š š š [email protected] > >> > > š š š š To unsubscribe from this group, send email to > >> > > š š š š [email protected] > >> > > š š š š For more options, visit this group at > >> > > š š š šhttp://groups.google.com/group/tesseract-ocr?hl=en > > >> > > -- > >> > > You received this message because you are subscribed to the Google > >> > > Groups "tesseract-ocr" group. > >> > > To post to this group, send email to [email protected] > >> > > To unsubscribe from this group, send email to > >> > > [email protected] > >> > > For more options, visit this group at > >> > >http://groups.google.com/group/tesseract-ocr?hl=en > > >> -- > >> You received this message because you are subscribed to the Google > >> Groups "tesseract-ocr" group. > >> To post to this group, send email to [email protected] > >> To unsubscribe from this group, send email to > >> [email protected] > >> For more options, visit this group at > >>http://groups.google.com/group/tesseract-ocr?hl=en > > > -- > > You received this message because you are subscribed to the Google > > Groups "tesseract-ocr" group. > > To post to this group, send email to [email protected] > > To unsubscribe from this group, send email to > > [email protected] > > For more options, visit this group at > >http://groups.google.com/group/tesseract-ocr?hl=en -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

