Hi Tom,

Thanks much for the update. I'm new to Ocropus, and I had a question on 
running rtrain.

Do you know (or have an estimate of) how many lines of text does the 
program take (to train) before it starts giving reasonable results? I'm 
wondering because since it's neural network based, I'd hazard a guess that 
it'd take more than a few thousand lines?

More details:  I'm working on gathering labeled data for Bengali (Bangla) 
OCR, and needed an estimate of lines that I'll need to transcribe as a 
starter.

Regards,
Shibamouli



On Wednesday, December 17, 2014 2:40:11 PM UTC-5, Tom wrote:
>
> With the new recognizer, it should be pretty easy to train. We've trained 
> it for other scripts purely from generated data and gotten pretty good 
> results.
>
> I'll try to create some more documentation and some simpler training 
> scripts.
>
> Tom
>
> On Wednesday, December 17, 2014 5:36:34 AM UTC-8, 81+ yrsold wrote:
>>
>> Tom,
>> I am really happy - you have resumed ocropus project again. Trust this 
>> time I hope Ocropus Project will support for Indic lang(Indian languages) 
>> this time.
>> With warmest regards,
>> sriranga(81+yrs) 
>>
>> On Wednesday, December 17, 2014 3:56:52 AM UTC+5:30, Tom wrote:
>>>
>>> I joined Google this year. Google permits me to spend time on the 
>>> OCRopus project and contribute. As part of this, I moved the project to 
>>> Github, because it's easier to maintain there.
>>>
>>> I just pushed out a new update of ocropy. This includes mainly 
>>> faster/smaller saving of models, as well as a C++ implementation of the 
>>> LSTM network. The C++ LSTM implementation is a pretty straightforward port 
>>> of the Python version and runs much faster. The C++ classes have been 
>>> wrapped as Python classes and are callable from Python. There are two new 
>>> top-level drivers, ocropus-ltrain and ocropus-lpred, for the C++ 
>>> implementation. The C++ implementation appears to be numerically close to 
>>> the Python implementation and yield good recognizers when trained, but it 
>>> requires more testing.
>>>
>>> As before, this is research-level software with minimal documentation 
>>> (do look at the iPython Notebooks, the .ipynb files, since they contain 
>>> significant information). Feel free to contribute patches, documentation, 
>>> etc. using the usual Github mechanisms of merge requests. I'll try to 
>>> incorporate them as time permits.
>>>
>>> Tom
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/ocropus/35e4fac4-fb18-4cfa-b180-a130a5c07322%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to