No worries, I will play around and see what I can get working. For now I am 
using a simple replace in my script to handle the Æ.
How would I go about if I were to compile tesseract 4.0 alpha using git and 
cmake? The wiki says the 4.0 alpha Source code is available in the master 
branch of the repository, but I have yet to find it...The compiling part 
seems straght-forward enough, but I need the source ;).

Tried installing the gimagereader hoping that it would give me the dll for 
tesseract 4.0, but no. 

mandag 9. januar 2017 08.34.18 UTC+1 skrev shree følgende:

> Sorry, I am not familiar with powershell and nuget.
>
> If you are on Windows, you can try the experimental binaries for 
> 4.0.0alpha for gimagereader, gui front-end to Tesseract-ocr. You can ocr a 
> pdf directly or load multiple images at the same time.
>
> - excuse the brevity, sent from mobile
>
> On 09-Jan-2017 12:49 PM, "Ludvig F Aarstad" <lud...@aarstad.org 
> <javascript:>> wrote:
>
>> Thanks Shree :D. Really appreciate it. Will this work with v3.03 too? I 
>> am basing my code on this: 
>> https://github.com/jourdant/powershell-paperless and there is a script 
>> to initialize the environment that is getting the tesseract files from 
>> here: https://nuget.org/api/v2/package/tesseract-ocr. Would you be able 
>> to point me in the right direction on how to move this from 3.03 to the 
>> 4.0alpha?
>>
>>
>>
>> fredag 6. januar 2017 13.50.38 UTC+1 skrev shree følgende:
>>
>>> I have uploaded modified nor.traineddata at
>>>
>>> https://github.com/Shreeshrii/tessdata4alpha/blob/master/nor.traineddata
>>>
>>> See attached log and info file for commands used in training. It took 
>>> about 9 hours on my pc - about 1700 iterations only and then my PC froze so 
>>> I rebooted and created the traineddata for norlayer0.853_1615.lstm i.e. 
>>> 0.853 % character error rate at iteration number 1615.
>>>
>>>
>>> ShreeDevi
>>> ____________________________________________________________
>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>
>>> On Fri, Jan 6, 2017 at 5:59 PM, ShreeDevi Kumar <shree...@gmail.com> 
>>> wrote:
>>>
>>>> @Peter, Have you tried the 4.0.0alpha version yet?
>>>>
>>>> @Ludvig F. Aarstad - Add a layer training worked for adding 'Æ' - I 
>>>> will upload the new traineddata so that you can test. You will need 
>>>> 4.0.alpha version for testing.
>>>>
>>>> Here is couple of the training tifs and OCRed text.  
>>>>
>>>> ShreeDevi
>>>> ____________________________________________________________
>>>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>>>>
>>>> On Fri, Jan 6, 2017 at 5:01 PM, Peter <pe...@peterkrantz.se> wrote:
>>>>
>>>>>
>>>>>
>>>>> Den torsdag 5 januari 2017 kl. 04:39:01 UTC+1 skrev shree:
>>>>>>
>>>>>> Ray is planning to retrain the languages for the new 4.0.0 version 
>>>>>> sometime in January. So it would be helpful if you could open an issue 
>>>>>> on 
>>>>>> https://github.com/tesseract-ocr/langdata/issues with this 
>>>>>> information.
>>>>>>
>>>>>
>>>>> Is it possible to contribute training data for this effort? I realise 
>>>>> swedish will not be on top of the list but I think it would be easy to 
>>>>> involve some of the research community here in contributing training data 
>>>>> if it could improve the language model.
>>>>>
>>>>> /Peter 
>>>>>
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "tesseract-ocr" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to tesseract-oc...@googlegroups.com.
>>>>> To post to this group, send email to tesser...@googlegroups.com.
>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/tesseract-ocr/9788db26-bb8a-4861-b29e-80db2b5a687f%40googlegroups.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/9788db26-bb8a-4861-b29e-80db2b5a687f%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to tesseract-oc...@googlegroups.com <javascript:>.
>> To post to this group, send email to tesser...@googlegroups.com 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/tesseract-ocr/f2ddc038-3409-44e6-8b00-2354a95d3ba6%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/tesseract-ocr/f2ddc038-3409-44e6-8b00-2354a95d3ba6%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/b193b0be-f57d-44cf-b2e4-6efc5bb9a361%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to