Re: [tesseract-ocr] Training for plotter file

2015-03-28 Thread Quan Nguyen
The basic image processing is only available for individual images loaded 
in the UI. The bulk or batch OCR does not have this support. Therefore, 
it's suggested that you perform the bulk image processing outside of 
VietOCR, using ImageMagick, GIMP, etc.

Both VietOCR Java  .NET support command-line execution. The command syntax 
is similar to that of Tesseract:

vietocr vietsample.tif out -l vie

On Sunday, March 22, 2015 at 3:55:50 PM UTC-5, Dennis wrote:

 I just tried the bulk option, and I see that it also outputs the location 
 of the text it OCRed, which is what I wanted,
 but it does not have an option to do a smooth or textbox around the text I 
 want to OCR.  There is no way to automate these things?

 Also, how do I run vietocr with commandline?  preferably through .NET 
 rather than Java.

 Thank you for the help,
 Dennis

 On Sunday, March 22, 2015 at 11:20:18 AM UTC-7, shree wrote:

 vietocr has bulkocr and batch options. 

 ShreeDevi
 
 भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

 On Sun, Mar 22, 2015 at 6:39 AM, Dennis denni...@gmail.com wrote:

 I'm using the latest version of tesseract: 3.02.

 I successfully OCRed with vietocr gui.  If I set it as screenshot mode, 
 apply a smooth filter, and use the textbox to select each line one by one, 
 I get a 100% correct OCR.

 Now I am wondering, how can I automate this process?  I want to be able 
 to create a program or execute a command so that I give it the image and it 
 does the above things automatically and outputs the OCR and the location of 
 the OCRed text in the image file.

 Thank you,
 Dennis Gahm

 On Sunday, October 19, 2014 at 7:02:05 PM UTC-7, shree wrote:

 Which version of tesseract are you using?

 Try changing to 300/600 dpi, apply a blur/soften filter, decrease 
 brighness, convert to greyscale.

 I tried with  vietocr gui, 
 zero with the line across gets recognized as @, rest comes out ok.

 If you will not have @ in your plots, you could just substitute @ by 
 zero in post-processing.

 ShreeDevi
 
 भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

 On Sun, Oct 19, 2014 at 3:21 AM, Dennis denni...@gmail.com wrote:

 Hello,

 I am trying to recognize the characters from a plot file (attached).  
 The characters are composed of lines and are not fonts.

 I've tried training, but I was unsuccessful (I probably did something 
 wrong).

 Can anyone help?

 Thank you,
 Dennis

 -- 
 You received this message because you are subscribed to the Google 
 Groups tesseract-ocr group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to tesseract-oc...@googlegroups.com.
 To post to this group, send email to tesser...@googlegroups.com.
 Visit this group at http://groups.google.com/group/tesseract-ocr.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/tesseract-ocr/7b3aa12c-0927-4f27-9b31-178b0c234d5e%
 40googlegroups.com 
 https://groups.google.com/d/msgid/tesseract-ocr/7b3aa12c-0927-4f27-9b31-178b0c234d5e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are subscribed to the Google 
 Groups tesseract-ocr group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to tesseract-oc...@googlegroups.com.
 To post to this group, send email to tesser...@googlegroups.com.
 Visit this group at http://groups.google.com/group/tesseract-ocr.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/tesseract-ocr/b6433fbd-0159-42ba-84b7-52ea0e563bd9%40googlegroups.com
  
 https://groups.google.com/d/msgid/tesseract-ocr/b6433fbd-0159-42ba-84b7-52ea0e563bd9%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
tesseract-ocr group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/2517da0d-3c4c-463f-9bcc-60e94dfb4016%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Training for plotter file

2015-03-22 Thread Dennis
I'm using the latest version of tesseract: 3.02.

I successfully OCRed with vietocr gui.  If I set it as screenshot mode, 
apply a smooth filter, and use the textbox to select each line one by one, 
I get a 100% correct OCR.

Now I am wondering, how can I automate this process?  I want to be able to 
create a program or execute a command so that I give it the image and it 
does the above things automatically and outputs the OCR and the location of 
the OCRed text in the image file.

Thank you,
Dennis Gahm

On Sunday, October 19, 2014 at 7:02:05 PM UTC-7, shree wrote:

 Which version of tesseract are you using?

 Try changing to 300/600 dpi, apply a blur/soften filter, decrease 
 brighness, convert to greyscale.

 I tried with  vietocr gui, 
 zero with the line across gets recognized as @, rest comes out ok.

 If you will not have @ in your plots, you could just substitute @ by zero 
 in post-processing.

 ShreeDevi
 
 भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

 On Sun, Oct 19, 2014 at 3:21 AM, Dennis denni...@gmail.com javascript: 
 wrote:

 Hello,

 I am trying to recognize the characters from a plot file (attached).  The 
 characters are composed of lines and are not fonts.

 I've tried training, but I was unsuccessful (I probably did something 
 wrong).

 Can anyone help?

 Thank you,
 Dennis

 -- 
 You received this message because you are subscribed to the Google Groups 
 tesseract-ocr group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to tesseract-oc...@googlegroups.com javascript:.
 To post to this group, send email to tesser...@googlegroups.com 
 javascript:.
 Visit this group at http://groups.google.com/group/tesseract-ocr.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/tesseract-ocr/7b3aa12c-0927-4f27-9b31-178b0c234d5e%40googlegroups.com
  
 https://groups.google.com/d/msgid/tesseract-ocr/7b3aa12c-0927-4f27-9b31-178b0c234d5e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
tesseract-ocr group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/b6433fbd-0159-42ba-84b7-52ea0e563bd9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Training for plotter file

2015-03-22 Thread ShreeDevi Kumar
vietocr has bulkocr and batch options.

ShreeDevi

भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Sun, Mar 22, 2015 at 6:39 AM, Dennis dennisg...@gmail.com wrote:

 I'm using the latest version of tesseract: 3.02.

 I successfully OCRed with vietocr gui.  If I set it as screenshot mode,
 apply a smooth filter, and use the textbox to select each line one by one,
 I get a 100% correct OCR.

 Now I am wondering, how can I automate this process?  I want to be able to
 create a program or execute a command so that I give it the image and it
 does the above things automatically and outputs the OCR and the location of
 the OCRed text in the image file.

 Thank you,
 Dennis Gahm

 On Sunday, October 19, 2014 at 7:02:05 PM UTC-7, shree wrote:

 Which version of tesseract are you using?

 Try changing to 300/600 dpi, apply a blur/soften filter, decrease
 brighness, convert to greyscale.

 I tried with  vietocr gui,
 zero with the line across gets recognized as @, rest comes out ok.

 If you will not have @ in your plots, you could just substitute @ by zero
 in post-processing.

 ShreeDevi
 
 भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

 On Sun, Oct 19, 2014 at 3:21 AM, Dennis denni...@gmail.com wrote:

 Hello,

 I am trying to recognize the characters from a plot file (attached).
 The characters are composed of lines and are not fonts.

 I've tried training, but I was unsuccessful (I probably did something
 wrong).

 Can anyone help?

 Thank you,
 Dennis

 --
 You received this message because you are subscribed to the Google
 Groups tesseract-ocr group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to tesseract-oc...@googlegroups.com.
 To post to this group, send email to tesser...@googlegroups.com.
 Visit this group at http://groups.google.com/group/tesseract-ocr.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/tesseract-ocr/7b3aa12c-0927-4f27-9b31-178b0c234d5e%
 40googlegroups.com
 https://groups.google.com/d/msgid/tesseract-ocr/7b3aa12c-0927-4f27-9b31-178b0c234d5e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  --
 You received this message because you are subscribed to the Google Groups
 tesseract-ocr group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to tesseract-ocr+unsubscr...@googlegroups.com.
 To post to this group, send email to tesseract-ocr@googlegroups.com.
 Visit this group at http://groups.google.com/group/tesseract-ocr.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/tesseract-ocr/b6433fbd-0159-42ba-84b7-52ea0e563bd9%40googlegroups.com
 https://groups.google.com/d/msgid/tesseract-ocr/b6433fbd-0159-42ba-84b7-52ea0e563bd9%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
tesseract-ocr group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVm06%3D1JtSoEhmvv13gBe62vg_xSXg5na4My5k8ozqLfQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Training for plotter file

2015-03-22 Thread Dennis
I just tried the bulk option, and I see that it also outputs the location 
of the text it OCRed, which is what I wanted,
but it does not have an option to do a smooth or textbox around the text I 
want to OCR.  There is no way to automate these things?

Also, how do I run vietocr with commandline?  preferably through .NET 
rather than Java.

Thank you for the help,
Dennis

On Sunday, March 22, 2015 at 11:20:18 AM UTC-7, shree wrote:

 vietocr has bulkocr and batch options. 

 ShreeDevi
 
 भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

 On Sun, Mar 22, 2015 at 6:39 AM, Dennis denni...@gmail.com javascript: 
 wrote:

 I'm using the latest version of tesseract: 3.02.

 I successfully OCRed with vietocr gui.  If I set it as screenshot mode, 
 apply a smooth filter, and use the textbox to select each line one by one, 
 I get a 100% correct OCR.

 Now I am wondering, how can I automate this process?  I want to be able 
 to create a program or execute a command so that I give it the image and it 
 does the above things automatically and outputs the OCR and the location of 
 the OCRed text in the image file.

 Thank you,
 Dennis Gahm

 On Sunday, October 19, 2014 at 7:02:05 PM UTC-7, shree wrote:

 Which version of tesseract are you using?

 Try changing to 300/600 dpi, apply a blur/soften filter, decrease 
 brighness, convert to greyscale.

 I tried with  vietocr gui, 
 zero with the line across gets recognized as @, rest comes out ok.

 If you will not have @ in your plots, you could just substitute @ by 
 zero in post-processing.

 ShreeDevi
 
 भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

 On Sun, Oct 19, 2014 at 3:21 AM, Dennis denni...@gmail.com wrote:

 Hello,

 I am trying to recognize the characters from a plot file (attached).  
 The characters are composed of lines and are not fonts.

 I've tried training, but I was unsuccessful (I probably did something 
 wrong).

 Can anyone help?

 Thank you,
 Dennis

 -- 
 You received this message because you are subscribed to the Google 
 Groups tesseract-ocr group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to tesseract-oc...@googlegroups.com.
 To post to this group, send email to tesser...@googlegroups.com.
 Visit this group at http://groups.google.com/group/tesseract-ocr.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/tesseract-ocr/7b3aa12c-0927-4f27-9b31-178b0c234d5e%
 40googlegroups.com 
 https://groups.google.com/d/msgid/tesseract-ocr/7b3aa12c-0927-4f27-9b31-178b0c234d5e%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


  -- 
 You received this message because you are subscribed to the Google Groups 
 tesseract-ocr group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to tesseract-oc...@googlegroups.com javascript:.
 To post to this group, send email to tesser...@googlegroups.com 
 javascript:.
 Visit this group at http://groups.google.com/group/tesseract-ocr.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/tesseract-ocr/b6433fbd-0159-42ba-84b7-52ea0e563bd9%40googlegroups.com
  
 https://groups.google.com/d/msgid/tesseract-ocr/b6433fbd-0159-42ba-84b7-52ea0e563bd9%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
tesseract-ocr group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/bec14bdc-b81f-44ff-bb7a-c2b494c7582e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.