hi, W.K.LO
have you ever solved this problem? i've met this segmentation fault 
recently.

On Wednesday, February 27, 2013 4:44:20 PM UTC+8, W. K. LO wrote:
>
> Dear all,
>  
> I would like to know if there is/are option(s) for controlling the 
> segmentation process during OCR.
>  
> I am playing with a Chinese OCR and find that the segmentation is affected 
> by neighbouring characters. I would like to try playing with the 
> parameters/options to control the processes. An example is given as follows:
>  
> test01
> ======
> test01.tif [
> https://docs.google.com/file/d/0Bz99K1Qj2HQ_anJKQXN4RTlmLXc/edit]
> command: tesseract test01.tif test01 -l chi makebox
> result: 4th, 5th, 8th characters are broken apart
> test01.box [
> https://docs.google.com/file/d/0Bz99K1Qj2HQ_aFQ2ekpVMy0wTWM/edit]
> screen of test01 segmentaion [
> https://docs.google.com/file/d/0Bz99K1Qj2HQ_UmlUVFNLT0paZjA/edit]
>
> test02
> ======
> test02.tif [
> https://docs.google.com/file/d/0Bz99K1Qj2HQ_NVF6ZzhvSXBQZnc/edit]
> edit: remove first 3 characters of test01.tif
> command: tesseract test02.tif test02 -l chi makebox
> result: all characters are correctly segmented (only mixed up a 
> punctuation mark)
> test02.box [
> https://docs.google.com/file/d/0Bz99K1Qj2HQ_N04zM1V2T2xvNWs/edit]
> screen of test02 segmentaion [
> https://docs.google.com/file/d/0Bz99K1Qj2HQ_QXNKcGNzU3NxMDg/edit]
>
> test03
> ======
> test03.tif [
> https://docs.google.com/file/d/0Bz99K1Qj2HQ_cjZCbVE3ZVNWOUU/edit]
> edit: replace the 2nd last character of test02.tif
> command: tesseract test03.tif test03 -l chi makebox
> result: 1st, 2nd and 5th characters are broken apart
> test03.box [
> https://docs.google.com/file/d/0Bz99K1Qj2HQ_TVRzMXpmTDlwR00/edit]
> screen of test03 segmentaion [
> https://docs.google.com/file/d/0Bz99K1Qj2HQ_UDlyOGIxU091SnM/edit]
> It seems that the combination in test02 favour tesseract's default 
> setting. I would like to try if there are parameters/options for me to play 
> around to control the segmentation process.
>  
>  
> Thanks.
>  
> Regards,
> W. K. Lo
>
>  
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to