On Sat, May 18, 2013 at 2:57 PM, sdk <[email protected]> wrote:

> Hi,
>
> I have used QT Box Editor 1.10 on Windows 7. It works fine on .png files
> (does not opne .tif files).
>
> QT Box Editor (QBE) uses QT4 functionality to load images. QT4 does not
support multipage tif. I did not experienced problem with tiff on my
Windows XP. I have no possibility to test it on Win 7. Anyway I plan to use
leptonica to import images (at leastfor  tiff ;-) ) - this would bring also
multipage tiff support, but I have not time for this development (you know
it works for me at this stage, and there is bunch of other tasks...)


> I had a question regarding its import / export feature.
>
> I have generated a box file using QT Box Editor with Hindi traineddata and
> want to fix the errors in it as some areas boxes are being marked
> erroneously.
>
> 1. If I export the text from box file 'Line by Line', is there a way to
> import it back? I am getting the error that number of boxes dont match.
>
> Maybe it is possible. Import feature is quite simple:

   1. QBE expects that number of boxes (already in table view) is equal to
   number of symbols excluding spaces and linebreaks (\n). Otherwise you got
   error (there are more symbol than boxes, or there are more boxes than
   symbol).
   2. QBE expects for import format: one box = one symbol

2. It seems from the little reading of the source that I have done, that
> there is an option for box files which handle a line at a time and segment
> at word level. Is there some feature like that?

It is not clear to me, what do you need (or expect): tesseract box file can
have one box per line. There is no information about words.
QBE offer export where it try to identify words based on space between
boxes (6pt - this can be adjusted in settings) It is not perfect, but it
works in most of situation (problem could be on historical documents with
non consistent spacing)


> Can it be used with QT Box editor?
>
> Generally: QBE was based for my needs (latin based script - so have no
clue, how it behaves in other scripts). Improvements (code, patches) are
welcomed ;-)


> Shree
>
>
>
>
> On Friday, November 16, 2012 6:02:34 PM UTC+5:30, zdenop wrote:
>>
>> QT Box Editor 1.10 was released. It is a multi-platform visual editor
>>  for tesseract-**ocr <http://code.google.com/p/tesseract-ocr/> box 
>> files<http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3> (used
>> for OCR training) based on QT4 library <http://qt.nokia.com/products/>.
>>
>> Several problems were fixed so upgrade is recommended.
>>
>> New features:
>>
>>    - implemented bbox drag resizing (thanks to D. Silaev) e.g. user can
>>    change box rectangle on image with mouse
>>    - reload image, reload box file from disk
>>    - implemented 'regenerate box file'
>>    - implemented 'convert image to binary image' so user can see its
>>    image the way tesseract will use it in OCR process
>>    - implemented 'zoom in/out' with CTRL + mouse wheel
>>    - watch for modified boxfile outside of program
>>
>>
>> For windows users there are binary files (qt-box-editor-1.10.exe +
>> qt-box-editor-dependecies-1.**09.zip) created with mingw32, QT 4.8.1,
>> leptonica 1.69 and tesseract 3.02 on Windows XP SP3 (32bit). For other
>> platforms you need to compile it from source.
>>
>> Homepage: 
>> http://zdenop.**github.com/qt-box-editor/<http://zdenop.github.com/qt-box-editor/>
>> Code: 
>> https://github.com/**zdenop/qt-box-editor<https://github.com/zdenop/qt-box-editor>
>> Changelog: https://github.com/**zdenop/qt-box-editor/blob/**
>> master/CHANGELOG<https://github.com/zdenop/qt-box-editor/blob/master/CHANGELOG>
>>
>>
>> --
>> Zdenko
>>
>  --
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to