On Sat, May 18, 2013 at 2:57 PM, sdk <[email protected]> wrote: > Hi, > > I have used QT Box Editor 1.10 on Windows 7. It works fine on .png files > (does not opne .tif files). > > QT Box Editor (QBE) uses QT4 functionality to load images. QT4 does not support multipage tif. I did not experienced problem with tiff on my Windows XP. I have no possibility to test it on Win 7. Anyway I plan to use leptonica to import images (at leastfor tiff ;-) ) - this would bring also multipage tiff support, but I have not time for this development (you know it works for me at this stage, and there is bunch of other tasks...)
> I had a question regarding its import / export feature. > > I have generated a box file using QT Box Editor with Hindi traineddata and > want to fix the errors in it as some areas boxes are being marked > erroneously. > > 1. If I export the text from box file 'Line by Line', is there a way to > import it back? I am getting the error that number of boxes dont match. > > Maybe it is possible. Import feature is quite simple: 1. QBE expects that number of boxes (already in table view) is equal to number of symbols excluding spaces and linebreaks (\n). Otherwise you got error (there are more symbol than boxes, or there are more boxes than symbol). 2. QBE expects for import format: one box = one symbol 2. It seems from the little reading of the source that I have done, that > there is an option for box files which handle a line at a time and segment > at word level. Is there some feature like that? It is not clear to me, what do you need (or expect): tesseract box file can have one box per line. There is no information about words. QBE offer export where it try to identify words based on space between boxes (6pt - this can be adjusted in settings) It is not perfect, but it works in most of situation (problem could be on historical documents with non consistent spacing) > Can it be used with QT Box editor? > > Generally: QBE was based for my needs (latin based script - so have no clue, how it behaves in other scripts). Improvements (code, patches) are welcomed ;-) > Shree > > > > > On Friday, November 16, 2012 6:02:34 PM UTC+5:30, zdenop wrote: >> >> QT Box Editor 1.10 was released. It is a multi-platform visual editor >> for tesseract-**ocr <http://code.google.com/p/tesseract-ocr/> box >> files<http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3> (used >> for OCR training) based on QT4 library <http://qt.nokia.com/products/>. >> >> Several problems were fixed so upgrade is recommended. >> >> New features: >> >> - implemented bbox drag resizing (thanks to D. Silaev) e.g. user can >> change box rectangle on image with mouse >> - reload image, reload box file from disk >> - implemented 'regenerate box file' >> - implemented 'convert image to binary image' so user can see its >> image the way tesseract will use it in OCR process >> - implemented 'zoom in/out' with CTRL + mouse wheel >> - watch for modified boxfile outside of program >> >> >> For windows users there are binary files (qt-box-editor-1.10.exe + >> qt-box-editor-dependecies-1.**09.zip) created with mingw32, QT 4.8.1, >> leptonica 1.69 and tesseract 3.02 on Windows XP SP3 (32bit). For other >> platforms you need to compile it from source. >> >> Homepage: >> http://zdenop.**github.com/qt-box-editor/<http://zdenop.github.com/qt-box-editor/> >> Code: >> https://github.com/**zdenop/qt-box-editor<https://github.com/zdenop/qt-box-editor> >> Changelog: https://github.com/**zdenop/qt-box-editor/blob/** >> master/CHANGELOG<https://github.com/zdenop/qt-box-editor/blob/master/CHANGELOG> >> >> >> -- >> Zdenko >> > -- > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/groups/opt_out. > > > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

