But if you have a look at tr file, you will see that font name will be 20centsmarker.exp0. And I guess this is not want you want.
Tesseract tooks some information from filenames. If you go your own way with naming you will face a problem (crash). I remember there is crash at some stage if last part of name is not "exp" + number... I know it is annoying, but... this is a stage where tesseract training is. Zdenko On Sat, Feb 2, 2013 at 5:18 PM, Carlos Antunes <[email protected]> wrote: > Zdenko, > > I' ve got you. I was using the wrong syntax for the file. I just used the > file the way it was created and not the way it should be. > > The syntax that was not working: > > shapeclustering -F font_properties -U unicharset > eng.20centsmarker.exp0.box.tr > > The syntax that worked: > > shapeclustering -F font_properties -U unicharset eng.20centsmarker.exp0.tr > > Basically, it does not like with the .box.tr extension but it works with > .tr extension. > > Thanks again mate! > > On Saturday, February 2, 2013 2:45:11 AM UTC-7, zdenop wrote: > >> Don't sent gdb output - it is useless. Especially when you do not follow >> wiki: >> you run: >> tesseract eng.20centsmarker.exp0.tif eng.20centsmarker.exp0.box >> nobatch box.train >> and you should run: >> tesseract eng.20centsmarker.exp0.tif eng.20centsmarker.exp0 nobatch >> box.train >> >> Zdenko >> >> >> On Fri, Feb 1, 2013 at 11:26 PM, Carlos Antunes <[email protected]>wrote: >> >>> Hello, >>> >>> I have generated the TIFF/Box pair from a font using 10pt and 0.05 >>> trailing spaces. It went really well when I did the tesseract training >>> procedure and generated the .tr file as attached. Then I did the unicharset >>> generation and it also did well. >>> >>> However, when I start the final pieces as per the Wiki things do not >>> work very well and it crashes. Attached is a zip file with all the stuff I >>> was able to generate plus the tif/box pairs. Also attached is the GDB >>> output of it. >>> >>> Here is what I have ran and the message. Attached are the files. My >>> tesseract system is 3.02 and it came with Ubuntu 12.10 which is this >>> desktop. >>> >>> I am having a real hard time with this whole procedure and getting >>> quite frustrated on trying to make it work. >>> >>> I would greatly appreciate any further pointers on this. Thanks in >>> advance. >>> >>> >>> >>> :~/TrainingOCR/d$ shapeclustering -F font_properties -U unicharset >>> eng.20centsmarker.exp0.box.tr >>> Reading eng.20centsmarker.exp0.box.tr ... >>> *** glibc detected *** shapeclustering: double free or corruption (out): >>> 0x0000000002176b90 *** >>> ======= Backtrace: ========= >>> /lib/x86_64-linux-gnu/libc.so.**6(+0x7eb96)[0x7f4188c2db96] >>> shapeclustering(_**ZN13GenericVectorIiE5clearEv+**0x9b)[0x409df3] >>> shapeclustering(_**ZN13GenericVectorIiED1Ev+0x2a)**[0x409a30] >>> /usr/lib/libtesseract.so.3(_**ZN9tesseract17TrainingSampleSe** >>> t14SetupFontIdMapEv+0x136)[**0x7f4189a4fb88] >>> /usr/lib/libtesseract.so.3(_**ZN9tesseract17TrainingSampleSe** >>> t22OrganizeByFontAndClassEv+**0x24)[0x7f4189a4f67c] >>> /usr/lib/libtesseract.so.3(_**ZN9tesseract13MasterTrainer24R** >>> eplaceFragmentedSamplesEv+**0x1f0)[0x7f4189a3e810] >>> /usr/lib/libtesseract.so.3(_**ZN9tesseract13MasterTrainer15P** >>> ostLoadCleanupEv+0x47)[**0x7f4189a3be13] >>> shapeclustering[0x4074dc] >>> shapeclustering(main+0x52)[**0x405cae] >>> /lib/x86_64-linux-gnu/libc.so.**6(__libc_start_main+0xed)[** >>> 0x7f4188bd076d] >>> shapeclustering[0x405b79] >>> ======= Memory map: ======== >>> 00400000-0040f000 r-xp 00000000 08:05 8655084 >>> /usr/bin/shapeclustering >>> 0060e000-0060f000 r--p 0000e000 08:05 8655084 >>> /usr/bin/shapeclustering >>> 0060f000-00610000 rw-p 0000f000 08:05 8655084 >>> /usr/bin/shapeclustering >>> 0208b000-02193000 rw-p 00000000 00:00 0 >>> [heap] >>> 7f418726d000-7f418726f000 r-xp 00000000 08:05 15990993 >>> /lib/x86_64-linux-gnu/libdl-2.**15.so <http://libdl-2.15.so> >>> 7f418726f000-7f418746f000 ---p 00002000 08:05 15990993 >>> /lib/x86_64-linux-gnu/libdl-2.**15.so <http://libdl-2.15.so> >>> 7f418746f000-7f4187470000 r--p 00002000 08:05 15990993 >>> /lib/x86_64-linux-gnu/libdl-2.**15.so <http://libdl-2.15.so> >>> 7f4187470000-7f4187471000 rw-p 00003000 08:05 15990993 >>> /lib/x86_64-linux-gnu/libdl-2.**15.so <http://libdl-2.15.so> >>> 7f4187471000-7f418747c000 r-xp 00000000 08:05 8650918 >>> /usr/lib/x86_64-linux-gnu/**libjbig.so.0.0.0 >>> 7f418747c000-7f418767b000 ---p 0000b000 08:05 8650918 >>> /usr/lib/x86_64-linux-gnu/**libjbig.so.0.0.0 >>> 7f418767b000-7f418767c000 r--p 0000a000 08:05 8650918 >>> /usr/lib/x86_64-linux-gnu/**libjbig.so.0.0.0 >>> 7f418767c000-7f418767f000 rw-p 0000b000 08:05 8650918 >>> /usr/lib/x86_64-linux-gnu/**libjbig.so.0.0.0 >>> 7f418767f000-7f41876a0000 r-xp 00000000 08:05 15991023 >>> /lib/x86_64-linux-gnu/liblzma.**so.5.0.0 >>> 7f41876a0000-7f418789f000 ---p 00021000 08:05 15991023 >>> /lib/x86_64-linux-gnu/liblzma.**so.5.0.0 >>> 7f418789f000-7f41878a0000 r--p 00020000 08:05 15991023 >>> /lib/x86_64-linux-gnu/liblzma.**so.5.0.0 >>> 7f41878a0000-7f41878a1000 rw-p 00021000 08:05 15991023 >>> /lib/x86_64-linux-gnu/liblzma.**so.5.0.0 >>> 7f41878a1000-7f41878d5000 r-xp 00000000 08:05 8655082 >>> /usr/lib/x86_64-linux-gnu/**libwebp.so.2.0.0 >>> 7f41878d5000-7f4187ad4000 ---p 00034000 08:05 8655082 >>> /usr/lib/x86_64-linux-gnu/**libwebp.so.2.0.0 >>> 7f4187ad4000-7f4187ad5000 r--p 00033000 08:05 8655082 >>> /usr/lib/x86_64-linux-gnu/**libwebp.so.2.0.0 >>> 7f4187ad5000-7f4187ad6000 rw-p 00034000 08:05 8655082 >>> /usr/lib/x86_64-linux-gnu/**libwebp.so.2.0.0 >>> 7f4187ad6000-7f4187ad9000 rw-p 00000000 00:00 0 >>> 7f4187ad9000-7f4187b46000 r-xp 00000000 08:05 8657402 >>> /usr/lib/x86_64-linux-gnu/**libtiff.so.5.1.0 >>> 7f4187b46000-7f4187d46000 ---p 0006d000 08:05 8657402 >>> /usr/lib/x86_64-linux-gnu/**libtiff.so.5.1.0 >>> 7f4187d46000-7f4187d47000 r--p 0006d000 08:05 8657402 >>> /usr/lib/x86_64-linux-gnu/**libtiff.so.5.1.0 >>> 7f4187d47000-7f4187d4a000 rw-p 0006e000 08:05 8657402 >>> /usr/lib/x86_64-linux-gnu/**libtiff.so.5.1.0 >>> 7f4187d4a000-7f4187d52000 r-xp 00000000 08:05 8652644 >>> /usr/lib/x86_64-linux-gnu/**libgif.so.4.1.6 >>> 7f4187d52000-7f4187f51000 ---p 00008000 08:05 8652644 >>> /usr/lib/x86_64-linux-gnu/**libgif.so.4.1.6 >>> 7f4187f51000-7f4187f52000 r--p 00007000 08:05 8652644 >>> /usr/lib/x86_64-linux-gnu/**libgif.so.4.1.6 >>> 7f4187f52000-7f4187f53000 rw-p 00008000 08:05 8652644 >>> /usr/lib/x86_64-linux-gnu/**libgif.so.4.1.6 >>> 7f4187f53000-7f4187f92000 r-xp 00000000 08:05 8651145 >>> /usr/lib/x86_64-linux-gnu/**libjpeg.so.8.0.2 >>> 7f4187f92000-7f4188192000 ---p 0003f000 08:05 8651145 >>> /usr/lib/x86_64-linux-gnu/**libjpeg.so.8.0.2 >>> 7f4188192000-7f4188193000 r--p 0003f000 08:05 8651145 >>> /usr/lib/x86_64-linux-gnu/**libjpeg.so.8.0.2 >>> 7f4188193000-7f4188194000 rw-p 00040000 08:05 8651145 >>> /usr/lib/x86_64-linux-gnu/**libjpeg.so.8.0.2 >>> 7f4188194000-7f41881a4000 rw-p 00000000 00:00 0 >>> 7f41881a4000-7f41881c9000 r-xp 00000000 08:05 15990874 >>> /lib/x86_64-linux-gnu/**libpng12.so.0.49.0 >>> 7f41881c9000-7f41883c8000 ---p 00025000 08:05 15990874 >>> /lib/x86_64-linux-gnu/**libpng12.so.0.49.0 >>> 7f41883c8000-7f41883c9000 r--p 00024000 08:05 15990874 >>> /lib/x86_64-linux-gnu/**libpng12.so.0.49.0 >>> 7f41883c9000-7f41883ca000 rw-p 00025000 08:05 15990874 >>> /lib/x86_64-linux-gnu/**libpng12.so.0.49.0 >>> 7f41883ca000-7f41883e0000 r-xp 00000000 08:05 15991011 >>> /lib/x86_64-linux-gnu/libz.so.**1.2.7 >>> 7f41883e0000-7f41885df000 ---p 00016000 08:05 15991011 >>> /lib/x86_64-linux-gnu/libz.so.**1.2.7 >>> 7f41885df000-7f41885e0000 r--p 00015000 08:05 15991011 >>> /lib/x86_64-linux-gnu/libz.so.**1.2.7 >>> 7f41885e0000-7f41885e1000 rw-p 00016000 08:05 15991011 >>> /lib/x86_64-linux-gnu/libz.so.**1.2.7 >>> 7f41885e1000-7f41885f9000 r-xp 00000000 08:05 15990997 >>> /lib/x86_64-linux-gnu/libpthre**ad-2.15.so <http://libpthread-2.15.so> >>> 7f41885f9000-7f41887f8000 ---p 00018000 08:05 15990997 >>> /lib/x86_64-linux-gnu/libpthre**ad-2.15.so <http://libpthread-2.15.so> >>> 7f41887f8000-7f41887f9000 r--p 00017000 08:05 15990997 >>> /lib/x86_64-linux-gnu/libpthre**ad-2.15.so <http://libpthread-2.15.so> >>> 7f41887f9000-7f41887fa000 rw-p 00018000 08:05 15990997 >>> /lib/x86_64-linux-gnu/libpthre**ad-2.15.so <http://libpthread-2.15.so> >>> 7f41887fa000-7f41887fe000 rw-p 00000000 00:00 0 >>> 7f41887fe000-7f41889a6000 r-xp 00000000 08:05 8653061 >>> /usr/lib/liblept.so.3.0.0 >>> 7f41889a6000-7f4188ba5000 ---p 001a8000 08:05 8653061 >>> /usr/lib/liblept.so.3.0.0 >>> 7f4188ba5000-7f4188ba6000 r--p 001a7000 08:05 8653061 >>> /usr/lib/liblept.so.3.0.0 >>> 7f4188ba6000-7f4188bae000 rw-p 001a8000 08:05 8653061 >>> /usr/lib/liblept.so.3.0.0 >>> 7f4188bae000-7f4188baf000 rw-p 00000000 00:00 0 >>> 7f4188baf000-7f4188d64000 r-xp 00000000 08:05 15990995 >>> /lib/x86_64-linux-gnu/libc-2.**15.so <http://libc-2.15.so> >>> 7f4188d64000-7f4188f63000 ---p 001b5000 08:05 15990995 >>> /lib/x86_64-linux-gnu/libc-2.**15.so <http://libc-2.15.so> >>> 7f4188f63000-7f4188f67000 r--p 001b4000 08:05 15990995 >>> /lib/x86_64-linux-gnu/libc-2.**15.so <http://libc-2.15.so> >>> 7f4188f67000-7f4188f69000 rw-p 001b8000 08:05 15990995 >>> /lib/x86_64-linux-gnu/libc-2.**15.so <http://libc-2.15.so> >>> 7f4188f69000-7f4188f6e000 rw-p 00000000 00:00 0 >>> 7f4188f6e000-7f4188f83000 r-xp 00000000 08:05 15990810 >>> /lib/x86_64-linux-gnu/libgcc_**s.so.1 >>> 7f4188f83000-7f4189182000 ---p 00015000 08:05 15990810 >>> /lib/x86_64-linux-gnu/libgcc_**s.so.1 >>> 7f4189182000-7f4189183000 r--p 00014000 08:05 15990810 >>> /lib/x86_64-linux-gnu/libgcc_**s.so.1 >>> 7f4189183000-7f4189184000 rw-p 00015000 08:05 15990810 >>> /lib/x86_64-linux-gnu/libgcc_**s.so.1 >>> 7f4189184000-7f418927f000 r-xp 00000000 08:05 15991003 >>> /lib/x86_64-linux-gnu/libm-2.**15.so <http://libm-2.15.so> >>> 7f418927f000-7f418947e000 ---p 000fb000 08:05 15991003 >>> /lib/x86_64-linux-gnu/libm-2.**15.so <http://libm-2.15.so> >>> 7f418947e000-7f418947f000 r--p 000fa000 08:05 15991003 >>> /lib/x86_64-linux-gnu/libm-2.**15.so <http://libm-2.15.so> >>> 7f418947f000-7f4189480000 rw-p 000fb000 08:05 15991003 >>> /lib/x86_64-linux-gnu/libm-2.**15.so <http://libm-2.15.so> >>> 7f4189480000-7f4189481000 rw-p 00000000 00:00 0 >>> 7f4189481000-7f4189566000 r-xp 00000000 08:05 8650946 >>> /usr/lib/x86_64-linux-gnu/**libstdc++.so.6.0.17 >>> 7f4189566000-7f4189765000 ---p 000e5000 08:05 8650946 >>> /usr/lib/x86_64-linux-gnu/**libstdc++.so.6.0.17 >>> 7f4189765000-7f418976d000 r--p 000e4000 08:05 8650946 >>> /usr/lib/x86_64-linux-gnu/**libstdc++.so.6.0.17 >>> 7f418976d000-7f418976f000 rw-p 000ec000 08:05 8650946 >>> /usr/lib/x86_64-linux-gnu/**libstdc++.so.6.0.17 >>> 7f418976f000-7f4189784000 rw-p 00000000 00:00 0 >>> 7f4189784000-7f4189b92000 r-xp 00000000 08:05 8652613 >>> /usr/lib/libtesseract.so.3.0.**2 >>> 7f4189b92000-7f4189d91000 ---p 0040e000 08:05 8652613 >>> /usr/lib/libtesseract.so.3.0.**2 >>> 7f4189d91000-7f4189d9f000 r--p 0040d000 08:05 8652613 >>> /usr/lib/libtesseract.so.3.0.**2 >>> 7f4189d9f000-7f4189dad000 rw-p 0041b000 08:05 8652613 >>> /usr/lib/libtesseract.so.3.0.**2 >>> 7f4189dad000-7f4189eba000 rw-p 00000000 00:00 0 >>> 7f4189eba000-7f4189edc000 r-xp 00000000 08:05 15991321 >>> /lib/x86_64-linux-gnu/ld-2.15.**so <http://ld-2.15.so> >>> 7f418a0b0000-7f418a0b9000 rw-p 00000000 00:00 0 >>> 7f418a0d9000-7f418a0dc000 rw-p 00000000 00:00 0 >>> 7f418a0dc000-7f418a0dd000 r--p 00022000 08:05 15991321 >>> /lib/x86_64-linux-gnu/ld-2.15.**so <http://ld-2.15.so> >>> 7f418a0dd000-7f418a0df000 rw-p 00023000 08:05 15991321 >>> /lib/x86_64-linux-gnu/ld-2.15.**so <http://ld-2.15.so> >>> 7fff12119000-7fff1213a000 rw-p 00000000 00:00 0 >>> [stack] >>> 7fff121ff000-7fff12200000 r-xp 00000000 00:00 0 >>> [vdso] >>> ffffffffff600000-**ffffffffff601000 r-xp 00000000 00:00 0 >>> [vsyscall] >>> Aborted (core dumped) >>> >>> -- >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To post to this group, send email to [email protected] >>> >>> To unsubscribe from this group, send email to >>> tesseract-oc...@**googlegroups.com >>> >>> For more options, visit this group at >>> http://groups.google.com/**group/tesseract-ocr?hl=en<http://groups.google.com/group/tesseract-ocr?hl=en> >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to tesseract-oc...@**googlegroups.com. >>> >>> For more options, visit >>> https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out> >>> . >>> >>> >>> >> >> -- > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/groups/opt_out. > > > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

