Hello, I am trying to fine-tune Tesseract with a new language to interpret these images posted below. I am able to create the box tr file with: tesseract NTS.tif NTS --psm6 nobatch box.train. And the unicharset file with: unicharset_extractor NTS.box. However when I get to the command for making the shapetable I use: shapeclustering -F font_properties.txt -U unicharset -O NTS.unicharset NTS.font.exp0.tr
And receive: Reading NTS.font.exp0.tr ... Building master shape table Computing shape distances... Stopped with 0 merged, min dist 999.000000 Computing shape distances... Stopped with 0 merged, min dist 999.000000 Computing shape distances... Stopped with 0 merged, min dist 999.000000 Computing shape distances... 0 Stopped with 0 merged, min dist 999.000000 Computing shape distances... 0 Stopped with 0 merged, min dist 999.000000 Computing shape distances... 0 Stopped with 0 merged, min dist 999.000000 Computing shape distances... 0 Stopped with 0 merged, min dist 999.000000 Computing shape distances... 0 Stopped with 0 merged, min dist 999.000000 Computing shape distances... 0 Stopped with 0 merged, min dist 999.000000 Computing shape distances... 0 Stopped with 0 merged, min dist 999.000000 Computing shape distances... 0 Stopped with 0 merged, min dist 999.000000 Computing shape distances... 0 Stopped with 0 merged, min dist 999.000000 Computing shape distances... 0 Stopped with 0 merged, min dist 999.000000 Computing shape distances... 0 Stopped with 0 merged, min dist 999.000000 Computing shape distances... Stopped with 0 merged, min dist 999.000000 Computing shape distances... Stopped with 0 merged, min dist 999.000000 Computing shape distances... 0 1 2 3 4 5 6 7 8 9 10 Stopped with 0 merged, min dist 0.061111 Master shape_table:Number of shapes = 11 max unichars = 1 number with multiple unichars = 0 However, when I examine the shapetable afterwards I get a blank file that I cannot use for further steps. Does anyone have any advice? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/7a7aad32-b95e-4141-b002-fe9debf9cd9en%40googlegroups.com.

