Hi Tom and others, I've been enjoying running and poking around in the system the last several weeks. It's a very nice system once you get it all running and figured out :-). I was especially glad to find "ocropus-showpsegs" and "ocropus-showlrecs." Those are very nice and make it very easy to see what's going on.
I've got a few of questions, mostly related to the images that OCRopus during it's processing. When you do page segmentation, it draws an image named filename.pseg.png. If the original has a very simple layout, then this is a black and white image just like the original. If the layout is more complex, the text (and spaces) get colored in black, blue, green and yellow. Is there any particular meaning to the colors? It looks like green is titles or headlines, and yellow is for spaces? What's the distinction between black and blue? Is there any way to output a page segmentation of the sort that you see when you run "ocropus-showpsegs" (i.e. red boxes around the line regions)? I see that you can save that segmentation to a file from within ocropus-showpsegs, but you have to open it, etc. first and the saved image (I think) is fairly low resolution and has an odd-looking background. If it's not currently possible, that would be a nice feature to add. When running "ocropus-linerec" it creates 2 images, which the script says should a raw and an aligned segmentation. For me, these look just like the original line (B&W). Are they supposed to be colored according to the segmentation (as happens in "ocropus-showlrecs")? Cheers, Ben -- Benjamin Lambert Ph.D. Student of Computer Science Carnegie Mellon University www.cs.cmu.edu/~belamber Mobile: 617-869-1844 -- You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/ocropus?hl=en.
