Hello, I'm a newbie to great ocropus library. What I'm doing here is to input a PDF file and output page segmentation data. So far I can see the steps to do this as follows,
1. Convert each page of PDF file to 300 DPI PNG (or jpeg, ppm etc) image 2. Save the PNG to disk 3. Call iulib::read_image_gray(gray, filename); passing filename of PNG 4. Make binarizer and call binarizer.binarize(bin, gray); 5. Make page segmenter and call segmenter.segment(out, bin) 6. Use RegionExtractor to find out the rectangular regions 7. Save the region data to a sqlite database for use later The problem is now that saving each PNG file to disk and then reading it from disk takes a lot of time, esp when there are 100+ pages in a PDF. I'm using Cairo graphics to render PDF to images, and want to know if there is a way I can save time by directly passing some in-memory reference of PNG encoded data to iulib::read_image method. I'm a newb to C, C++ as well, so forgive if I'm missing something obvious. Thanks -- Regards Mridul -- You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/ocropus?hl=en.
