Disk I/O may take significant amounts of time, but usually is only a small fraction of overall processing.
If it is really a concern, just set up a RAM disk and read and write the images there; that way, you only have to do a little scripting (since you need to clean up things so that the RAM disk doesn't overflow). Tom On May 23, 6:37 pm, Mridul Kashatria <[email protected]> wrote: > Hello, > > I'm a newbie to great ocropus library. What I'm doing here is to input a > PDF file and output page segmentation data. So far I can see the steps > to do this as follows, > > 1. Convert each page of PDF file to 300 DPI PNG (or jpeg, ppm etc) image > 2. Save the PNG to disk > 3. Call iulib::read_image_gray(gray, filename); passing filename of PNG > 4. Make binarizer and call binarizer.binarize(bin, gray); > 5. Make page segmenter and call segmenter.segment(out, bin) > 6. Use RegionExtractor to find out the rectangular regions > 7. Save the region data to a sqlite database for use later > > The problem is now that saving each PNG file to disk and then reading it > from disk takes a lot of time, esp when there are 100+ pages in a PDF. > > I'm using Cairo graphics to render PDF to images, and want to know if > there is a way I can save time by directly passing some in-memory > reference of PNG encoded data to iulib::read_image method. > > I'm a newb to C, C++ as well, so forgive if I'm missing something > obvious. > > Thanks > > -- > Regards > > Mridul -- You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/ocropus?hl=en.
