Can this be done by using standard C++ iostreams, rather then creating a new model? What about the boost stream extensions?
Leonard On Oct 23, 2008, at 3:25 PM, Craig Ringer wrote: > Hi > > Before I start work on this, I just want to check to make sure I'm not > missing anything obvious. There isn't currently any interface > exposed to > permit users to progressively read a filtered PDF stream or do random > I/O in an unfiltered stream, is there? > > I'd like to provide a PdfInputStream-like interface for PdfStream, so > that users can read huge streams in small segments. For streams that > don't have any filters applied (be they file or memory based) it could > also do random I/O. > > This is different from GetFilteredCopy(PdfOutputStream*), in that > there's no need for the caller to implement a custom PdfOutputStream > to > do whatever work they need to do, and for file streams it doesn't have > to allocate a temporary copy of the whole stream in RAM in order to > filter it. It'd also be an easier interface to use for most work, > especially where you might not even want to decode all the stream. > > The main use I have for this is in PoDoFoBrowser, where we really > shouldn't have to allocate a whole stream in memory and possibly > allocate another decompressed copy of it if it's flate filtered or > similar. The same principle will apply to other programs processing > big > PDF streams (say, huge images) though. > > I'd like to preserve the existing interfaces in PdfStream, but rewrite > GetCopy and GetFilteredCopy to use the underlying progressive reading > interfaces. PdfStream would no longer make any assumption that a > stream > has an "internal buffer" that may be accessed; instead, it'll request > data from the stream in small chunks and feed those to the output or > to > any required filter. The chunk size can be big enough that the > (minimal) > overhead of the function calls etc for the progressive reading > should be > basically undetectable, and concrete stream implementations can > override > the methods if they have a simpler way to do it anyway. > > Once I've got the PdfStream interface adjustments done it should be > possible to do something like extract and write a 100MB image from a > PDF > without using more than a few hundred kb of RAM. > > Sound good? If so, the next thing I'll want to do is write a variant > on > PdfFileStream that uses an external temp file instead of a view into > the > original PDF, so it's possible to edit a stream without having to load > the whole thing into RAM at once. Again, I'm sure you can see uses > outside the obvious ones in PoDoFoBrowser. > > -- > Craig Ringer > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win > great prizes > Grand prize is a trip for two to an Open Source event anywhere in > the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Podofo-users mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/podofo-users > ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Podofo-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/podofo-users
