Re: [pdf-devel] ASCII85Decode implementation

jemarch Sat, 08 Mar 2008 15:40:10 -0800

   >  We need to introduce a change in the stm module API and in the filters
   >  management: the memory should be allocated by the caller and should be
   >  possible to apply a filter using repeated calls. As in:
   >
   >   /* create a stm and install filters for read... */
   >   /* allocate a 10k buffer... */
   >   /* read 10k and check for eof... */
   >   /* allocate 10k... */
   >   /* read 10k and check for eof... */
   >   ...


   I don't understand this scheme - a stream tells you how long its input
   data segment is.   In the case of an ASCII85 or ASCIIHEX stream, you
   only need to read 5 and 2 bytes at a time to decode; the operating
   system will handle the file buffering.  For the decoded stream,
   ASCII85 and ASCIIHEX have predictable sizes; in fact you always know
   the length of the decoded ASCIIHEX and you can predict the ASCII85
   result to within 4 characters.

The ASCII85 and ACIIHEX filters has a predictable output size. But
the size of the output of some filters (such as flate-decode) cannot
be determined before to apply the filter to the entire data. 

The idea is to allow the client to speak in terms of filtered data. In
this way if we install some filters to decode a stream and we tell a
stm to get 10k we are asking to retrieve 10k of filtered data.

In this way the stm_read function will work quite similar to fread.

   Except for very large streams such as audio and video, I don't see
   a point in piecemeal memory allocation; that will only result in
   poor performance and a horribly fragmented memory table.

Streams in PDF files can be quite lengthy. Both audio and video
data can be encoded in a PDF stream. From PDF 1.5 there are also
object streams.

I think that, like fread, stm_read should allow the user to make a
suitable management of the memory used to return the data.  

   I also don't understand why the caller should allocate memory when
   in principle the caller should be ignorant of the details of the
   filter behavior.

I dont understand. Asking for 10k of filtered data is a quite good way
to hide the details of the filter behavior: we dont care about the
length of the unfiltered data.

Re: [pdf-devel] ASCII85Decode implementation

Reply via email to