On Sat, 2008-03-15 at 16:03 +0530, Senaka Fernando wrote:
> Hi Manjula,
> 
> Please read my reply inline.
> 
> > Hi Senaka,
> >
> > I am confused here. I think you are taking the discussion to the
> > beginning. Because in the receiving side we read till the end of the
> > stream. Please see my first mail.
> 
> No I'm not taking the discussion to the starting point. I'm rather
> proposing an alternative implementation. According to what I mention here,
> we will rather still read till the end of the stream. But, we will not
> buffer everything we read into memory. We will flush the buffer to a file
> once it exceeds a threshold. However, when we read beyond the buffer size,
> we will not directly copy the entire content to file without parsing it.
> Instead we will use our fixed-sized buffer to temporarily store the
> content before being flushed and then parse it and write it to file. Thus,
> the file will contain only the binary part. It will not contain the
> "--MIMEBoundary" statements etc. These, along with the file name(s) can be
> stored into the parsed attachment object created. Thus, the memory
> consumption will be limited to the size of the fixed buffer and we will
> use the file for storage. This mechanism gives us the added plus of not
> having to worry about re-parsing what is written to file as it has already
> being parsed once. Please note that MIME parsing DOES NOT require us to
> store the entire content in memory.

For me this is same as what Thilina is saying. So again I need to ask
the question what happened when the mime boundary is divided between two
reads. 

> 
> >
> > When sending writing part by part to the stream is same as chunking.
> > Because when sending either you should specify a content-length or
> > specified it as chunked.
> 
> No, it is not the same as chunking. What I meant here is that you need not
> read the entire content at once to memory and write to the stream in a
> single step. Rather we can read part by part and write it to the stream
> and repeat the process until the whole large file is written. In here you
> will still be using the Content Length. Chunking is a whole different
> story where you can transmit data as blocks. Using chunking we can send an
> arbitrary length of data of which the length is not pre-calculated. Now
> you might wonder how do we calculate the content-length without reading
> the entire content to the memory. Well, you can seek through the file and
> find out the size of the content to be written. Add to it the standard
> header block and MIME boundary demarcation string lengths and you will get
> the Content Length. This is a not at all expensive operation as the file
> seek will be scanning the file as a block without reading it to memory.
> The OS will manage it's efficiency.

Here also I think it is same as what Thilina is saying.

Thanks,
-Manjula.

> 
> >
> > -Manjula.
> 
> Regards,
> Senaka
> 
> >
> > On Sat, 2008-03-15 at 13:39 +0530, Senaka Fernando wrote:
> >> >>>  BTW, this whole discussion is about in path, that is reading an
> >> >>>  incomming message. How about the out path? We have the same
> >> problems
> >> >>>  when sending attachments. Right now, we read the whole file into
> >> >>> memory
> >> >>>  and then only we send over the wire.
> >> >> hmm... Why not write it in chunks.. Read a chunk from the file, then
> >> >> write it to the outstream.. Use size of the file for content-type
> >> >> calculation in case of non-chunking.. But mostly people will use
> >> >> chunking when using MTOM..
> >> >
> >> > No, chunking is not required. You also don't need to write the entire
> >> data
> >> > to be sent, to the stream at once. Because any HTTP Receiver will pull
> >> > from the stream until it sees a valid ending character sequence.
> >>
> >> It should rather read a length equal to content length. And the
> >> terminating sequence is for headers. Sorry for the confusion. Therefore,
> >> the HTTP Receiver will pull from the stream until it reads a content
> >> length or until an error occurs.
> >>
> >> >
> >> > I believe that you should be able to write part by part to the stream,
> >> and
> >> > send it, then reuse the buffer and write part 2, and send and so on.
> >> This
> >> > argument can be justified, because on the receiving end, we must read
> >> the
> >> > multi-part data until we encounter the mime boundary, unlike an
> >> ordinary
> >> > payload where it can be terminated by a valid terminating character
> >>
> >> Same here. We'll be reading a length equal to content length.
> >>
> >> > sequence . We'll only have issues if we are to write large soap
> >> payloads
> >> > which of course can be dealt with once we've implemented Session in
> >> > Axis2/C.
> >> >
> >> > Regards,
> >> > Senaka
> >> >
> >> >>
> >> >> thanks,
> >> >> Thilina
> >> >>
> >> >>
> >> >>>
> >> >>>  Samisa...
> >> >>>
> >> >>>
> >> >>>
> >> >>>  > Regards,
> >> >>>  > Senaka
> >> >>>  >
> >> >>>  >
> >> >>>  >> Hi,
> >> >>>  >>
> >> >>>  >>>  > In Axis2/Java case we do write the attachment content
> >> directly
> >> >>> from
> >> >>>  >>>  > the InputStream to the File when the attachment size is
> >> larger
> >> >>> than
> >> >>>  >>>  > the threshold.  This avoids loading the whole attachment to
> >> the
> >> >>>  >>> memory
> >> >>>  >>>  > at all.
> >> >>>  >>>
> >> >>>  >>>  In this case to find out the attachment size don't you need to
> >> do
> >> >>> any
> >> >>>  >>>  mime parsing? How do you find the attachment size with out
> >> >>> searching
> >> >>>  >>> for
> >> >>>  >>>  the mime boundaries ?
> >> >>>  >>>
> >> >>>  >> Yes.. MIME is a boundary based packaging mechanism and you does
> >> not
> >> >>>  >> need to specify the length for each of the parts...Even the HTTP
> >> >>>  >> content length is not there if the message is chunked.
> >> >>>  >>
> >> >>>  >> What we did in Axis2/Java to overcome this is to read the data
> >> to a
> >> >>>  >> byte[] buffer of up to a certain size (the size threshold). If
> >> >>> there
> >> >>>  >> are more data available in the mime part (if we have not
> >> >>> encountered
> >> >>>  >> the boundary yet) then we know this attachment is bigger than
> >> the
> >> >>>  >> threshold. So we create the temp file, pump the content in the
> >> >>> buffer
> >> >>>  >> to the file, then pump the rest of the stream to the file.. In
> >> this
> >> >>>  >> way we do not need to know the size of the attachment upfront..
> >> BTW
> >> >>> we
> >> >>>  >> do all of the above while we are parsing the MIME message at the
> >> >>> MIME
> >> >>>  >> parser level..
> >> >>>  >>
> >> >>>  >>
> >> >>>  >>>  > This has the plus point that the attachment size will be
> >> >>>  >>>  > limited only by the available free space in the Temp
> >> >>> Directory..
> >> >>>  >>>  > Will that be possible in Axis2/C.. Or is that wat you have
> >> in
> >> >>> mind
> >> >>>  >>> :)..
> >> >>>  >>>
> >> >>>  >>>  Yes this is possible.
> >> >>>  >>>
> >> >>>  >> But in Axis2/JAVA we will get a OutOfMemory if we parse a large
> >> >>> MIME
> >> >>>  >> part upfront, since it reads the attachment to memory. May be
> >> you
> >> >>> can
> >> >>>  >> have a larger limit with C than in Java, but ultimately you'll
> >> come
> >> >>> to
> >> >>>  >> a situation where you will not have enough memory to store that
> >> >>> MIME
> >> >>>  >> part in memory in the parsing time, unless you write in to a
> >> File
> >> >>>  >> while parsing,..
> >> >>>  >>
> >> >>>  >> thanks,
> >> >>>  >> Thilina
> >> >>>  >>
> >> >>>  >>
> >> >>>  >>>
> >> >>>  >>>  >
> >> >>>  >>>  > thanks,
> >> >>>  >>>  > Thilina
> >> >>>  >>>  >
> >> >>>  >>>  >  >and keeping the file name inside
> >> >>>  >>>  > >  data_handler instead of the whole buffer. So the service
> >> or
> >> >>> the
> >> >>>  >>> client
> >> >>>  >>>  > >  will get the file name instead of the buffered stream,
> >> when
> >> >>> it
> >> >>>  >>> receives
> >> >>>  >>>  > >  an attachment. This will not prevent buffering the
> >> >>> attachment
> >> >>> at
> >> >>>  >>> the
> >> >>>  >>>  > >  transport but will prevent keeping it inside the om_tree
> >> >>> till
> >> >>> it
> >> >>>  >>> reaches
> >> >>>  >>>  > >  the receiver.
> >> >>>  >>>  > >
> >> >>>  >>>  > >  Before implementing this I would like to know your
> >> >>> suggestions
> >> >>>  >>> regarding
> >> >>>  >>>  > >  this.
> >> >>>  >>>  > >
> >> >>>  >>>  > >  [1] https://issues.apache.org/jira/browse/AXIS2C-672
> >> >>>  >>>  > >
> >> >>>  >>>  > >  Thanks,
> >> >>>  >>>  > >  -Manjula
> >> >>>  >>>  > >
> >> >>>  >>>  > >  --
> >> >>>  >>>  > >  Manjula Peiris: http://manjula-peiris.blogspot.com/
> >> >>>  >>>  > >
> >> >>>  >>>  > >
> >> >>>  >>>  > >  
> >> >>> ---------------------------------------------------------------------
> >> >>>  >>>  > >  To unsubscribe, e-mail:
> >> [EMAIL PROTECTED]
> >> >>>  >>>  > >  For additional commands, e-mail:
> >> >>> [EMAIL PROTECTED]
> >> >>>  >>>  > >
> >> >>>  >>>  > >
> >> >>>  >>>  >
> >> >>>  >>>  >
> >> >>>  >>>  >
> >> >>>  >>>
> >> >>>  >>>
> >> >>>  >>>  
> >> >>> ---------------------------------------------------------------------
> >> >>>  >>>  To unsubscribe, e-mail: [EMAIL PROTECTED]
> >> >>>  >>>  For additional commands, e-mail: [EMAIL PROTECTED]
> >> >>>  >>>
> >> >>>  >>>
> >> >>>  >>>
> >> >>>  >>
> >> >>>  >> --
> >> >>>  >> Thilina Gunarathne - http://thilinag.blogspot.com
> >> >>>  >>
> >> >>>  >> 
> >> >>> ---------------------------------------------------------------------
> >> >>>  >> To unsubscribe, e-mail: [EMAIL PROTECTED]
> >> >>>  >> For additional commands, e-mail: [EMAIL PROTECTED]
> >> >>>  >>
> >> >>>  >>
> >> >>>  >>
> >> >>>  >
> >> >>>  >
> >> >>>  > 
> >> >>> ---------------------------------------------------------------------
> >> >>>  > To unsubscribe, e-mail: [EMAIL PROTECTED]
> >> >>>  > For additional commands, e-mail: [EMAIL PROTECTED]
> >> >>>  >
> >> >>>  >
> >> >>>  >
> >> >>>  >
> >> >>>
> >> >>>
> >> >>>  --
> >> >>>  Samisa Abeysinghe
> >> >>>  Software Architect; WSO2 Inc.
> >> >>>
> >> >>>  http://www.wso2.com/ - "Oxygenating the Web Service Platform."
> >> >>>
> >> >>>
> >> >>>
> >> >>>
> >> >>>  ---------------------------------------------------------------------
> >> >>>  To unsubscribe, e-mail: [EMAIL PROTECTED]
> >> >>>  For additional commands, e-mail: [EMAIL PROTECTED]
> >> >>>
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Thilina Gunarathne - http://thilinag.blogspot.com
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: [EMAIL PROTECTED]
> >> >> For additional commands, e-mail: [EMAIL PROTECTED]
> >> >>
> >> >>
> >> >
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> >> > For additional commands, e-mail: [EMAIL PROTECTED]
> >> >
> >> >
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [EMAIL PROTECTED]
> >> For additional commands, e-mail: [EMAIL PROTECTED]
> >>
> >
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to