Hi Samisa, IIRC, this discussion is on handling attachments and thus, does not relate to caching. Though $subject says "Caching" what actually was discussed was a mechanism to buffer the attachment in a file, rather than in memory, and that buffer has nothing to do with a Caching, which is a totally different concept, as in [1].
The previous mail I sent was a reply to Manjula's concern in handling a scenario where the MIME boundary appears as two parts distributed among two reads. As unlike the previous scenarios, the once read block will be flushed to a file, instead of having it in memory. Thus, parsing may have to be thought of. Sorry if it confused you. IMHO, writing a partially parsed buffer to a file is not that efficient as we will have to parse it sometime later, to discover MIME Boundaries and extract attachments. Thus, I still believe that realtime buffering to a file while parsing is still a better choice. To implement such, we will have to modify our mime_parser.c, and probably the data_handler implementation. Or if not, am I misunderstanding $subject? [1] http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html Regards, Senaka > Senaka Fernando wrote: >>> Hi Manjula, Thilina and others, >>> >>> Yep, I think I'm exactly in the same view point as Thilina when it >>> comes >>> to handling attachment data. Well for the chunking part. I think I >>> didn't >>> get Thilina right in his first e-mail. >>> >>> And, However, the file per MIME part may not always be optimal. I say >>> rather each file should have a fixed Max Size and if that is exceeded >>> perhaps you can divide it to two. Also a user should always be given >>> the >>> option to choose between Thilina's method and this method through the >>> axis2.xml (or services.xml). Thus, a user can fine tune memory use. >>> >>> When it comes to base64 encoded binary data, you can use a mechanism >>> where >>> the buffer would always have the size which is a multiple of 4, and >>> then >>> when you flush you decode it and copy it to the file, so that should >>> essentially be the same to a user when it comes to caching. >>> >>> OK, so Manjula, you mean when the MIME boundary appears partially in >>> the >>> first read and partially in the second? >>> >>> Well this is probably the best solution. >>> >>> You will allocate enough size to read twice the size of a MIME boundary >>> and in your very first read, you will read 2 times the MIME boundary, >>> then >>> you will search for the existence of the MIME boundary. Next you will >>> do a >>> memmove() and move all the contents of the buffer starting from the >>> MidPoint until the end, to the beginning of the buffer. After doing >>> this, >>> you will read a size equivalent to 1/2 the buffer (which again is the >>> size >>> of the MIME boundary marker) and store it from the Mid Point of the >>> buffer >>> to the end. Then you will search again. You will iterate this procedure >>> until you read less than half the size of the buffer. >>> >> >> If you are interested further in this mechanism, I used this approach >> when >> it comes to resending Binary data using TCPMon. You may check that also. >> >> Also, the strstr() has issues when you have '\0' in the middle. Thus you >> will have to use a temporary search marker and use that in the process. >> Before calling strstr() you will check whether strlen(temp) is greater >> than the MIME boundary marker or equal. If it is greater, you only need >> to >> search once. If it is equal, you will need to search exactly twice. If >> it >> is less you increment temp by strlen(temp) and repeat until you cross >> the >> Midpoint. So this makes the search even efficient. >> >> If you want to make the search even efficient, you can make the buffer >> size one less than the size of the MIME boundary marker, so when you get >> the equals scenario, you will have to search only once. >> >> The fact I've used here is that strstr and strlen behaves the same in a >> given implementation. In Windows if strlen() is multibyte aware, so will >> strstr(). So, no worries. >> > > We have an efficient parsing mechanism already, tested and proven to > work, with 1.3. Why on earth are we discussing this over and over again? > > Does caching get affected by the mime parser logic? IMHO no. They are > two separate concerns, so ahy are we wasting time discussing parsing > while the problem at had is not parsing but caching? > > Writing the partially passed buffer was a solution to caching. Do we > have any other alternatives? If so what, in short, what are they? > > Samisa... > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
