Re: [twsocket] Adding gzip to HttpCli
On 30-Jul-05 01:54:44 Maurizio Lotauro wrote: On 28-Jul-05 08:34:58 Francois Piette wrote: Do you want that the data passed to OnDocData is decompressed? Yes I do. Are you really sure? Ok Ok, I'll try to do it :-) Done. Now I have some points that I would like to discuss. a) Exception in THttpContentCoding.GetCoding method This method is called indirectly during the initialization. It seems that this is not the best moment to raise an exception. When run from Delphi, if the Stop on Delphi exception is not enabled, the developer see only an Internal error 217 on ..., not very meaningful to know what the problem is. For the same reason is useles to have a spefic exception class. If run outside Delphi the user see This application has encounterd bla bla bla do you want send a report bla bla bla. I tried to move the check in the THttpContCodHandler.Create. In this case when run from Delphi the developer will see what specific exception is raised. Outside Delphi same behaviour same message This application ... I would prefer the first approach because the error is raised when the application will run, while the second only when the form or datamodule that contain the component will created. But the error message will disoriunt the developer if he has Stop on Delphi exception disabled. Opinions? b) New properties. We need at least two new properties. One for disable the automatic use of content coding and another to enable the use of Quality specifier. I suggest to use a record type to group all properties related to the content coding. The property could be ContentCoding with Enabled (default false) and UseQuality (default false) fields. If it is not enabled the component will not add the Accept-Encoding in the header. Should it even ignore then Content-Encoding? c) The THttpContCodHandler.Prepare return false if there is an encode that it is unable to decompress. Actually the HttpCli doesn't check the result, and in this case the body will be not decompressed at all. Is it acceptable or should this situation be handled differently? d) There are two coding atomatically added: Identity (quality=0.5) and * (quality=0). Actually they are enabled by default, should they must disabled? Is it ok the default value of quality? That's all for the moment. Bye, Maurizio. -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] Adding gzip to HttpCli
Xavier, have you any reason to assign to FLastResponse? No important reason Xavier -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] Adding gzip to HttpCli
I think I said that RcvdCount that a choice has to be made and Ithat I have no defitive answer. The idea is to break as less as possible existing code. RcvdCount is used for progress bar updating and should be compressed byte count. It is also used to allocate storage (or similar use) for data and should be decompressed data. Maybe for simplicity we should let RcvdCount be the compressed byte count ? Really a question, the debate is open ! I doubt that RcvdCount could be used to allocate storage. The body data will be put into RcvdStream that is a stream, and normally a stream is able to allocate the storage itself. RcvdStream is not the only way to get data from the component. Some (many ?) applications sue OnDocData event to get data on the fly. It could be useful if the size of the body is known in advance so the whole storage is allocated in one step, but eventually you have this information only with the content length and not from RcvdCount. And the content length is not always specified, not to mention the case when it is wrong. If specified but wrong, the component will hang, unless the server close the connexion before the length specified is reached. After all this considerations my conclusions are: - RcvdCount contains the count of bytes received from the server - ContentLength contains what specified in the header Agreed. -- [EMAIL PROTECTED] http://www.overbyte.be -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] Adding gzip to HttpCli
Scrive Francois Piette [EMAIL PROTECTED]: [...] I doubt that RcvdCount could be used to allocate storage. The body data will be put into RcvdStream that is a stream, and normally a stream is able to allocate the storage itself. RcvdStream is not the only way to get data from the component. Some (many ?) applications sue OnDocData event to get data on the fly. In that case the decompression should be made completly by the application, or should we add some support for this? [...] After all this considerations my conclusions are: - RcvdCount contains the count of bytes received from the server - ContentLength contains what specified in the header Agreed. This is how it is implemented. Bye, Maurizio. This mail has been sent using Alpikom webmail system http://www.alpikom.it -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] Adding gzip to HttpCli
I doubt that RcvdCount could be used to allocate storage. The body data will be put into RcvdStream that is a stream, and normally a stream is able to allocate the storage itself. RcvdStream is not the only way to get data from the component. Some (many ?) applications sue OnDocData event to get data on the fly. In that case the decompression should be made completly by the application, or should we add some support for this? If decompression is done on the fly (streaming, we already talked about this), then nothing special is required. -- [EMAIL PROTECTED] http://www.overbyte.be -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] Adding gzip to HttpCli
Scrive Francois PIETTE [EMAIL PROTECTED]: RcvdStream is not the only way to get data from the component. Some (many ?) applications sue OnDocData event to get data on the fly. In that case the decompression should be made completly by the application, or should we add some support for this? If decompression is done on the fly (streaming, we already talked about this), then nothing special is required. The decompression will be made only if RcvdStream is assigned. Bye, Maurizio. This mail has been sent using Alpikom webmail system http://www.alpikom.it -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] Adding gzip to HttpCli
On 26-Jul-05 20:37:54 Xavier Le Bris wrote: Hello, Hello, [...] FGzTime is necessary, not for me, but for anyone who wants to choose the best compression level/time ratio. I think that this is the job for a test application, not for the component. You can use OnBeforeHeaderSend and OnDocEnd. http://www.pipeboost.com/ is not mine; but you must use Http 1.1 to have compression. It works using Http 1.0 for me :-) Bye, Maurizio. -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] Adding gzip to HttpCli
On 26-Jul-05 17:47:41 Francois PIETTE wrote: [...] We already talked about RcvdCount. Yes, but we haven't still get a conclusion :-) I think I said that RcvdCount that a choice has to be made and Ithat I have no defitive answer. The idea is to break as less as possible existing code. RcvdCount is used for progress bar updating and should be compressed byte count. It is also used to allocate storage (or similar use) for data and should be decompressed data. Maybe for simplicity we should let RcvdCount be the compressed byte count ? Really a question, the debate is open ! I doubt that RcvdCount could be used to allocate storage. The body data will be put into RcvdStream that is a stream, and normally a stream is able to allocate the storage itself. It could be useful if the size of the body is known in advance so the whole storage is allocated in one step, but eventually you have this information only with the content length and not from RcvdCount. And the content length is not always specified, not to mention the case when it is wrong. After all this considerations my conclusions are: - RcvdCount contains the count of bytes received from the server - ContentLength contains what specified in the header Similar question for ContentLength: should it contain what is specified in the header or the effective final length of the body? The final length will not be known before everything is decompressed. Exactly. ContentLength, IMO, should stay in sync with what the header says. A new property could be added to have the decompressed content length. Not that much useful as the user has RcvdStream.Size or can cumulate what he receive in OnDocData. You red in my mind :-) Bye, Maurizio. -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] Adding gzip to HttpCli
On 24-Jul-05 12:01:55 Xavier Le Bris wrote: Hello, A contribution to compression with gzip and some answers to Maurizio's questions : It is a good idea to put the whole stuff in another unit. a/ The site 'http://www.pipeboost.com/' works today for me in T_HttpCliGz. If not, check if zlib.dll is well loaded. This URL answer with the gzip encoding, but there is an error in your code. In xZlibDll in the ZLibLoadDll procedure if the first LoadLibrary fail then you try using different dll names. But you will not reassign the ZLibDllHandle variable. Another little change that I made in the same unit is to move the uses of Windows unit from the implementation section to the interface. I'm using Delphi 5 and it is unable to found the pByte declaration. The URL is yours? I noticed that if http version is 1.0 then in the answer of the server there is two times the header Connection: keep-alive (the second has the first letters capitalized). Last observation. I tried to compile your HttpGzAsy project but I get an error because the xChrono unit is missing. b/ FGzTime is used to know the time of reception and decompression. It can help to choose a good compression level on the server side. For example, I chose level 6 in a first time for one of my sites. After tests, the level 1 was better... Ok, then it is no neecessary for normal use. I have not reported it. c/ I chose to decompress at the end of reception, because in my xGzipStream unit, the decompression is done for the whole stream after reading the number of uncompressed bytes. This value is at the end of the stream. We may decompress on the fly, but the units have to be rewritten. With the actual RAM memory, I don't think it is necessary. The whole thing is made in a way that it is a choiche of the developer of the coding class if it is handled on the fly or at the end. d/ I think that the RcvdStream can contain the compressed data during the reception. So, RcvdCount can be used for a progress bar. At the end of reception, I consider, according to François's message of 09/05/2005 (the user has not to know if the received data were compressed or not) that RcvdStream and RcvdCount are relative to uncompressed data. I don't see why we have to keep the uncompressed data. Only my variable GzContentLength keeps ^^ I imagine you mean compressed data. the compressed length. I don't think that use RcvdStream for receiving compressed data is a good idea because it will be more delicate to cleanup. Moreover it is possible that more than one encoding are applied to the body and in this case the situation will be even worse, not to mention the case of on the fly decoding. In my propose the RcvdStream will contain only the uncompressed data. And about the RcvdCount, I think that it is better if it still contain the byte effectly received. Changing it could mess up existing code. Similar question for ContentLength: should it contain what is specified in the header or the effective final length of the body? Francois? e/ FRequestVer is set to 1.1, because some site don't compress if 1.0 is sent. Ok, but it is not necessary to limit to the 1.1 version. In that case the server simply will not compress the data. f/ In my code, I have a GzAcceptEncoding boolean to disconnect compression if necessary. It would be nice to keep it in case of problem with some sites. It is still possibile to enable/disable each coding. Use HttpCli.ContentCodingHnd.CodingEnabled[name of coding] Sorry, I have not time today si answer other questions. No problem :-) Bye, Maurizio. -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] Adding gzip to HttpCli
So we should consider both. But there is a potential problem. The RFC stated that more that one encoding can be applied to the body. We can have then a mixed situation. Two compression applied one after the other or one compression for a part of the document and another one for the remaining part ? In either case, I don't understand how it could work except if the document is in MIME format and compression apply to MIME parts. Thta is a completly different problem. BTW the Xavier proposal make the decode at the end. I think that initially making the decode of the whole thing will be easier to implement, and in a second time we can add the streaming capability. I think the code would be completely different. It's writing it twice. Streaming is very interesting for example when downloading images or sound and start displaying or playing before the end of document is here. And this apply also in basic document type where display can be done as the document is comming. -- [EMAIL PROTECTED] http://www.overbyte.be - Original Message - From: Maurizio Lotauro [EMAIL PROTECTED] To: ICS support mailing twsocket@elists.org Sent: Friday, July 15, 2005 1:55 AM Subject: Re: [twsocket] Adding gzip to HttpCli Before start, I get your answer from the archive of this ML because it seems that my ISP has lost a whole day of emails :( On 12-Jul-05 02:41:45 Francois PIETTE wrote: a) where is the best place to decode the received stream? Xavier do this in the GetBodyLineNext when the end of document is reached. I think decoding should be done on the fly. What do you mean? Data should be decoded during the receiving, i.e. without having first the whole body? Yes, indeed. Most compressions, including gzio, are streaming type. So it is easy to decode them on the fly. Of course if some compression is not streamable, then receiving the whole thing then decompressing is OK. So we should consider both. But there is a potential problem. The RFC stated that more that one encoding can be applied to the body. We can have then a mixed situation. BTW the Xavier proposal make the decode at the end. I think that initially making the decode of the whole thing will be easier to implement, and in a second time we can add the streaming capability. [...] Coded version should remains purely internal to the component. Should that coded version be of some interest (but why ?), then maybe an event or a new property could give access to it. The only reason that I see is for debug, in particular for the decoding routines. In the component, where received data is written to RcvdStream, it must be written to the decompression stream, and then the decoding stream is read to get decompressed data (if any: you don't always have decompressed data each time you write compressed data) and written to RcvdStream where it is available for the application. RcvdCount, used for determining the end of document, must represent the compressed byte count. Maybe be the component has to use another variable for that purpose (determining when document is complete) and still expose the decompressed byte count in RcvdCount property. Doing so will make decompression completely [well, mostly] transparent to applications. I see a potential problem: RcvdCount is used by application to update progress bar, so it must be compressed byte count. But other applications use it to know how many bytes to process, so it must be the decompressed byte count. We can't have both ! So a choice must be made. I don't know what the best is. Any opinion ? I never used RcvdCount then one choice or the other is the same for me. But I see a bigger problem. FRcvdStream and FRcvdCount are declared protected (as everything not declared public or published) and then is it possible that components inherited from THttpCli have used these variables. Bye, Maurizio. -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] Adding gzip to HttpCli
Before start, I get your answer from the archive of this ML because it seems that my ISP has lost a whole day of emails :( On 12-Jul-05 02:41:45 Francois PIETTE wrote: a) where is the best place to decode the received stream? Xavier do this in the GetBodyLineNext when the end of document is reached. I think decoding should be done on the fly. What do you mean? Data should be decoded during the receiving, i.e. without having first the whole body? Yes, indeed. Most compressions, including gzio, are streaming type. So it is easy to decode them on the fly. Of course if some compression is not streamable, then receiving the whole thing then decompressing is OK. So we should consider both. But there is a potential problem. The RFC stated that more that one encoding can be applied to the body. We can have then a mixed situation. BTW the Xavier proposal make the decode at the end. I think that initially making the decode of the whole thing will be easier to implement, and in a second time we can add the streaming capability. [...] Coded version should remains purely internal to the component. Should that coded version be of some interest (but why ?), then maybe an event or a new property could give access to it. The only reason that I see is for debug, in particular for the decoding routines. In the component, where received data is written to RcvdStream, it must be written to the decompression stream, and then the decoding stream is read to get decompressed data (if any: you don't always have decompressed data each time you write compressed data) and written to RcvdStream where it is available for the application. RcvdCount, used for determining the end of document, must represent the compressed byte count. Maybe be the component has to use another variable for that purpose (determining when document is complete) and still expose the decompressed byte count in RcvdCount property. Doing so will make decompression completely [well, mostly] transparent to applications. I see a potential problem: RcvdCount is used by application to update progress bar, so it must be compressed byte count. But other applications use it to know how many bytes to process, so it must be the decompressed byte count. We can't have both ! So a choice must be made. I don't know what the best is. Any opinion ? I never used RcvdCount then one choice or the other is the same for me. But I see a bigger problem. FRcvdStream and FRcvdCount are declared protected (as everything not declared public or published) and then is it possible that components inherited from THttpCli have used these variables. Bye, Maurizio. -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] Adding gzip to HttpCli
a) where is the best place to decode the received stream? Xavier do this in the GetBodyLineNext when the end of document is reached. I think decoding should be done on the fly. What do you mean? Data should be decoded during the receiving, i.e. without having first the whole body? Yes, indeed. Most compressions, including gzio, are streaming type. So it is easy to decode them on the fly. Of course if some compression is not streamable, then receiving the whole thing then decompressing is OK. b) the RcvdStream must contain what effectly received from the server or the decoded version? In the second case, what received should be dropped or keeped somewhere? RcvdStream should contains the decode version. This has some impact, for example if the Cleanup routine will called. We should proceed very carefully. In the meantime I made the changes that I mentioned. It is not finished because we should discuss and decide about the RcvdStream. The possibilities that I see are the following. a) RcvdStream will initially receive the encoded body pro: - the code that handle the receiving will remain as is. cons: - the encoded part must be removed from the stream before replacing with the decoded version - the cleanup procedure will be more delicate (in particutlar if the stream was not empty at the start) because the RcvdCount will change after the decoding and because the decode could fail. b) RcvdStream will receive only the decoded body pro: - if needed the cleanup is easier and probably more safe cons: - the code that handle the receiving must be reworked, probably using an internal variable that points to FRcvdStream (if no decode is needed) or to the stream that will be decoded. - RcvdCount can have some side effects on existent application Coded version should remains purely internal to the component. Should that coded version be of some interest (but why ?), then maybe an event or a new property could give access to it. The only reason that I see is for debug, in particular for the decoding routines. In the component, where received data is written to RcvdStream, it must be written to the decompression stream, and then the decoding stream is read to get decompressed data (if any: you don't always have decompressed data each time you write compressed data) and written to RcvdStream where it is available for the application. RcvdCount, used for determining the end of document, must represent the compressed byte count. Maybe be the component has to use another variable for that purpose (determining when document is complete) and still expose the decompressed byte count in RcvdCount property. Doing so will make decompression completely [well, mostly] transparent to applications. I see a potential problem: RcvdCount is used by application to update progress bar, so it must be compressed byte count. But other applications use it to know how many bytes to process, so it must be the decompressed byte count. We can't have both ! So a choice must be made. I don't know what the best is. Any opinion ? btw: Xavier, are you listening ? -- [EMAIL PROTECTED] http://www.overbyte.be -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] Adding gzip to HttpCli
a) where is the best place to decode the received stream? Xavier do this in the GetBodyLineNext when the end of document is reached. I think decoding should be done on the fly. b) the RcvdStream must contain what effectly received from the server or the decoded version? In the second case, what received should be dropped or keeped somewhere? RcvdStream should contains the decode version. Coded version should remains purely internal to the component. Should that coded version be of some interest (but why ?), then maybe an event or a new property could give access to it. -- [EMAIL PROTECTED] http://www.overbyte.be - Original Message - From: Maurizio Lotauro [EMAIL PROTECTED] To: twsocket@elists.org Sent: Friday, July 01, 2005 12:00 AM Subject: [twsocket] Adding gzip to HttpCli Hello, I finally get some time to check the changes proposed to handle the gzip content encoding. First I think that it should be better not to add specific gzip handling but a generic class to handle the content encoding, using a registration machanism for each encoding, like what happen with TGraphic. So the THttpCli will compose the Accept-encoding header accordinly to the registered classes, that will do then the decoding if needed. Two questions: a) where is the best place to decode the received stream? Xavier do this in the GetBodyLineNext when the end of document is reached. b) the RcvdStream must contain what effectly received from the server or the decoded version? In the second case, what received should be dropped or keeped somewhere? Opinions? About the gzip implementation proposed from Xavier I have some question. Why in the Create method the FRequestVer is set to a different value? Why the Accept-encoding is added only if the RequestVer is 1.1? What is FGzTime used for? Bye, Maurizio. -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
Re: [twsocket] Adding gzip to HttpCli
On 11-Jul-05 19:23:20 Francois PIETTE wrote: Welcome back Francois :-) a) where is the best place to decode the received stream? Xavier do this in the GetBodyLineNext when the end of document is reached. I think decoding should be done on the fly. What do you mean? Data should be decoded during the receiving, i.e. without having first the whole body? b) the RcvdStream must contain what effectly received from the server or the decoded version? In the second case, what received should be dropped or keeped somewhere? RcvdStream should contains the decode version. This has some impact, for example if the Cleanup routine will called. We should proceed very carefully. In the meantime I made the changes that I mentioned. It is not finished because we should discuss and decide about the RcvdStream. The possibilities that I see are the following. a) RcvdStream will initially receive the encoded body pro: - the code that handle the receiving will remain as is. cons: - the encoded part must be removed from the stream before replacing with the decoded version - the cleanup procedure will be more delicate (in particutlar if the stream was not empty at the start) because the RcvdCount will change after the decoding and because the decode could fail. b) RcvdStream will receive only the decoded body pro: - if needed the cleanup is easier and probably more safe cons: - the code that handle the receiving must be reworked, probably using an internal variable that points to FRcvdStream (if no decode is needed) or to the stream that will be decoded. - RcvdCount can have some side effects on existent application Coded version should remains purely internal to the component. Should that coded version be of some interest (but why ?), then maybe an event or a new property could give access to it. The only reason that I see is for debug, in particular for the decoding routines. Bye, Maurizio. -- To unsubscribe or change your settings for TWSocket mailing list please goto http://www.elists.org/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be