Re: [twsocket] Adding gzip to HttpCli

2005-08-01 Thread Maurizio Lotauro
On 30-Jul-05 01:54:44 Maurizio Lotauro wrote:

On 28-Jul-05 08:34:58 Francois Piette wrote:

 Do you want that the data passed to OnDocData is decompressed?

Yes I do.

Are you really sure? Ok Ok, I'll try to do it :-)

Done.

Now I have some points that I would like to discuss.

a) Exception in THttpContentCoding.GetCoding method
This method is called indirectly during the initialization. It seems
that this is not the best moment to raise an exception.
When run from Delphi, if the Stop on Delphi exception is not
enabled, the developer see only an Internal error 217 on ..., not
very meaningful to know what the problem is. For the same reason is
useles to have a spefic exception class.
If run outside Delphi the user see This application has encounterd
bla bla bla do you want send a report bla bla bla.
I tried to move the check in the THttpContCodHandler.Create. In this
case when run from Delphi the developer will see what specific
exception is raised. Outside Delphi same behaviour same message This
application ...
I would prefer the first approach because the error is raised when
the application will run, while the second only when the form or
datamodule that contain the component will created. But the error
message will disoriunt the developer if he has Stop on Delphi
exception disabled. Opinions?

b) New properties.
We need at least two new properties. One for disable the automatic
use of content coding and another to enable the use of Quality
specifier. I suggest to use a record type to group all properties
related to the content coding. The property could be ContentCoding
with Enabled (default false) and UseQuality (default false)
fields.
If it is not enabled the component will not add the Accept-Encoding
in the header. Should it even ignore then Content-Encoding?

c) The THttpContCodHandler.Prepare return false if there is an encode
that it is unable to decompress. Actually the HttpCli doesn't check
the result, and in this case the body will be not decompressed at
all. Is it acceptable or should this situation be handled differently?

d) There are two coding atomatically added: Identity (quality=0.5)
and * (quality=0). Actually they are enabled by default, should
they must disabled?
Is it ok the default value of quality?

That's all for the moment.


Bye, Maurizio.

-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be


Re: [twsocket] Adding gzip to HttpCli

2005-07-29 Thread Xavier Le Bris
 
 Xavier, have you any reason to assign to FLastResponse?
 

No important reason

Xavier
-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be


Re: [twsocket] Adding gzip to HttpCli

2005-07-27 Thread Francois Piette
 I think I said that RcvdCount that a choice has to be made and Ithat I have
 no defitive answer. The idea is to break as less as possible existing code.
 RcvdCount is used for progress bar updating and should be compressed byte
 count. It is also used to allocate storage (or similar use) for data and
 should be decompressed data. Maybe for simplicity we should let RcvdCount be
 the compressed byte count ? Really a question, the debate is open !

 I doubt that RcvdCount could be used to allocate storage. The body
 data will be put into RcvdStream that is a stream, and normally a
 stream is able to allocate the storage itself.

RcvdStream is not the only way to get data from the component. Some (many ?) 
applications sue
OnDocData event to get data on the fly.

 It could be useful if the size of the body is known in advance so the
 whole storage is allocated in one step, but eventually you have this
 information only with the content length and not from RcvdCount.
 And the content length is not always specified, not to mention the
 case when it is wrong.

If specified but wrong, the component will hang, unless the server close the 
connexion before the
length specified is reached.

 After all this considerations my conclusions are:
 - RcvdCount contains the count of bytes received from the server
 - ContentLength contains what specified in the header

Agreed.

--
[EMAIL PROTECTED]
http://www.overbyte.be

-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be


Re: [twsocket] Adding gzip to HttpCli

2005-07-27 Thread Maurizio Lotauro
Scrive Francois Piette [EMAIL PROTECTED]:

[...]

  I doubt that RcvdCount could be used to allocate storage. The body
  data will be put into RcvdStream that is a stream, and normally a
  stream is able to allocate the storage itself.
 
 RcvdStream is not the only way to get data from the component. Some (many ?)
 applications sue
 OnDocData event to get data on the fly.

In that case the decompression should be made completly by the application, or 
should we add some support for this?

[...]

  After all this considerations my conclusions are:
  - RcvdCount contains the count of bytes received from the server
  - ContentLength contains what specified in the header
 
 Agreed.

This is how it is implemented.


Bye, Maurizio.



This mail has been sent using Alpikom webmail system
http://www.alpikom.it

-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be


Re: [twsocket] Adding gzip to HttpCli

2005-07-27 Thread Francois PIETTE
  I doubt that RcvdCount could be used to allocate storage. The body
  data will be put into RcvdStream that is a stream, and normally a
  stream is able to allocate the storage itself.

 RcvdStream is not the only way to get data from the component. Some (many 
 ?)
 applications sue
 OnDocData event to get data on the fly.

 In that case the decompression should be made completly by the 
 application, or
 should we add some support for this?

If decompression is done on the fly (streaming, we already talked about 
this), then nothing special is required.

--
[EMAIL PROTECTED]
http://www.overbyte.be

-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be


Re: [twsocket] Adding gzip to HttpCli

2005-07-27 Thread Maurizio Lotauro
Scrive Francois PIETTE [EMAIL PROTECTED]:

  RcvdStream is not the only way to get data from the component. Some (many
 
  ?)
  applications sue
  OnDocData event to get data on the fly.
 
  In that case the decompression should be made completly by the 
  application, or
  should we add some support for this?
 
 If decompression is done on the fly (streaming, we already talked about 
 this), then nothing special is required.

The decompression will be made only if RcvdStream is assigned.


Bye, Maurizio.



This mail has been sent using Alpikom webmail system
http://www.alpikom.it

-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be


Re: [twsocket] Adding gzip to HttpCli

2005-07-26 Thread Maurizio Lotauro
On 26-Jul-05 20:37:54 Xavier Le Bris wrote:

Hello,

Hello,

[...]

FGzTime is necessary, not for me, but for anyone who wants to choose the
best compression level/time ratio.

I think that this is the job for a test application, not for the
component. You can use OnBeforeHeaderSend and OnDocEnd.

http://www.pipeboost.com/ is not mine; but you must use Http 1.1 to have
compression.

It works using Http 1.0 for me :-)


Bye, Maurizio.

-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be


Re: [twsocket] Adding gzip to HttpCli

2005-07-26 Thread Maurizio Lotauro
On 26-Jul-05 17:47:41 Francois PIETTE wrote:

[...]

We already talked about RcvdCount.

Yes, but we haven't still get a conclusion :-)

I think I said that RcvdCount that a choice has to be made and Ithat I have
no defitive answer. The idea is to break as less as possible existing code.
RcvdCount is used for progress bar updating and should be compressed byte
count. It is also used to allocate storage (or similar use) for data and
should be decompressed data. Maybe for simplicity we should let RcvdCount be
the compressed byte count ? Really a question, the debate is open !

I doubt that RcvdCount could be used to allocate storage. The body
data will be put into RcvdStream that is a stream, and normally a
stream is able to allocate the storage itself.
It could be useful if the size of the body is known in advance so the
whole storage is allocated in one step, but eventually you have this
information only with the content length and not from RcvdCount.
And the content length is not always specified, not to mention the
case when it is wrong.

After all this considerations my conclusions are:
- RcvdCount contains the count of bytes received from the server
- ContentLength contains what specified in the header

 Similar question for ContentLength: should it contain what is
 specified in the header or the effective final length of the body?

The final length will not be known before everything is decompressed.

Exactly.

ContentLength, IMO, should stay in sync with what the header says. A new
property could be added to have the decompressed content length. Not that
much useful as the user has RcvdStream.Size or can cumulate what he receive
in OnDocData.

You red in my mind :-)


Bye, Maurizio.

-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be


Re: [twsocket] Adding gzip to HttpCli

2005-07-25 Thread Maurizio Lotauro
On 24-Jul-05 12:01:55 Xavier Le Bris wrote:

Hello,

A contribution to compression with gzip and some answers to Maurizio's
questions :

It is a good idea to put the whole stuff in another unit.

a/ The site 'http://www.pipeboost.com/' works today for me in T_HttpCliGz.
If not, check if zlib.dll is well loaded.

This URL answer with the gzip encoding, but there is an error in your
code. In xZlibDll in the ZLibLoadDll procedure if the first
LoadLibrary fail then you try using different dll names. But you will
not reassign the ZLibDllHandle variable.
Another little change that I made in the same unit is to move the
uses of Windows unit from the implementation section to the
interface. I'm using Delphi 5 and it is unable to found the pByte
declaration.

The URL is yours? I noticed that if http version is 1.0 then in the
answer of the server there is two times the header Connection:
keep-alive (the second has the first letters capitalized).

Last observation. I tried to compile your HttpGzAsy project but I get
an error because the xChrono unit is missing.

b/ FGzTime is used to know the time of reception and decompression. It can
help to choose a good compression level on the server side. For example, I
chose level 6 in a first time for one of my sites. After tests, the level 1
was better...

Ok, then it is no neecessary for normal use. I have not reported it.

c/ I chose to decompress at the end of reception, because in my xGzipStream
unit, the decompression is done for the whole stream after reading the number
of uncompressed bytes. This value is at the end of the stream. We may
decompress on the fly, but the units have to be rewritten. With the actual
RAM memory, I don't think it is necessary.

The whole thing is made in a way that it is a choiche of the
developer of the coding class if it is  handled on the fly or at the
end.

d/ I think that the RcvdStream can contain the compressed data during the
reception. So, RcvdCount can be used for a progress bar. At the end of
reception, I consider, according to François's message of 09/05/2005 (the
user has not to know if the received data were compressed or not) that
RcvdStream and RcvdCount are relative to uncompressed data. I don't see why
we have to keep the uncompressed data. Only my variable GzContentLength keeps
 ^^
I imagine you mean compressed data.

the compressed length.

I don't think that use RcvdStream for receiving compressed data is a
good idea because it will be more delicate to cleanup. Moreover it is
possible that more than one encoding are applied to the body and in
this case the situation will be even worse, not to mention the case
of on the fly decoding.
In my propose the RcvdStream will contain only the uncompressed data.
And about the RcvdCount, I think that it is better if it still
contain the byte effectly received. Changing it could mess up
existing code.

Similar question for ContentLength: should it contain what is
specified in the header or the effective final length of the body?
Francois?


e/ FRequestVer is set to 1.1, because some site don't compress if 1.0 is
sent.

Ok, but it is not necessary to limit to the 1.1 version. In that case
the server simply will not compress the data.

f/ In my code, I have a GzAcceptEncoding boolean to disconnect compression if
necessary. It would be nice to keep it in case of problem with some sites.

It is still possibile to enable/disable each coding. Use
HttpCli.ContentCodingHnd.CodingEnabled[name of coding]

Sorry, I have not time today si answer other questions.

No problem :-)


Bye, Maurizio.

-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be


Re: [twsocket] Adding gzip to HttpCli

2005-07-15 Thread Francois Piette
 So we should consider both. But there is a potential problem. The RFC
 stated that more that one encoding can be applied to the body. We can
 have then a mixed situation.

Two compression applied one after the other or one compression for a part of 
the document and
another one for the remaining part ? In either case, I don't understand how it 
could work except if
the document is in MIME format and compression apply to MIME parts. Thta is a 
completly different
problem.

 BTW the Xavier proposal make the decode at the end.
 I think that initially making the decode of the whole thing will be
 easier to implement, and in a second time we can add the streaming
 capability.

I think the code would be completely different. It's writing it twice.
Streaming is very interesting for example when downloading images or sound and 
start displaying or
playing before the end of document is here. And this apply also in basic 
document type where display
can be done as the document is comming.
--
[EMAIL PROTECTED]
http://www.overbyte.be


- Original Message - 
From: Maurizio Lotauro [EMAIL PROTECTED]
To: ICS support mailing twsocket@elists.org
Sent: Friday, July 15, 2005 1:55 AM
Subject: Re: [twsocket] Adding gzip to HttpCli


 Before start, I get your answer from the archive of this ML because
 it seems that my ISP has lost a whole day of emails :(

 On 12-Jul-05 02:41:45 Francois PIETTE wrote:

  a) where is the best place to decode the received stream? Xavier do
  this in the GetBodyLineNext when the end of document is reached.
 
 I think decoding should be done on the fly.
 
  What do you mean? Data should be decoded during the receiving, i.e.
  without having first the whole body?

 Yes, indeed. Most compressions, including gzio, are streaming type. So it is
 easy to decode them on the fly.
 Of course if some compression is not streamable, then receiving the whole
 thing then decompressing is OK.

 So we should consider both. But there is a potential problem. The RFC
 stated that more that one encoding can be applied to the body. We can
 have then a mixed situation.

 BTW the Xavier proposal make the decode at the end.
 I think that initially making the decode of the whole thing will be
 easier to implement, and in a second time we can add the streaming
 capability.

 [...]

 Coded version should remains purely internal to the component. Should that
 coded version be of some interest (but why ?), then maybe an event or a
 new
 property could give access to it.
 
  The only reason that I see is for debug, in particular for the
  decoding routines.

 In the component, where received data is written to RcvdStream, it must be
 written to the decompression stream, and then the decoding stream is read to
 get decompressed data (if any: you don't always have decompressed data each
 time you write compressed data) and written to RcvdStream where it is
 available for the application.

 RcvdCount, used for determining the end of document, must represent the
 compressed byte count.
 Maybe be the component has to use another variable for that purpose
 (determining when document is complete) and still expose the decompressed
 byte count in RcvdCount property. Doing so will make decompression
 completely [well, mostly] transparent to applications. I see a potential
 problem: RcvdCount is used by application to update progress bar, so it must
 be compressed byte count. But other applications use it to know how many
 bytes to process, so it must be the decompressed byte count. We can't have
 both ! So a choice must be made. I don't know what the best is. Any opinion
 ?

 I never used RcvdCount then one choice or the other is the same for
 me.
 But I see a bigger problem. FRcvdStream and FRcvdCount are declared
 protected (as everything not declared public or published) and then
 is it possible that components inherited from THttpCli have used
 these variables.


 Bye, Maurizio.


 -- 
 To unsubscribe or change your settings for TWSocket mailing list
 please goto http://www.elists.org/mailman/listinfo/twsocket
 Visit our website at http://www.overbyte.be



-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be


Re: [twsocket] Adding gzip to HttpCli

2005-07-14 Thread Maurizio Lotauro
Before start, I get your answer from the archive of this ML because
it seems that my ISP has lost a whole day of emails :(

On 12-Jul-05 02:41:45 Francois PIETTE wrote:

 a) where is the best place to decode the received stream? Xavier do
 this in the GetBodyLineNext when the end of document is reached.

I think decoding should be done on the fly.

 What do you mean? Data should be decoded during the receiving, i.e.
 without having first the whole body?

Yes, indeed. Most compressions, including gzio, are streaming type. So it is
easy to decode them on the fly.
Of course if some compression is not streamable, then receiving the whole
thing then decompressing is OK.

So we should consider both. But there is a potential problem. The RFC
stated that more that one encoding can be applied to the body. We can
have then a mixed situation.

BTW the Xavier proposal make the decode at the end.
I think that initially making the decode of the whole thing will be
easier to implement, and in a second time we can add the streaming
capability.

[...]

Coded version should remains purely internal to the component. Should that
coded version be of some interest (but why ?), then maybe an event or a
new
property could give access to it.

 The only reason that I see is for debug, in particular for the
 decoding routines.

In the component, where received data is written to RcvdStream, it must be
written to the decompression stream, and then the decoding stream is read to
get decompressed data (if any: you don't always have decompressed data each
time you write compressed data) and written to RcvdStream where it is
available for the application.

RcvdCount, used for determining the end of document, must represent the
compressed byte count.
Maybe be the component has to use another variable for that purpose
(determining when document is complete) and still expose the decompressed
byte count in RcvdCount property. Doing so will make decompression
completely [well, mostly] transparent to applications. I see a potential
problem: RcvdCount is used by application to update progress bar, so it must
be compressed byte count. But other applications use it to know how many
bytes to process, so it must be the decompressed byte count. We can't have
both ! So a choice must be made. I don't know what the best is. Any opinion
?

I never used RcvdCount then one choice or the other is the same for
me.
But I see a bigger problem. FRcvdStream and FRcvdCount are declared
protected (as everything not declared public or published) and then
is it possible that components inherited from THttpCli have used
these variables.


Bye, Maurizio.


-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be


Re: [twsocket] Adding gzip to HttpCli

2005-07-12 Thread Francois PIETTE

a) where is the best place to decode the received stream? Xavier do
this in the GetBodyLineNext when the end of document is reached.



I think decoding should be done on the fly.


What do you mean? Data should be decoded during the receiving, i.e.
without having first the whole body?


Yes, indeed. Most compressions, including gzio, are streaming type. So it is 
easy to decode them on the fly.
Of course if some compression is not streamable, then receiving the whole 
thing then decompressing is OK.



b) the RcvdStream must contain what effectly received from the server
or the decoded version? In the second case, what received should be
dropped or keeped somewhere?



RcvdStream should contains the decode version.


This has some impact, for example if the Cleanup routine will called.
We should proceed very carefully.

In the meantime I made the changes that I mentioned. It is not
finished because we should discuss and decide about the RcvdStream.
The possibilities that I see are the following.

a) RcvdStream will initially receive the encoded body
pro:
- the code that handle the receiving will remain as is.
cons:
- the encoded part must be removed from the stream before replacing
with the decoded version
- the cleanup procedure will be more delicate (in particutlar if the
stream was not empty at the start) because the RcvdCount will change
after the decoding and because the decode could fail.
b) RcvdStream will receive only the decoded body
pro:
- if needed the cleanup is easier and probably more safe
cons:
- the code that handle the receiving must be reworked, probably using
an internal variable that points to FRcvdStream (if no decode is
needed) or to the stream that will be decoded.
- RcvdCount can have some side effects on existent application


Coded version should remains purely internal to the component. Should that
coded version be of some interest (but why ?), then maybe an event or a 
new

property could give access to it.


The only reason that I see is for debug, in particular for the
decoding routines.



In the component, where received data is written to RcvdStream, it must be 
written to the decompression stream, and then the decoding stream is read to 
get decompressed data (if any: you don't always have decompressed data each 
time you write compressed data) and written to RcvdStream where it is 
available for the application.


RcvdCount, used for determining the end of document, must represent the 
compressed byte count.
Maybe be the component has to use another variable for that purpose 
(determining when document is complete) and still expose the decompressed 
byte count in RcvdCount property. Doing so will make decompression 
completely [well, mostly] transparent to applications. I see a potential 
problem: RcvdCount is used by application to update progress bar, so it must 
be compressed byte count. But other applications use it to know how many 
bytes to process, so it must be the decompressed byte count. We can't have 
both ! So a choice must be made. I don't know what the best is. Any opinion 
?


btw: Xavier, are you listening ?

--
[EMAIL PROTECTED]
http://www.overbyte.be


--
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be


Re: [twsocket] Adding gzip to HttpCli

2005-07-11 Thread Francois PIETTE

a) where is the best place to decode the received stream? Xavier do
this in the GetBodyLineNext when the end of document is reached.


I think decoding should be done on the fly.


b) the RcvdStream must contain what effectly received from the server
or the decoded version? In the second case, what received should be
dropped or keeped somewhere?


RcvdStream should contains the decode version.

Coded version should remains purely internal to the component. Should that 
coded version be of some interest (but why ?), then maybe an event or a new 
property could give access to it.


--
[EMAIL PROTECTED]
http://www.overbyte.be



- Original Message - 
From: Maurizio Lotauro [EMAIL PROTECTED]

To: twsocket@elists.org
Sent: Friday, July 01, 2005 12:00 AM
Subject: [twsocket] Adding gzip to HttpCli



Hello,

I finally get some time to check the changes proposed to handle the
gzip content encoding.

First I think that it should be better not to add specific gzip
handling but a generic class to handle the content encoding, using a
registration machanism for each encoding, like what happen with
TGraphic.

So the THttpCli will compose the Accept-encoding header accordinly
to the registered classes, that will do then the decoding if needed.

Two questions:
a) where is the best place to decode the received stream? Xavier do
this in the GetBodyLineNext when the end of document is reached.
b) the RcvdStream must contain what effectly received from the server
or the decoded version? In the second case, what received should be
dropped or keeped somewhere?

Opinions?


About the gzip implementation proposed from Xavier I have some
question.

Why in the Create method the FRequestVer is set to a different value?

Why the Accept-encoding is added only if the RequestVer is 1.1?

What is FGzTime used for?


Bye, Maurizio.


--
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be




--
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be


Re: [twsocket] Adding gzip to HttpCli

2005-07-11 Thread Maurizio Lotauro
On 11-Jul-05 19:23:20 Francois PIETTE wrote:

Welcome back Francois :-)

 a) where is the best place to decode the received stream? Xavier do
 this in the GetBodyLineNext when the end of document is reached.

I think decoding should be done on the fly.

What do you mean? Data should be decoded during the receiving, i.e.
without having first the whole body?

 b) the RcvdStream must contain what effectly received from the server
 or the decoded version? In the second case, what received should be
 dropped or keeped somewhere?

RcvdStream should contains the decode version.

This has some impact, for example if the Cleanup routine will called.
We should proceed very carefully.

In the meantime I made the changes that I mentioned. It is not
finished because we should discuss and decide about the RcvdStream.
The possibilities that I see are the following.

a) RcvdStream will initially receive the encoded body
pro:
- the code that handle the receiving will remain as is.
cons:
- the encoded part must be removed from the stream before replacing
with the decoded version
- the cleanup procedure will be more delicate (in particutlar if the
stream was not empty at the start) because the RcvdCount will change
after the decoding and because the decode could fail.

b) RcvdStream will receive only the decoded body
pro:
- if needed the cleanup is easier and probably more safe
cons:
- the code that handle the receiving must be reworked, probably using
an internal variable that points to FRcvdStream (if no decode is
needed) or to the stream that will be decoded.
- RcvdCount can have some side effects on existent application

Coded version should remains purely internal to the component. Should that
coded version be of some interest (but why ?), then maybe an event or a new
property could give access to it.

The only reason that I see is for debug, in particular for the
decoding routines.


Bye, Maurizio.


-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://www.elists.org/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be