Re: SUGGESTION: rollback like GetRight

2001-01-11 Thread Dan Harkless


Dan Harkless <[EMAIL PROTECTED]> writes:
> Don't cc me when posting to wget, please.  I don't need two copies.
> 
> ZIGLIO Frediano <[EMAIL PROTECTED]> writes:
> > In my opinion rollback should be added for two problem:
> > - the file change
> > - the proxy return unwanted data
> > 
> > The rollback without verify resolve first problem but not second
> > The rollback with full verify resolve second but not first
> > 
> > An example:
> > G garbage dsata
> > D downloaded file correctly
> > C download continued
> > 
> > Now you download a file you can obtain something like this (garbage for bad
> > proxy):
> > G
> > In current implementation of wget you finally obtain:
> > GCC
> > That is a wrong file
> > If you rollback before continuing you can obtain
> > DDD
> > That is correct ... if file didn't change!
> 
> It'd be a lot simpler and would handle almost all continuing-download-of-a-
> file-that-has-changed-on-server cases if we made it so if you specify both
> -c and -N, wget checks the timestamp on the local file vs. the one on the
> server, and if the one on the server is newer, it restarts the download from
> scratch.

I forgot to add that the timestamp method is superior to the
"rollback-verify" method in tons of cases because a file can change without
the specific portion you're checking having changed.

In any case, I've just added this to the TODO.

---
Dan Harkless| To help prevent SPAM contamination,
GNU Wget co-maintainer  | please do not mention this email
http://sunsite.dk/wget/ | address in Usenet posts -- thank you.



Re: SUGGESTION: rollback like GetRight

2001-01-11 Thread Dan Harkless


Don't cc me when posting to wget, please.  I don't need two copies.

ZIGLIO Frediano <[EMAIL PROTECTED]> writes:
> In my opinion rollback should be added for two problem:
> - the file change
> - the proxy return unwanted data
> 
> The rollback without verify resolve first problem but not second
> The rollback with full verify resolve second but not first
> 
> An example:
> G garbage dsata
> D downloaded file correctly
> C download continued
> 
> Now you download a file you can obtain something like this (garbage for bad
> proxy):
> G
> In current implementation of wget you finally obtain:
> GCC
> That is a wrong file
> If you rollback before continuing you can obtain
> DDD
> That is correct ... if file didn't change!

It'd be a lot simpler and would handle almost all continuing-download-of-a-
file-that-has-changed-on-server cases if we made it so if you specify both
-c and -N, wget checks the timestamp on the local file vs. the one on the
server, and if the one on the server is newer, it restarts the download from
scratch.

---
Dan Harkless| To help prevent SPAM contamination,
GNU Wget co-maintainer  | please do not mention this email
http://sunsite.dk/wget/ | address in Usenet posts -- thank you.



RE: SUGGESTION: rollback like GetRight

2001-01-11 Thread ZIGLIO Frediano

> 
> Jan Prikryl <[EMAIL PROTECTED]> writes:
> > Quoting ZIGLIO Frediano ([EMAIL PROTECTED]):
> > > I suggest two parameter:
> > > - rollback-size
> > > - rollback-check-size
> > > where 0 <= rollback-check-size <= rollback-size
> > > The first for calculate the beginning of range (filesize 
> - rollback-size)
> > > and the second for check (wget should check the range [filesize -
> > > rollback-size,filesize - rollback-size + rollback-check-size) )
> > 
> > My understanding of the rollback problem is that there are 
> some broken
> > proxies that do add some additional text garabge after the conection
> > has timed out for example. Then, for `--rollback-size=NUM' after
> > timing-out, wget shall cut the last NUM bytes of the file and try to
> > resume the download.
> > 
> > Chould you elaborate more on the situation where something like
> > `--rollback-check-size' would be needed? What shall be 
> checked there?
> 
> I think he wants an option where wget will verify that a 
> certain section
> towards the end of the local file matches what's on the 
> server, so that you
> don't have to guess or manually check how far to roll back 
> by.  I don't
> think I'd implement it as he's suggesting, though.
> 
True and false.
In my opinion rollback should be added for two problem:
- the file change
- the proxy return unwanted data

The rollback without verify resolve first problem but not second
The rollback with full verify resolve second but not first

An example:
G garbage dsata
D downloaded file correctly
C download continued

Now you download a file you can obtain something like this (garbage for bad
proxy):
G
In current implementation of wget you finally obtain:
GCC
That is a wrong file
If you rollback before continuing you can obtain
DDD
That is correct ... if file didn't change!
If change the D part of file is different from server side so you gain
another wrong file
So I rollback the file
DDD(DG)
Then start do download
DDDCC
I can check if CC == (DG). But is always true for garbage data!
Now return to file saved:
G
... rollback ...
DDD(DG)
... start download ...
DDDC
... now can check  if C == (D). You ignore possible garbage in file. If C ==
(D) you assume that file is the same. If C != (D) you restart downloading

The two parameters should be clearer. 
rollback-size is the size of (DG), rollback-check-size is the size of (D).

freddy77



Entra in www.omnitel.it. Ti aspetta un mondo di servizi on line




Re: SUGGESTION: rollback like GetRight

2001-01-11 Thread Dan Harkless


Jan Prikryl <[EMAIL PROTECTED]> writes:
> Quoting ZIGLIO Frediano ([EMAIL PROTECTED]):
> > I suggest two parameter:
> > - rollback-size
> > - rollback-check-size
> > where 0 <= rollback-check-size <= rollback-size
> > The first for calculate the beginning of range (filesize - rollback-size)
> > and the second for check (wget should check the range [filesize -
> > rollback-size,filesize - rollback-size + rollback-check-size) )
> 
> My understanding of the rollback problem is that there are some broken
> proxies that do add some additional text garabge after the conection
> has timed out for example. Then, for `--rollback-size=NUM' after
> timing-out, wget shall cut the last NUM bytes of the file and try to
> resume the download.
> 
> Chould you elaborate more on the situation where something like
> `--rollback-check-size' would be needed? What shall be checked there?

I think he wants an option where wget will verify that a certain section
towards the end of the local file matches what's on the server, so that you
don't have to guess or manually check how far to roll back by.  I don't
think I'd implement it as he's suggesting, though.

---
Dan Harkless| To help prevent SPAM contamination,
GNU Wget co-maintainer  | please do not mention this email
http://sunsite.dk/wget/ | address in Usenet posts -- thank you.




RE: SUGGESTION: rollback like GetRight

2001-01-10 Thread Csaba Raduly

On 10/01/2001 08:50:18 ZIGLIO Frediano wrote:

>I suggest two parameter:
>- rollback-size
>- rollback-check-size
>where 0 <= rollback-check-size <= rollback-size
>The first for calculate the beginning of range (filesize - rollback-size)
>and the second for check (wget should check the range [filesize -
>rollback-size,filesize - rollback-size + rollback-check-size) )
>

I was thinking of making -c have an optional parameter specifying the rollback.
If this was defaulted to 0, it can be given to lseek( , , SEEK_END )
(it would be nice if it could accept a 'k' suffix)

The check size then could be specified separately.

Csaba


--
Csaba Ráduly, Programmer - OS/2  Sophos Anti-Virus
email: [EMAIL PROTECTED]   http://www.sophos.com
US support: +1 888 SOPHOS 9UK Support: +44 1235 559933





RE: SUGGESTION: rollback like GetRight

2001-01-10 Thread ZIGLIO Frediano

Rollback is usefull mainly for checking if file is not changed.
You check (compare) download data with your file.

freddy77
> 
> Quoting ZIGLIO Frediano ([EMAIL PROTECTED]):
> 
> > I suggest two parameter:
> > - rollback-size
> > - rollback-check-size
> > where 0 <= rollback-check-size <= rollback-size
> > The first for calculate the beginning of range (filesize - 
> rollback-size)
> > and the second for check (wget should check the range [filesize -
> > rollback-size,filesize - rollback-size + rollback-check-size) )
> 
> My understanding of the rollback problem is that there are some broken
> proxies that do add some additional text garabge after the conection
> has timed out for example. Then, for `--rollback-size=NUM' after
> timing-out, wget shall cut the last NUM bytes of the file and try to
> resume the download.
> 
> Chould you elaborate more on the situation where something like
> `--rollback-check-size' would be needed? What shall be checked there?
> 
> -- jan
> 
> +-
> -
>  Jan Prikryl| vr|vis center for virtual reality and 
> visualisation
>  <[EMAIL PROTECTED]> | http://www.vrvis.at
> +-
> -
> 



Entra in www.omnitel.it. Ti aspetta un mondo di servizi on line




Re: SUGGESTION: rollback like GetRight

2001-01-10 Thread Jan Prikryl

Quoting ZIGLIO Frediano ([EMAIL PROTECTED]):

> I suggest two parameter:
> - rollback-size
> - rollback-check-size
> where 0 <= rollback-check-size <= rollback-size
> The first for calculate the beginning of range (filesize - rollback-size)
> and the second for check (wget should check the range [filesize -
> rollback-size,filesize - rollback-size + rollback-check-size) )

My understanding of the rollback problem is that there are some broken
proxies that do add some additional text garabge after the conection
has timed out for example. Then, for `--rollback-size=NUM' after
timing-out, wget shall cut the last NUM bytes of the file and try to
resume the download.

Chould you elaborate more on the situation where something like
`--rollback-check-size' would be needed? What shall be checked there?

-- jan

+--
 Jan Prikryl| vr|vis center for virtual reality and visualisation
 <[EMAIL PROTECTED]> | http://www.vrvis.at
+--




RE: SUGGESTION: rollback like GetRight

2001-01-10 Thread ZIGLIO Frediano

I suggest two parameter:
- rollback-size
- rollback-check-size
where 0 <= rollback-check-size <= rollback-size
The first for calculate the beginning of range (filesize - rollback-size)
and the second for check (wget should check the range [filesize -
rollback-size,filesize - rollback-size + rollback-check-size) )

freddy77

> 
> Hrvoje Niksic <[EMAIL PROTECTED]> writes:
> > Daniel Stenberg <[EMAIL PROTECTED]> writes:
> > 
> > > Could you elaborate on this and describe in what way, 
> theoretically,
> > > the errors would sneak into the destination file?
> > 
> > By a silly proxy inserting a "transfer interrupted" string when the
> > transfer between the proxy and the actual server gets interrupted.
> 
> How awful.  Okay, I added this to the TODO.  I imagine it 
> won't get done
> until someone with one of those broken proxies sends in a 
> patch to implement
> it, though.
> 
> ---
> Dan Harkless| To help prevent SPAM contamination,
> GNU Wget co-maintainer  | please do not mention this email
> http://sunsite.dk/wget/ | address in Usenet posts -- thank you.
> 



Entra in www.omnitel.it. Ti aspetta un mondo di servizi on line




Re: SUGGESTION: rollback like GetRight

2001-01-10 Thread Dan Harkless


Hrvoje Niksic <[EMAIL PROTECTED]> writes:
> Daniel Stenberg <[EMAIL PROTECTED]> writes:
> 
> > Could you elaborate on this and describe in what way, theoretically,
> > the errors would sneak into the destination file?
> 
> By a silly proxy inserting a "transfer interrupted" string when the
> transfer between the proxy and the actual server gets interrupted.

How awful.  Okay, I added this to the TODO.  I imagine it won't get done
until someone with one of those broken proxies sends in a patch to implement
it, though.

---
Dan Harkless| To help prevent SPAM contamination,
GNU Wget co-maintainer  | please do not mention this email
http://sunsite.dk/wget/ | address in Usenet posts -- thank you.