On Thursday 24 July 2003 14:43, Kevin Steen wrote:
> At 24/07/2003 14:01, you wrote:
>  >On Thursday 24 July 2003 12:36, Michael Schierl wrote:
>  >> Toad schrieb:
>  >> > Changes (a ton, mostly not mine):
>  >> > * Implemented support for ZIP containers
>  >
>  >...
>  >
>  >Call me skeptical, but I think this is an amazingly bad idea. It removes
>  > any concept of having redundant date de-duplicated automatically. Also,
>  > downloading 1 MB file will potentially take quite a while. Smaller files
>  > can be downloaded with a greater degree of parallelism. I am simply not
>  > convinced
>  >that partial availability is a problem with a properly routed node, and
>  > that is all this will achieve. In a way, I think this will make the
>  > problem worse,
>  >because if the entire file cannot be retrieved or re-assembled, then the
>  >whole site is unavailable, rather than perhaps a few small parts of it.
>
> Supporting containers allows freesite authors to make the decision for
> themselves, with the 1MB limit preventing drastic duplication on the
> network.

I am not convinded that this decision should be left to the author. If they 
are that concerned about it they can upload the ZIP of the whole site 
separately themselves and give link to it. At best building it into the node 
is a bodge, and at worst, it is counterproductive. What happens when the same 
files are linked from multiple pages, e.g. active links? Do you bundle the 
files separately into each "archive set"? Where do you draw a line?

> I see the main use for containers being to keep the _required_
> parts of a freesite together - namely the html files, layout images, PGP
> key and Activelink.

Except that for a site with more than 2 pages, this becomes extremely 
cumbersome to separate manually. An automated approach could be used by 
analysing html and linked documents, but this has other limitations, such as 
how do you decide how to separate the files into archives? What about when 
you have to put one file into all archives? How difficult will it be to come 
up with "auxiliary" archives that have files accessed from multiple pages?

It is logically incoherent, and cannot be dealt with in a way that is both 
generic and consistent. Therefore, I believe it should not be catered for, 
especially as it doesn't add any new functionality, and the benefits it 
provides are at best questionable.

> For me, having all of those available goes a long way
> to differentiating "good" freesites from "bad" ones. Also, there should be
> some saving on bandwidth and processing by not having to deal with so many
> small files.

I disagree. Dealing with multiple small files can be dealt with in parallel - 
less so than for bigger files. In fact, there are answers in the FAQ about 
how to make IE and Mozilla use more simultaneous connections. If an archive 
file goes missing, that's it, no site at all. I do not believe that that 
would be an improvement.

Say you use FEC, how many parts will a 1 MB file be split into? 4? How is that 
going to be faster than downloading 20 much smaller files in parallel? It 
strikes me that archive based sites are simply not the correct tool for the 
job in pretty much all cases.

Freenet is high-latency potentially-high-bandwidth network, but in this case 
it doesn't matter - because the latency for parallel dowload is fixed to that 
of the slowest download. As they are all happening in parallel, the latency 
penalty is effectively taken once, just as it would be for the archive 
split-file.

Additionally, more smaller downloads will probabalistically come from more 
different hosts, thus maximising the use of the bandwidth on the requesting 
node - in effect making it faster.

>  >Additionally, it means that even if you want to look at one or two pages
>  > of a 100 page site, you still have to download the entire site.
>
> A lot of sites consist of a Table of Contents as the front page, with the
> content in separate files. I've always found it bad for my karma when I
> click on a very interesting link and end up with a "Data Not Found"
> message!

Ultimately, the chances of a file going missing are the same, whether it is 
the archive or a single file on the site. How is making sure that losses are 
in bigger chunks going to help, on top of having to wait for longer for one 
big file to trickle down to your node?

I do not see this as a viable alternative to proper verification and use of 
insertion tools. If your files go missing, then use a higher HTL or re-insert 
more frequently.

Regards.

Gordan
_______________________________________________
devl mailing list
[EMAIL PROTECTED]
http://hawk.freenetproject.org:8080/cgi-bin/mailman/listinfo/devl

Reply via email to