On Thursday 24 July 2003 14:43, Kevin Steen wrote: > At 24/07/2003 14:01, you wrote: > >On Thursday 24 July 2003 12:36, Michael Schierl wrote: > >> Toad schrieb: > >> > Changes (a ton, mostly not mine): > >> > * Implemented support for ZIP containers > > > >... > > > >Call me skeptical, but I think this is an amazingly bad idea. It removes > > any concept of having redundant date de-duplicated automatically. Also, > > downloading 1 MB file will potentially take quite a while. Smaller files > > can be downloaded with a greater degree of parallelism. I am simply not > > convinced > >that partial availability is a problem with a properly routed node, and > > that is all this will achieve. In a way, I think this will make the > > problem worse, > >because if the entire file cannot be retrieved or re-assembled, then the > >whole site is unavailable, rather than perhaps a few small parts of it. > > Supporting containers allows freesite authors to make the decision for > themselves, with the 1MB limit preventing drastic duplication on the > network.
I am not convinded that this decision should be left to the author. If they are that concerned about it they can upload the ZIP of the whole site separately themselves and give link to it. At best building it into the node is a bodge, and at worst, it is counterproductive. What happens when the same files are linked from multiple pages, e.g. active links? Do you bundle the files separately into each "archive set"? Where do you draw a line? > I see the main use for containers being to keep the _required_ > parts of a freesite together - namely the html files, layout images, PGP > key and Activelink. Except that for a site with more than 2 pages, this becomes extremely cumbersome to separate manually. An automated approach could be used by analysing html and linked documents, but this has other limitations, such as how do you decide how to separate the files into archives? What about when you have to put one file into all archives? How difficult will it be to come up with "auxiliary" archives that have files accessed from multiple pages? It is logically incoherent, and cannot be dealt with in a way that is both generic and consistent. Therefore, I believe it should not be catered for, especially as it doesn't add any new functionality, and the benefits it provides are at best questionable. > For me, having all of those available goes a long way > to differentiating "good" freesites from "bad" ones. Also, there should be > some saving on bandwidth and processing by not having to deal with so many > small files. I disagree. Dealing with multiple small files can be dealt with in parallel - less so than for bigger files. In fact, there are answers in the FAQ about how to make IE and Mozilla use more simultaneous connections. If an archive file goes missing, that's it, no site at all. I do not believe that that would be an improvement. Say you use FEC, how many parts will a 1 MB file be split into? 4? How is that going to be faster than downloading 20 much smaller files in parallel? It strikes me that archive based sites are simply not the correct tool for the job in pretty much all cases. Freenet is high-latency potentially-high-bandwidth network, but in this case it doesn't matter - because the latency for parallel dowload is fixed to that of the slowest download. As they are all happening in parallel, the latency penalty is effectively taken once, just as it would be for the archive split-file. Additionally, more smaller downloads will probabalistically come from more different hosts, thus maximising the use of the bandwidth on the requesting node - in effect making it faster. > >Additionally, it means that even if you want to look at one or two pages > > of a 100 page site, you still have to download the entire site. > > A lot of sites consist of a Table of Contents as the front page, with the > content in separate files. I've always found it bad for my karma when I > click on a very interesting link and end up with a "Data Not Found" > message! Ultimately, the chances of a file going missing are the same, whether it is the archive or a single file on the site. How is making sure that losses are in bigger chunks going to help, on top of having to wait for longer for one big file to trickle down to your node? I do not see this as a viable alternative to proper verification and use of insertion tools. If your files go missing, then use a higher HTL or re-insert more frequently. Regards. Gordan _______________________________________________ devl mailing list [EMAIL PROTECTED] http://hawk.freenetproject.org:8080/cgi-bin/mailman/listinfo/devl
