On Fri, Jul 25, 2003 at 11:52:04AM +0100, Gordan wrote:
> On Friday 25 Jul 2003 03:37, Nick Tarleton wrote:
> 
> > > If you don't zip active links this is a non issue.
> >
> > The whole point of an ActiveLink is to 1) indicate if the site is available
> > and 2) to precache the mapfile. If you're going to zip anything, the
> > ActiveLink should be there too.
> 
> There is a problem with this, too. It has, IIRC, recently been demonstrated 
> that using SSK sites with map files is considerably slower on latency than 
> referencing each file directly by CHK. In this particular case, we have an 
> additional problem of latency because the node would not only have to look up 
> the CHK for the file, but also decompress it. Not good.

That information is obsolete, it was caused by a bug which has been
fixed for months. However, decompression (and more importantly,
iterating through the ZIP) will impose some overhead, yeah (this can be
ameliorated by caching ZIPs...)
> 
> > > Well, things like NIMs wouldn't be zipped. MOST of the content on Freenet
> > > SHOULD NOT be zipped, and if it is done right it won't be. However there
> > > are places where it could help. Could you explain exactly what you mean
> > > when you say "manual pre-caching" and "automated pre-caching".
> >
> > Indeed. The best sites for zipping are infrequently updated and contain a
> > lot of separate content, Thoughtcrime being possibly the best example.
> > TFEE, though, is one site I would NOT like to see zipped, because I rarely
> > use anything but the Recently Updated list, the KSK Logs, and the flog, and
> > it would suck to download 1MB every day on dial-up.
> 
> So perhaps the limit hould be put much, much lower than 1 MB. Maybe 128KB, 
> just enough to get the HTML for the main page, the active link, and maybe the 
> description.txt, public key and in extreme cases HTML for the pages in 
> frames, if the evil frames are used.

Yeah, probably a good idea.
> 
> Even so, I am still not convinced it is a good idea.
> 
> > A ridiculous bandwidth/retrievability tradeoff for Thoughtcrime-like sites:
> > With each edition, only the new content (if content can *change*, it gets
> > less efficient) is inserted in a new zip, which is inserted under a CHK.
> > That way, if I selected something from the first edition, I would get the
> > first edition zip and everything else from the first edition.
> 
> Umm... Are you saying that there should be _incremental_ site updates using 
> archives? So we have to go and get more and more old zips to get the older 
> files? Please tell me I'm misunderstanding what you said.

_I_ am not.
> 
> > Or just use zips for multiple-file documents, like the Unabomber manifesto
> > and the 9/11 de-debunking.
> 
> What do you mean by "multiple-file documents"?
> 
> I can understand that some files should conceptually be kept together, but the 
> chances are that with any kind of growth, data duplication will become 
> detrimental. Space in the network should be treated as a luxury, not a 
> commodity.
> 
> For example, it is plausible that many sites will want to use the same skin, 
> which is fine. But if somebody decides they want to change 10% of the site 
> skin, they will still upload the whole zip, thus yielding a new "big" file in 
> the network that is 90% redundant with other content in the network.

Yeah, but balance this against load... of course, with good load
balancing (i.e. ngrouting), maybe load will be less of an issue.
> 
> The network is specifically designed to prevent duplication of files and thus 
> use the space in the most efficient way possible, and the idea of using ZIP 
> archives goes against it in a lot of cases. Even bundling the activelink with 
> the root page would be bad, especially if everyone started using it. It could 
> cause an unnecessary increase in network load for sites with a lot of links 
> on them.
> 
> I am rather surprised there is that much call for this archive feature when 
> the pure file compression feature would be much more useful (IMHO) and cause 
> none of the disadvantages that archives seem to bring with them. And yet 
> there seems to be no drive toward the transparent compression-only feature 
> which would bring benefits wihout any drawbacks, except perhaps for slightly 
> slower inserts (proportionally negligible) and slightly more latency on 
> retrieval while the file is decompressed (again, proportionally negligible). 
> In return it would save/increase bandwidth and speed up transfers to offset 
> any latency increases caused by decompression times.

Moderately useful - any large files are already compressed.
> 
> > > Yes that would work, but it would be slow with high latency. Even if the
> > > latency were as low as the WWW it would not really be good enough to do
> > > this sort of thing. Having zips would allow you to have dosens possibly
> > > hundreds of VERY SMALL images on a site as part of a theme.
> >
> > For all the important images on a site to be in a zip would be a good
> > thing; just an extension of multiple-file documents.
> 
> I personally see the segmentation of content to the n-th degree to actually be 
> useful in terms of modularity and content de-duplication. The design of data 
> storage in Freenet seems to be ideally suited to storing web-page type data 
> in an optimal fashion without wasting space, and gaining data spread in the 
> process where the same file is accessed on multiple sites, typically obvious 
> in case of active-links, but not limited to them.
> 
> Gordan

-- 
Matthew J Toseland - [EMAIL PROTECTED]
Freenet Project Official Codemonkey - http://freenetproject.org/
ICTHUS - Nothing is impossible. Our Boss says so.

Attachment: pgp00000.pgp
Description: PGP signature

Reply via email to