I just came up with some interesting ideas on splitfiles. Some of theese may have already been discussed, although I have had a hard time finding things on theese topics in the documentation and mailing lists. The ideas are:
1. random segment choosing After recieving the list of the pieces, they should be downladed in a random order so that, because of download failure, the end pieces are not harder to grab. Redundant segments should be included in the random list so that they are not harder to retreive. The node can stop downloading when it has enough segments to create the file. 2. Download recovery. Splitfiles allows nodes to easily recover from broken downloads so that all the data does not need to be retransferred. This is basically automatic because of the nature of freenet, but it would be more convienent for inexperienced users to be able to click on a partial file and finish the download. This would require a download manager client. 3. Data synchronization. Let's say i'm using Debian's CD image tool (it downloads all the files to create a pseudo-image then does rsync to turn it into the official image) over freenet. The tool could link to all the files that are needed to construct the pseudo-image and download them off from freenet. Then, it could download special splitfile meta-data. This metafile would include a rolling checksum on the entire file and then strong checksums on all the segments (which would already be there if CHKs are used for the segments). The pseudo-image program could then do some rsync-like magic to calculate, at every offset, where the checksums on the segments of the pseudo-image match those on the official image. The program could then download just the segments it does not have and/or the redundant segments then reconstruct the file. If a new version of the CD is posted on freenet, then the sync program can use this method to only download the changes. 4. Redundency reduction. Take the previous example where two similar versions of the Debian CD are posted on freenet. Let's say version A is already well-established in freenet and version B is going to be placed in it. If variable-size splitfiles are used, B could be designed to use the identical segments in A. (This is much different than a diff). The checksum data could be downloaded and B's data compared. Matching segments could be indexed in order by the splitfile index, and missing ones would be added. The problem though is that the missing data might not fit evenly between the already-existing data. This would make a few segments differ in size from the others. Even more advanced, a complete checksum index with segment connectivity could be added to freenet. Every linked segment in freenet would be listed here, and also each splitfile that refrences it. Nodes wanting to put data in freenet could dowload a checksum list of the most linked segments, then upload data using these segments. An index might imply centralization, but a decent checksum list could be pulled from the inserter's data store or some other decentralized resource. If implemented, theese ideas would make freenet a litte more complex, but more effecient - both with storage and bandwidth requirements. I would like to do get into some freenet development, but I don't have a good idea where to start. Any suggestions? -Scott Young _______________________________________________ freenet-tech mailing list [EMAIL PROTECTED] http://lists.freenetproject.org/mailman/listinfo/tech
