[Freenet-dev] 0.3 Protocol docs, take 1

h...@finney.org Sun, 6 Aug 2000 13:54:15 -0700

I apologize for the use of text, which I know is not as portable or widely
supported as PDF.  Hopefully users can find a text-to-pdf converter so
they will be able to read this message.


Adam wrote at http://whiterose.sourceforge.net/r3proto.pdf:

> Progressive CHKs
>
> The concept of a Content Hash Key (CHK) is simple. The key is the hash
> of the content of a document.  This means that you can always check that
> the have the correct document by checking the hash. However you can only
> check the hash if you have the whole document. This is fine for small
> documents, but for large ones it can be a hugh waste of bandwidth to
> download the whole document only to find it is incorrect.
>
> Progressive CHKs allow the document to be checked in stages, thus problems
> can be detected as soon as you have the incorrect block of data.
>
> The length of a block is contained in the header Storable.PartSize. When
> getting a stream of data do the following.
>  -  Set check_hash to the CHK
>  -  Read in PartSize bytes of data, call it block_data
>  -  Read in a hash. In the case of SHA1 this means 20 bytes (160 bits),
>     call it temp_hash
>  -  Read in 1 byte. This is the control byte. If it is 0x00 (CB OK)
>     then proceed. If it is 0x01 (CB RESTARTED) then an upstream node
>     has found a problem. Switch out of tunnel mode and back into message
>     parsing mode. Expect a QueryRestarted message
>
> Hash block_data and temp_hash and assert that the hash equals
> check_hash. If it does, proceed. If it doesn't then send 0x01 as the
> control byte to all client nodes and then send QueryRestarted. The
> sending node should be viewed as evil because it lied.
>
> Copy temp_hash to check_hash. The last part might not be a whole PartSize
> bytes long if DataLength isn't big enough. In this case you should take
> PartSize to be the amount of data left.

This is a good description of a conceptually rather complex system
overall, but there are some areas that could be improved.

It's not made clear that you are also passing the data through as you
read it.  The exception is the control byte; if you read a 1, you pass
it through and abort; if it is a 0, you do the hash check and if that
is no good you send a 1 and abort, else send a 0 and proceed with the
next block.

Also, the loop structure is not completely clear.  Really, the per-block
processing happens from the step "Read in PartSize bytes of data, call
it block_data", down to "Copy temp_hash to check_hash".  The part before
this is only done once, and the part after it talks about special handilng
that might be needed on the last block.

You might move the text relating to short final blocks to the first
per-block step, "Read in PartSize bytes".  Add, something like, on the
final block, read in whatever bytes are remaining, and set PartSize to
that value for the remaining processing on that block.

There is a field for the total message length in the headers, of course.
I assume it includes the hashes and check bytes?  So you can tell when you
have reached the final block by keeping track of how many bytes total you
have left.  If there are n bytes left, you need to read no more than n -
20 - 1 bytes as the block_data, to leave room for the hash and the control
byte.  If this value is <= PartSize then you are on the last block.

If there are fewer than 21 bytes left in the last block, that would be
malformed.  I guess you should just pass the data through and then send
a 1 to indicate bad data.

Hal

_______________________________________________
Freenet-dev mailing list
Freenet-dev at lists.sourceforge.net
http://lists.sourceforge.net/mailman/listinfo/freenet-dev

[Freenet-dev] 0.3 Protocol docs, take 1

Reply via email to