I like the idea of having an optional hash field in the Blob.
It could be used as a HTTP Entity Tag.
But the contract has to be loose, to allow it to be null (when asynchronous computation isn't yet done) without big problems for the rest of the application.

Also the choice of algorithm is a problem, do we want to make that pluggable? We cannot mandate a choice of algorithm at the repository level as customer will have different requirements.

Florent

On 26 Apr 2007, at 11:46, Olivier Grisel wrote:

Hi,

We will probably soon need to store checksums of file attachments as part of a customer project. The goal is to be able to quickly find dupes when importing a
bunch of files from a file-system folder to a nuxeo workspace.

I wondered if it would be relevant to add sha (or md5 ?) checksums by default in nuxeo, either as a Blob new feature or in a dedicated field of the file schema
computed by a core event listener.

Possible usage:

- search / query for documents by checksum provided that the checksum field is
indexed by the search service;
 - invalidation key for the transform service hypothetical cache;
- making it easier to do integrity checks on the client side by having some
browser plugin on sign the checksum instead of the complete binary;
- additional metadata displayed in the UI so that geeks can check the integrity of their latest Prison Break^W^W^W Wikipedia iso before burning it to a DVD;
...

--
Olivier

_______________________________________________
ECM mailing list
[email protected]
http://lists.nuxeo.com/mailman/listinfo/ecm


--
Florent Guillaume, Director of R&D, Nuxeo
Open Source Enterprise Content Management (ECM)
http://www.nuxeo.com   http://www.nuxeo.org   +33 1 40 33 79 87



_______________________________________________
ECM mailing list
[email protected]
http://lists.nuxeo.com/mailman/listinfo/ecm

Reply via email to