Hi Thomas, >Instead of returning an InputStream, Jackrabbit would return a >DataStoreInputStream with the additional method getDataIdentifier(). >Then the module can read the identifier, check if the item is already >processed, and avoid reading the data itself if this identifier is >already processed.
What does this exactly mean ? would you store the dataidentifier in the index and so in all modules ? But what will you do in the case if you try to copy a node internaly .. the datastore should know that he must not read the binary to prevent extra read and write to the datastore. >For text extraction, a separate >file may make sense, but probably not for 'virus scan' because that's >only a flag (you don't need the data). Thumbnails: for better >performance you want to keep them together, and not save them >separately (that is, in the data store). can you explain this a little bit more .. i dont know what viruscan and thumbnails have to do with that problem. i think i can not follow your thoughts at all greets claus
