This actually is a valid way, and would work to reconnect raw files with sidecar file/databases against the scenarios I've postulated.
The downside of it is the time to recalculate the hash. Quick test: md5 against a directory of NEF (nikon raw files) 135 seconds for 721 files at about 30M each. That's about 5 raw images at 30 M each. Suppose it's 1/5sec each. A library of 100,000 images would take about 5 hours to extract checksums. If the unique ID is in the file, then the program only has to read part of the file instead of all of it. This may or may not be a win. I don't know how much of that time just positioning the read heads to slurp up the file. If your disks are capable of 6Gb/s that's roughly600B/s allowing 20% overhead for sector layout and error correction data. That's about 1/20 s per image, about half time time. Realistically, looking at a few drives, a 6Gb drive gets about 160MB/s with medium largish files. If the reads are short -- 4K, then the MB/s goes to hell, but you can do about 150-200 IOP/s A big downside of writing metadata back to the original file is the space it takes in the backup. If I change a file, the whole file gets backed up. You don't want to do this on a regular basis. In my book, a good photo management solution would give you the choice. Any combination of: * Keep the original untouched. * Calculate a hash of the original. * Insert the hash into the original * Back up the original to a temporary folder while spot checks are done with metadata updates. * Dump the first N sectors of the file to a backing store before making changes in that N sectors. If I had to choose today I would keep the original untouched but would want the photomanagement program to write all metadata (includeing the original image checksum) to all derived files. Regards Sherwood On Thu, 30 Jan 2020 at 10:17, Guillermo Rozas <[email protected]> wrote: > > If I can do so safely, at a minimum I want a unique ID in the master > image that can be propagated with the image to all derived ones. > > At best I want all critical metadata -- the stuff that takes hours to > put in -- keywords, caption, description to reside in the image, and in the > database, and in the sidecar files, and in every derived image. > > An option to do that without touching the RAW file could be: > - calculate a cryptographic hash on the original RAW file when you > download it > - save the hash in a sidecar file, on an XMP tag which you know is > carried over by all the programs you use > > This also has the nice property of warning you if your RAW file is > corrupted in the future. > > Best regards, > Guillermo > > ____________________________________________________________________________ > darktable user mailing list > to unsubscribe send a mail to > [email protected] > > ____________________________________________________________________________ darktable user mailing list to unsubscribe send a mail to [email protected]
