This actually is a valid way, and would work to reconnect raw files with
sidecar file/databases against the scenarios I've postulated.

The downside of it is the time to recalculate the hash.

Quick test:  md5 against a directory of NEF (nikon raw files) 135 seconds
for 721 files at about 30M each.
That's about 5 raw images at 30 M each.  Suppose it's 1/5sec each.  A
library of 100,000 images would take about 5 hours to extract checksums.

If the unique ID is in the file, then the program only has to read part of
the file instead of all of it. This may or may not be a win.  I don't know
how much of that time just positioning the read heads to slurp up the file.
If your disks are capable of 6Gb/s that's roughly600B/s allowing 20%
overhead for sector layout and error correction data.  That's about 1/20 s
per image, about half time time.  Realistically, looking at a few drives, a
6Gb drive gets about 160MB/s with medium largish files.  If the reads are
short -- 4K, then the MB/s goes to hell, but you can do about 150-200
IOP/s

A big downside of writing metadata back to the original file is the space
it takes in the backup.  If I change a file, the whole file gets backed
up.  You don't want to do this on a regular basis.

In my book, a good photo management solution would give you the choice.
Any combination of:
* Keep the original untouched.
* Calculate a hash of the original.
* Insert the hash into the original
* Back up the original to a temporary folder while spot checks are done
with metadata updates.
* Dump the first N sectors of the file to a backing store before making
changes in that N sectors.

If I had to choose today I would keep the original untouched but would want
the photomanagement program to write all metadata (includeing the original
image checksum) to all derived files.

Regards

Sherwood



On Thu, 30 Jan 2020 at 10:17, Guillermo Rozas <[email protected]> wrote:

> > If I can do so safely, at a minimum I want a unique ID in the master
> image that can be propagated with the image to all derived ones.
> > At best I want all critical metadata -- the stuff that takes hours to
> put in -- keywords, caption, description to reside in the image, and in the
> database, and in the sidecar files, and in every derived image.
>
> An option to do that without touching the RAW file could be:
> - calculate a cryptographic hash on the original RAW file when you
> download it
> - save the hash in a sidecar file, on an XMP tag which you know is
> carried over by all the programs you use
>
> This also has the nice property of warning you if your RAW file is
> corrupted in the future.
>
> Best regards,
> Guillermo
>
> ____________________________________________________________________________
> darktable user mailing list
> to unsubscribe send a mail to
> [email protected]
>
>

____________________________________________________________________________
darktable user mailing list
to unsubscribe send a mail to [email protected]

Reply via email to