There is an image archiving system called TARDIS (http://tardis.edu.au/)
that sounds more-or-less exactly like what you describe.
I agree that it would be "nice" if you can get your synchrotron to do it
for you, but since every single beamline and home-source setup in the
world has already been providing you with a "database" that is more
commonly called the "image header", I don't think it is too hard to
imagine how accurate the data in your "database" is going to be.
If I may interject my two cents, I have found that when a user is asked
to fill out a form, compliance is inversely proportional to the number
of fields on the form. But far more important than that: if you ask
them to answer a question that they simply don't know the answer to,
they will likely skip the whole thing. An excellent example (I think)
is asking for the space group BEFORE they have even taken their first
snapshot of a brand new crystal. This datum is simply not known until
AFTER the structure is solved! For example, is it P41 or P43? You
don't "really" know that until after you see a helix in the map. What
is the molecular weight? That depends on whether or not it is a
complex. (if I had a nickel for every user who was certain they had a
protein-DNA complex with a "very low solvent content", I would be quite
rich).
All that said, I don't think it is unreasonable to expect an image
header (or any other database) to contain motor positions, detector
type, wavelength, beam center etc. Clearly this is not always the case,
and this problem still needs a lot of work, but my point is that we
should try to write down things that we "really know" (observations) and
not try to muddle the database with derived quantities (interpretations).
When it comes to what you "really know" about the sample, all you can
realistically hope to be sure of is the list of chemicals that went into
the drop: macromolecule sequence, salts, PEGs, and their respective
concentrations. Sometimes you don't even kow that! (i.e. proteolysis).
However, the macromolecule sequence is INCREDIBLY useful for deriving
(or at least guessing) a great many other things (such as the molecular
weight, solvent content, number of heavy atom sites). The list of salts
is also absolutely critical for doing radiation damage predictions.
So, as my rant comes to an end, I would strongly suggest focusing on
trying to capture the important things that we actually do know, rather
than confusing our poor users further by asking them to write down a lot
of things that they don't.
-James Holton
MAD Scientist
Andreas Förster wrote:
Dear all,
going through some previous lab member's data and trying to make sense
of it, I was wondering what kind of solutions exist to simply the
archiving and retrieval process.
In particular, what I have in mind is a web interface that allows a
user who has just returned from the synchrotron or the in-house
detector to fill in a few boxes (user, name of protein, mutant, light
source, quality of data, number of frames, status of project, etc) and
then upload his data from the USB stick, portable hard drive or remote
storage.
The database application would put the data in a safe place (some file
server that's periodically backed up) and let users browse through all
the collected data of the lab with minimal effort later.
I doesn't seem too hard to implement this, which is why I'm asking if
anyone has done so already.
Thanks.
Andreas