Re: [ccp4bb] database-assisted data archive
The PiMS team intends that the CCP4 records link not only with the synchrotron, but further back to crystallogenesis records in xtalPiMS, and protein production records in PiMS. The benefits this will provide include: - if you find an unexpected piece of electron density, navigating to records that show what substances were in the sample - designing crystalogenesis screens in the light of data not only about crystals obtained, but also about diffraction. Paul Paukstelis rightly points out that "was convincing anyone to actually use it" is hard, even though the cost of lost work is significant. To address this, we need to ensure: - data entry is as automatic as possible - everything joins up, so that one act of data entry has multiple payoffs. The aim must be seamless data transfer and consistent user interfaces, all the way from target selection to structure interpretation, delivered in a way that is extensible as methods evolve, and which supports not only PX but also other methods. This is a large challenge, but it is achievable. Andreas, in the short term I suggest you look at keeping your files in a Subversion repository. This provides a central backup, and it can easily be mapped as a folder on Linux, OSX, and Windows, because it implements the WebDAV standard. Each project can have a sub-folder. regards, Chris Chris Morris chris.mor...@stfc.ac.uk Tel: +44 (0)1925 603689 Fax: +44 (0)1925 603634 Mobile: 07921-717915 https://www.pims-lims.org/ Daresbury Lab, Daresbury, Warrington, UK, WA4 4AD Date:Wed, 18 Aug 2010 12:19:36 +0100 From:Georgios Pelios Subject: Re: database-assisted data archive Dear all As CCP4, we are currently developing the new CCP4i that will include a database application that will store project and job data. The database schema has already been designed but its design is not final and can be modified depending on user feedback. Now, we are in the process of writing the database API. Any suggestions and ideas regarding data storage and retrieval are welcome. George Pelios CCP4
Re: [ccp4bb] database-assisted data archive
TARDIS and MyTARDIS (for public and private data respectively) is currently in production at the Australian Synchrotron and has just received funding to expand to working with all data being produced from all beamlines (not just macromolecular) - and also to all instruments at the Australian nuclear facility ANSTO. One of the major drawcards of this system is that for users of the Australian Synchrotron there is zero barrier to entry as far as data cataloguing and access. Once the frames come off the beamline, their headers are extracted and catalogued in a database. This is all accessible for download anywhere today via the web portal http://tardis.synchrotron.org.au under one's synchrotron user account. Information is gathered from the proposal and scheduling systems at the facility and fed to this MyTARDIS node, so there is literally nothing a user *has* to enter to have their data described and accounted for in the system. Furthermore, an instance of MyTARDIS can be set up at the lab or institution to receive a local copy of the data and metadata. For instance, if a crystallographer from Melbourne university has a MyTARDIS set up in their lab, the MyTARDIS node at the Australian Synchrotron detects if new data off a beamline is owned by this crystallographer and sends a copy of all data and its associated metadata for download through a local web portal - under their regular university login system. A sharing interface allows crystallographers to grant access to fellow researchers so that they can also download data and browse/search through metadata. Later on, a user will be able to add datasets with results and log files to these catalogued raw diffraction datasets and publish them. Published data appears in the central index TARDIS.edu.au and contains a persistent handle for citation. No data is actually stored at this central index: TARDIS.edu.au simply provides rich metadata and download links to federated MyTARDIS nodes and their stored data. There are plans to have (at least) the first diffraction image converted to JPG or PNG and stored/displayed by the web portal (as Andreas mentioned), as well as crystal quality ranking and other (eg. XDS) processing. As a final note, while the preferred method of data storage in TARDIS is the zero-effort one via synchrotrons, there's a method of manually depositing diffraction datasets, irrespective of date or origin. See: http://tardis.edu.au/deposit for more details. A mailing list (Google Group) has just been set up for discussion of TARDIS/MyTARDIS. Feel free to join in to keep abreast of changes and discuss finer points of the solution: http://groups.google.com/group/tardis-users
Re: [ccp4bb] database-assisted data archive
Hi everyone Thanks for your emails. Apparently, there is a wide range of suggestions and ideas about what (and how) can be stored in a database with X-ray crystallography data. We would all like to be able to store all our research data in a database, with as little an effort as possible, in a very simple way, and we should also be able to easily upload it to our laptops. How many of you think this is possible? As I mentioned in an earlier email, the new CCP4i will include a small database. The primary function of the database will be to store data on projects, jobs, files and users. One of the objectives is to allow other programs (such as Coot, CCP4mg and iMosflm) to be more integrated with the main CCP4 suite interface, allowing data from those programs to be accessible from CCP4i and vice-versa. The database will be tested thoroughly before it is released as part of the new CCP4i. We welcome user feedback. George Pelios CCP4
Re: [ccp4bb] database-assisted data archive
Dear Andreas, If you really want to do this, and want to define what is the data is, it's not _so_ difficult to do it yourself, with Ruby on Rails ( http://rubyonrails.org/) You have to know how to script a bit, and know a bit about Model/View/Controller frameworks. http://www.youtube.com/watch?v=Gzj723LkRJY That's not what you asked, but if you want to define what is the data to be input, you end up being unhappy with someone else's implementation. Mark 2010/8/18 Andreas Förster > Dear all, > > going through some previous lab member's data and trying to make sense of > it, I was wondering what kind of solutions exist to simply the archiving and > retrieval process. > > In particular, what I have in mind is a web interface that allows a user > who has just returned from the synchrotron or the in-house detector to fill > in a few boxes (user, name of protein, mutant, light source, quality of > data, number of frames, status of project, etc) and then upload his data > from the USB stick, portable hard drive or remote storage. > > The database application would put the data in a safe place (some file > server that's periodically backed up) and let users browse through all the > collected data of the lab with minimal effort later. > > I doesn't seem too hard to implement this, which is why I'm asking if > anyone has done so already. > > Thanks. > > > Andreas > > -- >Andreas Förster, Research Associate >Paul Freemont & Xiaodong Zhang Labs > Department of Biochemistry, Imperial College London >http://www.msf.bio.ic.ac.uk > -- Skype: markabrooks
Re: [ccp4bb] database-assisted data archive
On Wednesday 18 August 2010 11:25:19 am Andreas Förster wrote: > Thanks to everyone for the good ideas and suggestion. Let me clarify > what I want. A simple system that does one task. I'm with James Holton > on complexity and with several others on wikis and databases. They're > simple to set up and easy to use, but no one does, besides the one who > implemented them. I've seen this with a lab wiki and a plasmid > database. If the boss just approves of the project but doesn't enforce > usage, it won't be used. > > That's why what I really want is an unavoidable system. Our protocol makes use of a FileMaker database (the one Juergen Bosch mentioned earlier) that tracks all mounted crystals. It is both handy and, as you say you want (but be careful what you wish for), unavoidable. Juergen was largely responsible for setting it up in the first place, but it has remained in continuous use since then. This works for us because the great bulk of our data collection is done using the BluIce interface to the SSRL beamlines. As a requirement for data collection, users must provide a spreadsheet that indexes each crystal and its location in the SSRL sample cassette. We create this spreadsheet directly as an export from our lab database. The database itself assigns a unique systematic directory name for each crystal. The spreadsheet is then used by the beamline software to screen and collect data from all the crystals. The beamline software fills in screening information as it goes, including the cell dimensions, etc, as determined by the automated software. The data images for each crystal are put into a uniquely named directory as specified in the spreadsheet. After the run, the updated spreadsheet is merged back into our lab database and the data images are archived keeping their systematic uniquely determined directory names. Yes, if you work hard at it you can manage to mess up, say, the human-interpretable meaning of the assigned systematic name. But you cannot avoid the system altogether, because the only way to reserve a slot for your crystal in the cassette being sent for data collection is to enter its identifying information in the lab database. There is still room to lose track of archived data at a larger scale. Last I asked, TARDIS and the like cannot really help much with this. If your 600 Gigabytes of archived data from 2008 are indexed as being stored on disk XD_2008_2 in Room K407 of building HSB, it can tell you exactly what directory on that disk corresponds to the data from which crystal. Unfortunately, it doesn't tell you that in fact that disk was moved to a room down the hall 6 months ago when the lab was reorganized :-) The drawbacks of this system are - I wish I knew of an open-source linux-compatible equivalent to FileMaker. Nothing else I have looked at offered this level of easy yet controlled access via a web browser from remote locations. - Compliance with the protocol drops to less than 100% for datasets collected at home rather than at a beamline. - One is still faced with the issue of how to deal with archiving terabytes of data - Ethan > I'm thinking of > an uploader that sits on the file server. Only the uploader has write > permission. The user calls the uploader because data is only backed up > on the file server, puts the data directory name into a box and fills in > a few other boxes (four or five) because otherwise the uploader won't > work. The uploader interface could then be used to query the file > server and find datasets. But the key is that the system MUST be used > to archive data - basically like flickr, but with the tag boxes > mandatory. It's look like TARDIS (http://tardis.edu.au/) might have > such capabilities. > > The discussion regarding LIMS and ISPyB and other fancy tracking systems > was fascinating, but I don't see those as relevant for my archiving > task. For the same reason, xTrack doesn't fit my bill. I want to bury > data, but not so deep that I don't find them should I ever need to. I > don't care about space group or crystallization conditions or processing > information - the CCP4_DATABASE breaks with time anyway, either because > a user renamed directories or because the user's home directory has been > moved to /oldhome to make space for new users. I just want to be able > to always find old data. > > Going off on a tangent, associating a jpg of the first image (with > resolution rings) to each dataset is great. Can the generation of such > images be automated, ie. a script for the whole directory tree? > > All best. > > > Andreas > > > > On 18/08/2010 11:44, Eleanor Dodson wrote: > > I would contact Johan Turkenburg here - he and sSam Hart have organised > > the York data archive brilliantly - it is now pretty straightforward to > > access any data back to ~ 1998 I think.. > > > > Eleanor > > j...@ysbl.york.ac.uk > > > > Andreas Förster wrote: > >>
Re: [ccp4bb] database-assisted data archive
Thanks to everyone for the good ideas and suggestion. Let me clarify what I want. A simple system that does one task. I'm with James Holton on complexity and with several others on wikis and databases. They're simple to set up and easy to use, but no one does, besides the one who implemented them. I've seen this with a lab wiki and a plasmid database. If the boss just approves of the project but doesn't enforce usage, it won't be used. That's why what I really want is an unavoidable system. I'm thinking of an uploader that sits on the file server. Only the uploader has write permission. The user calls the uploader because data is only backed up on the file server, puts the data directory name into a box and fills in a few other boxes (four or five) because otherwise the uploader won't work. The uploader interface could then be used to query the file server and find datasets. But the key is that the system MUST be used to archive data - basically like flickr, but with the tag boxes mandatory. It's look like TARDIS (http://tardis.edu.au/) might have such capabilities. The discussion regarding LIMS and ISPyB and other fancy tracking systems was fascinating, but I don't see those as relevant for my archiving task. For the same reason, xTrack doesn't fit my bill. I want to bury data, but not so deep that I don't find them should I ever need to. I don't care about space group or crystallization conditions or processing information - the CCP4_DATABASE breaks with time anyway, either because a user renamed directories or because the user's home directory has been moved to /oldhome to make space for new users. I just want to be able to always find old data. Going off on a tangent, associating a jpg of the first image (with resolution rings) to each dataset is great. Can the generation of such images be automated, ie. a script for the whole directory tree? All best. Andreas On 18/08/2010 11:44, Eleanor Dodson wrote: I would contact Johan Turkenburg here - he and sSam Hart have organised the York data archive brilliantly - it is now pretty straightforward to access any data back to ~ 1998 I think.. Eleanor j...@ysbl.york.ac.uk Andreas Förster wrote: Dear all, going through some previous lab member's data and trying to make sense of it, I was wondering what kind of solutions exist to simply the archiving and retrieval process. In particular, what I have in mind is a web interface that allows a user who has just returned from the synchrotron or the in-house detector to fill in a few boxes (user, name of protein, mutant, light source, quality of data, number of frames, status of project, etc) and then upload his data from the USB stick, portable hard drive or remote storage. The database application would put the data in a safe place (some file server that's periodically backed up) and let users browse through all the collected data of the lab with minimal effort later. I doesn't seem too hard to implement this, which is why I'm asking if anyone has done so already. Thanks. Andreas -- Andreas Förster, Research Associate Paul Freemont & Xiaodong Zhang Labs Department of Biochemistry, Imperial College London http://www.msf.bio.ic.ac.uk
Re: [ccp4bb] database-assisted data archive
What about XTrack? http://xray.bmc.uu.se/xtrack/ -Original Message- From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of James Holton Sent: 18 August 2010 16:54 To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] database-assisted data archive There is an image archiving system called TARDIS (http://tardis.edu.au/) that sounds more-or-less exactly like what you describe. I agree that it would be "nice" if you can get your synchrotron to do it for you, but since every single beamline and home-source setup in the world has already been providing you with a "database" that is more commonly called the "image header", I don't think it is too hard to imagine how accurate the data in your "database" is going to be. If I may interject my two cents, I have found that when a user is asked to fill out a form, compliance is inversely proportional to the number of fields on the form. But far more important than that: if you ask them to answer a question that they simply don't know the answer to, they will likely skip the whole thing. An excellent example (I think) is asking for the space group BEFORE they have even taken their first snapshot of a brand new crystal. This datum is simply not known until AFTER the structure is solved! For example, is it P41 or P43? You don't "really" know that until after you see a helix in the map. What is the molecular weight? That depends on whether or not it is a complex. (if I had a nickel for every user who was certain they had a protein-DNA complex with a "very low solvent content", I would be quite rich). All that said, I don't think it is unreasonable to expect an image header (or any other database) to contain motor positions, detector type, wavelength, beam center etc. Clearly this is not always the case, and this problem still needs a lot of work, but my point is that we should try to write down things that we "really know" (observations) and not try to muddle the database with derived quantities (interpretations). When it comes to what you "really know" about the sample, all you can realistically hope to be sure of is the list of chemicals that went into the drop: macromolecule sequence, salts, PEGs, and their respective concentrations. Sometimes you don't even kow that! (i.e. proteolysis). However, the macromolecule sequence is INCREDIBLY useful for deriving (or at least guessing) a great many other things (such as the molecular weight, solvent content, number of heavy atom sites). The list of salts is also absolutely critical for doing radiation damage predictions. So, as my rant comes to an end, I would strongly suggest focusing on trying to capture the important things that we actually do know, rather than confusing our poor users further by asking them to write down a lot of things that they don't. -James Holton MAD Scientist Andreas Förster wrote: > Dear all, > > going through some previous lab member's data and trying to make sense > of it, I was wondering what kind of solutions exist to simply the > archiving and retrieval process. > > In particular, what I have in mind is a web interface that allows a > user who has just returned from the synchrotron or the in-house > detector to fill in a few boxes (user, name of protein, mutant, light > source, quality of data, number of frames, status of project, etc) and > then upload his data from the USB stick, portable hard drive or remote > storage. > > The database application would put the data in a safe place (some file > server that's periodically backed up) and let users browse through all > the collected data of the lab with minimal effort later. > > I doesn't seem too hard to implement this, which is why I'm asking if > anyone has done so already. > > Thanks. > > > Andreas > Evotec (UK) Ltd is a limited company registered in England and Wales. Registration number:2674265. Registered office: 114 Milton Park, Abingdon, Oxfordshire, OX14 4SA, United Kingdom.
Re: [ccp4bb] database-assisted data archive
There is an image archiving system called TARDIS (http://tardis.edu.au/) that sounds more-or-less exactly like what you describe. I agree that it would be "nice" if you can get your synchrotron to do it for you, but since every single beamline and home-source setup in the world has already been providing you with a "database" that is more commonly called the "image header", I don't think it is too hard to imagine how accurate the data in your "database" is going to be. If I may interject my two cents, I have found that when a user is asked to fill out a form, compliance is inversely proportional to the number of fields on the form. But far more important than that: if you ask them to answer a question that they simply don't know the answer to, they will likely skip the whole thing. An excellent example (I think) is asking for the space group BEFORE they have even taken their first snapshot of a brand new crystal. This datum is simply not known until AFTER the structure is solved! For example, is it P41 or P43? You don't "really" know that until after you see a helix in the map. What is the molecular weight? That depends on whether or not it is a complex. (if I had a nickel for every user who was certain they had a protein-DNA complex with a "very low solvent content", I would be quite rich). All that said, I don't think it is unreasonable to expect an image header (or any other database) to contain motor positions, detector type, wavelength, beam center etc. Clearly this is not always the case, and this problem still needs a lot of work, but my point is that we should try to write down things that we "really know" (observations) and not try to muddle the database with derived quantities (interpretations). When it comes to what you "really know" about the sample, all you can realistically hope to be sure of is the list of chemicals that went into the drop: macromolecule sequence, salts, PEGs, and their respective concentrations. Sometimes you don't even kow that! (i.e. proteolysis). However, the macromolecule sequence is INCREDIBLY useful for deriving (or at least guessing) a great many other things (such as the molecular weight, solvent content, number of heavy atom sites). The list of salts is also absolutely critical for doing radiation damage predictions. So, as my rant comes to an end, I would strongly suggest focusing on trying to capture the important things that we actually do know, rather than confusing our poor users further by asking them to write down a lot of things that they don't. -James Holton MAD Scientist Andreas Förster wrote: Dear all, going through some previous lab member's data and trying to make sense of it, I was wondering what kind of solutions exist to simply the archiving and retrieval process. In particular, what I have in mind is a web interface that allows a user who has just returned from the synchrotron or the in-house detector to fill in a few boxes (user, name of protein, mutant, light source, quality of data, number of frames, status of project, etc) and then upload his data from the USB stick, portable hard drive or remote storage. The database application would put the data in a safe place (some file server that's periodically backed up) and let users browse through all the collected data of the lab with minimal effort later. I doesn't seem too hard to implement this, which is why I'm asking if anyone has done so already. Thanks. Andreas
Re: [ccp4bb] database-assisted data archive
Dear All, I would just like to add to Enrico's mention of ISPyB. This LIMS system will log all your data collected at the beamline (experimental parameters, screening images, data sets, edge scans, xrf spectra, crystal snapshots etc) automatically and is stored indefinitely. Your colleagues can also follow data collections in real time by logging on from their home labs. In addition, you can upload large amounts of information on your samples (acronym, space group, pin barcode etc) to the data base that can be recovered at the beamline through MXCuBE and the sample changer, tying all data collections to this information. You can also track your dewars to and from the ESRF using it - even receiving an email when it reaches the beamline. It has recently delved into the world of data analysis, as you can rank crystals against each other using a number of criteria. For those not in an exclusive relationship with the ESRF, you will be glad to hear it is also available at Diamond and I believe will be at PETRAIII. Cheers, Matt Some links: ISPyB: http://www.esrf.eu/UsersAndScience/Experiments/MX/How_to_use_our_beamlines/ISPYB Sample tracking: http://www.esrf.eu/UsersAndScience/Experiments/MX/How_to_use_our_beamlines/ISPYB/ispyb-dewar-tracking Ranking: http://www.esrf.eu/UsersAndScience/Experiments/MX/How_to_use_our_beamlines/ISPYB/ispyb-sample-ranking Enrico Stura wrote: Knowing where all the important files are is really all that is needed. Sofistication can come later. I would welcome a CCP4 database-assisted data archive system. Here is my contribution to the discussion: I agree with Paul Paukstelis that getting users to use any database-assisted data archive system is the biggest obstacle. I have had problems with compliance with my system, where all that the student has to do is to provide file and directory names each Friday to keep the database up to date. It is a simple html based access system where through hyperlinks one can access the data anywhere where it is stored. Users need only provide the directories names of where the various pieces of data are stored within the accessible network and the data manager (any HTML competent individual) can then set-up the links to the main control platform (start-up html page). The advantage of such system is that it is platform independent and needs only a well configured browser. It is backward compatible with any old data. George Pelios may want to consider an automated system where mosflm, scala and all subsequent programs contribute to create and update a raw data retrieval file on the basis of the files they have used. When the project is finished a backup program should be able to retrieve all such files to be stored in a consolidated manner for transfer to a long term storage server. A brief description of the system I use for synchrotron data collection: Prior to the synchrotron trip, each sample taken to the synchrotron is entered in a table that represents its position in the puck with hyperlinks to a file describing its position in the crystallization tray (this file will have hyperlinks to crystallization and all prior preparation steps). As data is collected a short comment (resolution and number of frames is included if data has been collected) as the data is transfered in the home lab a link to the directory where the data is stored is then added. To give an idea of data quality Mosflm and gimp screen capture are used to create a jpg of the first data image (with the frame filename added) which is stored in the same directory as the raw data frames. This image is accessed when clicking on the comment. Compliance with the system can be checked by clicking on comments other than "not tested". It is all manual but is not very time consuming once the initial html templates have been set up. Still I am looking foward to a simple CCP4 designed system that can do something similar automatically. I would also recommend looking at ispyb implemented at the ESRF which is also web based: www.esrf.eu/UsersAndScience/Experiments/MX/Software/ispyb Enrico. -- Matthew Bowler Structural Biology Group European Synchrotron Radiation Facility B.P. 220, 6 rue Jules Horowitz F-38043 GRENOBLE CEDEX FRANCE === Tel: +33 (0) 4.76.88.29.28 Fax: +33 (0) 4.76.88.29.04 http://www.esrf.fr/UsersAndScience/Experiments/MX/ ===
Re: [ccp4bb] database-assisted data archive
Knowing where all the important files are is really all that is needed. Sofistication can come later. I would welcome a CCP4 database-assisted data archive system. Here is my contribution to the discussion: I agree with Paul Paukstelis that getting users to use any database-assisted data archive system is the biggest obstacle. I have had problems with compliance with my system, where all that the student has to do is to provide file and directory names each Friday to keep the database up to date. It is a simple html based access system where through hyperlinks one can access the data anywhere where it is stored. Users need only provide the directories names of where the various pieces of data are stored within the accessible network and the data manager (any HTML competent individual) can then set-up the links to the main control platform (start-up html page). The advantage of such system is that it is platform independent and needs only a well configured browser. It is backward compatible with any old data. George Pelios may want to consider an automated system where mosflm, scala and all subsequent programs contribute to create and update a raw data retrieval file on the basis of the files they have used. When the project is finished a backup program should be able to retrieve all such files to be stored in a consolidated manner for transfer to a long term storage server. A brief description of the system I use for synchrotron data collection: Prior to the synchrotron trip, each sample taken to the synchrotron is entered in a table that represents its position in the puck with hyperlinks to a file describing its position in the crystallization tray (this file will have hyperlinks to crystallization and all prior preparation steps). As data is collected a short comment (resolution and number of frames is included if data has been collected) as the data is transfered in the home lab a link to the directory where the data is stored is then added. To give an idea of data quality Mosflm and gimp screen capture are used to create a jpg of the first data image (with the frame filename added) which is stored in the same directory as the raw data frames. This image is accessed when clicking on the comment. Compliance with the system can be checked by clicking on comments other than "not tested". It is all manual but is not very time consuming once the initial html templates have been set up. Still I am looking foward to a simple CCP4 designed system that can do something similar automatically. I would also recommend looking at ispyb implemented at the ESRF which is also web based: www.esrf.eu/UsersAndScience/Experiments/MX/Software/ispyb Enrico. -- Enrico A. Stura D.Phil. (Oxon) ,Tel: 33 (0)1 69 08 4302 Office Room 19, Bat.152, Tel: 33 (0)1 69 08 9449Lab LTMB, SIMOPRO, IBiTec-S, CE Saclay, 91191 Gif-sur-Yvette, FRANCE http://www-dsv.cea.fr/en/institutes/institute-of-biology-and-technology-saclay-ibitec-s/unites-de-recherche/department-of-molecular-engineering-of-proteins-simopro/molecular-toxinology-and-biotechnology-laboratory-ltmb/crystallogenesis-e.-stura http://www.chem.gla.ac.uk/protein/mirror/stura/index2.html e-mail: est...@cea.fr Fax: 33 (0)1 69 08 90 71
Re: [ccp4bb] database-assisted data archive
Dear all As CCP4, we are currently developing the new CCP4i that will include a database application that will store project and job data. The database schema has already been designed but its design is not final and can be modified depending on user feedback. Now, we are in the process of writing the database API. Any suggestions and ideas regarding data storage and retrieval are welcome. George Pelios CCP4 -Original Message- From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of Andreas Förster Sent: 18 August 2010 10:53 To: CCP4BB@JISCMAIL.AC.UK Subject: [ccp4bb] database-assisted data archive Dear all, going through some previous lab member's data and trying to make sense of it, I was wondering what kind of solutions exist to simply the archiving and retrieval process. In particular, what I have in mind is a web interface that allows a user who has just returned from the synchrotron or the in-house detector to fill in a few boxes (user, name of protein, mutant, light source, quality of data, number of frames, status of project, etc) and then upload his data from the USB stick, portable hard drive or remote storage. The database application would put the data in a safe place (some file server that's periodically backed up) and let users browse through all the collected data of the lab with minimal effort later. I doesn't seem too hard to implement this, which is why I'm asking if anyone has done so already. Thanks. Andreas -- Andreas Förster, Research Associate Paul Freemont & Xiaodong Zhang Labs Department of Biochemistry, Imperial College London http://www.msf.bio.ic.ac.uk
Re: [ccp4bb] database-assisted data archive
I did something like that for plasmids by putting together a web interface, php, and MySQL. It was simple, maybe a little ugly, but worked nicely. The problem was convincing anyone to actually use it was virtually impossible. --paul On 08/18/2010 04:52 AM, Andreas Förster wrote: Dear all, going through some previous lab member's data and trying to make sense of it, I was wondering what kind of solutions exist to simply the archiving and retrieval process. In particular, what I have in mind is a web interface that allows a user who has just returned from the synchrotron or the in-house detector to fill in a few boxes (user, name of protein, mutant, light source, quality of data, number of frames, status of project, etc) and then upload his data from the USB stick, portable hard drive or remote storage. The database application would put the data in a safe place (some file server that's periodically backed up) and let users browse through all the collected data of the lab with minimal effort later. I doesn't seem too hard to implement this, which is why I'm asking if anyone has done so already. Thanks. Andreas -- Paul Paukstelis, Ph.D Assistant Professor University of Maryland Chemistry & Biochemistry Dept. Center for Biomolecular Structure & Organization pauks...@umd.edu 301-405-9933
Re: [ccp4bb] database-assisted data archive
I would contact Johan Turkenburg here - he and sSam Hart have organised the York data archive brilliantly - it is now pretty straightforward to access any data back to ~ 1998 I think.. Eleanor j...@ysbl.york.ac.uk Andreas Förster wrote: Dear all, going through some previous lab member's data and trying to make sense of it, I was wondering what kind of solutions exist to simply the archiving and retrieval process. In particular, what I have in mind is a web interface that allows a user who has just returned from the synchrotron or the in-house detector to fill in a few boxes (user, name of protein, mutant, light source, quality of data, number of frames, status of project, etc) and then upload his data from the USB stick, portable hard drive or remote storage. The database application would put the data in a safe place (some file server that's periodically backed up) and let users browse through all the collected data of the lab with minimal effort later. I doesn't seem too hard to implement this, which is why I'm asking if anyone has done so already. Thanks. Andreas
Re: [ccp4bb] database-assisted data archive
Do you want the frames to be accessible too ? If not, then a.wiki would be an easy solution. Alternatively a Filemaker database would do the trick too. Jürgen .. Jürgen Bosch Johns Hopkins Bloomberg School of Public Health Department of Biochemistry & Molecular Biology Johns Hopkins Malaria Research Institute 615 North Wolfe Street, W8708 Baltimore, MD 21205 Phone: +1-410-614-4742 Lab: +1-410-614-4894 Fax: +1-410-955-3655 http://web.mac.com/bosch_lab/ On Aug 18, 2010, at 5:52, Andreas Förster wrote: > Dear all, > > going through some previous lab member's data and trying to make sense > of it, I was wondering what kind of solutions exist to simply the > archiving and retrieval process. > > In particular, what I have in mind is a web interface that allows a user > who has just returned from the synchrotron or the in-house detector to > fill in a few boxes (user, name of protein, mutant, light source, > quality of data, number of frames, status of project, etc) and then > upload his data from the USB stick, portable hard drive or remote storage. > > The database application would put the data in a safe place (some file > server that's periodically backed up) and let users browse through all > the collected data of the lab with minimal effort later. > > I doesn't seem too hard to implement this, which is why I'm asking if > anyone has done so already. > > Thanks. > > > Andreas > > -- > Andreas Förster, Research Associate > Paul Freemont & Xiaodong Zhang Labs > Department of Biochemistry, Imperial College London > http://www.msf.bio.ic.ac.uk
[ccp4bb] database-assisted data archive
Dear all, going through some previous lab member's data and trying to make sense of it, I was wondering what kind of solutions exist to simply the archiving and retrieval process. In particular, what I have in mind is a web interface that allows a user who has just returned from the synchrotron or the in-house detector to fill in a few boxes (user, name of protein, mutant, light source, quality of data, number of frames, status of project, etc) and then upload his data from the USB stick, portable hard drive or remote storage. The database application would put the data in a safe place (some file server that's periodically backed up) and let users browse through all the collected data of the lab with minimal effort later. I doesn't seem too hard to implement this, which is why I'm asking if anyone has done so already. Thanks. Andreas -- Andreas Förster, Research Associate Paul Freemont & Xiaodong Zhang Labs Department of Biochemistry, Imperial College London http://www.msf.bio.ic.ac.uk