Just to say that I found this post from Micha a few months back very useful/inspiring.
I'm able to import two-colour experiments quite quickly from three text files (it might work with a single file but I didn't try it): 1. biosources to labeled extracts (all one-to-one information) 2. labeled extracts to hybs (two-to-one information) 3. hybs, scans, raw bioassays, raw data files (all one-to-one information) Then the only manual steps are to inherit the relevant biosource's annotation to the rawbioassay and to run the data import plugin for each raw bioassay. Now that a project's default protocols are added by the batch importers (2.9.x), I don't bother with these in the text files, which simplifies things greatly. Thanks! Bob. Micha Bayer writes: > Hi, > > I thought it might be useful for others to have a step-by-step account > of how to import a whole experiment with the new core batch importers, > so here it goes: > > My aim was to import the hybridization-related information for a whole > experiment in a way that would provide MIAME-compliance for this aspect > of the data. I also wanted to see if I can do this working from a single > spreadsheet (rather than having separate ones for each importer), and > one can, which is great. > > The column headers of the spreadsheet I used looked like this in the > end: > > RawBioAssay FileName ArrayName ArrayBatch > ArraySlide Platform RawDataType Scan Hybridization > LabeledExtract Dye Extract Sample BioSource > Protocol[image_analysis] Protocol[scanning] > Protocol[hybridization] Protocol[labeling] Protocol[extraction] > Protocol[treatment] StrainOrLine Time > > In this case the last two columns contained annotation (experimental > factors) specific to my experiment. > > Using this spreadsheet (and suitable import configs), I ran each of the > batch importers for BioSource, Sample, Extract, Labelled Extract, > Hybridization, Scan and Raw Bioassay, in this order. I had to first > manually create a project, protocols and an array design, but that's > fine since it is infrequent stuff, compared to the other entities. > > The last thing I did was to run the annotation batch importer on my new > raw bioassays, which works but is not ideal because of the lack of > inheritance from the appropriate entities upstream (this will be fixed > in BASE 2.9 though, see this thread: > http://www.mail-archive.com/basedb-users@lists.sourceforge.net/msg01596. > html). > > All in all the import of the hybs using the batch importers only takes > about 10 minutes -- that's getting very acceptable. Nice work, guys!! > > There is still some manual repetition involved but one could get round > this by writing a fairly simple plugin that just calls all the other > batch importers in turn. I'll add that to my TODO list but it might take > me some time to get round to this as I am snowed under with lots of > other stuff. > > Attached below is a more detailed point-by-point walk-through of what I > did. Hope this is of use. > > cheers > Micha > > > 1. Create all required protocols manually or check suitable > protocols already exist. > 2. Create new array design manually or check suitable design > already exists. > 3. Create a new project with default settings for platform, raw > data type, array design and protocols. These will be associated with > all entities created from here on. > 4. Set the new project active - this will make it the current > project. > 5. Format your hybridization data as per example above and save as > tab delimited text. Make sure that the names of existing entities you > refer to in the spreadsheet match those in the database, if you are > planning on matching by name. Upload this file to BASE. > 6. Upload raw data files to BASE and unzip in suitable directory > (if you want to have the files associated). (N.B. This example here does > not include storing raw data in the database) > 7. Create a suitable import configurations for each of the batch > importers - this only needs to be done once if the same spreadsheet > format is used for later imports. > 8. Batch-import all required entities by selecting the list view of > each of them in turn and running their respective batch import plugin > with the spreadsheet as input - import configs should be detected > automatically. It's best to stick to the following order: > > a. BioSource > b. Sample > c. Extract > d. Labelled Extract > e. Hybridization > f. Scan > g. Raw Bioassay > > 9. Select all newly created bioassays and Click "New > Experiment...". This will associate all selected bioassays with the new > experiment. > > ANNOTATION > This is a (fairly dirty) temporary workaround which does not use > inheritance. From BASE 2.9 on it should be possible to use inheritance > with the mass annotation importer. > > 10. Check that suitable annotation types (= experimental factors) > exist (Administrate -> Types -> Annotation Types) or create new ones > with names that match the entries in the spreadsheet. > 11. Batch-annotate all raw bioassays in the experiment. In the list > view of the Raw Bioassays, select "Import..." and then select the > Annotation Importer from the list of plugins available. A suitable > import config should be detected automatically. This will annotate each > RawBioassay with the appropriate factor value combination. > > > ================================== > Dr Micha M Bayer > Bioinformatics Specialist > Genetics Programme > The Scottish Crop Research Institute > Invergowrie > Dundee > DD2 5DA > Scotland, UK > Telephone +44(0)1382 562731 ext. 2309 > Fax +44(0)1382 562426 > http://www.scri.ac.uk/staff/michabayer > ================================== > > > > ______________________________________________________________________ > SCRI, Invergowrie, Dundee, DD2 5DA. > The Scottish Crop Research Institute is a charitable company limited by > guarantee. > Registered in Scotland No: SC 29367. > Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. > > > DISCLAIMER: > > This email is from the Scottish Crop Research Institute, but the views > expressed by the sender are not necessarily the views of SCRI and its > subsidiaries. This email and any files transmitted with it are > confidential > > to the intended recipient at the e-mail address to which it has been > addressed. It may not be disclosed or used by any other than that > addressee. > If you are not the intended recipient you are requested to preserve this > > confidentiality and you must not use, disclose, copy, print or rely on > this > e-mail in any way. Please notify postmas...@scri.ac.uk quoting the > name of the sender and delete the email from your system. > > Although SCRI has taken reasonable precautions to ensure no viruses are > present in this email, neither the Institute nor the sender accepts any > responsibility for any viruses, and it is your responsibility to scan > the email and the attachments (if any). > ______________________________________________________________________ > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > The BASE general discussion mailing list > basedb-users@lists.sourceforge.net > unsubscribe: send a mail with subject "unsubscribe" to > basedb-users-requ...@lists.sourceforge.net -- Bob MacCallum | VectorBase Developer | Kafatos/Christophides Groups | Division of Cell and Molecular Biology | Imperial College London | Phone +442075941945 | Email r.maccal...@imperial.ac.uk ------------------------------------------------------------------------------ _______________________________________________ The BASE general discussion mailing list basedb-users@lists.sourceforge.net unsubscribe: send a mail with subject "unsubscribe" to basedb-users-requ...@lists.sourceforge.net