Great--I'll switch to the development version and give that a try. Thanks- Joe
On 19 Apr 2010, at 11:26, Kasper Daniel Hansen wrote: > Hi Joe > > This is addressed in the development version. We now have the > capability of giving importFromAlignedReads a (named) vector of > filenames instead of a named list of AlignedRead objects. This vector > of filenames will be read in one at a time, so you just need enough > memory to process a single lane. I have processed around 160 lanes > worth of data using this approach. > > There is an extended example in the 'with ShortRead' vignette. > > importFromAlignedReads also has the capability of directly summing > several columns (fi you need this). So let us say you have 6 files > (lanes) and you want to end up with a database with 2 columns > (assuming you have a 3x2 experiment and you have decided to add up > over the lanes). Then you can do this using a construction where the > names of the files are like > "a", "a", "a", "b", "b", "b" > (this will create two columns named "a" and "b" each holding 3 lanes > worth of data). > > In this case, all 3 lanes will be read into memory at the same time - > it is less memory efficient but it was much easier to code. If that > is impossible you should create a standard 6 column database and then > use collapseExpData. The importFromAlignedReads is more of a > convenience (and speed) trick. > > I uploaded a new version 1.1.6 yesterday which I recommend, because of > some documentation updates. This version should replace 1.1.5 on the > Bioconductor development servers sometime tomorrow. > > Kasper > > > On Mon, Apr 19, 2010 at 11:06 AM, joseph franklin > <[email protected]> wrote: >> I'm addressing this to Jim Bullard, who has been really helpful answering >> some of my questions, as well as the list, in case anyone has some advice >> for me. >> >> I've started using Genominator (I'm using the release version right now) to >> quantitate and analyze RNA-seq data, and have been really successful >> aggregating AlignedRead objects with my own annotation tables to produce >> per-gene counts. I've done this with sets of 2-3 AlignedRead objects (each >> representing an Illumina lane), but I'd like to extend the approach to a few >> dozen lanes. Since this is far too much data to fit in memory, I need an >> efficient way to combine many AlignedRead objects at once that doesn't rely >> on them being loaded as objects at the same time. >> >> I imagine that I need to load the objects into tables using the >> importFromAlignedReads, and then join the appropriate columns, either before >> or after aggregation (the manual hints that afterwards is preferable). >> However, there are a few points I'm confused with (probably resulting from >> my limited experience with SQLite): >> >> - I've been unable load to load a SQLite database file that was previously >> created with the importFromAlignedReads--what is the best way to load the >> database connection--for instance, during a new R session? >> >> -Can AlignedRead objects only be imported (via importFromAlignedReads) as >> named lists of two or more objects? What about single AlignedRead objects? >> I would imagine that a solution to my problem would be to create a separate >> table in a database file for each of my AlignedRead objects (I made a loop >> to do this), and then join these tables (as long as I can create a >> connection to the database). >> >> I think my problems could be solved if I could load the AlignedRead objects >> from multiple lanes into tables in database file, load it, and join the >> appropriate columns from the various tables (and then aggregate with the >> annotations in a single step--this would seem to be the most >> straightforward). Any advice on accomplishing these steps would be much >> appreciated. >> >> Thanks again, >> Joe Franklin >> >> ________________________________ >> Joseph Franklin >> Department of Cell Biology >> Yale University >> 295 Congress Ave, BCMM 137 >> New Haven, CT 06519 >> USA >> >> _______________________________________________ >> Bioc-sig-sequencing mailing list >> [email protected] >> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing >> _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
