Ross,

About the index files: It is way easier to have pre-built index files. However, when running a 2-pass STAR run, a user will need to generate their own reference index files based on the output SJ.tab.out file created in the first pass. Is this incorporated into your tool?

About shared memory: I am under the impression that the latest version of STAR has deprecated this feature. I am unclear how this would help unless a single large-memory machine was dedicated to running all STAR jobs. Is this the case?

Also, does the tool merge the SAM/BAM file with the output chimeric SAM file?

David Hoover

On 9/24/2014 7:03 PM, Ross wrote:
Hi All,

That (fubar in testtoolshed) star wrapper was derived from one originally written by Jeremy Goecks. I modified it for multiple inputs and added a few tweaks and it has been in production use in our group for about 6 months so I'm pretty sure it works reasonably well in our hands at least.

I would really appreciate any available help getting it to a proven useful state - suggestions and code welcomed. I have not moved it to the main toolshed because aside from some encouragement, I've had no feedback to suggest it's working - or not. It is extremely fast - we regularly see 200-300M reads per minute in the logs!

We regularly run a whole experiment worth (eg 12 - 24) fastq files simultaneously with the shared memory option working on our cluster - see the readme.

Star index files made with a gene model (requires valid gff3) are huge - 20-30GB for hg19 - hence the need for shared memory if you run multiple jobs. That will eventually become a serious problem if you really want to allow users to make their own - we definitely do not. You need to be very careful about matching the gene model gff3 file to the reference and I had enough trouble getting it right for the few major genomes we use to make me think that I do not want users trying to do that generating 25GB of rubbish every time they get it wrong.

There are challenges to do with needing different indexes for different length reads but we are seeing fairly consistent 60bp single ended reads for most of the incoming RNA seq experiments.

A data manager would be a boon if anyone cares to write one...


On Thu, Sep 25, 2014 at 6:55 AM, Curtis Hendrickson (Campus) <curt...@uab.edu <mailto:curt...@uab.edu>> wrote:

    Bjorn

    We'd be interested in this tool, as well. Any idea how close to
    functional it is?
    I see it's only on TEST toolshed, and not on production, at this
    point.

    I don't see any related Trello card when searching on "star"

    Regards,
    Curtis
    Galaxy Admin @ University of Alabama at Birmingham

    -----Original Message-----
    From: galaxy-dev-boun...@lists.bx.psu.edu
    <mailto:galaxy-dev-boun...@lists.bx.psu.edu>
    [mailto:galaxy-dev-boun...@lists.bx.psu.edu
    <mailto:galaxy-dev-boun...@lists.bx.psu.edu>] On Behalf Of Björn
    Grüning
    Sent: Wednesday, September 24, 2014 3:15 PM
    To: galaxy-dev@lists.bx.psu.edu
    <mailto:galaxy-dev@lists.bx.psu.edu>; hoove...@helix.nih.gov
    <mailto:hoove...@helix.nih.gov> >> David Hoover
    Subject: Re: [galaxy-dev] tool for STAR RNA-seq aligner

    Hi David,

    yes there is inital code in the
    https://testtoolshed.g2.bx.psu.edu/. I think Ross has done some
    work on it.
    The main problem with Star is that is needs special indices (and a
    lot of it) and it would be great to offer data managers for it.

    Cheers,
    Bjoern

    Am 24.09.2014 um 22:05 schrieb David Hoover:
    > Hi,
    >
    > I am developing a tool for STAR
    (https://code.google.com/p/rna-star/), and I realize I may be
    reinventing another wheel.  Has anyone else created a tool for
    STAR?  There's nothing else in the toolsheds for it yet.
    >
    > David
    >
    > --------------------
    > David Hoover, PhD
    > Helix Systems Staff
    > SCB/DCSS/CIT/NIH
    > 301-435-2986
    > http://helix.nih.gov
    >
    >
    >
    >
    >
    > ___________________________________________________________
    > Please keep all replies on the list by using "reply all"
    > in your mail client.  To manage your subscriptions to this and other
    > Galaxy lists, please use the interface at:
    > http://lists.bx.psu.edu/
    >
    > To search Galaxy mailing lists use the unified search at:
    > http://galaxyproject.org/search/mailinglists/
    >
    ___________________________________________________________
    Please keep all replies on the list by using "reply all"
    in your mail client.  To manage your subscriptions to this and
    other Galaxy lists, please use the interface at:
    http://lists.bx.psu.edu/

    To search Galaxy mailing lists use the unified search at:
    http://galaxyproject.org/search/mailinglists/

    ___________________________________________________________
    Please keep all replies on the list by using "reply all"
    in your mail client.  To manage your subscriptions to this
    and other Galaxy lists, please use the interface at:
    http://lists.bx.psu.edu/

    To search Galaxy mailing lists use the unified search at:
    http://galaxyproject.org/search/mailinglists/



___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Reply via email to