Jennifer,
What's the status of bowtie2/mm9 index on PSU main?
When I select tophat2, it offers me mm9 as a choice for built-in indexes.
However, when the job runs, I get the following error, indicating the
bowtie2/mm9 indexes are missing (below).
Any insight into whether this is expected, or what the ETA is until the index
would be installed, would be great.
I'm trying to reproduce work on PSU I ran on my local galaxy, so that we can
link to it for supplemental materials for a paper.
Thanks,
Curtis
PS - I clicked the submit bug button a few days ago, but haven't received a
response yet.
Fatal error: Tool execution failed
[2013-10-29 10:13:27] Beginning TopHat run (v2.0.9)
-----------------------------------------------
[2013-10-29 10:13:27] Checking for Bowtie
Bowtie version: 2.1.0.0
[2013-10-29 10:13:27] Checking for Samtools
Samtools version: 0.1.18.0
[2013-10-29 10:13:27] Checking for Bowtie index files (genome)..
Error: Could not find Bowtie 2 index files
(/galaxy/data/mm9/mm9full/bowtie2_index/mm9full.*.bt2)
From: Jennifer Jackson [mailto:j...@bx.psu.edu]
Sent: Friday, September 20, 2013 4:00 PM
To: Curtis Hendrickson (Campus)
Subject: Re: [galaxy-dev] datacache & bowtie2 for mm9 ?
Thanks Curtis,
I am actually working to try to get mm9 out there right now. No promises, but
is just one (well, three, including variants)! If technical is a go, then will
do it. Ideally others soonish. We'll see.
The last news brief has help for the Data manager, it may be that you need to
do some config changes to get it going. I am certainly no expert - this is
Dan's and under active development - but is where I would start.
Jen
On 9/20/13 1:25 PM, Curtis Hendrickson (Campus) wrote:
Thanks for the rapid reply! I have some questions and comments, but need to
read up on Data Managers (that admin page seems non-functional in our local
galaxy, despite being on latest code) first.
Regards,
Curtis
From: Jennifer Jackson [mailto:j...@bx.psu.edu]
Sent: Friday, September 20, 2013 2:34 PM
To: Curtis Hendrickson (Campus)
Cc: galaxy-...@bx.psu.edu<mailto:galaxy-...@bx.psu.edu>
Subject: Re: [galaxy-dev] datacache & bowtie2 for mm9 ?
Hello Curtis,
The datacache was originally pointed to the data staging area and is now
pointed to the data published area. The difference is that the published area
contains data and location (.loc) files that are in synch and have completed
final testing. It is your choice about whether to use the staged-only data - it
depends how risk tolerant your project is and if you plan on testing. But, that
said, I think it is almost certainly fine or our team wouldn't have staged it
yet. A vanishingly small number of datasets are pulled back once they make it
to staging, and this is why we were comfortable pointing datacache there in the
first place (were unable to point to the published area at first, but wanted to
make the data available ASAP).
Going forward - I can let you know that these indexes are very easy to create:
one command-line execution, then add one line to the associated .loc file.
Instructions are here, see "Bowtie and Tophat":
http://wiki.galaxyproject.org/Admin/NGS%20Local%20Setup
For one or few genomes, not a problem. For hundreds of genomes with variants,
can become tedious even with helper tools and in our case, the processing
interacted with disk that was undergoing changes (as we have been working on
system configuration most of the summer). Also, with the Data Manager is now
available, creating batch indexes for use via rsync become lower priority. Even
so, I would expect more indexes to be fully published once the final
configuration is in place, as many are already staged or close being staged
(watch the yellow banner on Main).
Hopefully this helps to explain the data, guides you to making an informed
decision, and aids with creating your own indexes as needed,
Thanks!
Jen
Galaxy team
On 9/18/13 1:04 PM, Curtis Hendrickson (Campus) wrote:
Folks,
First, I wanted to thank you for making the datacache available
(http://wiki.galaxyproject.org/Admin/Data%20Integration;
rsync://datacache.g2.bx.psu.edu). It's a great resource.
However, what is the best way to stay abreast of changes to what's in
datacache, and understand how these indexes are computed?
We are currently upgrading to bowtie2, but I notice that the bowtie2 indices
for mm9, which used to be in
rsync://datacache.g2.bx.psu.edu/indexes/mm9/mm9*/bowtie2_index
have been removed, and only the hg19 genome has bowtie2 indices. Why only that
one, and not the others?
Where are the scripts you use to make these indices, in case I want to create
bowtie2 indices for other
So, how do I find out *why* they were removed? (Can I safely use the copy I
have, or was there a problem with them?)
More generally, how do I understand the policies and logic behind the datacache
indices, and be notified of changes, short of running my own periodic
rsync/diff?
Finally, since I'm doing "reproducible research" is anything planned for
systematically versioning genome indices, so I can easily tell what version of
a system (ie, what BWA version) was used to create the index, and be sure that
an index will not suddenly disappear.
Thanks,
Curtis
Research Associate/CTSA-Informatics Team
University of Alabama at Birmingham
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/
--
Jennifer Hillman-Jackson
http://galaxyproject.org
--
Jennifer Hillman-Jackson
http://galaxyproject.org
--- Begin Message ---
GALAXY TOOL ERROR REPORT
------------------------
This error report was sent from the Galaxy instance hosted on the server
"usegalaxy.org"
-----------------------------------------------------------------------------
This is in reference to dataset id 6934567 from history id 1653696
-----------------------------------------------------------------------------
You should be able to view the history containing the related history item
26: Tophat2 on data 2, data 13, and data 1: align_summary
by logging in as a Galaxy admin user to the Galaxy instance referenced above
and pointing your browser to the following link.
usegalaxy.org/history/view?id=90532bba2005aa6b
-----------------------------------------------------------------------------
The user 'curt...@uab.edu' provided the following information:
mm9 was on the available genomes list, but index files appear to be missing.
Easy fix?
Thanks!
Curtis
-----------------------------------------------------------------------------
job id: 5977429
tool id: tophat2
job pid or drm id: 18211
-----------------------------------------------------------------------------
job command line:
tophat2 --num-threads 8 --read-mismatches 2
--read-edit-dist 2 --read-realign-edit-dist 2
-a 8 -m 0 -i 70 -I 500000
-g 20 --min-segment-intron 50 --max-segment-intron
500000 --segment-mismatches 2 --segment-length 25
--library-type fr-unstranded
--max-insertion-length 3 --max-deletion-length 3
-G /galaxy-repl/main/files/006/934/dataset_6934512.dat
--no-coverage-search
-r 200 --mate-std-dev=20
/galaxy/data/mm9/mm9full/bowtie2_index/mm9full
/galaxy-repl/main/files/006/934/dataset_6934487.dat
/galaxy-repl/main/files/006/934/dataset_6934488.dat
-----------------------------------------------------------------------------
job stderr:
Fatal error: Tool execution failed
[2013-10-24 21:33:44] Beginning TopHat run (v2.0.9)
-----------------------------------------------
[2013-10-24 21:33:44] Checking for Bowtie
Bowtie version: 2.1.0.0
[2013-10-24 21:33:44] Checking for Samtools
Samtools version: 0.1.18.0
[2013-10-24 21:33:44] Checking for Bowtie index files (genome)..
Error: Could not find Bowtie 2 index files
(/galaxy/data/mm9/mm9full/bowtie2_index/mm9full.*.bt2)
-----------------------------------------------------------------------------
job stdout:
-----------------------------------------------------------------------------
job info:
None
-----------------------------------------------------------------------------
job traceback:
None
-----------------------------------------------------------------------------
(This is an automated message).
--- End Message ---
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/