Thanks Jennifer,
Since I did have an error when I run tophat2 with, as reference, a fasta
from my history, so I modified line 105 of tophat wrapper
(bowtie2-build instead of bowtie-build in command line).
Now "Tophat for Illumina Find splice junctions using RNA-seq data " runs
without error.
Thank you again for your help,
Sarah
Jennifer Jackson a écrit :
Hi Sarah,
On 4/11/13 8:02 AM, Sarah Maman wrote:
Thnaks Jennifer,
Excuse me, my previous mail contains an error : In fact, the
reference genome from my history was a fasta format (the name was
GTF file but the format was fasta...).
So, when I run tophat with a reference genome from your history, here
is the error message (my reference genome is a FASTA file) :
OK, now this looks like a tool/index mismatch problem. Most likely
rooted in a binary path issue.
Error in tophat:
[2013-04-11 14:57:12] Beginning TopHat run (v2.0.5)
-----------------------------------------------
[2013-04-11 14:57:12] Checking for Bowtie
Bowtie version: 2.0.0.7
[2013-04-11 14:57:12] Checking for Samtools
Samtools version: 0.1.19.0
[2013-04-11 14:57:13] Checking for Bowtie index files
Error: Could not find Bowtie 2 index files
(/tmp/1078173.1.workq/tmpzxEFNK/dataset_6485.*.bt2)
Settings:
blablabla OK.....
Total time for backward call to driver() for mirror index: 00:00:57
TopHat v2.0.5
tophat -p 4 /tmp/1078173.1.workq/tmpzxEFNK/dataset_6485
/work/galaxy/database/files/006/dataset_6528.dat
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[bam_index_core] Invalid BAM header.[bam_index_build2] fail to index the BAM
file.
Epilog : job finished at jeu. avril 11 14:57:18 CEST 2013
And here is my bowtie and tophat versions :
$ which bowtie
bowtie -v
0.12.8
This is good
$ which tophat
tophat -v
2.0.5
This is most likely the problem. There is probably a symbolic link
pointing from tophat -> tophat2. You will want to remove that. The
tool wrappers will be looking for the correct binary/indexes for the
version they are each dependent on. This means that if you are running
Tophat2 for Illumina, you want both tophat2 and bowtie2 to be used,
along with the bowtie2 indexes. This is detailed in the wikis I sent
in the Tophat/Bowtie sections, for both dependencies and the index set up.
My guess at this point, without seeing your exact files, is that you
need to add the index path to the bowtie2 loc file, and remove/adjust
the symbolic link as I stated above, then restart, and test again to
see if that fixes the problem.
But we have also, available on our cluster :
bowtie2 --version is 2.0.0-beta7
This is also good, if "which bowtie" == v0.12.8 and "which bowtie2" == v2.
If "bowtie" is pointing to v2 on your cluster nodes, then remove that
symbolic link, so that this is instead pointing to the correct binary
(v0.12.8). same for bowtie2, should point to the v2 binary.
bowtie/tophat and bowtie2/tophat2 are not the same executable and use
different indexes - this is most likely why you had to use the bowtie
v0.12.8 loc to get tophat2 going to begin with.
Hope it works this time! Please keep replies on the list to help us
with tracking,
Jen
Galaxy team
Could you please tell me how to point to the v2 binaries (how to
change symbolic links) ?
Thanks in advance,
Sarah
Jennifer Jackson a écrit :
Hi Sarah,
It still sounds like there is a path problem - this is why the tools
are looking in the wrong loc file. When bowtie2/tophat2 installs, it
will create a symbolic link that names itself as just "bowtie" or
"tophat", pointing to the v2 binaries.
When you run these, what do you get?
$ which bowtie
$ which tophat
My guess is that these are symbolic links pointing to the v2
binaries. You will want to remove these. This is noted in the NGS
set-up wiki, but easy to miss.
For the custom _reference genome _portion below, there is a mix-up
here. A custom _reference genome_ is in fasta format, not GTF
format. I think what you are doing is using a _reference annotation_
file with the process. Both can be used with RNA-seq tools, but the
_reference genome_ is the one with the indexes. The link I sent
about _custom reference genomes_ explains how to use one of these,
if you still what want to try.
I think it is worth reviewing the path and loc info, plus the binary
commands above. Unless this helps you to solve the problem on your
own now.
Thanks!
Jen
Galaxy team
On 4/11/13 6:16 AM, Sarah Maman wrote:
Thanks a lot Jennifer,
Restart, full paths were OK.
I don't know why but the 2nd version of Tophat (so the tophat tool
available from Galaxy) search indexs in bowtie-index.loc file and
not in bowtie2-index.loc
So, I've added my bowtie2 index paths in bowtie-index. loc file and
tophat run...
But when I want to run tophat with a reference genome from your
history, here is the error message (my reference genome is a GFT
file) :
Error in tophat:
[2013-04-11 14:57:12] Beginning TopHat run (v2.0.5)
-----------------------------------------------
[2013-04-11 14:57:12] Checking for Bowtie
Bowtie version: 2.0.0.7
[2013-04-11 14:57:12] Checking for Samtools
Samtools version: 0.1.19.0
[2013-04-11 14:57:13] Checking for Bowtie index files
Error: Could not find Bowtie 2 index files
(/tmp/1078173.1.workq/tmpzxEFNK/dataset_6485.*.bt2)
Settings:
blablabla OK.....
Total time for backward call to driver() for mirror index: 00:00:57
TopHat v2.0.5
tophat -p 4 /tmp/1078173.1.workq/tmpzxEFNK/dataset_6485
/work/galaxy/database/files/006/dataset_6528.dat
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[bam_index_core] Invalid BAM header.[bam_index_build2] fail to index the BAM
file.
Epilog : job finished at jeu. avril 11 14:57:18 CEST 2013
Thanks in advance,
Sarah
Jennifer Jackson a écrit :
Hi Sarah,
Let's try to sort this out. Your problem does not seem to be the
same as in the question referenced, but we can see. First - just
to double check - since setting up the genome, you have restarted
the server? If not, do that first and check to see if that fixes
the problem. Basically, you want to follow this checklist and
restarting is the final step:
http://wiki.galaxyproject.org/Admin/NGS%20Local%20Setup
If the problem persists, then would you please send a few more
details:
1 - full paths* on you system where you keep the .bt2 indexes, sam
index, and .fa file. Maybe do an "ls -l" on these dirs so we can
check the symbolic links are in place and named correctly.
* as a note, these should be "hard paths" and not symbolic (except
for the .fa links), and must have permissions set to be accessible
to the "galaxy user"
2 - lines from your bowtie2_indices.loc and sam_fa_indices.loc
file for this genome. I may have you double check your builds.txt
file later. if this doesn't sounds familiar, it could be the
problem, the genome must be in there, too. - see this wiki:
http://wiki.galaxyproject.org/Admin/Data%20Integration
3 - full error message you get when you try to run this using a
genome in fasta format from your history. It really shouldn't be
the same error - something is not right with the settings and a
custom genome is not actually being used if that is the case. Give
it another try and see what happens, then send that info. This is
a bit of a side case, we should get your basic install correct,
but knowing how to do this is a good thing and easy to learn.
http://wiki.galaxyproject.org/Support#Custom_reference_genome
It is OK to masked out anything like user names/groups you don't
want to share. Please keep on the list in case we need other feedback.
Thanks!
Jen
Galaxy team
On 4/10/13 3:15 AM, Sarah Maman wrote:
Hello,
When I run tophat ("Tophat for Illumina Find splice junctions
using RNA-seq data ), the job failed with truncated files.
However, index files are available and I get exactly the same
error message using built-in index or one of my history.
/
Tool execution generated the following error message:
Error in tophat:
[2013-04-10 09:17:07] Beginning TopHat run (v2.0.5)
-----------------------------------------------
[2013-04-10 09:17:07] Checking for Bowtie
Bowtie version: 2.0.0.7
[2013-04-10 09:17:07] Checking for Samtools
Samtools version: 0.1.19.0
[2013-04-10 09:17:07] Checking for Bowtie index files
Error: Could not find Bowtie 2 index files
(/work/galaxy/Danio_rerio.Zv9.62.dna.chromosome.22.fa.*.bt2)
The tool produced the following additional output:
TopHat v2.0.5
tophat -p 4 /work/galaxy/Danio_rerio.Zv9.62.dna.chromosome.22.fa
/work/galaxy/database/files/006/dataset_6528.dat
[bam_header_read] EOF marker is absent. The input is probably
truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM
file).
[bam_index_core] Invalid BAM header.[bam_index_build2] fail to
index the BAM file.
Epilog : job finished at mer. avril 10 09:17:12 CEST 2013 /
In this post
(http://dev.list.galaxyproject.org/tophat-for-illumina-looking-in-wrong-directory-for-bowtie2-indexes-tt4658609.html#none),
the solution isn't found.
Do you have any idea,
Sarah Maman
--
--*--
Sarah Maman
INRA - LGC - SIGENAE
http://www.sigenae.org/
Chemin de Borde-Rouge - Auzeville - BP 52627
31326 Castanet-Tolosan cedex - FRANCE
Tel: +33(0)5.61.28.57.08
Fax: +33(0)5.61.28.57.53
--*--
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/
--
Jennifer Hillman-Jackson
Galaxy Support and Training
http://galaxyproject.org
--
--*--
Sarah Maman
INRA - LGC - SIGENAE
http://www.sigenae.org/
Chemin de Borde-Rouge - Auzeville - BP 52627
31326 Castanet-Tolosan cedex - FRANCE
Tel: +33(0)5.61.28.57.08
Fax: +33(0)5.61.28.57.53
--*--
--
Jennifer Hillman-Jackson
Galaxy Support and Training
http://galaxyproject.org
--
--*--
Sarah Maman
INRA - LGC - SIGENAE
http://www.sigenae.org/
Chemin de Borde-Rouge - Auzeville - BP 52627
31326 Castanet-Tolosan cedex - FRANCE
Tel: +33(0)5.61.28.57.08
Fax: +33(0)5.61.28.57.53
--*--
--
Jennifer Hillman-Jackson
Galaxy Support and Training
http://galaxyproject.org
--
--*--
Sarah Maman
INRA - LGC - SIGENAE
http://www.sigenae.org/
Chemin de Borde-Rouge - Auzeville - BP 52627
31326 Castanet-Tolosan cedex - FRANCE
Tel: +33(0)5.61.28.57.08
Fax: +33(0)5.61.28.57.53
--*--
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/