Re: [galaxy-dev] Trackster and gff file with multiple chromosome annotations

2012-10-31 Thread Yec'han Laizet

I will modify the gff file as you mentioned and update galaxy.

Thanks a lot.

Yec'han




Yec'han LAIZET
Ingenieur
Plateforme Genome Transcriptome
Tel: 05 57 12 27 75
_
INRA-UMR BIOGECO 1202
Equipe Genetique
69 route d'Arcachon
33612 CESTAS


Le 29/10/2012 15:59, Jeremy Goecks a écrit :
Whatever the file type I set for the gff file (gff3, gff or gtf), I 
get the transcript_id error:


Traceback (most recent call last):
 File 
/home/pgtgal/galaxy-dist/lib/galaxy/datatypes/converters/interval_to_fli.py, 
line 91, in

   main()
 File 
/home/pgtgal/galaxy-dist/lib/galaxy/datatypes/converters/interval_to_fli.py, 
line 30, in main

   for feature in read_unordered_gtf( open( in_fname, 'r' ) ):
 File 
/home/pgtgal/galaxy-dist/lib/galaxy/datatypes/util/gff_util.py, 
line 375, in read_unordered_gtf

   transcript_id = line_attrs[ 'transcript_id' ]
KeyError: 'transcript_id'


This was due to an incomplete feature. Turns out that GFF support 
hadn't been included in feature search; I've added it in -central 
changeset fa045aad74e9:


https://bitbucket.org/galaxy/galaxy-central/changeset/fa045aad74e90f16995e0cbb670a59e6b9becbed


Is the gff file not correct?


I believe there is an issue with your GFF: it is using non-standard 
identifiers in the attributes (last) column. To the best of my 
knowledge, 'name' is not a valid field for connecting features in GFF3 
(which is my best guess for the file version), but your GFF uses this 
field anyways.


To fix this issue, I replaced 'name' with 'ID' (which is compliant 
GFF3) from the command line:


--
% sed s/name/ID/ ~/Downloads/test.gff  ~/Downloads/test_with_ids.gff
--

and this fixed the issue.

Finally, there is a sed wrapper in the toolshed should you want to do 
this conversion in Galaxy:


http://toolshed.g2.bx.psu.edu/repository/browse_categories?sort=nameoperation=view_or_manage_repositoryf-deleted=Falsef-free-text-search=sedid=9652a50c5a932f3e

Best,
J.


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] Trackster and gff file with multiple chromosome annotations

2012-10-29 Thread Jeremy Goecks
 Whatever the file type I set for the gff file (gff3, gff or gtf), I get the 
 transcript_id error:
 
 Traceback (most recent call last):
  File 
 /home/pgtgal/galaxy-dist/lib/galaxy/datatypes/converters/interval_to_fli.py,
  line 91, in
main()
  File 
 /home/pgtgal/galaxy-dist/lib/galaxy/datatypes/converters/interval_to_fli.py,
  line 30, in main
for feature in read_unordered_gtf( open( in_fname, 'r' ) ):
  File /home/pgtgal/galaxy-dist/lib/galaxy/datatypes/util/gff_util.py, line 
 375, in read_unordered_gtf
transcript_id = line_attrs[ 'transcript_id' ]
 KeyError: 'transcript_id'

This was due to an incomplete feature. Turns out that GFF support hadn't been 
included in feature search; I've added it in -central changeset fa045aad74e9:

https://bitbucket.org/galaxy/galaxy-central/changeset/fa045aad74e90f16995e0cbb670a59e6b9becbed

 Is the gff file not correct?

I believe there is an issue with your GFF: it is using non-standard identifiers 
in the attributes (last) column. To the best of my knowledge, 'name' is not a 
valid field for connecting features in GFF3 (which is my best guess for the 
file version), but your GFF uses this field anyways.

To fix this issue, I replaced 'name' with 'ID' (which is compliant GFF3) from 
the command line:

--
% sed s/name/ID/ ~/Downloads/test.gff  ~/Downloads/test_with_ids.gff
--

and this fixed the issue. 

Finally, there is a sed wrapper in the toolshed should you want to do this 
conversion in Galaxy:

http://toolshed.g2.bx.psu.edu/repository/browse_categories?sort=nameoperation=view_or_manage_repositoryf-deleted=Falsef-free-text-search=sedid=9652a50c5a932f3e

Best,
J.___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Trackster and gff file with multiple chromosome annotations

2012-10-24 Thread Yec'han Laizet

Here are the links to get the gff and the related genome files:

http://genomeportal.jgi-psf.org/Crypa2/download/Cparasiticav2.GeneCatalog20091217.gff.gz

http://genomeportal.jgi-psf.org/Crypa2/download/Cryphonectria_parasiticav2.nuclearAssembly.unmasked.gz

Whatever the file type I set for the gff file (gff3, gff or gtf), I get 
the transcript_id error:


Traceback (most recent call last):
  File 
/home/pgtgal/galaxy-dist/lib/galaxy/datatypes/converters/interval_to_fli.py, 
line 91, in

main()
  File 
/home/pgtgal/galaxy-dist/lib/galaxy/datatypes/converters/interval_to_fli.py, 
line 30, in main

for feature in read_unordered_gtf( open( in_fname, 'r' ) ):
  File 
/home/pgtgal/galaxy-dist/lib/galaxy/datatypes/util/gff_util.py, line 
375, in read_unordered_gtf

transcript_id = line_attrs[ 'transcript_id' ]
KeyError: 'transcript_id'

If I fix the transcript_id problem, I get the other error:

Traceback (most recent call last):
 File ~/galaxy-dist/lib/galaxy/datatypes/converters/interval_to_fli.py, line 
91, in
   main()
 File ~/galaxy-dist/lib/galaxy/datatypes/converters/interval_to_fli.py, line 
30, in main
   for feature in read_unordered_gtf( open( in_fname, 'r' ) ):
 File ~/galaxy-dist/lib/galaxy/datatypes/util/gff_util.py, line 389, in 
read_unordered_gtf
   feature = GFFFeature( None, intervals=intervals )
 File ~/galaxy-dist/lib/galaxy/datatypes/util/gff_util.py, line 65, in 
__init__
   ( interval.chrom, self.chrom ) )
ValueError: interval chrom does not match self chrom: scaffold_10 != scaffold_10


Is the gff file not correct?

PS : I use the galaxy changeset : 7828:b5bda7a5c345

Yec'han




Yec'han LAIZET
Ingenieur
Plateforme Genome Transcriptome
Tel: 05 57 12 27 75
_
INRA-UMR BIOGECO 1202
Equipe Genetique
69 route d'Arcachon
33612 CESTAS


Le 23/10/2012 18:37, Jeremy Goecks a écrit :

Yes, you should be able to use a single GFF for the complete genome.

This error stems from the same issue as before, namely that Galaxy is treating 
your GFF file as GTF.

If you think your GFF is well formatted and there is an issue with Galaxy's 
handling of GFF, please send me your GFF and I'll take a look.

Best,
J.

On Oct 23, 2012, at 9:24 AM, Yec'han Laizet wrote:


Hello,

Is it possible to load a unique gff file with the annotations of several 
chromosomes for my custom build in one step (one gff file)?

With the current version of galaxy, it seems that I can load a gff file 
referring only to one chromosome. That's pretty tedious to load 43 gff files 
separatly for my custom build...

If I try, I get this error:

Traceback (most recent call last):
  File ~/galaxy-dist/lib/galaxy/datatypes/converters/interval_to_fli.py, line 
91, in
main()
  File ~/galaxy-dist/lib/galaxy/datatypes/converters/interval_to_fli.py, line 
30, in main
for feature in read_unordered_gtf( open( in_fname, 'r' ) ):
  File ~/galaxy-dist/lib/galaxy/datatypes/util/gff_util.py, line 389, in 
read_unordered_gtf
feature = GFFFeature( None, intervals=intervals )
  File ~/galaxy-dist/lib/galaxy/datatypes/util/gff_util.py, line 65, in 
__init__
( interval.chrom, self.chrom ) )
ValueError: interval chrom does not match self chrom: SAGS2 != SAGS1

Thanks

Yec'han




Yec'han LAIZET
Ingenieur
Plateforme Genome Transcriptome
Tel: 05 57 12 27 75
_
INRA-UMR BIOGECO 1202
Equipe Genetique
69 route d'Arcachon
33612 CESTAS


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

http://lists.bx.psu.edu/





___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

 http://lists.bx.psu.edu/


Re: [galaxy-dev] Trackster and gff file with multiple chromosome annotations

2012-10-23 Thread Jeremy Goecks
Yes, you should be able to use a single GFF for the complete genome.

This error stems from the same issue as before, namely that Galaxy is treating 
your GFF file as GTF. 

If you think your GFF is well formatted and there is an issue with Galaxy's 
handling of GFF, please send me your GFF and I'll take a look.

Best,
J.

On Oct 23, 2012, at 9:24 AM, Yec'han Laizet wrote:

 Hello,
 
 Is it possible to load a unique gff file with the annotations of several 
 chromosomes for my custom build in one step (one gff file)?
 
 With the current version of galaxy, it seems that I can load a gff file 
 referring only to one chromosome. That's pretty tedious to load 43 gff files 
 separatly for my custom build...
 
 If I try, I get this error:
 
 Traceback (most recent call last):
  File ~/galaxy-dist/lib/galaxy/datatypes/converters/interval_to_fli.py, 
 line 91, in
main()
  File ~/galaxy-dist/lib/galaxy/datatypes/converters/interval_to_fli.py, 
 line 30, in main
for feature in read_unordered_gtf( open( in_fname, 'r' ) ):
  File ~/galaxy-dist/lib/galaxy/datatypes/util/gff_util.py, line 389, in 
 read_unordered_gtf
feature = GFFFeature( None, intervals=intervals )
  File ~/galaxy-dist/lib/galaxy/datatypes/util/gff_util.py, line 65, in 
 __init__
( interval.chrom, self.chrom ) )
 ValueError: interval chrom does not match self chrom: SAGS2 != SAGS1
 
 Thanks
 
 Yec'han
 
 
 
 
 Yec'han LAIZET
 Ingenieur
 Plateforme Genome Transcriptome
 Tel: 05 57 12 27 75
 _
 INRA-UMR BIOGECO 1202
 Equipe Genetique
 69 route d'Arcachon
 33612 CESTAS
 
 
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
 http://lists.bx.psu.edu/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/