Hi Nirmala, That is the problem then. MISO has no way of knowing which transcripts go with what genes without this unit. The GFF3 format is hierarchical describes genes as parent nodes that have "mRNA" entries as their children, with each "mRNA" entry having "exon" nodes as children. The "transcript" entry can be a substitute for "mRNA", but "gene" entries are required.
Best, Yarden On Jul 20, 2015, at 11:50 AM, Akula, Nirmala (NIH/NIMH) [C] <[email protected]> wrote: > Hi, > > The GFF file has only "transcript and "exon" entries. No "genes". > > Thanks, > Nirmala > > -----Original Message----- > From: Yarden Katz [mailto:[email protected]] On Behalf Of Yarden Katz > Sent: Friday, July 17, 2015 5:55 PM > To: Akula, Nirmala (NIH/NIMH) [C] > Cc: [email protected] > Subject: Re: [miso-users] problem with GFF file > > Hi, > > Does your GFF file contain "gene" entries, or just "transcript" and "exon" > entries? > > The "gene" entries are used to determine genes. > > Yarden > > On Jul 17, 2015, at 5:16 PM, Akula, Nirmala (NIH/NIMH) [C] > <[email protected]> wrote: > >> Hi, >> >> I converted GTF file (generated by Cufflinks) to GFF3 format using Cufflinks >> using the following command: >> >> gffread -E merged.gtf -o- >merged_gtfToGff3.gff3 >> >> When I try to index merged_gtfToGff3.gff3 file using MISO I see that 0 genes >> were loaded and genes.gff file is 0. >> >> [akulan@helix stringtieGtfs_v1-0-3_cuffmerged]$ index_gff --index >> merged_gtfToGff3.gff3 mergedIndexedGff/ >> Indexing GFF... >> - GFF: /akulan/merged_gtfToGff3.gff3 >> - Outputting to: / akulan/mergedIndexedGff >> Loaded 0 genes >> - Loading of genes from GFF took 199.95 seconds >> Outputting gene records in GFF format... >> - Output file: >> /gpfs/gsfs4/users/akulan/transcriptome/stringtie/stringtieGtfs_v1-0-3_cuffmerged/mergedIndexedGff/genes.gff >> - Serialization of genes from GFF took 13.24 seconds >> Indexing of GFF took 213.18 seconds. >> >> Here are the top 5 lines from the gff3 file >> # gffread -E merged.gtf -o- >> ##gff-version 3 >> chr1 Cufflinks transcript 11869 14409 . + . >> ID=TCONS_00000001;geneID=XLOC_000001;gene_name=DDX11L1 >> chr1 Cufflinks exon 11869 12227 . + . >> Parent=TCONS_00000001 >> chr1 Cufflinks exon 12613 12721 . + . >> Parent=TCONS_00000001 >> chr1 Cufflinks exon 13221 14409 . + . >> Parent=TCONS_00000001 >> chr1 Cufflinks transcript 11869 29022 . + . >> ID=TCONS_00000002;geneID=XLOC_000001;gene_name=DDX11L1 >> >> Any suggestions as to why the genes are not loaded from the gff file would >> be really helpful. >> >> Thank you very much. >> >> Regards, >> Nirmala >> _______________________________________________ >> miso-users mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/miso-users >
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ miso-users mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/miso-users
