Hi Greg, Thanks for the response. The FAQ on GTF format doesn't answer my question. As you suggested, if I select "All fields from selected table", the output format is only in txt, not GTF. I really need GTF format with both Transcript_ID and Gene_name there. I combine the two outputs from USCS refseq table and refFlat table, it includes all information I need, it looks like this:
chr1 protein_coding CDS 67162933 67163102 0.000000 - 0 gene_id "NM_207014"; transcript_id "NM_207014"; gene_name "WDR78"; chr1 protein_coding start_codon 67163100 67163102 0.000000 - . gene_id "NM_207014"; transcript_id "NM_207014"; gene_name "WDR78"; chr1 protein_coding exon 67162933 67163158 0.000000 - . gene_id "NM_207014"; transcript_id "NM_207014"; gene_name "WDR78"; chr1 protein_coding stop_codon 58719225 58719227 0.000000 - . gene_id "NM_145243"; transcript_id "NM_145243"; gene_name "OMA1"; chr1 protein_coding CDS 58719228 58719434 0.000000 - 0 gene_id "NM_145243"; transcript_id "NM_145243"; gene_name "OMA1"; Unfortunately it doesn't work when I tried to use it on the analysis. Do you have any other suggestion? Thanks, Li On 7/11/11 6:35 PM, "Greg Roe" <[email protected]> wrote: >Hi Li, > >Please see this section of our help describing the GTF file format: >http://genome.ucsc.edu/FAQ/FAQformat.html#format4. > >If you want generate the data exactly like the table schema, for the >output format in the Table Browser, select "All fields from selected >table". > >Please let us know if you have any additional questions: >[email protected] > >- >Greg Roe >UCSC Genome Bioinformatics Group > > > >On 7/11/11 1:13 PM, Jia, Li (NIH/NCI) [C] wrote: >> Hi, >> >> I am using table browser working on generating annotation GTF format. >>After selecting assembly of interest select: >> >> group: Genes and Gene Prediction Tracks >> track: refSeq Gene >> table: refFlat >> output format: "GTF"--Gene transfer format >> >> then give the name and output the GTF file. >> >> My question is that my output refFlat.GTF is not exactly same as the >>described table schema. In table schema, output format is as follows: >> >> geneName LOC100288778 >> Name NR_028269 >> chrom chr1 >> strand - >> txStart 4224 >> txEnd 7502 >> cdsStart 7502 >> cdsEnd 7502 >> exonCount 7 >> exonStarts 4224,4832,5658,6469,6719,70... >> exonEnds 4692,4901,5810,6631,6918,72... >> >> but my output file is: >> chr1 hg18_refFlat exon 14601 14754 0.000000 - . >>gene_id "WASH7P"; transcript_id "WASH7P"; >> chr1 hg18_refFlat exon 19184 19233 0.000000 - . >>gene_id "WASH7P"; transcript_id "WASH7P"; >> chr1 hg18_refFlat exon 24474 25037 0.000000 - . >>gene_id "FAM138A"; transcript_id "FAM138A"; >> chr1 hg18_refFlat exon 25140 25344 0.000000 - . >>gene_id "FAM138A"; transcript_id "FAM138A"; >> >> it has GeneName (gene_id), but there is no trancript_id (in the output, >>it is same as gene_id). In the example schema, Name should be >>transcript_id? >> >> How do I generate the table exactly like the table schema? >> >> Thanks, >> Li >> _______________________________________________ >> Genome maillist - [email protected] >> https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
