Hi Greg,

Thanks for the response. The FAQ on GTF format doesn't answer my question.
As you suggested, if I select "All fields from selected table", the output
format is only in txt, not GTF. I really need GTF format with both
Transcript_ID and Gene_name there. I combine the two outputs from USCS
refseq table and refFlat table, it includes all information I need, it
looks like this:

chr1 protein_coding CDS 67162933  67163102  0.000000 - 0  gene_id
"NM_207014"; transcript_id "NM_207014"; gene_name "WDR78";
chr1 protein_coding start_codon 67163100  67163102  0.000000 - .  gene_id
"NM_207014"; transcript_id "NM_207014"; gene_name "WDR78";
chr1 protein_coding exon 67162933 67163158  0.000000 - .  gene_id
"NM_207014"; transcript_id "NM_207014"; gene_name "WDR78";
chr1 protein_coding stop_codon 58719225   58719227  0.000000 - .  gene_id
"NM_145243"; transcript_id "NM_145243"; gene_name "OMA1";
chr1 protein_coding CDS 58719228  58719434  0.000000 - 0  gene_id
"NM_145243"; transcript_id "NM_145243"; gene_name "OMA1";


Unfortunately it doesn't work when I tried to use it on the analysis.

Do you have any other suggestion?

Thanks,
Li

On 7/11/11 6:35 PM, "Greg Roe" <[email protected]> wrote:

>Hi Li,
>
>Please see this section of our help describing the GTF file format:
>http://genome.ucsc.edu/FAQ/FAQformat.html#format4.
>
>If you want generate the data exactly like the table schema, for the
>output format in the Table Browser, select "All fields from selected
>table".
>
>Please let us know if you have any additional questions:
>[email protected]
>
>-
>Greg Roe
>UCSC Genome Bioinformatics Group
>
>
>
>On 7/11/11 1:13 PM, Jia, Li (NIH/NCI) [C] wrote:
>> Hi,
>>
>> I am using table browser working on generating annotation GTF format.
>>After selecting assembly of interest select:
>>
>> group: Genes and Gene Prediction Tracks
>> track: refSeq Gene
>> table: refFlat
>> output format: "GTF"--Gene transfer format
>>
>> then give the name and output the GTF file.
>>
>> My question is that my output refFlat.GTF is not exactly same as the
>>described table schema. In table schema, output format is as follows:
>>
>> geneName     LOC100288778
>> Name             NR_028269
>> chrom            chr1
>> strand             -
>> txStart             4224
>> txEnd           7502
>> cdsStart             7502
>> cdsEnd             7502
>> exonCount     7 
>> exonStarts             4224,4832,5658,6469,6719,70...
>> exonEnds             4692,4901,5810,6631,6918,72...
>>
>> but my output file is:
>> chr1    hg18_refFlat    exon    14601    14754    0.000000    -    .
>>gene_id "WASH7P"; transcript_id "WASH7P";
>> chr1    hg18_refFlat    exon    19184    19233    0.000000    -    .
>>gene_id "WASH7P"; transcript_id "WASH7P";
>> chr1    hg18_refFlat    exon    24474    25037    0.000000    -    .
>>gene_id "FAM138A"; transcript_id "FAM138A";
>> chr1    hg18_refFlat    exon    25140    25344    0.000000    -    .
>>gene_id "FAM138A"; transcript_id "FAM138A";
>>
>> it has GeneName (gene_id), but there is no trancript_id (in the output,
>>it is same as gene_id). In the example schema, Name should be
>>transcript_id?
>>
>> How do I generate the table exactly like the table schema?
>>
>> Thanks,
>> Li
>> _______________________________________________
>> Genome maillist  -  [email protected]
>> https://lists.soe.ucsc.edu/mailman/listinfo/genome


_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to