Hi Katrina,

Thanks for your response. I think I was just confused because you had a 
typo in your message that said "dumping the files in gtf format".

I tested the command you gave me, which actually produces the same 
output as the command I used previously with gtfToGenePred (except with 
an index column at the beginning). Consequently, several transcripts are 
still missing compared to what is available on the genome browser for 
wgEncodeGencodeAutoV4.gtf.gz (described in my original email).

Thanks,
Pouya

On 08/30/2010 11:13 AM, Katrina Learned wrote:
> Hi Pouya,
>
> I am sorry if my answer wasn't clear. At this time, we do not have these
> files in genePred format available for download. It is, however,
> something our management is considering. The command provided in my
> previous email should convert the files into genePred format.
>
> In the future, please direct your questions to the genome mailing list
> at gen...@soe.ucsc.edu -- our moderated forum for user questions and
> discussion. You will likely get a quicker response to your question.
>
> Katrina Learned
> UCSC Genome Bioinformatics Group
>
> Pouya Kheradpour wrote, On 08/28/10 10:36:
>> Hi Katrina,
>>
>> I will look at that command, thanks. I agree about downloading and
>> working with files locally, but where can I download the raw file in
>> gp format (that is what I want... not gtf).
>>
>> Thanks!
>> Pouya
>>
>> On 08/27/2010 07:03 PM, Katrina Learned wrote:
>>> Hi Pouya,
>>>
>>> Please note that this Gencode Genes V4 is has not been through our QA
>>> process, yet. So please keep that in mind as you are using the data.
>>> One of our engineers has suggested that this command should work for
>>> you:
>>>
>>> gunzip -c wgEncodeGencodeAutoV4.gtf.gz | ldHgGene-gtf -genePredExt hg19
>>> wgEncodeGencodeAutoV4 stdin -out= wgEncodeGencodeAutoV4.genePred
>>>
>>> Here's the information on -out from ldHgGene usage statement:
>>> -out=gpfile write output, in genePred format, instead of loading table.
>>> Database is ignored.
>>>
>>> Regarding your question about the table browser being fixed, for large
>>> files like this we feel that it is actually better to download them and
>>> work with them locally. However, I have asked our management is consider
>>> dumping the files in gtf format on our download server as we do with
>>> many of our other tables.
>>>
>>> Please don't hesitate to contact the mail list again if you have any
>>> further questions.
>>>
>>> Katrina Learned
>>> UCSC Genome Bioinformatics Group
>>>
>>> Pouya Kheradpour wrote, On 08/24/10 14:08:
>>>> My goal is to have both wgEncodeGencodeManualV4 and
>>>> wgEncodeGencodeAutoV4 in GenePred format.
>>>>
>>>> I tried to download the wgEncodeGencodeManualV4 table from the test
>>>> browser. For some reason when downloading it gets stuck after
>>>> downloading and the file is cutoff (chromosomes 17-22 are completely
>>>> missing, chromosome 16 is there partially; after exactly 409600b =
>>>> 400kb). This happens reproducibly across multiple
>>>> computers/networks/operating systems. I also want the
>>>> wgEncodeGencodeAutoV4, which appears to download ok.
>>>>
>>>> I have had this sort of problem before (where downloads from the table
>>>> browser would get stuck). I am not sure what causes them. Is there a
>>>> url from which the data can always be downloaded in flat files?
>>>>
>>>> I can also download the gtf version of these files from:
>>>>
>>>> http://hgdownload-test.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeGencode/wgEncodeGencode{Auto,Manual}V4.gtf.gz
>>>>
>>>>
>>>>
>>>> But when I try to convert the AutoV4 file I get several errors:
>>>>
>>>> gunzip -c wgEncodeGencodeAutoV4.gtf.gz | gtfToGenePred -allErrors
>>>> -genePredExt /dev/stdin /dev/stdout
>>>>
>>>> ... [snip]
>>>> no exons defined for 93876
>>>> no exons defined for 93875
>>>> no exons defined for 93874
>>>> no exons defined for 115098
>>>> no exons defined for 27940
>>>> no exons defined for 29602
>>>> no exons defined for 29603
>>>> no exons defined for 10879
>>>> 622 errors
>>>>
>>>> and these genes are missing from the final output (although they are
>>>> present in the wgEncodeGencodeAutoV4 I download from the test browser).
>>>>
>>>> I was wondering what the command used to convert the gtf files above
>>>> to GenePred actually was. Also, can the table browser be repaired?
>>>>
>>>> Right now I am using wgEncodeGencodeAutoV4 from the table browser and
>>>> wgEncodeGencodeManualV4 converted with gtfToGenePred, but it would be
>>>> nice to have a more consistent way to set it all up.
>>>>
>>>> Thanks,
>>>> Pouya
>>>>
>>>> _______________________________________________
>>>> Genome maillist - Genome@lists.soe.ucsc.edu
>>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>>>
>>
>

_______________________________________________
Genome maillist  -  Genome@lists.soe.ucsc.edu
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to