Hi Pouya,

Please note that this Gencode Genes V4 is has not been through our QA 
process, yet. So please keep that in mind as you are using the data.
One of our engineers has suggested that this command should work for you:

gunzip -c wgEncodeGencodeAutoV4.gtf.gz | ldHgGene-gtf -genePredExt hg19 
wgEncodeGencodeAutoV4 stdin -out= wgEncodeGencodeAutoV4.genePred

Here's the information on -out from ldHgGene usage statement:
-out=gpfile write output, in genePred format, instead of loading table. 
Database is ignored.

Regarding your question about the table browser being fixed, for large 
files like this we feel that it is actually better to download them and 
work with them locally.  However, I have asked our management is 
consider dumping the files in gtf format on our download server as we do 
with many of our other tables.

Please don't hesitate to contact the mail list again if you have any 
further questions.

Katrina Learned
UCSC Genome Bioinformatics Group

Pouya Kheradpour wrote, On 08/24/10 14:08:
> My goal is to have both wgEncodeGencodeManualV4 and 
> wgEncodeGencodeAutoV4 in GenePred format.
>
> I tried to download the wgEncodeGencodeManualV4 table from the test 
> browser. For some reason when downloading it gets stuck after 
> downloading and the file is cutoff (chromosomes 17-22 are completely 
> missing, chromosome 16 is there partially; after exactly 409600b = 
> 400kb). This happens reproducibly across multiple 
> computers/networks/operating systems. I also want the 
> wgEncodeGencodeAutoV4, which appears to download ok.
>
> I have had this sort of problem before (where downloads from the table 
> browser would get stuck). I am not sure what causes them. Is there a url 
> from which the data can always be downloaded in flat files?
>
> I can also download the gtf version of these files from:
>
> http://hgdownload-test.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeGencode/wgEncodeGencode{Auto,Manual}V4.gtf.gz
>
> But when I try to convert the AutoV4 file I get several errors:
>
> gunzip -c wgEncodeGencodeAutoV4.gtf.gz | gtfToGenePred -allErrors 
> -genePredExt /dev/stdin /dev/stdout
>
> ... [snip]
> no exons defined for 93876
> no exons defined for 93875
> no exons defined for 93874
> no exons defined for 115098
> no exons defined for 27940
> no exons defined for 29602
> no exons defined for 29603
> no exons defined for 10879
> 622 errors
>
> and these genes are missing from the final output (although they are 
> present in the wgEncodeGencodeAutoV4 I download from the test browser).
>
> I was wondering what the command used to convert the gtf files above to 
> GenePred actually was. Also, can the table browser be repaired?
>
> Right now I am using wgEncodeGencodeAutoV4 from the table browser and 
> wgEncodeGencodeManualV4 converted with gtfToGenePred, but it would be 
> nice to have a more consistent way to set it all up.
>
> Thanks,
> Pouya
>
> _______________________________________________
> Genome maillist  -  Genome@lists.soe.ucsc.edu
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
>   
_______________________________________________
Genome maillist  -  Genome@lists.soe.ucsc.edu
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to