Hi Hiram, Thank you for responding to my question so quickly. It's exactly what I needed. Best regards,
Emilie On Thu, Dec 29, 2011 at 4:49 PM, Hiram Clawson <[email protected]> wrote: > Good Afternoon Emilie: > > I don't have the archive of the knownToEnsembl.txt file, however, > I do have a copy of the genePred file for the ensGene track version60: > > http://genome-test.cse.ucsc.**edu/~hiram/ensGene.hg19.v60/** > hg19.ensGene.gp.gz<http://genome-test.cse.ucsc.edu/%7Ehiram/ensGene.hg19.v60/hg19.ensGene.gp.gz> > > The simple awk script included below can convert this genePred file > to a bed file. > > --Hiram > > > Emilie Chautard wrote: > >> Hi, >> >> I need to use the Ensembl r60 gene annotations for the files >> knownToEnsembl.txt and ensGene.txt. >> It seems that the files on >> http://hgdownload.cse.ucsc.**edu/goldenPath/hg19/database/<http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/>correspond >> to a >> more recent release. >> Do you have a tool to suggest which could help me to generate these files >> or are these files still available somewhere? >> Thanks a lot in advance, >> Best regards, >> >> Emilie >> > > #!/usr/bin/awk -f > > # > # Convert genePred file to a bed file (on stdout) > # > BEGIN { > FS="\t"; > OFS="\t"; > } > { > name=$1 > chrom=$2 > strand=$3 > start=$4 > end=$5 > cdsStart=$6 > cdsEnd=$7 > blkCnt=$8 > > delete starts > split($9, starts, ","); > delete ends > split($10, ends, ","); > blkStarts="" > blkSizes="" > for (i = 1; i <= blkCnt; i++) { > blkSizes = blkSizes (ends[i]-starts[i]) ","; > blkStarts = blkStarts (starts[i]-start) ","; > } > > print chrom, start, end, name, 1000, strand, cdsStart, cdsEnd, 0, > blkCnt, blkSizes, blkStarts > } > > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
