Hi Yishay, Please see the answers to your questions dispersed in your email below. Contact us again at [email protected] if you have any further questions.
Regards, --- Luvina Guruvadoo UCSC Genome Bioinformatics Group On 11/1/2011 1:25 PM, Yishay Pinto wrote: > Hello, > I have mice genomic coordinates database (chr, coordinate and > strand), each record has one nucleotide range. > I need a little guidance in order to this data, if possible. > 1. list of the one (and only) nucleotide in each coordinate. for example > the input will be chr 8 46418501 (-) and the output should be C. Assuming your coordinates in the input are 1-based (see http://genome.ucsc.edu/FAQ/FAQtracks.html#tracks1), you can make a BED6 file from the input, with the name column encoding all 3 input fields and a 0 placeholder for the score column: sed -re 's/\(//; s/\)//;' myInput.txt \ | awk '{print $1, ($2-1), $2, $1":"$2":"$3, 0, $3}'> myInput.bed Then download hg19.2bit and our twoBitToFa utility to get a FASTA file with position and nucleotide sequence, and transform the FASTA back into the input format plus the sequence column like this: twoBitToFa -bed=myInput.bed hg19.2bit myBases.fa perl -we '$/=">"; while (<>) { if (/^(\w+):(\d+):([+-])\n(\w+)/) { print "$1\t$2\t$3\t$4\n"; } }' myBases.fa \ > myBases.txt Alternatively, instead of downloading hg19.2bit and the twoBitToFa utility, you could upload myInput.bed and use the Table Browser to obtain the FASTA. > 2. if there any SNPs in this coordinate. Upload myInput.bed from above as a custom track. Using the Table Browser, choose SNPs as the primary track and intersect with the custom track. Unfortunately, performing an intersection loses most of the SNP columns, so if you need other SNP information you can intersect in Galaxy, or collect the rs IDs resulting from the Table Browser intersection and upload those IDs to get all/selected fields from SNPs. > 3. and maybe if there some way to know the codon (by refseq) that the > nucleotide belongs to. We don't offer this directly, but you may want to try Ensembl's Variant Effect Predictor tool which has a web version and a standalone script version: http://uswest.ensembl.org/info/docs/variation/vep/index.html > thanks! > _______________________________________________ > Genome maillist [email protected] > https://lists.soe.ucsc.edu/mailman/listinfo/genome _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
