Hi, I want to be able to download genome sequences (without repeat masking, all upper case) for a given chrNum with start & end coordinates.
Here is how it is done on the browser: chrX:151,073,054-151,383,976 with (0,0) flanking sequences upstream & downstream. http://genome.ucsc.edu/cgi-bin/hgc?hgsid=115827007&o=151073053&g=getDna& i=mixed&c=chrX&l=151073053&r=151383976&db=hg18&hgsid=115827007 Is there a way to stitch together a url so that this data can be generated (as if one was to press the getDNA button) or by the following link: http://genome.ucsc.edu/cgi-bin/hgc?hgsid=115827007&g=htcGetDna2&table=&i =mixed&o=151073053&l=151073053&r=151383976&getDnaPos=chrX%3A151%2C073%2C 054-151%2C383%2C976&db=hg18&hgSeq.cdsExon=1&hgSeq.padding5=0&hgSeq.paddi ng3=0&hgSeq.casing=upper&boolshad.hgSeq.maskRepeats=1&hgSeq.repMasking=l ower&boolshad.hgSeq.revComp=1&submit=get+DNA The FAQs page suggests not to use the hgsid variable, and I am not sure how to query without that. My question is pretty much the same as answered on the following page: http://www.soe.ucsc.edu/pipermail/genome/2008-August/017039.html , the only thing different is that I am not using a web browser. So creating or setting a session would not be helpful. My program accepts a url & reads the content of the given url in html or txt format. No browser is involved. All the necessary information has to be given in a single (or multiple) link(s) / string(s), based on parameters defined in the example above: >hg18_dna range=chrX:151073054-151383976 5'pad=0 3'pad=0 strand=+ repeatMasking=none Any help would be much appreciated. Thanks, Ashutosh. PS: Is there a way to get the whole desired sequence in one line? _______________________________________________ Genome maillist - [email protected] http://www.soe.ucsc.edu/mailman/listinfo/genome
