Hi Marco, We are unsure if your contig names are from the Broad Institute (the organization who performed the sequencing) or NCBI. Can you please check the Assembly track and see if your contig names match the ones in this track (here is a link for equCab2: http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=equCab2&g=gold)? If they do, you can download the data in this track to convert your coordinates. If not, please send us an example of your contig names and we can see if we have a conversion file.
Best, Mary ------------------ Mary Goldman UCSC Bioinformatics Group On 7/5/11 10:09 AM, Marco Santagostino wrote: > Dear Sirs, > > I worked a bit with the RepeatMasker Track, but I found that, oddly, the > consensus sequence of the transposable element (which we are > investigating) used to mask the genome (and in general used by > RepeatMasker) is different from that annotated in RepBase (and which we > used for some preliminary analysis). I can download the hit list > generated by BLAST using "our" consensus sequence, but I don't have the > coordinates in the ordered horse genome for each BLAST hit, I have just > the coordinates in the contig sequences; is there a way to submit this > hit list (in csv or txt format, or whatever) to Table Browser and > retrieve the coordinates in the horse genome for each hit? > > Thanks, > > Marco > > > > Il 14/06/11 00:25, Greg Roe ha scritto: >> Hi Marco, >> >> There is some information on the track info page for the RepeatMasker >> track. Click the track title. There is also some info in the >> downloads' README file: >> http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/ (see esp. >> chromOut.tar.gz) >> >> To set up the Table browser so it recovers only the elements with at >> least 90% of identity with the consensus sequence.... >> >> (For some background, a definition of RepeatMasker output columns can >> be found here: http://repeatmasker.org/webrepeatmaskerhelp.html ) >> >> The 2nd, 3rd and 4th columns of the .out files are useful: >> >> 15.6 = % substitutions in matching region compared to the consensus >> 6.2 = % of bases opposite a gap in the query sequence (deleted bp) >> 0.0 = % of bases opposite a gap in the repeat consensus >> (inserted bp) >> >> In our database table, those are multiplied by 10 in order to get >> integer parts-per-thousand, and called milliDiv (substitutions), >> milliDel and milliIns. >> >> The simplest % identity measurement is milliDiv only -- if you wish, >> you can factor in milliDel and milliIns too. >> >> So, to get % identity>= 90% in the Table Browser, create a filter >> with milliDiv>= 900 (since it is parts per thousand). >> >> Please let us know if you have any additional questions: >> [email protected] >> >> - >> Greg Roe >> UCSC Genome Bioinformatics Group >> >> >> On 6/13/11 9:32 AM, Marco Santagostino wrote: >>> Dear Sirs, >>> >>> were can I find the parameters used to generate the RepeatMasker track? >>> The problem is as it follows: I need to take from the horse genome a >>> certain repetitive element, and I'm supposed to classify all the hits >>> found according to their identity (with respect to the consensus >>> sequence). Some collegues of mine already took all the sequences with at >>> least 98% of identity by BLAST search, so, now I'm supposed to find >>> those which have a lower identity, but I can't find out how to set up >>> the Table Browser so that it finds the elements with the identity that I >>> chose. How do I set up the table browser so, for exemple, it recovers >>> only the elements with at least 90% of identity with the consensus >>> sequence? >>> >>> Thank you, >>> >>> Marco Santagostino >>> >>> >>> > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
