Hi katrina,
Thank you for your response. The Seg-dups strand info is
still not clear. When the relative orientation is '+', then it is always +A ~
+B. One never sees -A ~ -B, as this is equivalent to +A ~ +B. And similarly I
never saw -A ~ +B (it was always +A ~ -B). Now the question is, is the 'Other
Position Relative Orientation' based on 'strand' or the orientation of the
sequence with respect to each other?
Consider the following scenarios..and tell me which one is appropriate. My
impression is that relative orientation should be a reverse orientation of the
sequence and has noting to do with strand (situation 2a). But please clarify.
1) +A ~ +B <==> -A ~ -B <==> relative orientation = +
The sequences are in the same orientation and strand (<- <- / -> -> )
Block_A Block_B
--------------> -------------->
(+) ACGTTGACAATGTCA (+) ACGTTGACAATGTCA
(-) TGCAACTGTTACAGT (-) TGCAACTGTTACAGT
2) -A ~ +B <==> +A ~ -B <==> relative orientation = -
(a) This is simply a reverse orientation of the sequence (<- -> / -> <- )
Block_A Block_B
--------------> <--------------
(+) ACGTTGACAATGTCA (+) ACTGTAACAGTTGCA
(-) TGCAACTGTTACAGT (-) TGACATTGTCAACGT
or
(b) This is what you implied, simple flipping of the strands, although in the
same orientation. Similar to 1 (<- <- / -> -> )
Block_A Block_B
-------------->
(+) ACGTTGACAATGTCA (+) TGCAACTGTTACAGT
(-) TGCAACTGTTACAGT (-) ACGTTGACAATGTCA
-------------->
or
(C) There is reverse orientation and strand flipping
Block_A Block_B
-------------->
(+) ACGTTGACAATGTCA (+) TGACATTGTCAACGT
(-) TGCAACTGTTACAGT (-) ACTGTAACAGTTGCA
<--------------
Thank you,
Sri.
On Aug 15, 2011, at 9:48 AM, Katrina Learned wrote:
Hi Sri,
1) The one base pair difference in the start positions is due to our
0-based coordinate system. Please see this FAQ for more information:
http://genome.ucsc.edu/FAQ/FAQtracks.html#tracks1
2) The "strand:" information that you see on the item's details page is
there in error; one of our engineers has removed it. Sorry for the
confusion and thank you for bringing it to our attention! The 'strand'
column in the genomicSuperDups table is represented in the item details
by "Other Position Relative Orientation:." The items in this track are
spans of double-stranded genomic DNA -- they don't have an inherent
strand like genes etc. However, if the forward strand sequence of the
region aligns to the reverse strand of the other region, then the
relative orientation is '-'. And that's a reciprocal relationship. So
say region A has other-region B, + and - are the stranded sequences, and
~ means "aligns to":
+A ~ +B <==> -A ~ -B <==> relative orientation = +
-A ~ +B <==> +A ~ -B <==> relative orientation = -
3) We suggest checking the references listed in the description page for
more details
(http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=genomicSuperDups) or
contacting the authors (the email is on the description page also), but
on of our engineers thinks that based on this statement in the
description page's Methods section (emphasis added):
The repeats were then reinserted into the pairwise alignments, the
ends of alignments trimmed, and global alignments were generated.
that the trimming of alignment ends might explain the shorter length of
the "optimal global alignment."
I hope this information is helpful! Please contact the mail list
([email protected]) again if you have any further questions.
Katrina Learned
UCSC Genome Bioinformatics Group
On 8/10/11 9:38 AM, Sampath, Srirangan wrote:
Below are some questions regarding hg18 Seg-dup tracks.
1) I am using Table browser to download the 'genomicSuperDups' and using 'all
fields from selected table' to extract all the fields in the schema. There is
1bp shift in the 'chromStart' and 'otherStart' columns between the data
downloaded from Table Browser and the corresponding information from the Genome
Browser. The start positions of the intervals in Genome Browser is shifted one
base ahead of the Table Browser. The End positions do match. Why is that and
which one is correct?
Below I have pasted the both the Table Browser rows and the corresponding link
to the Genome Browser
Table Browser:
bin chrom chromStart chromEnd name score strand otherChrom otherStart otherEnd
otherSize uid posBasesHit testResult verdict chits ccov alignfile alignL indelN
indelS alignB matchB mismatchB transitionsB transversionsB fracMatch
fracMatchIndel jcK k2K
118 chr10 47833296 47862414 chr10:45491750 419130 + chr10 45491750 45520834
135374737 0 1000 N/A N/A N/A N/A align_both//0004/both020825 28953 63 298 15276
28292 661 457 204 0.97717 0.975048 0.0231848 0.0232378
Genome Browser:
http://www.genome.ucsc.edu/cgi-bin/hgc?hgsid=206552079&o=47833296&t=47862414&g=genomicSuperDups&i=chr10%3A45491750
2) There is a column with 'strand' information on the 'genomicSuperDups' schema
(Table Browser). But there is no 'Other Position Relative Orientation' data. Is
the 'strand' column from Table Browser the actual strand information of the
sequence, or relative orientation of the 'other position'? From the Table
Browser output when I manually checked a few, all '+' seem to have '+' relative
orientation of the other position , and all '-' seem to have '-' relative
orientation of the other position.
3) Sometimes the link to the 'Optimal Global Alignment' does not provide the
Alignment for the entire interval. For example, the track for
Item: chr10:48284311
is 428kb long. But the 'Optimal Global Alignment' provides alignment for only
103 kb. Below are the links
http://www.genome.ucsc.edu/cgi-bin/hgc?hgsid=206563651&o=45896971&t=46325469&g=genomicSuperDups&i=chr10%3A48284311
http://humanparalogy.gs.washington.edu/build36/align_both//0004/both020916
Thank you,
Sri.
Dr. Srirangan Sampath Ph.D.,
ABMG Clinical Cytogenetics Fellow
Medical Genetics Laboratories
Department of Molecular& Human Genetics
John P. McGovern Campus NABS-O250
Baylor College of Medicine,
Houston, TX-77021
Cell: 504-390-5512
_______________________________________________
Genome maillist [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome
Dr. Srirangan Sampath Ph.D.,
ABMG Clinical Cytogenetics Fellow
Medical Genetics Laboratories
Department of Molecular & Human Genetics
John P. McGovern Campus NABS-O250
Baylor College of Medicine,
Houston, TX-77021
Cell: 504-390-5512
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome