Hi Matthew,

We suggest that you report the issue to the Genome Reference Consortium
which is the organization that provided the assembly data. Here is the link:

http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/ReportAnIssue.shtml

If you have further questions, please email the mailing list:
[email protected].

Vanessa Kirkup Swing
UCSC Genome Bioinformatics Group



---------- Forwarded message ----------
From: Parks, Matthew <[email protected]>
Date: Tue, Dec 6, 2011 at 2:37 PM
Subject: [Genome] Error in the Human Genome: an unidentified base?
To: [email protected]


Hello,

In my studies, I came across the following strange error in the UCSC Genome
Browser:

Use the UCSC Genome Browser and its "get DNA" function to examine the
following short stretch of sequence:   chr10:37412173-37412176

The sequence reads "ANCC".

Notice that the second nucleotide is marked as "N".  Why is this?  This
part of the chromosome is sufficiently far from both the centromere and
telomere, so presumably it has been sequenced fairly well.  Even if there
is uncertainty about the true value of this nucleotide, isn't some sort of
consensus used.  Also, this seems to be the only "N" nucleotide in the area
- from my analysis, there are no other "N" for at least 10,000 nucleotides
before and after the position in question.

Note that this stretch of sequence is part of a repeat (that's how I came
across it in the first place), and I understand that there is ambiguity in
repeat regions - but then why should only one nucleotide be unidentified
("N") ?  Wouldn't there be more ambiguous nucleotides?

Thank you
--
Matthew Parks
PhD candidate, Division of Applied Mathematics
Brown University
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to