Hi Gareth,

Please see this answer to a similar question:
https://lists.soe.ucsc.edu/pipermail/genome/2010-October/023997.html

If you turn on the RepeatMasker track, you will be able to see that 
there is a low complexity repeat that is responsible for masking out 7 
of the 29 CpGs.

--
Brooke Rhead
UCSC Genome Bioinformatics Group


Gareth Wilson wrote on 11/1/10 6:30 AM:
> Hello,
> 
> I recently downloaded the cpgIslandExt table for Human genome GRCh37. Whilst
> using this file in an analysis, I stumbled upon a problem, the root of which
> seemed to come back to the cpgIsland file. It seems that, for some islands,
> the metadata is incorrect. For example:
> 
> http://genome.ucsc.edu/cgi-bin/hgc?hgsid=171991065&o=33697914&t=33698193&g=c
> pgIslandExt&i=CpG%3A+22
> 
> This cpg island has a CpG count of 22. However on inspection of the
> sequence, the actual count can be seen to be 29. Hence the percentage CpG
> and ratio of observed/expected will also be incorrect.
> 
> Am I missing something obvious? And if not, have you any idea as to how many
> islands have similar errors?
> 
> Hope you can help!
> 
> Thanks
> 
> Gareth Wilson
> 
> 
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to