Hello Maria,

The field cpgIslandExt.name is not an identifier for a particular 
region. Rather, this name is contamination of a fixed word "CpG:", plus 
a space, then the value cpgIslandExt.cpgNum.

Format:

CpG: <cpgIslandExt.cpgNum>

Individual CpG Islands in the track and best identified uniquely by the 
genome position (chrom, chromStart, chromEnd).

Regarding chromosome names, the Gateway page for each assembly explains 
the dataset (Source, Methods, Credits, etc.).

How to locate the gateway page for an assembly:
http://genome.ucsc.edu -> Genome Browser -> set clade, genome, assembly

As an example, below is the relevant information in the credits section 
of the gateway page for hg19. Note that assemblies can be formatted 
differently, so it would be important to review each individually.

Chromosome naming scheme
In addition to the "regular" chromosomes, the hg19 browser contains nine 
haplotype chromosomes and 59 unplaced contigs. If an unplaced contig is 
localized to a chromosome, the contig name is appended to the regular 
chromosome name, as in chr1_gl000191_random. If the chromosome is 
unknown, the contig is represented with the name "chrUn" followed by the 
contig indentifier, as in chrUn_gl000211. Note that the chrUn contigs 
are no longer placed in a single, artificial chromosome as they have 
been in previous UCSC assemblies. See the sequences* page for a complete 
list of hg19 chromosome names.

* this is a link to view the sequences on an html page

To download a text file of the same data, use the Table browser and 
extract the table "chromInfo".
1) http://genome.ucsc.edu/cgi-bin/hgTables
2) choose clade, genome, assembly
3) group = All Tables, table = chromInfo
4) name file for download and click on "get output"

We hope this helps! If you need more information, please let us know,
Jennifer

---------------------------------
Jennifer Jackson
UCSC Genome Informatics Group
http://genome.ucsc.edu/

On 4/22/10 4:16 AM, Maria Iglesias wrote:
>
>
>
> HI,
>
> I am not familiar working with genome annotation data so I have
> questions about the output files of CpG island and Refgene from UCSC
> browser.
>
> First I have download a table with all the CpG island annotation. The
> file have 4 columns 1:chrom name : 2nd and 3rd are start and end
> position and the fourth one is the number or name give to each CpG
> island. I notice there are different positions with the same number in
> the fourth column. Why this happened? Sometimes could be due to the
> length of the CpG island (range 201-3000pb) but another times these
> position are more than 10kb distant.
>
> The second question is in both cases with CpG island file and Refgene I
> got all the features from chromosome 1 until chromosome 22 but later in
> the table appear another annotation that I don't really understand.
>
>
> chr6_dbb_hap3
> .
> .
> ChrUN_...
>
> chr17_ctg5_hap1
> ...
> chr1_gl000191_random
>
>
> What they are? Should I used them?
>
>
> Thanks a lot in advance for your help.
>
>
> María Jesús
>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to