Dear Dr. Clawson,
Thanks for your reply.
I have been aware of the file "ensGene.txt" in the ftp of UCSC. But I don't
confirm if the genes in the shared regions between X and Y chromosomes are
annotated "twice" (count two genes)?
Could please help me clarify these?
Many thanks and best wishes,
Sincerely,
Tom
发件人: Hiram Clawson
发送时间: 2012-06-11 21:29:46
收件人: genome; wangdp
抄送:
主题: Re: [Genome] distinctions of human chromosome Y between UCSC andensembl
Good Morning Tom:
Ensembl explains how they work with the chrY sequence:
> For the human Y chromosome in Ensembl, we have included DNA sequence
> (A/G/C/T) for only the unique region. The rest of the chromosome is masked
> with Ns, which explains how the length of the chromosome matches the GRC
> chromosome but the composition of the sequence is shifted. The reason we only
> include the unique region of Y is to make sure that we represent each region
> of the genome only once.
>
> grep \> Homo_sapiens.GRCh37.67.dna.chromosome.Y.fa
>> Y dna:chromosome chromosome:GRCh37:Y:2649521:59034049:1
>
> To add a bit more detail, the Y chromosome has four regions, two of which are
> unique to Y and two of which are shared with X.
> chromosome:GRCh37:Y:1 - 10000 is unique to Y but it's a string of 1000 Ns
> (Pseudoautosomal region)
> chromosome:GRCh37:Y:10001 - 2649520 is shared with X
> chromosome:GRCh37:Y:2649521- 59034049 is unique to Y
> chromosome:GRCh37:Y:59034050 - 59373566 is shared with X
>
> We store sequence for only the 2 unique regions of Y in our database. The
> full chromosome Y can be generated on-the-fly by our API, where we stitch in
> the shared sequence from X. By default our API will fetch only the unique
> regions of Y however you can request to stitch in the X sequence by setting
> the 4th argument in the SliceAdaptor to '1' :
> $slice_adaptor->fetch_all('toplevel', undef,0,1)};
> The relationship between the shared regions of X and Y are stored in the
> assembly_exception table.
--Hiram
On 6/11/12 2:19 AM, wangdp wrote:
> Dear Sir/ Madam,
>
>
> I have found there are distinctions of human chromosome Y between UCSC and
> ensembl, and there are more "Ns" in the version of ensembl.
>
>
> They are all refered to hg19 or GRCh37.
>
> Could you please help me about this? and which one should I neet to choose as
> the best or right one for further study?
>
> Many thanks and best wishes,
>
> Sincerely,
>
> Tom
>
>
> 2012-06-11
> _______________________________________________
> Genome maillist - [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome
_______________________________________________
Genome maillist - [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome