Hi Brooke,

Thanks for helping me investigate. On a follow-up note, would you (or fellow
engineers and staff) know approximately what percentage of the D.
melanogaster genome is affected with this issue? Rough estimate will
suffice. I'm deciding whether I should redo the multiway alignment or not,
and such a metric will help with my decision.

Additionally, would you know what revision of Multiz this issue was resolved
it. The current version to date is at v15. 

Many thanks,
Jaaved

--
Jaaved Mohammed,
Ph.D. Student of Computational Biology 
Tri-Institutional Training Program in Computational Biology and Medicine 
(Cornell University - Ithaca, Weill Cornell Medical College, and Memorial
Sloan-Kettering Cancer Center)

-----Original Message-----
From: Brooke Rhead [mailto:[email protected]] 
Sent: Tuesday, February 22, 2011 4:15 PM
To: Jaaved Mohammed
Cc: [email protected]
Subject: Re: [Genome] Huge block of missing data from insect 15way mulitple
alignment

Hi Jaaved,

One of our engineers looked at the region you pointed out and recognized 
the missing alignments as a known (old) bug in multiz.  If you turn on 
the chain and net tracks, you can see that the supposedly missing 
sequence is actually present in the pairwise alignments.

The bug should be fixed in more recent versions of multiz.  The dm3 
15-way multiple alignment is from 2006, and, regrettably, we don't have 
plans to re-do it, as our funding mandates that we focus on vertebrate 
species.

If you suspect some other region is also misaligned, you should be able 
to confirm it by looking at the chains and nets for the organism with 
the supposedly missing sequence and see if the sequence is aligned in them.

We apologize for the inconvenience this may cause.

--
Brooke Rhead
UCSC Genome Bioinformatics Group


Jaaved Mohammed wrote on 2/21/11 12:59 PM:
> Hello,
>
>
>
> I'm seeing one particular block of missing data from the 11
non-melanogaster
> species in the insect 15way multiple alignment. The D. melanogaster
> coordinate I enter on the browser is "chr3R:18,118,601-18,118,671" and I
get
> the attached image. I should point out the block spanning from 18,118,608
-
> 18,118,647 is missing in all the other 11 species. This interval spans a
> popular microRNA which is highly conserved. Along with other evidence, I
> suspect this to be an error and not genuine INDEL in the alignment.
>
>
>
> Does anyone know what this is attributed to and how/if we can fix this in
> the multiple alignment?
>
>
>
> Thanks for your generous attention.
>
>
>
> Regards,
>
> Jaaved
>
>
>
>
> _______________________________________________
> Genome maillist  -  [email protected]
> https://lists.soe.ucsc.edu/mailman/listinfo/genome

_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to