Thanks Brooke. I look forward to hearing about any progress. Thanks, Jaaved
On Mon, Mar 7, 2011 at 1:58 PM, Brooke Rhead <[email protected]> wrote: > Hi again Jaaved, > > We have decided to re-run multiz for the dm3 alignments, and we have added > it to our project list. We plan to continue to make the old multiz15way > available for people who are already working with it and would like to > continue to use it. I don't have an estimate yet of when the work will be > done. > > Thanks again for alerting us to the error and prompting this re-run. > > > -- > Brooke Rhead > UCSC Genome Bioinformatics Group > > > On 03/01/11 16:20, Brooke Rhead wrote: > >> Hi Jaaved, >> >> Since my first email to you, there has been some talk here of re-running >> both the multiz and phastcons programs on the dm3 alignments (though >> we haven't committed to doing it just yet). >> >> I asked the engineers here for more information on what percentage of >> the genome is affected by the bug, and when multiz got fixed. Figuring >> out what percentage is affected is non-trivial (and hard to do without >> just re-running the multiz and looking to see how many places change!). >> Also, the colleagues I spoke to didn't know exactly what the bug was >> that caused the missing sequence, just that it got fixed when we got a >> new version of multiz. >> >> Thank you for pointing out the problem in the dm3 multiple alignment. We >> will get back to you within a week or two with what we decide to do about >> re-running multiz. >> >> -- >> Brooke Rhead >> UCSC Genome Bioinformatics Group >> >> >> On 02/28/11 10:28, Jaaved Mohammed wrote: >> >>> Hi Brooke, >>> >>> Thanks for helping me investigate. On a follow-up note, would you (or >>> fellow >>> engineers and staff) know approximately what percentage of the D. >>> melanogaster genome is affected with this issue? Rough estimate will >>> suffice. I'm deciding whether I should redo the multiway alignment or >>> not, >>> and such a metric will help with my decision. >>> >>> Additionally, would you know what revision of Multiz this issue was >>> resolved >>> it. The current version to date is at v15. >>> Many thanks, >>> Jaaved >>> >>> -- >>> Jaaved Mohammed, >>> Ph.D. Student of Computational Biology Tri-Institutional Training Program >>> in Computational Biology and Medicine (Cornell University - Ithaca, Weill >>> Cornell Medical College, and Memorial >>> Sloan-Kettering Cancer Center) >>> >>> -----Original Message----- >>> From: Brooke Rhead [mailto:[email protected]] Sent: Tuesday, February >>> 22, 2011 4:15 PM >>> To: Jaaved Mohammed >>> Cc: [email protected] >>> Subject: Re: [Genome] Huge block of missing data from insect 15way >>> mulitple >>> alignment >>> >>> Hi Jaaved, >>> >>> One of our engineers looked at the region you pointed out and recognized >>> the missing alignments as a known (old) bug in multiz. If you turn on the >>> chain and net tracks, you can see that the supposedly missing sequence is >>> actually present in the pairwise alignments. >>> >>> The bug should be fixed in more recent versions of multiz. The dm3 >>> 15-way multiple alignment is from 2006, and, regrettably, we don't have >>> plans to re-do it, as our funding mandates that we focus on vertebrate >>> species. >>> >>> If you suspect some other region is also misaligned, you should be able >>> to confirm it by looking at the chains and nets for the organism with the >>> supposedly missing sequence and see if the sequence is aligned in them. >>> >>> We apologize for the inconvenience this may cause. >>> >>> -- >>> Brooke Rhead >>> UCSC Genome Bioinformatics Group >>> >>> >>> Jaaved Mohammed wrote on 2/21/11 12:59 PM: >>> >>>> Hello, >>>> >>>> >>>> >>>> I'm seeing one particular block of missing data from the 11 >>>> >>> non-melanogaster >>> >>>> species in the insect 15way multiple alignment. The D. melanogaster >>>> coordinate I enter on the browser is "chr3R:18,118,601-18,118,671" and I >>>> >>> get >>> >>>> the attached image. I should point out the block spanning from >>>> 18,118,608 >>>> >>> - >>> >>>> 18,118,647 is missing in all the other 11 species. This interval spans a >>>> popular microRNA which is highly conserved. Along with other evidence, I >>>> suspect this to be an error and not genuine INDEL in the alignment. >>>> >>>> >>>> >>>> Does anyone know what this is attributed to and how/if we can fix this >>>> in >>>> the multiple alignment? >>>> >>>> >>>> >>>> Thanks for your generous attention. >>>> >>>> >>>> >>>> Regards, >>>> >>>> Jaaved >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Genome maillist - [email protected] >>>> https://lists.soe.ucsc.edu/mailman/listinfo/genome >>>> >>> >> _______________________________________________ >> Genome maillist - [email protected] >> https://lists.soe.ucsc.edu/mailman/listinfo/genome >> > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
