I just finished chainSelf on danRer6. You can see it on hgwdev-test. -Galt
On Fri, 24 Jul 2009, Xianjun Dong wrote: > Hi, Galt, > > Thanks so much for the work. > > Are you saying the contigs unmapped to chromosome (like Zv8_NA***, or > Zv8_scaffold***)? I am not sure how this will make the display 'noisy'. For > study like duplicated genes, I guess it's OK to leave them out; I just > checked the Ensembl genes in Zv8 -- only a small fraction (~5%, 1579 out of > total 27855 Ensembl genes) are in contig Zv8_***. I am not sure if other > people have special need for that. But principally, they should be there, > since they are part of the genome assembly, anyway. Maybe the better way is > to treat it as how you did for a normal species pair: including both chain > and net. (see below) > > I found that the selfChain idea is very interesting/useful in several ways, > for example, in re-constructing the evolution history of Whole-genome > Duplication (WGD, e.g. 2R WGD for human, 3R WGD for teleost fish), in > comparing the paralog genes within one species, especially when it's hard to > find a suitable genome as out-group. One question is, if there is only chain, > it might be still hard to know which alignment blocks (the unit in chain) are > from the same locus before duplication. Actually, I expected there are > something like 'joined Chain'. I know the net data is somehow a sorted chain. > The higher-scoring non-overlapping chains are always put more top. This > brainchild can help people to 'filter out' the low-scoring chain (for > example, those by repeat or wrongly assembling, and finally are put into > unmapped contigs). So, I will be really appreciated if you can make the > selfNet. > > BTW, I suggest you can move the selfChain (and selfNet, if so) to the > Comparative Genomics. Now they are in "Variation and Repeats". Maybe you > have some reason for that. > > Thanks again > > > Xianjun > > > > Galt Barber wrote: >> >> Hi! I am working on self-chain for Zv8 now. >> I had a question about whether people want >> to include the approx. 11,000 contigs >> in the self-chain, or leave them out. >> It's more work to leave them out, >> but it may make the display easier to >> read. On the other hand, perhaps having >> the scaffolds included is important to >> you for some reason. >> >> Which way would be better for your work? >> >> thanks >> >> -Galt >> >> >> On Wed, 27 May 2009, Xianjun Dong wrote: >> >>> Hi, >>> >>> OK. I am back with the same request now, after the assembly/annotation >>> of Zv8 is done. >>> >>> We (and also the community, I think) know that, Zv8 was expected as a >>> big improvement for Zv7, and it IS indeed, from the analysis report they >>> released. So, we eagerly request UCSC, as one of the main hubs of >>> data/tool for bioinformatics, to >>> 1. update danRer6 (Zv8) new assembly on UCSC >>> 2. make hg18:danRer6 chain/net alignment >>> 3. put zebrafish self-chain alignment. >>> >>> Thanks >>> >>> Regards, >>> >>> Xianjun >>> >>> >>> Jennifer Jackson wrote: >>>> Hello, >>>> One of our scientists has some specific ideas concerning the zebrafish >>>> assembly as follows: >>>> >>>> They think the main reason the genome is so difficult to assemble was due >>>> to the DNA collection strategy:- >>>> http://www.sanger.ac.uk/Projects/D_rerio/Zv3_assembly_information.shtml >>>> >>>> The FAQ indicates there should be a finished genome by the end of this >>>> year: >>>> http://www.sanger.ac.uk/Projects/D_rerio/faqs.shtml#factsnine >>>> >>>> Maybe you could discuss your suggestion with the sequencing project, >>>> and if it would help them, we could discuss it further. >>>> >>>> Thank you for your offer to help improve the data, >>>> Jennifer Jackson >>>> UCSC Genome Bioinformatics Group >>>> >>>> Xianjun Dong wrote: >>>>> To those who might concern, >>>>> >>>>> Zebrafish has been one of the most studied models in study of whole >>>>> genome duplication and development, but its genome assembly is not so >>>>> well (which is naturally difficult also due to the whole genome >>>>> duplication there). We also noticed much duplication closely mapped >>>>> in same chromosome, which actually are proved as assembly error in >>>>> zv7, by BLATing in the new assembly Zv8 >>>>> (http://pre.ensembl.org/Danio_rerio/Info/Index). Before Zv8 >>>>> annotation get done (which might help to some extent, but not all), I >>>>> am thinking if UCSC could make a self-chain for zebrafish, just like >>>>> you did for human. If that information offered, we could write a >>>>> script to quickly check those 'tandem' duplications close in genome, >>>>> which can eventually help to improve the quality of the current >>>>> assembly. >>>>> >>>>> If you think this might not be done in the coming soon by your plan, >>>>> I will be appreciated if you can offer any assistance for me to try >>>>> it myself. >>>>> >>>>> Regards, >>>>> >>>>> >>> >>> -- >>> Sterding (Xianjun) Dong >>> PhD student, Boris Lenhard's group >>> Bergen Center of Computational Science >>> Bergen University, Norway >>> Mobile: 0047-47361688 >>> Telephone: 0047-55584022 >>> Skype: xianjun.dong >>> >>> _______________________________________________ >>> Genome maillist - [email protected] >>> https://lists.soe.ucsc.edu/mailman/listinfo/genome >>> > > -- > ========================================== > Xianjun Dong > PhD student, Lenhard group > Computational Biology Unit > Bergen Center for Computational Science > University of Bergen > Hoyteknologisenteret, Thormohlensgate 55 > N-5008 Bergen, Norway > E-mail: [email protected] > Tel.: +47 555 84022 > Fax : +47 555 84295 > ========================================== > > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
