Hi, Galt, Thanks so much for the work.
Are you saying the contigs unmapped to chromosome (like Zv8_NA***, or Zv8_scaffold***)? I am not sure how this will make the display 'noisy'. For study like duplicated genes, I guess it's OK to leave them out; I just checked the Ensembl genes in Zv8 -- only a small fraction (~5%, 1579 out of total 27855 Ensembl genes) are in contig Zv8_***. I am not sure if other people have special need for that. But principally, they should be there, since they are part of the genome assembly, anyway. Maybe the better way is to treat it as how you did for a normal species pair: including both chain and net. (see below) I found that the selfChain idea is very interesting/useful in several ways, for example, in re-constructing the evolution history of Whole-genome Duplication (WGD, e.g. 2R WGD for human, 3R WGD for teleost fish), in comparing the paralog genes within one species, especially when it's hard to find a suitable genome as out-group. One question is, if there is only chain, it might be still hard to know which alignment blocks (the unit in chain) are from the same locus before duplication. Actually, I expected there are something like 'joined Chain'. I know the net data is somehow a sorted chain. The higher-scoring non-overlapping chains are always put more top. This brainchild can help people to 'filter out' the low-scoring chain (for example, those by repeat or wrongly assembling, and finally are put into unmapped contigs). So, I will be really appreciated if you can make the selfNet. BTW, I suggest you can move the selfChain (and selfNet, if so) to the Comparative Genomics. Now they are in "Variation and Repeats". Maybe you have some reason for that. Thanks again Xianjun Galt Barber wrote: > > Hi! I am working on self-chain for Zv8 now. > I had a question about whether people want > to include the approx. 11,000 contigs > in the self-chain, or leave them out. > It's more work to leave them out, > but it may make the display easier to > read. On the other hand, perhaps having > the scaffolds included is important to > you for some reason. > > Which way would be better for your work? > > thanks > > -Galt > > > On Wed, 27 May 2009, Xianjun Dong wrote: > >> Hi, >> >> OK. I am back with the same request now, after the assembly/annotation >> of Zv8 is done. >> >> We (and also the community, I think) know that, Zv8 was expected as a >> big improvement for Zv7, and it IS indeed, from the analysis report they >> released. So, we eagerly request UCSC, as one of the main hubs of >> data/tool for bioinformatics, to >> 1. update danRer6 (Zv8) new assembly on UCSC >> 2. make hg18:danRer6 chain/net alignment >> 3. put zebrafish self-chain alignment. >> >> Thanks >> >> Regards, >> >> Xianjun >> >> >> Jennifer Jackson wrote: >>> Hello, >>> One of our scientists has some specific ideas concerning the zebrafish >>> assembly as follows: >>> >>> They think the main reason the genome is so difficult to assemble >>> was due >>> to the DNA collection strategy:- >>> http://www.sanger.ac.uk/Projects/D_rerio/Zv3_assembly_information.shtml >>> >>> The FAQ indicates there should be a finished genome by the end of this >>> year: >>> http://www.sanger.ac.uk/Projects/D_rerio/faqs.shtml#factsnine >>> >>> Maybe you could discuss your suggestion with the sequencing project, >>> and if it would help them, we could discuss it further. >>> >>> Thank you for your offer to help improve the data, >>> Jennifer Jackson >>> UCSC Genome Bioinformatics Group >>> >>> Xianjun Dong wrote: >>>> To those who might concern, >>>> >>>> Zebrafish has been one of the most studied models in study of whole >>>> genome duplication and development, but its genome assembly is not so >>>> well (which is naturally difficult also due to the whole genome >>>> duplication there). We also noticed much duplication closely mapped >>>> in same chromosome, which actually are proved as assembly error in >>>> zv7, by BLATing in the new assembly Zv8 >>>> (http://pre.ensembl.org/Danio_rerio/Info/Index). Before Zv8 >>>> annotation get done (which might help to some extent, but not all), I >>>> am thinking if UCSC could make a self-chain for zebrafish, just like >>>> you did for human. If that information offered, we could write a >>>> script to quickly check those 'tandem' duplications close in genome, >>>> which can eventually help to improve the quality of the current >>>> assembly. >>>> >>>> If you think this might not be done in the coming soon by your plan, >>>> I will be appreciated if you can offer any assistance for me to try >>>> it myself. >>>> >>>> Regards, >>>> >>>> >> >> -- >> Sterding (Xianjun) Dong >> PhD student, Boris Lenhard's group >> Bergen Center of Computational Science >> Bergen University, Norway >> Mobile: 0047-47361688 >> Telephone: 0047-55584022 >> Skype: xianjun.dong >> >> _______________________________________________ >> Genome maillist - [email protected] >> https://lists.soe.ucsc.edu/mailman/listinfo/genome >> -- ========================================== Xianjun Dong PhD student, Lenhard group Computational Biology Unit Bergen Center for Computational Science University of Bergen Hoyteknologisenteret, Thormohlensgate 55 N-5008 Bergen, Norway E-mail: [email protected] Tel.: +47 555 84022 Fax : +47 555 84295 ========================================== _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
