Hello group and Sebastien: I use Ray as default assembler for the project (chromosome broken into individual BAC clones) as it gives better result with default settings based on: 1) The total assembly length is always close to the physical fingerprinting result; 2) Less scaffolds number as the aim is to get "single" scaffold of each data set; 3) Longer maximum scaffold when more than 1 scaffolds created; 4) Longer N50 if many scaffolds exist, although this metrics does not make much sense to my case as normally not very many scaffolds (<30)
I got single scaffold assembly for 25% of the (cases), but I am hoping to get single scaffold for most of the clones as the sequence depth is quite big (>100x generally). Now I am trying to change some of the settings firstly with: -merge seeds or -ignore seeds -minimum-contig-length (a) I am not sure how the seeds will impact the final assembly as I do not understand what "seeds" really means in Ray algorithm; (b) Will setting minimum-contig-length lose informative contigs when some of the contigs are short, especially in high repetitive genome which is my case? Any experience with these would be appreciated. Bastien, can you elaborate these settings, please? Thank you very much! Yifang -use-minimum-seed-coverage -minimum-contig-length -merge-seeds Yifang ________________________________________________________________________ Bioinformatics Support Specialist | Spécialiste de soutien en bioinformatiques National Research Council of Canada | Conseil national de recherches Canada Government of Canada | Gouvernement du Canada 110 Gymnasium Place|110, place Gymnasium Saskatoon, Saskatchewan S7N 0W9 Tel / Tél : 306-975-5279 Fax | Télécopieur : 306-975-4839 ________________________________________ From: Sébastien Boisvert [sebastien.boisver...@ulaval.ca] Sent: Friday, February 06, 2015 8:43 AM To: Marco van het Hoog Cc: denovoassembler-us...@lists.sf.net Subject: [Denovoassembler-users] RE : Ray assembly of Hamster genome. > ________________________________________ > De : Marco van het Hoog [ma...@vanhethoog.com] > Date d'envoi : 20 janvier 2015 19:50 > À : Sébastien Boisvert > Objet : Ray assembly of Hamster genome. > Hello Sébastien ... Hi, > I am working for the National Research Council, Biotechnology Research > Institute in Montreal, and therefore (like you) I have access to all the > Calcul Quebec servers, including Colosse and Guillimin. I don't have to these resources now because I am currently working in the U.S. > We just received 3 HiSeq lanes of reads to construct a genome assembly > of a Hamster CHO cell line. > The estimated genome size of Hamster is about 2.8 GB, for Human (if I > remember well) it's about 3.3 GB, so it's a similar size. > The 3 lane coverage should be around 30X. > Could you tell me, if you were to start such a project with Ray, what > kind of settings you would use? Take a look at the Ray job scripts that were used for the Assemblathon 2: https://github.com/sebhtml/assemblathon-2-ray > Would you use Colosse and Guillimin with 30 cores each, or would you use > Mammouth Parallèle with 256 or 512 GB in memory? You would want to use many machines with something like 24 GB RAM each. > The memory requirements of Ray are a bit confusing to me :) > Thanks in advance for any suggestion. > - Marco. > ------------------------------------------------------------------------------ Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users ------------------------------------------------------------------------------ _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users