> From: ino4presid...@gmail.com [ino4presid...@gmail.com] on behalf of Ino de > Bruijn [ino.debru...@scilifelab.se] > Sent: Sunday, May 18, 2014 4:06 AM > To: Boisvert, Sebastien > Cc: denovoassembler-users@lists.sourceforge.net > Subject: Re: [Denovoassembler-users] Duplicate contigs? > > > > Hi, > > Didn't manage to finish in time with the debugging option. Our Cray XE6 > cluster only allows jobs for 24h unfortunately. I had it running with 2,048 > cores, but no luck. I'll see if I can recreate the problem by only using > reads from the one genome that is causing > the problem. >
A general thing in testing is to be able to reproduce a problem rapidly on possibly the smallest dataset possible. Sometimes this is hard to achieve. > I was wondering, completely lost as to how this works. If one increases the > number of cores, do you think it might be able to finish? Yes, Ray has good scalability. But you'll be consuming your core-hours at the same time. > Or could it be that collecting the debug output from all the different ranks > is actually limiting and increasing the number The option to debug this part does increase the amount of information, but I don't think this is significant. You can be sure by checking the actual file size of your standard output file. > of cores only increases the queue for output collecting. > > Best reagrds, > Ino > > > > On Fri, May 16, 2014 at 4:09 PM, Ino de Bruijn > <ino.debru...@scilifelab.se> wrote: > > Ok, I'll do try that, thanks! > > > Best regards, > Ino > > > > > On Fri, May 16, 2014 at 4:04 PM, Boisvert, Sebastien > <boisv...@anl.gov> wrote: > > Graph traversal takes place in parallel, so it is usual that > the same path is obtained in more than one copy. > > > But the code in > https://github.com/sebhtml/ray/blob/master/code/FusionTaskCreator/FusionWorker.cpp > collapse these into one copy. > > If you add the option -debug-fusions, you may see the underlying issue and > perhaps find a solution. > > > From: > ino4presid...@gmail.com [ino4presid...@gmail.com] on behalf of Ino de Bruijn > [ino.debru...@scilifelab.se] > > Sent: Friday, May 16, 2014 8:59 AM > > To: Boisvert, Sebastien > > Cc: > denovoassembler-users@lists.sourceforge.net > > Subject: Re: [Denovoassembler-users] Duplicate contigs? > > > > > > > > > > > Hi, > > > > Hmm interesting. This is indeed the reverse-complement. But you don't > > output those normally right? > > > > > > There are also some where they are exactly the same: > > > > 1 825006 1 825006 825006 825006 100.00 825006 825006 > > 100.00 100.00 contig-859 contig-539 > > > > > > > > Best regards, > > Ino > > > > > > > > On Fri, May 16, 2014 at 3:44 PM, Boisvert, Sebastien > > <boisv...@anl.gov> wrote: > > > > > From: Ino de Bruijn [ino.debru...@scilifelab.se] > > > Sent: Friday, May 16, 2014 7:31 AM > > > To: > > > denovoassembler-users@lists.sourceforge.net > > > Subject: [Denovoassembler-users] Duplicate contigs? > > > > > > > > > > > > > > > > > Bonjour, > > > > > > I have assembled a simulated metagenome of 20 genomes. I have 64 pairs > > > each containing 23,5M pairs. I have been using Ray v2.3.1 using 2048 > > > cores. > > > > > > For one of the genomes I seem to be getting duplicate contigs. > > > > > > > > > I have mapped the assembly against itself using MUMmer and found out that > > > some of the contigs are exactly the same. MUMmer shows these kind of hits: > > > > > > > > > [S1] [E2] [S2] [E2] [LEN 1] [LEN 2] [% IDY] [LEN R] [LEN Q] [COV R] [COV > > > Q] [REF] [QRY] > > > > > > > > > 1 825006 825006 1 825006 825006 100.00 825006 825006 > > > 100.00 100.00 contig-859 contig-717 > > > > > > > > > For an explanation of the format: > > > > > > http://mummer.sourceforge.net/manual/#coords > > > > > > > > > I have seen cases where there was a significant overlap between two contigs > > and for which Ray > > failed to merge them. > > > > I have not seen the case where there are 2 identical contigs (in your case, > > one is the reverse-complement of > > the other). > > > > > > > > > > Do you have any idea how this might happen? > > > > > > > > > Best regards, > > > Ino > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------------ "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users