> From: ino4presid...@gmail.com [ino4presid...@gmail.com] on behalf of Ino de 
> Bruijn [ino.debru...@scilifelab.se]
> Sent: Sunday, May 18, 2014 4:06 AM
> To: Boisvert, Sebastien
> Cc: denovoassembler-users@lists.sourceforge.net
> Subject: Re: [Denovoassembler-users] Duplicate contigs?
> 
> 
> 
> Hi,
> 
> Didn't manage to finish in time with the debugging option. Our Cray XE6 
> cluster only allows jobs for 24h unfortunately. I had it running with 2,048 
> cores, but no luck. I'll see if I can recreate the problem by only using 
> reads from the one genome that is causing
> the problem.
> 

A general thing in testing is to be able to reproduce a problem rapidly on 
possibly the smallest dataset possible.
Sometimes this is hard to achieve.

> I was wondering, completely lost as to how this works. If one increases the 
> number of cores, do you think it might be able to finish?

Yes, Ray has good scalability. But you'll be consuming your core-hours at the 
same time.

> Or could it be that collecting the debug output from all the different ranks 
> is actually limiting and increasing the number


The option to debug this part does increase the amount of information, but I 
don't think this is significant.

You can be sure by checking the actual file size of your standard output file.


> of cores only increases the queue for output collecting.
> 
> Best reagrds,
> Ino
> 
> 
> 
> On Fri, May 16, 2014 at 4:09 PM, Ino de Bruijn
> <ino.debru...@scilifelab.se> wrote:
> 
> Ok, I'll do try that, thanks!
> 
> 
> Best regards,
> Ino
> 
> 
> 
> 
> On Fri, May 16, 2014 at 4:04 PM, Boisvert, Sebastien
> <boisv...@anl.gov> wrote:
> 
> Graph traversal takes place in parallel, so it is usual that
> the same path is obtained in more than one copy.
> 
> 
> But the code in
> https://github.com/sebhtml/ray/blob/master/code/FusionTaskCreator/FusionWorker.cpp
> collapse these into one copy.
> 
> If you add the option -debug-fusions, you may see the underlying issue and 
> perhaps find a solution.
> 
> > From:
> ino4presid...@gmail.com [ino4presid...@gmail.com] on behalf of Ino de Bruijn 
> [ino.debru...@scilifelab.se]
> > Sent: Friday, May 16, 2014 8:59 AM
> > To: Boisvert, Sebastien
> > Cc:
> denovoassembler-users@lists.sourceforge.net
> > Subject: Re: [Denovoassembler-users] Duplicate contigs?
> 
> >
> >
> >
> >
> > Hi,
> >
> > Hmm interesting. This is indeed the reverse-complement. But you don't 
> > output those normally right?
> >
> >
> > There are also some where they are exactly the same:
> >
> > 1       825006  1       825006  825006  825006  100.00  825006  825006  
> > 100.00  100.00  contig-859      contig-539
> >
> >
> >
> > Best regards,
> > Ino
> >
> >
> >
> > On Fri, May 16, 2014 at 3:44 PM, Boisvert, Sebastien
> > <boisv...@anl.gov> wrote:
> >
> > > From: Ino de Bruijn [ino.debru...@scilifelab.se]
> > > Sent: Friday, May 16, 2014 7:31 AM
> > > To:
> >
> denovoassembler-users@lists.sourceforge.net
> > > Subject: [Denovoassembler-users] Duplicate contigs?
> >
> > >
> > >
> > >
> > >
> > > Bonjour,
> > >
> > > I have assembled a simulated metagenome of 20 genomes. I have 64 pairs 
> > > each containing 23,5M pairs. I have been using Ray v2.3.1 using 2048 
> > > cores.
> > >
> > > For one of the genomes I seem to be getting duplicate contigs.
> > >
> > >
> > > I have mapped the assembly against itself using MUMmer and found out that 
> > > some of the contigs are exactly the same. MUMmer shows these kind of hits:
> > >
> > >
> > > [S1] [E2] [S2] [E2] [LEN 1] [LEN 2] [% IDY] [LEN R] [LEN Q] [COV R] [COV 
> > > Q] [REF] [QRY]
> > >
> > >
> > > 1       825006  825006  1       825006  825006  100.00  825006  825006  
> > > 100.00  100.00 contig-859      contig-717
> > >
> > >
> > > For an explanation of the format:
> > >
> >
> http://mummer.sourceforge.net/manual/#coords
> > >
> >
> >
> > I have seen cases where there was a significant overlap between two contigs 
> > and for which Ray
> > failed to merge them.
> >
> > I have not seen the case where there are 2 identical contigs (in your case, 
> > one is the reverse-complement of
> > the other).
> >
> >
> > >
> > > Do you have any idea how this might happen?
> > >
> > >
> > > Best regards,
> > > Ino
> > >
> > >
> > >
> > >
> > >
> >
> >
> >
> >
> >
> 
> 
> 
> 
> 
> 
> 
> 
> 

------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to