Re: [Denovoassembler-users] your Ray job

Sébastien Boisvert Tue, 23 Apr 2013 13:58:43 -0700

On 23/04/13 04:34 PM, Daniel Gruner wrote:
> One further comment:  The standard output file is huge!  This is not very 
> good,
> especially if it is not redirected to an actual file on disk.


This was a known problem in v2.1.0 and before.

The verbosity of Ray v2.2.0 is much reduced.

>
> The standard .o and .e files from torque are on the head node of the job, and
> since the nodes are diskless then the files really are on ramdisk.  I've 
> noticed
> in Alison's directory that the output files are about 8GB in size.  This 
> means that
> more than 1/2 of the memory on the head node would have been consumed by this 
> file
> alone!

That's silly hehe.

>
> I've also noticed that the output is extremely verbose - is there a way to 
> quiet
> it down?
>

Yes, use v2.2.0. ;-)


Relevant changes:

$ git log --oneline v2.1.0..v2.2.0 | grep verbos
173da1a reduce verbosity of components
32b48a0 reduced verbosity
5aa8f70 reduced verbosity
773ab3c VerticesExtractor: reduced verbosity
c5e0eda SequencesLoader: reduced verbosity
1df6943 KmerAcademyBuilder: reduced the verbosity for graph construction
5831bc5 JoinerTaskCreator: reduced the default verbosity
7b777f6 SeedExtender: reduce the verbosity of graph traversal


> We see a lot of I/O happening from some ranks, while others are polling (i.e. 
> doing nothing),
> so we really have a problem here.

Maybe it's the standard output. v2.2.0 fixes that.

>
> Danny
>
>
> On Tue, Apr 23, 2013 at 04:17:26PM -0400, Daniel Gruner wrote:
>> Hi Sebastien,
>>
>> Yes, it is on the GPC.
>> Here is the full script that was run:
>>
>> #!/bin/bash
>> #PBS -l nodes=128:ppn=8
>> #PBS -l walltime=10:00:00
>> #PBS -N moa_Ray_flash_assembly_31mer_128nodes_disable_recycling
>> cd $PBS_O_WORKDIR
>> module load gcc
>> module unload openmpi/1.4.4-intel-v12.1
>> module load openmpi/1.4.4-gcc-v4.6.1
>> mpiexec -n 1024 Ray -k 31 -o RayOutput_k31_128nodes_disable-recycling 
>> -route-messages -connection-type debruijn -routing-graph-degree 32 
>> -disable-recycling -p 
>> /scratch/a/abaker/acloutie/moa_ray_input/moa_all_R1.fastq 
>> /scratch/a/abaker/acloutie/moa_ray_input/moa_all_R2.fastq -s 
>> /scratch/a/abaker/acloutie/moa_ray_input/moa_all_single.fastq 
>> -read-write-checkpoints checkpoints_128nodes_disable_recycling
>>
>> Danny
>>
>>
>> On Tue, Apr 23, 2013 at 04:11:33PM -0400, Sébastien Boisvert wrote:
>>> Hello,
>>>
>>> I CC'ed the mail to the community mailing list.
>>>
>>> To subscribe:  
>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>>>
>>> On 23/04/13 04:01 PM, Daniel Gruner wrote:
>>>> Hi Alison,
>>>>
>>>> We've noticed you are running a fairly large Ray assembly job, like you
>>>> did last Saturday.  It turns out that this job causes a high load on the
>>>> filesystem, and the effect is to slow down access to file for everybody
>>>> on SciNet.
>>>>
>>>
>>> On gpc, right ?
>>>
>>>> We are not sure why this is, and in fact I'd like to contact the developer 
>>>> of
>>>> Ray, Sebastien Boisvert, about it.  Perhaps you can tell me some details of
>>>> your current calculation, using 1024 cores.
>>>>
>>>> There is a new version of Ray out, and it is conceivable that some of the 
>>>> problems
>>>> have been addressed. Would you be able to test it?  Supposedly it fixes a 
>>>> number
>>>> of issues.
>>>
>>> The following options can do a lot of input/output operations:
>>>
>>> -write-kmers
>>>
>>> -read-write-checkpoints CheckpointDirectory
>>>
>>>>
>>>> It may be useful to be able to tell the developer of Ray what exactly you 
>>>> are doing
>>>> in your calculation, so that he may be able to determine if there is 
>>>> indeed a
>>>> problem with his I/O strategy.
>>>
>>> Can you provide the complete command line ?
>>>
>>>
>>> The file format .fastq.gz has a readahead code implemented too. That may 
>>> help.
>>>
>>>>
>>>> I am copying Sebastien in this email.
>>>>
>>>> Thanks and regards,
>>>> Danny
>>>>
>>>
>>>
>>>
>>
>> --
>>
>> Dr. Daniel Gruner                              [email protected]
>> Chief Technical Officer - Software             phone:  (416)-978-2775
>> SciNet High Performance Computing Consortium   www.scinethpc.ca
>> Compute/Calcul Canada                          www.computecanada.ca
>


------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Denovoassembler-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Re: [Denovoassembler-users] your Ray job

Reply via email to