Therefore the user should split in more than 3 input file when running on
1024 MPI ranks.

On 23/04/13 04:37 PM, Daniel Gruner wrote:
> Here is the tail of the stdout:
>
> Rank 358 has 390000 sequence reads
> Rank 358: assembler memory usage: 273852 KiB
> Rank 920 has 330000 sequence reads
> Rank 920: assembler memory usage: 264648 KiB
> Rank 920 has 340000 sequence reads
> Rank 920: assembler memory usage: 264648 KiB
> Rank 358 has 400000 sequence reads
> Rank 358: assembler memory usage: 273852 KiB
> Rank 358 has 400690 sequence reads (completed)
> Rank 358 is writing checkpoint Sequences
> Rank 920 has 350000 sequence reads
> Rank 920: assembler memory usage: 281036 KiB
> Rank 920 has 360000 sequence reads
> Rank 920: assembler memory usage: 281036 KiB
> Rank 920 has 370000 sequence reads
> Rank 920: assembler memory usage: 281036 KiB
> Rank 920 has 380000 sequence reads
> Rank 920: assembler memory usage: 281036 KiB
> Rank 920 has 390000 sequence reads
> Rank 920: assembler memory usage: 281036 KiB
> Rank 920 has 400000 sequence reads
> Rank 920: assembler memory usage: 281036 KiB
> Rank 920 has 400690 sequence reads (completed)
> Rank 920 is writing checkpoint Sequences
> Rank 961 has 0 sequence reads
> Rank 961: assembler memory usage: 265612 KiB
>
> The head node shows about 100 minutes of cpu usage on each rank.
>
> Danny
>
> On Tue, Apr 23, 2013 at 04:31:14PM -0400, Sébastien Boisvert wrote:
>> On 23/04/13 04:17 PM, Daniel Gruner wrote:
>>> Hi Sebastien,
>>>
>>> Yes, it is on the GPC.
>>> Here is the full script that was run:
>>>
>>> #!/bin/bash
>>> #PBS -l nodes=128:ppn=8
>>> #PBS -l walltime=10:00:00
>>> #PBS -N moa_Ray_flash_assembly_31mer_128nodes_disable_recycling
>>> cd $PBS_O_WORKDIR
>>> module load gcc
>>> module unload openmpi/1.4.4-intel-v12.1
>>> module load openmpi/1.4.4-gcc-v4.6.1
>>> mpiexec -n 1024 Ray -k 31 -o RayOutput_k31_128nodes_disable-recycling 
>>> -route-messages -connection-type debruijn -routing-graph-degree 32 
>>> -disable-recycling -p 
>>> /scratch/a/abaker/acloutie/moa_ray_input/moa_all_R1.fastq 
>>> /scratch/a/abaker/acloutie/moa_ray_input/moa_all_R2.fastq -s 
>>> /scratch/a/abaker/acloutie/moa_ray_input/moa_all_single.fastq 
>>> -read-write-checkpoints checkpoints_128nodes_disable_recycling
>>>
>>
>> The polytope graph is much better than the de Bruijn graph for routing 
>> messages. But it is not the reason for the high number of
>> I/O operations.
>>
>>
>>
>> A first possible reason for slow I/O is that the user has only 3 input 
>> files, yet it has 1024 MPI ranks.
>>
>> Sequencing data is definitely not produced as 3 files upstream.
>>
>> It's usually better to use more fastq input files.
>>
>>
>>
>> Another possible reason would be the checkpoints. Some of the checkpointing 
>> code group I/O operations. But some other
>> parts do not.
>>
>> There is an opened issue to fix this: 
>> https://github.com/sebhtml/ray/issues/57
>>
>>
>> So a tail on the standard output would (probably) help to zero in on the 
>> real issue here.
>>
>>
>>
>>> Danny
>>>
>>>
>>> On Tue, Apr 23, 2013 at 04:11:33PM -0400, Sébastien Boisvert wrote:
>>>> Hello,
>>>>
>>>> I CC'ed the mail to the community mailing list.
>>>>
>>>> To subscribe:  
>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>>>>
>>>> On 23/04/13 04:01 PM, Daniel Gruner wrote:
>>>>> Hi Alison,
>>>>>
>>>>> We've noticed you are running a fairly large Ray assembly job, like you
>>>>> did last Saturday.  It turns out that this job causes a high load on the
>>>>> filesystem, and the effect is to slow down access to file for everybody
>>>>> on SciNet.
>>>>>
>>>>
>>>> On gpc, right ?
>>>>
>>>>> We are not sure why this is, and in fact I'd like to contact the 
>>>>> developer of
>>>>> Ray, Sebastien Boisvert, about it.  Perhaps you can tell me some details 
>>>>> of
>>>>> your current calculation, using 1024 cores.
>>>>>
>>>>> There is a new version of Ray out, and it is conceivable that some of the 
>>>>> problems
>>>>> have been addressed. Would you be able to test it?  Supposedly it fixes a 
>>>>> number
>>>>> of issues.
>>>>
>>>> The following options can do a lot of input/output operations:
>>>>
>>>> -write-kmers
>>>>
>>>> -read-write-checkpoints CheckpointDirectory
>>>>
>>>>>
>>>>> It may be useful to be able to tell the developer of Ray what exactly you 
>>>>> are doing
>>>>> in your calculation, so that he may be able to determine if there is 
>>>>> indeed a
>>>>> problem with his I/O strategy.
>>>>
>>>> Can you provide the complete command line ?
>>>>
>>>>
>>>> The file format .fastq.gz has a readahead code implemented too. That may 
>>>> help.
>>>>
>>>>>
>>>>> I am copying Sebastien in this email.
>>>>>
>>>>> Thanks and regards,
>>>>> Danny
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>


------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Denovoassembler-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to