Well, seeing this post, I am a bit worried about my jobs, since using  
loadleveler in an idataplex cluster, I don't see the outputs from  
different rank numbers, but they all are labeled Rank 0. In fact with  
two tasks only two of the -output-filenames are written to and both  
gave labels Rank 0.  Is this only a difference of the clusters or am I  
missing something?

#!/bin/sh
# @ job_name = crambe_ray
# @ output = $(job_name).out
# @ error = $(job_name).err

# @ job_type = mpich
# @ node = 2
# @ total_tasks = 16
# @ restart = yes


# @ wall_clock_limit = 60:00:00,59:55:00
# @ queue

mpirun -np $LOADL_TOTAL_TASKS -output-filename cr -machinefile  
$LOADL_HOSTFILE /work/OOBMECO/bin/ray/Ray -read-write-checkpoints  
checkpoints -k 31  -amos -o crambe.3 -s /scratch/suzukim/mira/ 
CCB1a.fastq -search /scratch/suzukim/NCBI-taxonomy/NCBI-Eukaryote- 
Genomes -search /scratch/suzukim/NCBI-taxonomy/NCBI-Finished-Virus- 
Genomes -search /scratch/suzukim/NCBI-taxonomy/NCBI-Draft-Bacteria- 
Genomes -search  /scratch/suzukim/NCBI-taxonomy/NCBI-Finished- 
Bacterial-Genomes -with-taxonomy  /scratch/suzukim/ggenes-taxonomy/ 
Genome-to-Taxon.tsv /scratch/suzukim/ggenes-taxonomy/TreeOfLife- 
Edges.tsv /scratch/suzukim/ggenes-taxonomy/Taxon-Names.tsv -search / 
scratch/suzukim/Ontology/EMBL_CDS_Sequences -gene-ontology /scratch/ 
suzukim/Ontology/OntologyTerms.txt /scratch/suzukim/Ontology/ 
Annotations.txt -show-memory-usage

Marcelino

On Apr 23, 2013, at 10:59 PM, Sébastien Boisvert wrote:

> Therefore the user should split in more than 3 input file when  
> running on
> 1024 MPI ranks.
>
> On 23/04/13 04:37 PM, Daniel Gruner wrote:
>> Here is the tail of the stdout:
>>
>> Rank 358 has 390000 sequence reads
>> Rank 358: assembler memory usage: 273852 KiB
>> Rank 920 has 330000 sequence reads
>> Rank 920: assembler memory usage: 264648 KiB
>> Rank 920 has 340000 sequence reads
>> Rank 920: assembler memory usage: 264648 KiB
>> Rank 358 has 400000 sequence reads
>> Rank 358: assembler memory usage: 273852 KiB
>> Rank 358 has 400690 sequence reads (completed)
>> Rank 358 is writing checkpoint Sequences
>> Rank 920 has 350000 sequence reads
>> Rank 920: assembler memory usage: 281036 KiB
>> Rank 920 has 360000 sequence reads
>> Rank 920: assembler memory usage: 281036 KiB
>> Rank 920 has 370000 sequence reads
>> Rank 920: assembler memory usage: 281036 KiB
>> Rank 920 has 380000 sequence reads
>> Rank 920: assembler memory usage: 281036 KiB
>> Rank 920 has 390000 sequence reads
>> Rank 920: assembler memory usage: 281036 KiB
>> Rank 920 has 400000 sequence reads
>> Rank 920: assembler memory usage: 281036 KiB
>> Rank 920 has 400690 sequence reads (completed)
>> Rank 920 is writing checkpoint Sequences
>> Rank 961 has 0 sequence reads
>> Rank 961: assembler memory usage: 265612 KiB
>>
>> The head node shows about 100 minutes of cpu usage on each rank.
>>
>> Danny
>>
>> On Tue, Apr 23, 2013 at 04:31:14PM -0400, Sébastien Boisvert wrote:
>>> On 23/04/13 04:17 PM, Daniel Gruner wrote:
>>>> Hi Sebastien,
>>>>
>>>> Yes, it is on the GPC.
>>>> Here is the full script that was run:
>>>>
>>>> #!/bin/bash
>>>> #PBS -l nodes=128:ppn=8
>>>> #PBS -l walltime=10:00:00
>>>> #PBS -N moa_Ray_flash_assembly_31mer_128nodes_disable_recycling
>>>> cd $PBS_O_WORKDIR
>>>> module load gcc
>>>> module unload openmpi/1.4.4-intel-v12.1
>>>> module load openmpi/1.4.4-gcc-v4.6.1
>>>> mpiexec -n 1024 Ray -k 31 -o RayOutput_k31_128nodes_disable- 
>>>> recycling -route-messages -connection-type debruijn -routing- 
>>>> graph-degree 32 -disable-recycling -p /scratch/a/abaker/acloutie/ 
>>>> moa_ray_input/moa_all_R1.fastq /scratch/a/abaker/acloutie/ 
>>>> moa_ray_input/moa_all_R2.fastq -s /scratch/a/abaker/acloutie/ 
>>>> moa_ray_input/moa_all_single.fastq -read-write-checkpoints  
>>>> checkpoints_128nodes_disable_recycling
>>>>
>>>
>>> The polytope graph is much better than the de Bruijn graph for  
>>> routing messages. But it is not the reason for the high number of
>>> I/O operations.
>>>
>>>
>>>
>>> A first possible reason for slow I/O is that the user has only 3  
>>> input files, yet it has 1024 MPI ranks.
>>>
>>> Sequencing data is definitely not produced as 3 files upstream.
>>>
>>> It's usually better to use more fastq input files.
>>>
>>>
>>>
>>> Another possible reason would be the checkpoints. Some of the  
>>> checkpointing code group I/O operations. But some other
>>> parts do not.
>>>
>>> There is an opened issue to fix this: 
>>> https://github.com/sebhtml/ray/issues/57
>>>
>>>
>>> So a tail on the standard output would (probably) help to zero in  
>>> on the real issue here.
>>>
>>>
>>>
>>>> Danny
>>>>
>>>>
>>>> On Tue, Apr 23, 2013 at 04:11:33PM -0400, Sébastien Boisvert wrote:
>>>>> Hello,
>>>>>
>>>>> I CC'ed the mail to the community mailing list.
>>>>>
>>>>> To subscribe:  
>>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>>>>>
>>>>> On 23/04/13 04:01 PM, Daniel Gruner wrote:
>>>>>> Hi Alison,
>>>>>>
>>>>>> We've noticed you are running a fairly large Ray assembly job,  
>>>>>> like you
>>>>>> did last Saturday.  It turns out that this job causes a high  
>>>>>> load on the
>>>>>> filesystem, and the effect is to slow down access to file for  
>>>>>> everybody
>>>>>> on SciNet.
>>>>>>
>>>>>
>>>>> On gpc, right ?
>>>>>
>>>>>> We are not sure why this is, and in fact I'd like to contact  
>>>>>> the developer of
>>>>>> Ray, Sebastien Boisvert, about it.  Perhaps you can tell me  
>>>>>> some details of
>>>>>> your current calculation, using 1024 cores.
>>>>>>
>>>>>> There is a new version of Ray out, and it is conceivable that  
>>>>>> some of the problems
>>>>>> have been addressed. Would you be able to test it?  Supposedly  
>>>>>> it fixes a number
>>>>>> of issues.
>>>>>
>>>>> The following options can do a lot of input/output operations:
>>>>>
>>>>> -write-kmers
>>>>>
>>>>> -read-write-checkpoints CheckpointDirectory
>>>>>
>>>>>>
>>>>>> It may be useful to be able to tell the developer of Ray what  
>>>>>> exactly you are doing
>>>>>> in your calculation, so that he may be able to determine if  
>>>>>> there is indeed a
>>>>>> problem with his I/O strategy.
>>>>>
>>>>> Can you provide the complete command line ?
>>>>>
>>>>>
>>>>> The file format .fastq.gz has a readahead code implemented too.  
>>>>> That may help.
>>>>>
>>>>>>
>>>>>> I am copying Sebastien in this email.
>>>>>>
>>>>>> Thanks and regards,
>>>>>> Danny
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
>
> ------------------------------------------------------------------------------
> Try New Relic Now & We'll Send You this Cool Shirt
> New Relic is the only SaaS-based application performance monitoring  
> service
> that delivers powerful full stack analytics. Optimize and monitor your
> browser, app, & servers with just a few lines of code. Try New Relic
> and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
> _______________________________________________
> Denovoassembler-users mailing list
> Denovoassembler-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users


------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Denovoassembler-users mailing list
Denovoassembler-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to