On 23/04/13 05:38 PM, Marcelino Suzuki wrote: > Well, seeing this post, I am a bit worried about my jobs, since using > loadleveler in an idataplex cluster, I don't see the outputs from > different rank numbers, but they all are labeled Rank 0. In fact with > two tasks only two of the -output-filenames are written to and both > gave labels Rank 0. Is this only a difference of the clusters or am I > missing something?
IN your case, I think you should contact the helpdesk of your supercomputer center as the problem really seems to be in the job submission process, not in Ray. > > #!/bin/sh > # @ job_name = crambe_ray > # @ output = $(job_name).out > # @ error = $(job_name).err > > # @ job_type = mpich > # @ node = 2 > # @ total_tasks = 16 > # @ restart = yes > > > # @ wall_clock_limit = 60:00:00,59:55:00 > # @ queue > > mpirun -np $LOADL_TOTAL_TASKS -output-filename cr -machinefile > $LOADL_HOSTFILE /work/OOBMECO/bin/ray/Ray -read-write-checkpoints > checkpoints -k 31 -amos -o crambe.3 -s /scratch/suzukim/mira/ > CCB1a.fastq -search /scratch/suzukim/NCBI-taxonomy/NCBI-Eukaryote- > Genomes -search /scratch/suzukim/NCBI-taxonomy/NCBI-Finished-Virus- > Genomes -search /scratch/suzukim/NCBI-taxonomy/NCBI-Draft-Bacteria- > Genomes -search /scratch/suzukim/NCBI-taxonomy/NCBI-Finished- > Bacterial-Genomes -with-taxonomy /scratch/suzukim/ggenes-taxonomy/ > Genome-to-Taxon.tsv /scratch/suzukim/ggenes-taxonomy/TreeOfLife- > Edges.tsv /scratch/suzukim/ggenes-taxonomy/Taxon-Names.tsv -search / > scratch/suzukim/Ontology/EMBL_CDS_Sequences -gene-ontology /scratch/ > suzukim/Ontology/OntologyTerms.txt /scratch/suzukim/Ontology/ > Annotations.txt -show-memory-usage > > Marcelino > > On Apr 23, 2013, at 10:59 PM, Sébastien Boisvert wrote: > >> Therefore the user should split in more than 3 input file when >> running on >> 1024 MPI ranks. >> >> On 23/04/13 04:37 PM, Daniel Gruner wrote: >>> Here is the tail of the stdout: >>> >>> Rank 358 has 390000 sequence reads >>> Rank 358: assembler memory usage: 273852 KiB >>> Rank 920 has 330000 sequence reads >>> Rank 920: assembler memory usage: 264648 KiB >>> Rank 920 has 340000 sequence reads >>> Rank 920: assembler memory usage: 264648 KiB >>> Rank 358 has 400000 sequence reads >>> Rank 358: assembler memory usage: 273852 KiB >>> Rank 358 has 400690 sequence reads (completed) >>> Rank 358 is writing checkpoint Sequences >>> Rank 920 has 350000 sequence reads >>> Rank 920: assembler memory usage: 281036 KiB >>> Rank 920 has 360000 sequence reads >>> Rank 920: assembler memory usage: 281036 KiB >>> Rank 920 has 370000 sequence reads >>> Rank 920: assembler memory usage: 281036 KiB >>> Rank 920 has 380000 sequence reads >>> Rank 920: assembler memory usage: 281036 KiB >>> Rank 920 has 390000 sequence reads >>> Rank 920: assembler memory usage: 281036 KiB >>> Rank 920 has 400000 sequence reads >>> Rank 920: assembler memory usage: 281036 KiB >>> Rank 920 has 400690 sequence reads (completed) >>> Rank 920 is writing checkpoint Sequences >>> Rank 961 has 0 sequence reads >>> Rank 961: assembler memory usage: 265612 KiB >>> >>> The head node shows about 100 minutes of cpu usage on each rank. >>> >>> Danny >>> >>> On Tue, Apr 23, 2013 at 04:31:14PM -0400, Sébastien Boisvert wrote: >>>> On 23/04/13 04:17 PM, Daniel Gruner wrote: >>>>> Hi Sebastien, >>>>> >>>>> Yes, it is on the GPC. >>>>> Here is the full script that was run: >>>>> >>>>> #!/bin/bash >>>>> #PBS -l nodes=128:ppn=8 >>>>> #PBS -l walltime=10:00:00 >>>>> #PBS -N moa_Ray_flash_assembly_31mer_128nodes_disable_recycling >>>>> cd $PBS_O_WORKDIR >>>>> module load gcc >>>>> module unload openmpi/1.4.4-intel-v12.1 >>>>> module load openmpi/1.4.4-gcc-v4.6.1 >>>>> mpiexec -n 1024 Ray -k 31 -o RayOutput_k31_128nodes_disable- >>>>> recycling -route-messages -connection-type debruijn -routing- >>>>> graph-degree 32 -disable-recycling -p /scratch/a/abaker/acloutie/ >>>>> moa_ray_input/moa_all_R1.fastq /scratch/a/abaker/acloutie/ >>>>> moa_ray_input/moa_all_R2.fastq -s /scratch/a/abaker/acloutie/ >>>>> moa_ray_input/moa_all_single.fastq -read-write-checkpoints >>>>> checkpoints_128nodes_disable_recycling >>>>> >>>> >>>> The polytope graph is much better than the de Bruijn graph for >>>> routing messages. But it is not the reason for the high number of >>>> I/O operations. >>>> >>>> >>>> >>>> A first possible reason for slow I/O is that the user has only 3 >>>> input files, yet it has 1024 MPI ranks. >>>> >>>> Sequencing data is definitely not produced as 3 files upstream. >>>> >>>> It's usually better to use more fastq input files. >>>> >>>> >>>> >>>> Another possible reason would be the checkpoints. Some of the >>>> checkpointing code group I/O operations. But some other >>>> parts do not. >>>> >>>> There is an opened issue to fix this: >>>> https://github.com/sebhtml/ray/issues/57 >>>> >>>> >>>> So a tail on the standard output would (probably) help to zero in >>>> on the real issue here. >>>> >>>> >>>> >>>>> Danny >>>>> >>>>> >>>>> On Tue, Apr 23, 2013 at 04:11:33PM -0400, Sébastien Boisvert wrote: >>>>>> Hello, >>>>>> >>>>>> I CC'ed the mail to the community mailing list. >>>>>> >>>>>> To subscribe: >>>>>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users >>>>>> >>>>>> On 23/04/13 04:01 PM, Daniel Gruner wrote: >>>>>>> Hi Alison, >>>>>>> >>>>>>> We've noticed you are running a fairly large Ray assembly job, >>>>>>> like you >>>>>>> did last Saturday. It turns out that this job causes a high >>>>>>> load on the >>>>>>> filesystem, and the effect is to slow down access to file for >>>>>>> everybody >>>>>>> on SciNet. >>>>>>> >>>>>> >>>>>> On gpc, right ? >>>>>> >>>>>>> We are not sure why this is, and in fact I'd like to contact >>>>>>> the developer of >>>>>>> Ray, Sebastien Boisvert, about it. Perhaps you can tell me >>>>>>> some details of >>>>>>> your current calculation, using 1024 cores. >>>>>>> >>>>>>> There is a new version of Ray out, and it is conceivable that >>>>>>> some of the problems >>>>>>> have been addressed. Would you be able to test it? Supposedly >>>>>>> it fixes a number >>>>>>> of issues. >>>>>> >>>>>> The following options can do a lot of input/output operations: >>>>>> >>>>>> -write-kmers >>>>>> >>>>>> -read-write-checkpoints CheckpointDirectory >>>>>> >>>>>>> >>>>>>> It may be useful to be able to tell the developer of Ray what >>>>>>> exactly you are doing >>>>>>> in your calculation, so that he may be able to determine if >>>>>>> there is indeed a >>>>>>> problem with his I/O strategy. >>>>>> >>>>>> Can you provide the complete command line ? >>>>>> >>>>>> >>>>>> The file format .fastq.gz has a readahead code implemented too. >>>>>> That may help. >>>>>> >>>>>>> >>>>>>> I am copying Sebastien in this email. >>>>>>> >>>>>>> Thanks and regards, >>>>>>> Danny >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >> >> ------------------------------------------------------------------------------ >> Try New Relic Now & We'll Send You this Cool Shirt >> New Relic is the only SaaS-based application performance monitoring >> service >> that delivers powerful full stack analytics. Optimize and monitor your >> browser, app, & servers with just a few lines of code. Try New Relic >> and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr >> _______________________________________________ >> Denovoassembler-users mailing list >> Denovoassembler-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users > ------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users