On 23/04/13 04:34 PM, Daniel Gruner wrote: > One further comment: The standard output file is huge! This is not very > good, > especially if it is not redirected to an actual file on disk.
This was a known problem in v2.1.0 and before. The verbosity of Ray v2.2.0 is much reduced. > > The standard .o and .e files from torque are on the head node of the job, and > since the nodes are diskless then the files really are on ramdisk. I've > noticed > in Alison's directory that the output files are about 8GB in size. This > means that > more than 1/2 of the memory on the head node would have been consumed by this > file > alone! That's silly hehe. > > I've also noticed that the output is extremely verbose - is there a way to > quiet > it down? > Yes, use v2.2.0. ;-) Relevant changes: $ git log --oneline v2.1.0..v2.2.0 | grep verbos 173da1a reduce verbosity of components 32b48a0 reduced verbosity 5aa8f70 reduced verbosity 773ab3c VerticesExtractor: reduced verbosity c5e0eda SequencesLoader: reduced verbosity 1df6943 KmerAcademyBuilder: reduced the verbosity for graph construction 5831bc5 JoinerTaskCreator: reduced the default verbosity 7b777f6 SeedExtender: reduce the verbosity of graph traversal > We see a lot of I/O happening from some ranks, while others are polling (i.e. > doing nothing), > so we really have a problem here. Maybe it's the standard output. v2.2.0 fixes that. > > Danny > > > On Tue, Apr 23, 2013 at 04:17:26PM -0400, Daniel Gruner wrote: >> Hi Sebastien, >> >> Yes, it is on the GPC. >> Here is the full script that was run: >> >> #!/bin/bash >> #PBS -l nodes=128:ppn=8 >> #PBS -l walltime=10:00:00 >> #PBS -N moa_Ray_flash_assembly_31mer_128nodes_disable_recycling >> cd $PBS_O_WORKDIR >> module load gcc >> module unload openmpi/1.4.4-intel-v12.1 >> module load openmpi/1.4.4-gcc-v4.6.1 >> mpiexec -n 1024 Ray -k 31 -o RayOutput_k31_128nodes_disable-recycling >> -route-messages -connection-type debruijn -routing-graph-degree 32 >> -disable-recycling -p >> /scratch/a/abaker/acloutie/moa_ray_input/moa_all_R1.fastq >> /scratch/a/abaker/acloutie/moa_ray_input/moa_all_R2.fastq -s >> /scratch/a/abaker/acloutie/moa_ray_input/moa_all_single.fastq >> -read-write-checkpoints checkpoints_128nodes_disable_recycling >> >> Danny >> >> >> On Tue, Apr 23, 2013 at 04:11:33PM -0400, Sébastien Boisvert wrote: >>> Hello, >>> >>> I CC'ed the mail to the community mailing list. >>> >>> To subscribe: >>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users >>> >>> On 23/04/13 04:01 PM, Daniel Gruner wrote: >>>> Hi Alison, >>>> >>>> We've noticed you are running a fairly large Ray assembly job, like you >>>> did last Saturday. It turns out that this job causes a high load on the >>>> filesystem, and the effect is to slow down access to file for everybody >>>> on SciNet. >>>> >>> >>> On gpc, right ? >>> >>>> We are not sure why this is, and in fact I'd like to contact the developer >>>> of >>>> Ray, Sebastien Boisvert, about it. Perhaps you can tell me some details of >>>> your current calculation, using 1024 cores. >>>> >>>> There is a new version of Ray out, and it is conceivable that some of the >>>> problems >>>> have been addressed. Would you be able to test it? Supposedly it fixes a >>>> number >>>> of issues. >>> >>> The following options can do a lot of input/output operations: >>> >>> -write-kmers >>> >>> -read-write-checkpoints CheckpointDirectory >>> >>>> >>>> It may be useful to be able to tell the developer of Ray what exactly you >>>> are doing >>>> in your calculation, so that he may be able to determine if there is >>>> indeed a >>>> problem with his I/O strategy. >>> >>> Can you provide the complete command line ? >>> >>> >>> The file format .fastq.gz has a readahead code implemented too. That may >>> help. >>> >>>> >>>> I am copying Sebastien in this email. >>>> >>>> Thanks and regards, >>>> Danny >>>> >>> >>> >>> >> >> -- >> >> Dr. Daniel Gruner dgru...@scinet.utoronto.ca >> Chief Technical Officer - Software phone: (416)-978-2775 >> SciNet High Performance Computing Consortium www.scinethpc.ca >> Compute/Calcul Canada www.computecanada.ca > ------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users