On 22/11/13 05:37 AM, Hornung, Bastian wrote: > Sorry, me again :/. > > Again, another problem, and again no idea what could cause it. > Besides that I'm running ray on another server. > > I started ray with: > mpiexec -n 3 Ray -k 49 -p /home/bastian/data/XXX/Raw_data/NG-XXX_1.fastq > /home/bastian/data/XXX/Raw_data/NG-XXX_2.fastq -o > /home/bastian/data/XXX/NG-XXX_rayk49 >-write-kmers -enable-neighbourhoods -write-seeds -write-read-markers >-write-extensions -write-contig-paths > > As error message I get: > > Rank 2: assembler memory usage: 781872 KiB > Got a seed, peak coverage: 72, adding seed. > Rank 2 has 547 seeds > Rank 2 is creating seeds [5258290/5258290] (completed) > Rank 2: peak number of workers: 179, maximum: 32768 > Rank 2 : VirtualCommunicator (service provided by VirtualCommunicator): > 35999329 virtual messages generated 1263205 real messages (3.50897%) > Rank 2 runtime statistics for seeding algorithm: > Rank 2 Skipped paths because of dead end for head: 0 > Rank 2 Skipped paths because of dead end for tail: 0 > Rank 2 Skipped paths because of two dead ends: 0 > Rank 2 Skipped paths because of bubble weak component: 0 > Rank 2 Skipped paths because of short length: 5257176 > Rank 2 Skipped paths because of bad ownership: 567 > Rank 2 Skipped paths because of low coverage: 0 > Rank 2 Eligible paths: 547 > Rank 2: assembler memory usage: 781872 KiB > Rank 0 has 556 seeds to register. > Rank 1 has 573 seeds to register. > Rank 2 has 547 seeds to register. > Rank 1 registered 0/573 > Rank 2 registered 0/547 > Rank 0 registered 0/556 > Rank 1 registered 572/573 > Rank 1 registered its seeds > Rank 2 registered 546/547 > Rank 2 registered its seeds > Rank 0 registered 555/556 > Rank 0 registered its seeds > VirtualProcessor: completed jobs: 0 > Rank 0 : VirtualCommunicator (service provided by VirtualCommunicator): 0 > virtual messages generated 0 real messages (0%) > VirtualProcessor: completed jobs: 0 > Rank 1 : VirtualCommunicator (service provided by VirtualCommunicator): 0 > virtual messages generated 0 real messages (0%) > VirtualProcessor: completed jobs: 0 > Rank 2 : VirtualCommunicator (service provided by VirtualCommunicator): 0 > virtual messages generated 0 real messages (0%) > Rank 2 freed 62914560 bytes from the path memory pool (chunks: 15) > Rank 0 freed 62914560 bytes from the path memory pool (chunks: 15) > Rank 1 freed 62914560 bytes from the path memory pool (chunks: 15) > Fatal error in PMPI_Isend: Invalid rank, error stack: > PMPI_Isend(148): MPI_Isend(buf=0x7fa36f4e4250, count=8, MPI_BYTE, dest=3, > tag=224, MPI_COMM_WORLD, request=0x1082eb0) failed > PMPI_Isend(95).: Invalid rank has value 3 but must be nonnegative and less > than 3 > Fatal error in PMPI_Isend: Invalid rank, error stack: > PMPI_Isend(148): MPI_Isend(buf=0x7f738c14bd10, count=8, MPI_BYTE, dest=3, > tag=224, MPI_COMM_WORLD, request=0x2301ec8) failed > PMPI_Isend(95).: Invalid rank has value 3 but must be nonnegative and less > than 3 > Fatal error in PMPI_Isend: Invalid rank, error stack: > PMPI_Isend(148): MPI_Isend(buf=0x7f0534dc9350, count=8, MPI_BYTE, dest=3, > tag=224, MPI_COMM_WORLD, request=0x10d1c70) failed > PMPI_Isend(95).: Invalid rank has value 3 but must be nonnegative and less > than 3 > > > Only difference to my previous runs: As from the email before, this time I > correctly wrote "-write-kmers". > I just ran the assembly again, this time without this option (and with 9 > cores instead of 3),
In this part of the software, one rank is elected to be the arbitrer. The code picks up the last rank that is a prime number or 1. There was a bug in that code. The bug was that with 3 ranks, the arbiter will be # 3 (but ranks are numbered as 0, 1 and 2). I can confirm that this problem is now solved: https://github.com/sebhtml/ray/commit/b5811e82a4b0fddfbe82962e77c668e69391ded9 If you try with 4 ranks, it will work. 9 cores also work because it is not a prime number. > and it finished. > I doubt it's an issue with 3 cores (...well...you never know...as you've seen > with 2 cores), so I guess there's something wrong with that option. This is now fixed. Meanwhile, you need to use a number that is not a prime and not 1 (and not 2 because of the other different problem you reported which is also solved in the git repositories. > > Best regards, > Thanks ! > Bastian > > > > ------------------------------------------------------------------------------ > Shape the Mobile Experience: Free Subscription > Software experts and developers: Be at the forefront of tech innovation. > Intel(R) Software Adrenaline delivers strategic insight and game-changing > conversations that shape the rapidly evolving mobile landscape. Sign up now. > http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk > _______________________________________________ > Denovoassembler-users mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/denovoassembler-users > ------------------------------------------------------------------------------ Shape the Mobile Experience: Free Subscription Software experts and developers: Be at the forefront of tech innovation. Intel(R) Software Adrenaline delivers strategic insight and game-changing conversations that shape the rapidly evolving mobile landscape. Sign up now. http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk _______________________________________________ Denovoassembler-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
