On 22/11/13 05:37 AM, Hornung, Bastian wrote:
> Sorry, me again :/.
>
> Again, another problem, and again no idea what could cause it.
> Besides that I'm running ray on another server.
>
> I started ray with:
> mpiexec -n 3 Ray -k 49 -p /home/bastian/data/XXX/Raw_data/NG-XXX_1.fastq 
> /home/bastian/data/XXX/Raw_data/NG-XXX_2.fastq -o 
> /home/bastian/data/XXX/NG-XXX_rayk49
>-write-kmers  -enable-neighbourhoods  -write-seeds  -write-read-markers  
>-write-extensions  -write-contig-paths
>
> As error message I get:
>
> Rank 2: assembler memory usage: 781872 KiB
> Got a seed, peak coverage: 72, adding seed.
> Rank 2 has 547 seeds
> Rank 2 is creating seeds [5258290/5258290] (completed)
> Rank 2: peak number of workers: 179, maximum: 32768
> Rank 2 : VirtualCommunicator (service provided by VirtualCommunicator): 
> 35999329 virtual messages generated 1263205 real messages (3.50897%)
> Rank 2 runtime statistics for seeding algorithm:
> Rank 2 Skipped paths because of dead end for head: 0
> Rank 2 Skipped paths because of dead end for tail: 0
> Rank 2 Skipped paths because of two dead ends: 0
> Rank 2 Skipped paths because of bubble weak component: 0
> Rank 2 Skipped paths because of short length: 5257176
> Rank 2 Skipped paths because of bad ownership: 567
> Rank 2 Skipped paths because of low coverage: 0
> Rank 2 Eligible paths: 547
> Rank 2: assembler memory usage: 781872 KiB
> Rank 0 has 556 seeds to register.
> Rank 1 has 573 seeds to register.
> Rank 2 has 547 seeds to register.
> Rank 1 registered 0/573
> Rank 2 registered 0/547
> Rank 0 registered 0/556
> Rank 1 registered 572/573
> Rank 1 registered its seeds
> Rank 2 registered 546/547
> Rank 2 registered its seeds
> Rank 0 registered 555/556
> Rank 0 registered its seeds
> VirtualProcessor: completed jobs: 0
> Rank 0 : VirtualCommunicator (service provided by VirtualCommunicator): 0 
> virtual messages generated 0 real messages (0%)
> VirtualProcessor: completed jobs: 0
> Rank 1 : VirtualCommunicator (service provided by VirtualCommunicator): 0 
> virtual messages generated 0 real messages (0%)
> VirtualProcessor: completed jobs: 0
> Rank 2 : VirtualCommunicator (service provided by VirtualCommunicator): 0 
> virtual messages generated 0 real messages (0%)
> Rank 2 freed 62914560 bytes from the path memory pool (chunks: 15)
> Rank 0 freed 62914560 bytes from the path memory pool (chunks: 15)
> Rank 1 freed 62914560 bytes from the path memory pool (chunks: 15)
> Fatal error in PMPI_Isend: Invalid rank, error stack:
> PMPI_Isend(148): MPI_Isend(buf=0x7fa36f4e4250, count=8, MPI_BYTE, dest=3, 
> tag=224, MPI_COMM_WORLD, request=0x1082eb0) failed
> PMPI_Isend(95).: Invalid rank has value 3 but must be nonnegative and less 
> than 3
> Fatal error in PMPI_Isend: Invalid rank, error stack:
> PMPI_Isend(148): MPI_Isend(buf=0x7f738c14bd10, count=8, MPI_BYTE, dest=3, 
> tag=224, MPI_COMM_WORLD, request=0x2301ec8) failed
> PMPI_Isend(95).: Invalid rank has value 3 but must be nonnegative and less 
> than 3
> Fatal error in PMPI_Isend: Invalid rank, error stack:
> PMPI_Isend(148): MPI_Isend(buf=0x7f0534dc9350, count=8, MPI_BYTE, dest=3, 
> tag=224, MPI_COMM_WORLD, request=0x10d1c70) failed
> PMPI_Isend(95).: Invalid rank has value 3 but must be nonnegative and less 
> than 3
>
>
> Only difference to my previous runs: As from the email before, this time I 
> correctly wrote "-write-kmers".
> I just ran the assembly again, this time without this option (and with 9 
> cores instead of 3),

In this part of the software, one rank is elected to be the arbitrer. The code 
picks up the last rank that is a prime
number or 1. There was a bug in that code.

The bug was that with 3 ranks, the arbiter will be # 3 (but ranks are numbered 
as 0, 1 and 2).

I can confirm that this problem is now solved: 
https://github.com/sebhtml/ray/commit/b5811e82a4b0fddfbe82962e77c668e69391ded9

If you try with 4 ranks, it will work. 9 cores also work because it is not a 
prime number.



> and it finished.
> I doubt it's an issue with 3 cores (...well...you never know...as you've seen 
> with 2 cores), so I guess there's something wrong with that option.

This is now fixed. Meanwhile, you need to use a number that is not a prime and 
not 1 (and not 2 because of the other different problem you
reported which is also solved in the git repositories.

>
> Best regards,
>

Thanks !

> Bastian
>
>
>
> ------------------------------------------------------------------------------
> Shape the Mobile Experience: Free Subscription
> Software experts and developers: Be at the forefront of tech innovation.
> Intel(R) Software Adrenaline delivers strategic insight and game-changing
> conversations that shape the rapidly evolving mobile landscape. Sign up now.
> http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk
> _______________________________________________
> Denovoassembler-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>


------------------------------------------------------------------------------
Shape the Mobile Experience: Free Subscription
Software experts and developers: Be at the forefront of tech innovation.
Intel(R) Software Adrenaline delivers strategic insight and game-changing 
conversations that shape the rapidly evolving mobile landscape. Sign up now. 
http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk
_______________________________________________
Denovoassembler-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to