Is the code public for me to test? --Junchao Zhang
On Mon, Jul 8, 2019 at 3:06 PM Fande Kong <[email protected]<mailto:[email protected]>> wrote: Thanks Junchao, Tried your code. I did not hit seg fault this time, but the assembly was still slow time mpirun -n 2 ./matrix_sparsity-opt -matstash_legacy Close matrix for np = 2 ... Matrix successfully closed real 0m2.009s user 0m3.324s sys 0m0.575s time mpirun -n 2 ./matrix_sparsity-opt Close matrix for np = 2 ... Matrix successfully closed real 3m39.235s user 6m42.184s sys 0m35.084s Fande, On Mon, Jul 8, 2019 at 8:47 AM Fande Kong <[email protected]<mailto:[email protected]>> wrote: Will let you know soon. Thanks, Fande, On Mon, Jul 8, 2019 at 8:41 AM Zhang, Junchao <[email protected]<mailto:[email protected]>> wrote: Fande or John, Could any of you have a try? Thanks --Junchao Zhang ---------- Forwarded message --------- From: Junchao Zhang <[email protected]<mailto:[email protected]>> Date: Thu, Jul 4, 2019 at 8:21 AM Subject: Re: [petsc-dev] Slowness of PetscSortIntWithArrayPair in MatAssembly To: Fande Kong <[email protected]<mailto:[email protected]>> Fande, I wrote tests but could not reproduce the error. I pushed a commit that changed the MEDIAN macro to a function to make it easier to debug. Could you run and debug it again? It should be easy to see what is wrong in gdb. Thanks. --Junchao Zhang On Wed, Jul 3, 2019 at 6:48 PM Fande Kong <[email protected]<mailto:[email protected]>> wrote: Process 3915 resuming Process 3915 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=2, address=0x7ffee9b91fc8) frame #0: 0x000000010cbaa031 libpetsc.3.011.dylib`PetscSortIntWithArrayPair_Private(L=0x0000000119fc5480, J=0x000000011bfaa480, K=0x000000011ff74480, right=13291) at sorti.c:298 295 } 296 PetscFunctionReturn(0); 297 } -> 298 i = MEDIAN(L,right); 299 SWAP3(L[0],L[i],J[0],J[i],K[0],K[i],tmp); 300 vl = L[0]; 301 last = 0; (lldb) On Wed, Jul 3, 2019 at 4:32 PM Zhang, Junchao <[email protected]<mailto:[email protected]>> wrote: Could you debug it or paste the stack trace? Since it is a segfault, it should be easy. --Junchao Zhang On Wed, Jul 3, 2019 at 5:16 PM Fande Kong <[email protected]<mailto:[email protected]>> wrote: Thanks Junchao, But there is still segment fault. I guess you could write some continuous integers to test your changes. Fande On Wed, Jul 3, 2019 at 12:57 PM Zhang, Junchao <[email protected]<mailto:[email protected]>> wrote: Fande and John, Could you try jczhang/feature-better-quicksort-pivot? It passed Jenkins tests and I could not imagine why it failed on yours. Hash table has its own cost. We'd better get quicksort right and see how it performs before rewriting code. --Junchao Zhang On Tue, Jul 2, 2019 at 2:37 PM Fande Kong <[email protected]<mailto:[email protected]>> wrote: YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault: 11 (signal 11) Segmentation fault :-) As Jed said, it might be a good idea to rewrite the code using the hashing table. Fande, On Tue, Jul 2, 2019 at 1:27 PM Zhang, Junchao <[email protected]<mailto:[email protected]>> wrote: Try this to see if it helps: diff --git a/src/sys/utils/sorti.c b/src/sys/utils/sorti.c index 1b07205a..90779891 100644 --- a/src/sys/utils/sorti.c +++ b/src/sys/utils/sorti.c @@ -294,7 +294,8 @@ static PetscErrorCode PetscSortIntWithArrayPair_Private(PetscInt *L,PetscInt *J, } PetscFunctionReturn(0); } - SWAP3(L[0],L[right/2],J[0],J[right/2],K[0],K[right/2],tmp); + i = MEDIAN(L,right); + SWAP3(L[0],L[i],J[0],J[i],K[0],K[i],tmp); vl = L[0]; last = 0; for (i=1; i<=right; i++) { On Tue, Jul 2, 2019 at 12:14 PM Fande Kong via petsc-dev <[email protected]<mailto:[email protected]>> wrote: BTW, PetscSortIntWithArrayPair is used in MatStashSortCompress_Private. Any way to avoid to use PetscSortIntWithArrayPair in MatStashSortCompress_Private? Fande, On Tue, Jul 2, 2019 at 11:09 AM Fande Kong <[email protected]<mailto:[email protected]>> wrote: Hi Developers, John just noticed that the matrix assembly was slow when having sufficient amount of off-diagonal entries. It was not a MPI issue since I was able to reproduce the issue using two cores on my desktop, that is, "mpirun -n 2". I turned on a profiling, and 99.99% of the time was spent on PetscSortIntWithArrayPair (recursively calling). It took THREE MINUTES to get the assembly done. And then changed to use the option "-matstash_legacy" to restore the code to the old assembly routine, and the same code took ONE SECOND to get the matrix assembly done. Should write any better sorting algorithms? Fande,
