On Wed, May 24, 2017 at 1:13 PM, Michał Dereziński < [email protected]> wrote:
> > Wiadomość napisana przez Matthew Knepley <[email protected]> w dniu > 24.05.2017, o godz. 10:44: > > On Wed, May 24, 2017 at 12:37 PM, Michał Dereziński <michal.derezinski@ > gmail.com> wrote: > >> Great! Then I have a follow-up question: >> >> My goal is to be able to load the full matrix X from disk, while at the >> same time in parallel, performing computations on the submatrices that have >> already been loaded. Essentially, I want to think of X as a block matrix >> (where the blocks are horizontal, spanning the full width of the matrix), >> where I’m loading one block at a time, and all the blocks that have already >> been loaded are combined using MatCreateNest, so that I can make >> computations on that portion of the matrix. >> > > I need to understand better. So > > 1) You want to load a sparse matrix from disk > > > Yes, the matrix is sparse, stored on disk in row-wise chunks (one per > process), with total size of around 3TB. > > 2) You are imagining that it is loaded row-wise, since you can do a > calculation with some rows before others are loaded. > > What calculation, a MatMult? > How long does that MatMult take compared to loading? > > > Yes, a MatMult. > I already have a more straightforward implementation where the matrix is > loaded completely at the beginning, and then all of the multiplications are > performed. > Based on the loading time and computation time with the current > implementation, it appears that most of the computation time could be > subsumed into the loading time. > > 3) If you are talking about a dense matrix, you should be loading in > parallel using MPI-I/O. We do this for Vec. > > Before you do complicated programming, I would assure myself that the > performance gain is worth it. > > >> In this scenario, every process needs to be simultaneously loading the >> next block of X, and perform computations on the previously loaded portion. >> My strategy is for each MPI process to spawn a thread for data loading (so >> that the memory between the process and the thread is shared), while the >> process does computations. My concern is that the data loading thread may >> be using up computational resources of the processor, even though it is >> mainly doing IO. Will this be an issue? What is the best way to minimize >> the cpu time of this parallel data loading scheme? >> > > Oh, you want to load each block in parallel, but there are many blocks. I > would really caution you against using threads. They > are death to clean code. Use non-blocking reads. > > > I see. Could you expand on your suggestion regarding non-blocking reads? > Are you proposing that each process makes an asynchronous read request in > between every, say, MatMult operation? > Check this out: http://beige.ucs.indiana.edu/I590/node109.html PETSc does not do this currently, but it sounds like you are handling the load. Thanks, Matt > > Thanks, > > Matt > > >> Thanks, >> Michal. >> >> >> Wiadomość napisana przez Matthew Knepley <[email protected]> w dniu >> 24.05.2017, o godz. 04:55: >> >> On Wed, May 24, 2017 at 1:09 AM, Michal Derezinski <[email protected]> >> wrote: >> >>> Hi, >>> >>> I want to be able to perform matrix operations on several contiguous >>> submatrices of a full matrix, without allocating the memory redundantly for >>> the submatrices (in addition to the memory that is already allocated for >>> the full matrix). >>> I tried using MatGetSubMatrix, but this function appears to allocate the >>> additional memory. >>> >>> The other way I found to do this is to create the smallest submatrices I >>> need first, then use MatCreateNest to combine them into bigger ones >>> (including the full matrix). >>> The documentation of MatCreateNest seems to indicate that it does not >>> allocate additional memory for storing the new matrix. >>> Is this the right approach, or is there a better one? >>> >> >> Yes, that is the right approach. >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> Michal Derezinski. >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> http://www.caam.rice.edu/~mk51/ >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > http://www.caam.rice.edu/~mk51/ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener http://www.caam.rice.edu/~mk51/
