Okay, do you have more parameters than observations? And each segment of the matrix will be fully distributed? Do you have a parallel file system? Is your matrix sparse or dense?
Michał Dereziński <[email protected]> writes: > It is an optimization problem minimizing a convex objective for a binary > classification task, which I’m solving using a Tao solver. > The multiplication operations are performing gradient computation for each > step of the optimization. > So I’m performing both a MatMult and a MatMultTranspose, in both cases the > vector may be a dense vector. > > The crucial part of the implementation is that at the beginning I am not > running on the entire dataset (rows of the full matrix). > As a consequence I don’t need to have the entire matrix loaded right away. In > fact, in some cases I may choose to stop the optimization before the entire > matrix has been loaded (I already verified that this scenario may come up as > a use case). That is why it is important that I don’t load it at the > beginning. > > Parallel loading is not a necessary part of the implementation. Initially, I > intend to alternate between loading a portion of the matrix, then doing > computations, then loading more of the matrix, etc. But, given that I > observed large loading times for some datasets, parallel loading may make > sense, if done efficiently. > > Thanks, > Michal. > >> Wiadomość napisana przez Jed Brown <[email protected]> w dniu 24.05.2017, o >> godz. 11:32: >> >> Michał Dereziński <[email protected]> writes: >> >>> Great! Then I have a follow-up question: >>> >>> My goal is to be able to load the full matrix X from disk, while at >>> the same time in parallel, performing computations on the submatrices >>> that have already been loaded. Essentially, I want to think of X as a >>> block matrix (where the blocks are horizontal, spanning the full width >>> of the matrix), >> >> What would be the distribution of the vector that this non-square >> submatrix (probably with many empty columns) is applied to? >> >> Could you back up and explain what problem you're trying to solve? It >> sounds like you're about to code yourself into a dungeon.
signature.asc
Description: PGP signature
