> Wiadomość napisana przez Jed Brown <[email protected]> w dniu 24.05.2017, o > godz. 12:06: > > Okay, do you have more parameters than observations?
No (not necessarily). The biggest matrix is 50M observations and 12M parameters. > And each segment > of the matrix will be fully distributed? Yes. > Do you have a parallel file > system? Yes. > Is your matrix sparse or dense? Yes. > > Michał Dereziński <[email protected]> writes: > >> It is an optimization problem minimizing a convex objective for a binary >> classification task, which I’m solving using a Tao solver. >> The multiplication operations are performing gradient computation for each >> step of the optimization. >> So I’m performing both a MatMult and a MatMultTranspose, in both cases the >> vector may be a dense vector. >> >> The crucial part of the implementation is that at the beginning I am not >> running on the entire dataset (rows of the full matrix). >> As a consequence I don’t need to have the entire matrix loaded right away. >> In fact, in some cases I may choose to stop the optimization before the >> entire matrix has been loaded (I already verified that this scenario may >> come up as a use case). That is why it is important that I don’t load it at >> the beginning. >> >> Parallel loading is not a necessary part of the implementation. Initially, I >> intend to alternate between loading a portion of the matrix, then doing >> computations, then loading more of the matrix, etc. But, given that I >> observed large loading times for some datasets, parallel loading may make >> sense, if done efficiently. >> >> Thanks, >> Michal. >> >>> Wiadomość napisana przez Jed Brown <[email protected]> w dniu 24.05.2017, o >>> godz. 11:32: >>> >>> Michał Dereziński <[email protected]> writes: >>> >>>> Great! Then I have a follow-up question: >>>> >>>> My goal is to be able to load the full matrix X from disk, while at >>>> the same time in parallel, performing computations on the submatrices >>>> that have already been loaded. Essentially, I want to think of X as a >>>> block matrix (where the blocks are horizontal, spanning the full width >>>> of the matrix), >>> >>> What would be the distribution of the vector that this non-square >>> submatrix (probably with many empty columns) is applied to? >>> >>> Could you back up and explain what problem you're trying to solve? It >>> sounds like you're about to code yourself into a dungeon.
