> Wiadomość napisana przez Jed Brown <[email protected]> w dniu 24.05.2017, o 
> godz. 12:06:
> 
> Okay, do you have more parameters than observations?  

No (not necessarily). The biggest matrix is 50M observations and 12M parameters.

> And each segment
> of the matrix will be fully distributed?

Yes.

>  Do you have a parallel file
> system?

Yes.

>  Is your matrix sparse or dense?

Yes.

> 
> Michał Dereziński <[email protected]> writes:
> 
>> It is an optimization problem minimizing a convex objective for a binary 
>> classification task, which I’m solving using a Tao solver.
>> The multiplication operations are performing gradient computation for each 
>> step of the optimization.
>> So I’m performing both a MatMult and a MatMultTranspose, in both cases the 
>> vector may be a dense vector.
>> 
>> The crucial part of the implementation is that at the beginning I am not 
>> running on the entire dataset (rows of the full matrix).
>> As a consequence I don’t need to have the entire matrix loaded right away. 
>> In fact, in some cases I may choose to stop the optimization before the 
>> entire matrix has been loaded (I already verified that this scenario may 
>> come up as a use case). That is why it is important that I don’t load it at 
>> the beginning.
>> 
>> Parallel loading is not a necessary part of the implementation. Initially, I 
>> intend to alternate between loading a portion of the matrix, then doing 
>> computations, then loading more of the matrix, etc. But, given that I 
>> observed large loading times for some datasets, parallel loading may make 
>> sense, if done efficiently.
>> 
>> Thanks,
>> Michal.
>> 
>>> Wiadomość napisana przez Jed Brown <[email protected]> w dniu 24.05.2017, o 
>>> godz. 11:32:
>>> 
>>> Michał Dereziński <[email protected]> writes:
>>> 
>>>> Great! Then I have a follow-up question:
>>>> 
>>>> My goal is to be able to load the full matrix X from disk, while at
>>>> the same time in parallel, performing computations on the submatrices
>>>> that have already been loaded. Essentially, I want to think of X as a
>>>> block matrix (where the blocks are horizontal, spanning the full width
>>>> of the matrix), 
>>> 
>>> What would be the distribution of the vector that this non-square
>>> submatrix (probably with many empty columns) is applied to?
>>> 
>>> Could you back up and explain what problem you're trying to solve?  It
>>> sounds like you're about to code yourself into a dungeon.

Reply via email to