Baunsgaard commented on pull request #931:
URL: https://github.com/apache/systemds/pull/931#issuecomment-658081701


   @mboehm7 
   
   As requested here are some comparison between before and now after, also 
with this i will finish committing to this branch, to enable reviews.
   
   I have disabled two key features, that hopefully will improve performance 
once re-implemented, but i intend to slightly change the way they are done.
   
   - Dictionary sharing (I intend to enable sharing across different col-group 
types, Since we now have a shared representation for this) I intend to move 
this step to before the construction of ColGroups, this will enable the storing 
of pointers to all the dictionaries in the CompressedMatrixBlock object to 
quicken value only computations, and the  ColGroups will then be oblivious to 
their sharing of dictionaries.
   - CoCoding. This is disabled currently since 1 it increase compression time, 
2 it does not improve compression ratio on covType dataset.
   
   Before (on master branch)
   ```code
   DATA      , RUN                      , TYPE                , TIME ms   , REP 
 
   covtype   , MatrixVector mv          , cla                 ,     1.980 ,   
100
   covtype   , MatrixVector vm          , cla                 ,     3.310 ,   
100
   covtype   , scalar mult              , cla                 ,     3.900 ,   
100
   covtype   , scalar plus              , cla                 ,    13.180 ,   
100
   covtype   , unaryAggregate sum       , cla                 ,     1.992 ,   
500
   covtype   , unaryAggregate rowsum    , cla                 ,    23.740 ,   
500
   covtype   , unaryAggregate colsum    , cla                 ,    24.556 ,   
500
   covtype   , unaryAggregate colmax    , cla                 ,     0.122 ,   
500
   covtype   , unaryAggregate max       , cla                 ,       nan ,     0
   covtype   , unaryAggregate min       , cla                 ,     0.100 ,   
500
   covtype   , unaryAggregate rowmax    , cla                 ,    44.208 ,   
500
   ```
   
   after:
   ```code
   DATA      , RUN                      , TYPE                , TIME ms   , REP 
 
   covtype   , MatrixVector mv          , cla                 ,     1.916 ,  
1000
   covtype   , MatrixVector mv          , lcla                ,     1.752 ,  
1000
   covtype   , MatrixVector vm          , cla                 ,     4.138 ,  
1000
   covtype   , MatrixVector vm          , lcla                ,     3.764 ,  
1000
   covtype   , scalar mult              , cla                 ,     0.157 ,  
1000
   covtype   , scalar mult              , lcla                ,     0.129 ,  
1000
   covtype   , scalar plus              , cla                 ,     0.249 ,  
1000
   covtype   , scalar plus              , lcla                ,     0.212 ,  
1000
   covtype   , unaryAggregate sum       , cla                 ,     0.828 ,   
500
   covtype   , unaryAggregate sum       , lcla                ,     2.790 ,   
500
   covtype   , unaryAggregate rowsum    , cla                 ,    12.075 ,  
3000
   covtype   , unaryAggregate rowsum    , lcla                ,    33.120 ,  
3000
   covtype   , unaryAggregate colsum    , cla                 ,     0.834 ,   
500
   covtype   , unaryAggregate colsum    , lcla                ,     2.886 ,   
500
   covtype   , unaryAggregate colmax    , cla                 ,     0.259 ,  
3000
   covtype   , unaryAggregate colmax    , lcla                ,     0.039 ,  
3000
   covtype   , unaryAggregate max       , cla                 ,     0.142 ,   
500
   covtype   , unaryAggregate max       , lcla                ,     0.064 ,   
500
   covtype   , unaryAggregate min       , cla                 ,     0.170 ,   
500
   covtype   , unaryAggregate min       , lcla                ,     0.118 ,   
500
   covtype   , unaryAggregate rowmax    , cla                 ,    31.253 ,  
3000
   covtype   , unaryAggregate rowmax    , lcla                ,    69.297 ,  
3000
   ```
   
   Uncompressed Performance:
   
   ```code
   covtype   , MatrixVector mv          , ula                 ,     6.230 ,  
1000
   covtype   , MatrixVector vm          , ula                 ,     8.895 ,  
1000
   covtype   , scalar mult              , ula                 ,    34.050 ,   
300
   covtype   , scalar plus              , ula                 ,    63.683 ,   
300
   covtype   , unaryAggregate sum       , ula                 ,     7.146 ,   
500
   covtype   , unaryAggregate rowsum    , ula                 ,    10.895 ,  
3000
   covtype   , unaryAggregate colsum    , ula                 ,     8.268 ,   
500
   covtype   , unaryAggregate colmax    , ula                 ,     7.886 ,  
3000
   covtype   , unaryAggregate max       , ula                 ,     7.116 ,   
500
   covtype   , unaryAggregate min       , ula                 ,     7.508 ,   
500
   covtype   , unaryAggregate rowmax    , ula                 ,     8.403 ,  
3000
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to