Matthias Boehm created SYSTEMML-2031:
----------------------------------------

             Summary: Perftest: Unnecessary compression of incompressible blocks
                 Key: SYSTEMML-2031
                 URL: https://issues.apache.org/jira/browse/SYSTEMML-2031
             Project: SystemML
          Issue Type: Bug
            Reporter: Matthias Boehm


By default, we apply compression for data sets that are known to exceed 
aggregate memory if all operations of the given script that touch the 
respective input are supported over compressed matrices.

On the perftest 800GB dense scenario, this leads to a slight slowdown and 
increase in the matrix size due to incompressible data, where each block is 
represented as follows:
{code}
--col groups sizes (OLE,RLE,DDC1,DDC2,UC): 0,0,0,0,1000
--compression ratio: 0.999475777837746
{code}

We should investigate the set of incompressible columns as well as final 
representation and simply return the uncompressed block in such such scenarios.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to