mboehm7 commented on pull request #931:
URL: https://github.com/apache/systemml/pull/931#issuecomment-637094426
Related to this handling of sparsity might be the fix of memory estimates
along the new dictionaries. Before we did the following:
```
return ColGroupSizes.estimateInMemorySizeGroupValue(_colIndexes.length,
getValuesSize());
public long getValuesSize() { return (_values != null) ? 32 + _values.length
* 8 : 0;
```
which mistakenly fed the size in bytes as the number of values into the col
group estimates, which might have favored the sparse representations more.
Could we run a couple of experiments and compare the compression ratios and
plans, with the original framework, the reintroduced framework, and our current
implementation - then we can narrow down where the differences are and if they
are good or bad.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]