Github user AnthonyTruchet commented on the issue:
https://github.com/apache/spark/pull/16038
No this does not do the trick as the result of the aggregation IS dense.
And the zero in (tree)aggregate has the same type as the result. Said
otherwise, in L-BFGS, we do aggregate vectors that are each pretty sparse but
whose aggregation is dense as they have different support. So taking a dense
vector as the zero of the aggregator make perfect sense, adding a sparse
contribution to a dense aggregator yield a dense aggregator and this is what is
desired. We just need not to waiste bandwith by sending this huge zero to
executors.
If you do not want to change or add a public API in core, we might
contribute a treeAggregateWithZeroGenerator to MLlib using a third way to solve
the issue: use an Option[DenseVector] as the aggregate type and unwrap it as a
dense zero bby wrapping the seqOp and comboOp, we have a draft of that
locally...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]