Github user AnthonyTruchet commented on the issue:

    https://github.com/apache/spark/pull/16038
  
    No this does not do the trick as the result of the aggregation IS dense. 
And the zero in (tree)aggregate has the same type as the result. Said 
otherwise, in L-BFGS, we do aggregate vectors that are each pretty sparse but 
whose aggregation is dense as they have different support. So taking a dense 
vector as the zero of the aggregator make perfect sense, adding a sparse 
contribution to a dense aggregator yield a dense aggregator and this is what is 
desired. We just need not to waiste bandwith by sending this huge zero to 
executors.
    
    If you do not want to change or add a public API in core, we might 
contribute a treeAggregateWithZeroGenerator to MLlib using a third way to solve 
the issue: use an Option[DenseVector] as the aggregate type and unwrap it as a 
dense zero bby wrapping the seqOp and comboOp, we have a draft of that 
locally...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to