Github user dbtsai commented on the issue:
    I'm benchmarking LOR with 14M features of internal company dataset 
(unfortunately, it's not public). 
    Regrading using sparse data structure for aggregation, I'm not so sure how 
much this will improve the performance. Since after computing the gradient sum 
for all the data in one partitions, the gradient vector will be no longer to be 
very sparse. Even it's sparse, after couple depth of aggregation, it will be 
very dense. Also, we perform the compression in the shuffle phase, so if there 
are sparse, even it's in dense vector representation, the vector should take 
around the same size as sparse representation. We may need to do more 
investigation on this to understand how much performance we can gain in 
practice by using sparse vector for aggregating the gradients.

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at or file a JIRA ticket
with INFRA.

To unsubscribe, e-mail:
For additional commands, e-mail:

Reply via email to