hongbin ma created KYLIN-943:
--------------------------------

             Summary: Approximate TopN supported by Cube
                 Key: KYLIN-943
                 URL: https://issues.apache.org/jira/browse/KYLIN-943
             Project: Kylin
          Issue Type: New Feature
    Affects Versions: v0.8.1
            Reporter: hongbin ma
            Assignee: hongbin ma


SpaceSaving (TopN algorithm) code could copy from 
https://github.com/addthis/stream-lib/blob/master/src/main/java/com/clearspring/analytics/stream/StreamSummary.java
We don’t need the whole stream-lib, but just one (or two) classes is enough. 
Make sure you give credit to stream-lib in class comment.
 
In order to run SpaceSaving in parallel, the TopN has to be merged using 
http://arxiv.org/pdf/1401.0702.pdf.  No existing impl as I searched, we have to 
implement ourselves.
 
Cheers
Yang
 
From: Li, Yang 
Sent: 2015年8月7日 12:43
To: DL-eBay-Kylin
Subject: Distributed TopN papers
 
The basic algorithm
[1] https://icmi.cs.ucsb.edu/research/tech_reports/reports/2005-23.pdf
 
Its application in distributed system
[2] http://www.cs.utah.edu/~jeffp/papers/merge-summ-TODS.pdf
[3] http://www.crm.umontreal.ca/pub/Rapports/3300-3399/3322.pdf
 
 
Cheers
Yang



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to