GitHub user myui opened a pull request:

    https://github.com/apache/incubator-hivemall/pull/125

    approx_distinct_count UDAF using HyperLogLog++

    ## What changes were proposed in this pull request?
    
    This PR introduce `approx_distinct_count` using 
[HyperLogLog++](https://en.wikipedia.org/wiki/HyperLogLog#HLL.2B.2B) as 
implemented in 
[Oracle](https://docs.oracle.com/database/121/SQLRF/functions013.htm#SQLRF56900)
 and 
[BigQuery](https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators?hl=en#approx_count_distinct).
     
    [stream-lib](https://github.com/addthis/stream-lib) is used as the library.
    
    ## What type of PR is it?
    
    Feature
    
    ## What is the Jira issue?
    
    https://issues.apache.org/jira/browse/HIVEMALL-18
    
    ## How was this patch tested?
    
    manual tests
    
    ## How to use this feature?
    
    As described in [this markdown 
document](https://github.com/myui/incubator-hivemall/blob/e52fda9699c14687d62d5bfcd13459982f09193c/docs/gitbook/misc/approx.md).
    
    ## Checklist
    
    - [x] Did you apply source code formatter, i.e., `mvn formatter:format`, 
for your commit?
    - [x] Did you run system tests on Hive (or Spark)?


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/myui/incubator-hivemall sketch

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-hivemall/pull/125.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #125
    
----
commit 05a0d714f5199ecdd6159f1e45817b74ccb7f0b8
Author: Makoto Yui <[email protected]>
Date:   2017-11-21T12:03:01Z

    Minor bugfix in fmeasure UDAF

commit 3ae28623c08550cf1e6edc99e23204f61f3df074
Author: Makoto Yui <[email protected]>
Date:   2017-11-21T12:03:31Z

    Added approx_count_distinct using HyperLogLogPlus

commit a75285383bd4190cf7ba156d8968e4872666382b
Author: Makoto Yui <[email protected]>
Date:   2017-11-21T12:40:35Z

    Added TOC

commit e52fda9699c14687d62d5bfcd13459982f09193c
Author: Makoto Yui <[email protected]>
Date:   2017-11-21T12:40:57Z

    Added documentation about Hyperloglog

----


---

Reply via email to