[ 
https://issues.apache.org/jira/browse/PHOENIX-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463382#comment-16463382
 ] 

Ethan Wang commented on PHOENIX-4724:
-------------------------------------

[~vincentpoon]

If I understand correctly, with this feature implemented, when you build index 
table, you will at same time record some info into this histogram, so that in 
the future at some point you will conveniently get the distribution info of the 
index table. correct?

So do you store a histogram obj for each index table like a shadow obj some 
where off line? Also, will there every be case that you need mutate index or 
remove index from a existing index table?

Cool idea!

> Efficient Equi-Depth histogram for streaming data
> -------------------------------------------------
>
>                 Key: PHOENIX-4724
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4724
>             Project: Phoenix
>          Issue Type: Sub-task
>            Reporter: Vincent Poon
>            Assignee: Vincent Poon
>            Priority: Major
>         Attachments: PHOENIX-4724.v1.patch
>
>
> Equi-Depth histogram from 
> http://web.cs.ucla.edu/~zaniolo/papers/Histogram-EDBT2011-CamReady.pdf, but 
> without the sliding window - we assume a single window over the entire data 
> set.
> Used to generate the bucket boundaries of a histogram where each bucket has 
> the same # of items.
> This is useful, for example, for pre-splitting an index table, by feeding in 
> data from the indexed column.
> Works on streaming data - the histogram is dynamically updated for each new 
> value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to