Thanks all the reply

I have considered to integrate the java-hll package 
(https://github.com/aggregateknowledge/java-hll), which uses hash-function 
murmur_23 from google, I am having lot of exceptions to include it, I am 
thinking if this hash is compatible with the distributed machnism of storm (I 
might be naive). 

Another thing I am thinking is to use TridentReach, this is to count the unique 
people exposed to a url page, I am thinking to combine this tridentReach with 
kafkaSpout, my question, should I create a fixed size Hashmap to contain the 
URL and array of visitors? So this means the fixed size of hash map represents 
the window size of slide window. I wonder if this is correct?


thanks

Alec

On Aug 21, 2014, at 11:18 AM, Nima Movafaghrad <[email protected]> 
wrote:

> Alec,
>  
> You can use something like HyperLogLog or Bloomfilters to do Unique and/or 
> Distinct counting. Just create a bolt that does that.
>  
> Nima
>  
> From: Sa Li [mailto:[email protected]] 
> Sent: Wednesday, August 20, 2014 2:45 PM
> To: [email protected]
> Subject: distinct counting
>  
> Hi, all
>  
> I know storm does good job on counting and other aggregate jobs, I wonder if 
> anyone ever did distinct counting in storm, and how would you set the time 
> sliding window?
>  
> thanks
>  
> 
> Alec

Reply via email to