You are right, my plan was to store the cardinalities as well as maybe a
bitmap string into database, that surely save huge space. However, the we
already have a channel to populate the web events into postgres for some
other analytics use, which is kinda parallel with the process kafka
listening to web-server. My question is if I can do the distinct counting
in postgres which already exists, what will be the advantage to do the
similar thing in storm, of course implementation will help me to learn the
storm and kafka stuff. Maybe it is even faster because the parallelism in
storm?

thanks

Alec


On Fri, Aug 15, 2014 at 3:58 PM, Sam Goodwin <[email protected]>
wrote:

> I'm not too sure about how postgres hll works but i'm assuming you're
> going to have to send every tuple to Postgres DB remotely. This is very
> expensive. Where if you build your hll data strucuture in storm you only
> have to persist the fixed size serialized version of the hll to the
> database each transaction. This sort of solution scales much better.
>
>
> On Fri, Aug 15, 2014 at 1:42 PM, Sa Li <[email protected]> wrote:
>
>> postgresql-hll: the PostgreSQL extension adding HyperLogLog data
>> structures seems pretty good, If we do counting directly in postgresDB.
>>
>>
>> On Fri, Aug 15, 2014 at 1:38 PM, Sa Li <[email protected]> wrote:
>>
>>> Hi, all
>>>
>>> Continue this topic, I am bit of confused whether I should implement the
>>> hyperloglog in storm or perform the postgresql-hll extension in postgresDB,
>>> if I can effectively count the uniques in postgresql-hll, and write into a
>>> separate distinct count table, why would I implement that in storm? I know
>>> some developers are implementing hll in storm, and I am just unclear what
>>> the advantage to do that in storm than in database with hll-extension.
>>>
>>> thanks
>>>
>>> Alec
>>>
>>>
>>> On Wed, Aug 13, 2014 at 4:37 PM, Sa Li <[email protected]> wrote:
>>>
>>>> Hi, All
>>>>
>>>> I am thinking to implement HyperLoglog by storm with KafkaSpout, and
>>>> output not only the distinct counts, but also some kind of bitmap string,
>>>> anyone did the similar job, a guide for start is highly appreciated.
>>>>
>>>> thanks
>>>>
>>>> Alec
>>>>
>>>
>>>
>>
>

Reply via email to