Hi

I have a general question. I want to do a real time aggrega*tion using
spark. I have kinesis as source and planning ES as data source. there might
be close to 2000 distinct events possible. I want to keep a runnning count
of how many times each event occurs.*

*Currently upon receiving an event I am looking up backend by the event
code (which is used as document id, so fast lookup) and adding 1 with
the* current
value.

I am worried because this process is not idempotent. To solve it, I can
keep writing each event and let ES aggregate while querying. But this seems
wasteful.Am I correct in is assumption?

I know about update and new track by state functions, but I was wondering
what is the general approach to solve this issue,? Any pointer would be
very helpful.

Best
Ayan

On Sun, Dec 6, 2015 at 6:17 PM, Nick Pentreath <nick.pentre...@gmail.com>
wrote:

> I've had great success using Elasticsearch with Spark - the integration
> works well (both ways - reading and indexing) and ES + Kibana makes a
> powerful event / time-series storage, aggregation and data visualization
> stack.
>
>
> —
> Sent from Mailbox <https://www.dropbox.com/mailbox>
>
>
> On Sun, Dec 6, 2015 at 9:07 AM, manasdebashiskar <poorinsp...@gmail.com>
> wrote:
>
>> Depends on your need.
>> Have you looked at Elastic search, or Accumulo or Cassandra?
>> If post processing of your data is not your motive and you want to just
>> retrieve the data later greenplum(based on postgresql) can be an
>> alternative.
>>
>> in short there are many NOSQL out there with each having different
>> project
>> maturity and feature sets.
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Experiences-about-NoSQL-databases-with-Spark-tp25462p25594.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


-- 
Best Regards,
Ayan Guha

Reply via email to