Re: Architecture recommendations for a tricky use case

2016-09-29 Thread Ali Akhtar
; On Thu, Sep 29, 2016 at 10:40 AM, Deepak Sharma > wrote: > > If you use spark direct streams , it ensure end to end guarantee for > > messages. > > > > > > On Thu, Sep 29, 2016 at 9:05 PM, Ali Akhtar > wrote: > >> > >> My concern with Post

Re: Architecture recommendations for a tricky use case

2016-09-29 Thread Ali Akhtar
ost / > >> duplicated data? Are your writes idempotent? > >> > >> Absent any other information about the problem, I'd stay away from > >> cassandra/flume/hdfs/hbase/whatever, and use a spark direct stream > >> feeding postgres. > >> >

Re: Architecture recommendations for a tricky use case

2016-09-29 Thread Ali Akhtar
On Thu, Sep 29, 2016 at 8:24 PM, Ali Akhtar wrote: > >> I don't think I need a different speed storage and batch storage. Just >> taking in raw data from Kafka, standardizing, and storing it somewhere >> where the web UI can query it, seems like it will be enough. >> &g

Re: Architecture recommendations for a tricky use case

2016-09-29 Thread Ali Akhtar
laimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages ar

Re: Architecture recommendations for a tricky use case

2016-09-29 Thread Ali Akhtar
ny > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > On

Re: Architecture recommendations for a tricky use case

2016-09-29 Thread Ali Akhtar
It needs to be able to scale to a very large amount of data, yes. On Thu, Sep 29, 2016 at 7:00 PM, Deepak Sharma wrote: > What is the message inflow ? > If it's really high , definitely spark will be of great use . > > Thanks > Deepak > > On Sep 29, 2016 19:24, "

Architecture recommendations for a tricky use case

2016-09-29 Thread Ali Akhtar
I have a somewhat tricky use case, and I'm looking for ideas. I have 5-6 Kafka producers, reading various APIs, and writing their raw data into Kafka. I need to: - Do ETL on the data, and standardize it. - Store the standardized data somewhere (HBase / Cassandra / Raw HDFS / ElasticSearch / Pos