Re: Use case question
Streaming would be easy to implement, all you have to do is to create the stream, do some transformation (depends on your usecase) and finally write it to your dashboards backend. What kind of dashboards are you building? For d3.js based ones, you can have websocket and write the stream output to the socket, for qlikView/tableau based ones you can push the stream to database. Thanks Best Regards On Mon, Nov 24, 2014 at 4:34 PM, Gordon Benjamin gordon.benjami...@gmail.com wrote: hi, We are building an analytics dashboard. Data will be updated every 5 minutes for now and eventually every 1 minute, maybe more frequent. The amount of data coming is not huge, per customer maybe 30 records per minute although we could have 500 customers. Is streaming correct for this I nstead of reading from multiple partitions for the incremental data?
Re: Use case question
Thanks. Yes d3 ones. Just to clarify--we could take our current system, which is incrementally adding partitions and overlay an Apache streaming layer to ingest these partitions? Then nightly, we could coalesce these partitions for example? I presume that while we are carrying out a coalesce, the end user would not lose access to the underlying data? Let me know of I'm off the mark here. On Monday, November 24, 2014, Akhil Das ak...@sigmoidanalytics.com wrote: Streaming would be easy to implement, all you have to do is to create the stream, do some transformation (depends on your usecase) and finally write it to your dashboards backend. What kind of dashboards are you building? For d3.js based ones, you can have websocket and write the stream output to the socket, for qlikView/tableau based ones you can push the stream to database. Thanks Best Regards On Mon, Nov 24, 2014 at 4:34 PM, Gordon Benjamin gordon.benjami...@gmail.com javascript:_e(%7B%7D,'cvml','gordon.benjami...@gmail.com'); wrote: hi, We are building an analytics dashboard. Data will be updated every 5 minutes for now and eventually every 1 minute, maybe more frequent. The amount of data coming is not huge, per customer maybe 30 records per minute although we could have 500 customers. Is streaming correct for this I nstead of reading from multiple partitions for the incremental data?
Re: Use case question
I'm not quiet sure if i understood you correctly, but here's the thing, if you use sparkstreaming, it is more likely to refresh your dashboard for each batch. So for every batch your dashboard will be updated with the new data. And yes, the end use won't feel anything while you do the coalesce/repartition and all but after that your dashboards will be refreshed with new data. Thanks Best Regards On Mon, Nov 24, 2014 at 4:54 PM, Gordon Benjamin gordon.benjami...@gmail.com wrote: Thanks. Yes d3 ones. Just to clarify--we could take our current system, which is incrementally adding partitions and overlay an Apache streaming layer to ingest these partitions? Then nightly, we could coalesce these partitions for example? I presume that while we are carrying out a coalesce, the end user would not lose access to the underlying data? Let me know of I'm off the mark here. On Monday, November 24, 2014, Akhil Das ak...@sigmoidanalytics.com wrote: Streaming would be easy to implement, all you have to do is to create the stream, do some transformation (depends on your usecase) and finally write it to your dashboards backend. What kind of dashboards are you building? For d3.js based ones, you can have websocket and write the stream output to the socket, for qlikView/tableau based ones you can push the stream to database. Thanks Best Regards On Mon, Nov 24, 2014 at 4:34 PM, Gordon Benjamin gordon.benjami...@gmail.com wrote: hi, We are building an analytics dashboard. Data will be updated every 5 minutes for now and eventually every 1 minute, maybe more frequent. The amount of data coming is not huge, per customer maybe 30 records per minute although we could have 500 customers. Is streaming correct for this I nstead of reading from multiple partitions for the incremental data?
Re: Use case question
Great thanks On Monday, November 24, 2014, Akhil Das ak...@sigmoidanalytics.com wrote: I'm not quiet sure if i understood you correctly, but here's the thing, if you use sparkstreaming, it is more likely to refresh your dashboard for each batch. So for every batch your dashboard will be updated with the new data. And yes, the end use won't feel anything while you do the coalesce/repartition and all but after that your dashboards will be refreshed with new data. Thanks Best Regards On Mon, Nov 24, 2014 at 4:54 PM, Gordon Benjamin gordon.benjami...@gmail.com javascript:_e(%7B%7D,'cvml','gordon.benjami...@gmail.com'); wrote: Thanks. Yes d3 ones. Just to clarify--we could take our current system, which is incrementally adding partitions and overlay an Apache streaming layer to ingest these partitions? Then nightly, we could coalesce these partitions for example? I presume that while we are carrying out a coalesce, the end user would not lose access to the underlying data? Let me know of I'm off the mark here. On Monday, November 24, 2014, Akhil Das ak...@sigmoidanalytics.com javascript:_e(%7B%7D,'cvml','ak...@sigmoidanalytics.com'); wrote: Streaming would be easy to implement, all you have to do is to create the stream, do some transformation (depends on your usecase) and finally write it to your dashboards backend. What kind of dashboards are you building? For d3.js based ones, you can have websocket and write the stream output to the socket, for qlikView/tableau based ones you can push the stream to database. Thanks Best Regards On Mon, Nov 24, 2014 at 4:34 PM, Gordon Benjamin gordon.benjami...@gmail.com wrote: hi, We are building an analytics dashboard. Data will be updated every 5 minutes for now and eventually every 1 minute, maybe more frequent. The amount of data coming is not huge, per customer maybe 30 records per minute although we could have 500 customers. Is streaming correct for this I nstead of reading from multiple partitions for the incremental data?