Yizheng, one thing you can do is to use Spark Streaming to ingest the incoming data and store it in HDFS. You can use Shark to impose a schema on this data. Spark/Shark can easily handle 30GB/day.
For visualization and analysis, you may want to take a look at Adatao pAnalytics for R, which is built on top of Spark/Shark. Sent while mobile. Pls excuse typos etc. On Nov 21, 2013 3:08 PM, "Yizheng Liao" <[email protected]> wrote: > Hi, everyone: > > I am new to the Spark Project. We are working on a wireless sensor > network. We hope to know if Spark/Shark is good for time series data > storage and processing. The maximum data input we have is about 30GB/day. > Also, we hope to visualize the collected data in real-time. > > In addition, I hope to know if there is a solution for using R and > Spark/Shark. We know there is a R library for Hive. Is there a plan for > Spark/Shark project to provide R API/library? > > Thanks! > > Yizheng >
