Yizheng, one thing you can do is to use Spark Streaming to ingest the
incoming data and store it in HDFS. You can use Shark to impose a schema on
this data. Spark/Shark can easily handle 30GB/day.

For visualization and analysis, you may want to take a look at Adatao
pAnalytics for R, which is built on top of Spark/Shark.

Sent while mobile. Pls excuse typos etc.
On Nov 21, 2013 3:08 PM, "Yizheng Liao" <[email protected]> wrote:

> Hi, everyone:
>
> I am new to the Spark Project. We are working on a wireless sensor
> network. We hope to know if Spark/Shark is good for time series data
> storage and processing. The maximum data input we have is about 30GB/day.
> Also, we hope to visualize the collected data in real-time.
>
> In addition, I hope to know if there is a solution for using R and
> Spark/Shark. We know there is a R library for Hive. Is there a plan for
> Spark/Shark project to provide R API/library?
>
> Thanks!
>
> Yizheng
>

Reply via email to