I have been using a logstash alternative - fluentd to ingest the data into hdfs.
I had to configure fluentd to not append the data so that spark streaming will be able to pick up the new logs. -Liming On 2 Feb, 2015, at 6:05 am, NORD SC <jan.algermis...@nordsc.com> wrote: > Hi, > > I plan to have logstash send log events (as key value pairs) to spark > streaming using Spark on Cassandra. > > Being completely fresh to Spark, I have a couple of questions: > > - is that a good idea at all, or would it be better to put e.g. Kafka in > between to handle traffic peeks > (IOW: how and how well would Spark Streaming handle peeks?) > > - Is there already a logstash-source implementation for Spark Streaming > > - assuming there is none yet and assuming it is a good idea: I’d dive into > writing it myself - what would the core advice be to avoid biginner traps? > > Jan > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org