Re: Alternative to camus

2015-03-19 Thread Koert Kuipers
er. >> It >> > really is fairly easy to set up, and seems to be quite good so far. >> > > >> > > -Thunder >> > > >> > > >> > > -Original Message----- >> > > From: amiori...@gmail.com [mailto:amiori...@gmail.co

Re: Alternative to camus

2015-03-19 Thread sunil kalva
t; It > > really is fairly easy to set up, and seems to be quite good so far. > > > > > > -Thunder > > > > > > > > > -Original Message- > > > From: amiori...@gmail.com [mailto:amiori...@gmail.com] On Behalf Of > > Alberto Miorin > > > Se

Re: Alternative to camus

2015-03-19 Thread Koert Kuipers
March 13, 2015 12:15 PM > > To: users@kafka.apache.org > > Cc: otis.gospodne...@gmail.com > > Subject: Re: Alternative to camus > > > > We use spark on mesos. I don't want to partition our cluster because of > one YARN job (camus). > > > > Best > > > &

Re: Alternative to camus

2015-03-13 Thread Gwen Shapira
to > Miorin > Sent: Friday, March 13, 2015 12:15 PM > To: users@kafka.apache.org > Cc: otis.gospodne...@gmail.com > Subject: Re: Alternative to camus > > We use spark on mesos. I don't want to partition our cluster because of one > YARN job (camus). > > Best >

RE: Alternative to camus

2015-03-13 Thread Thunder Stumpges
...@gmail.com] On Behalf Of Alberto Miorin Sent: Friday, March 13, 2015 12:15 PM To: users@kafka.apache.org Cc: otis.gospodne...@gmail.com Subject: Re: Alternative to camus We use spark on mesos. I don't want to partition our cluster because of one YARN job (camus). Best Alberto On Fri, M

Re: Alternative to camus

2015-03-13 Thread William Briggs
It seemed really counter-intuitive; I can only imagine that it happened because nobody wanted to refactor the existing KafkaInputDStream to use the SimpleConsumer instead of the High Level Consumer (unless I'm misreading the source - it looks like that's what the new DirectKafkaInputDStream is doin

Re: Alternative to camus

2015-03-13 Thread Gwen Shapira
Also very interesting in hearing about them. I prefer war stories in form for Jira for the relevant project ;) There's a good chance we can make things less horrible if issues are reported. Gwen On Fri, Mar 13, 2015 at 12:48 PM, Andrew Otto wrote: >> We are currently using spark streaming 1.2.1

Re: Alternative to camus

2015-03-13 Thread Alberto Miorin
1) You save everything 2 times (kafka and hdfs). 2) You need to enable the checkpoint feature, that means you cannot change the configuration of the job, because the spark streaming context is deserialized from hdfs every time you restart the job. 3) What happens if hdfs is unavailable, not clear?

Re: Alternative to camus

2015-03-13 Thread Andrew Otto
> We are currently using spark streaming 1.2.1 with kafka and write-ahead log. > I will only say one thing : "a nightmare". ;-) I’d be really interested in hearing about your experience here. I’m exploring streaming frameworks a bit, and Spark Streaming is just so easy to use and set up. I’d be

Re: Alternative to camus

2015-03-13 Thread Gwen Shapira
I really like the new approach. The WAL in HDFS never made much sense to me (I mean, Kafka is a log. I know they don't want the Kafka dependency, but a log for a log makes no sense). Still experimental, but I think thats the right direction. On Fri, Mar 13, 2015 at 12:38 PM, Alberto Miorin wrote

Re: Alternative to camus

2015-03-13 Thread William Briggs
Thanks for the heads-up, Alberto, that's good to know. We were about to start a few projects working with Spark Streaming + Kafka; sounds like there's still quite a bit of work to be done there. -Will On Fri, Mar 13, 2015 at 3:38 PM, Alberto Miorin wrote: > We are currently using spark streamin

Re: Alternative to camus

2015-03-13 Thread Alberto Miorin
We are currently using spark streaming 1.2.1 with kafka and write-ahead log. I will only say one thing : "a nightmare". ;-) Let's see if things are better with 1.3.0 : http://spark.apache.org/docs/1.3.0/streaming-kafka-integration.html On Fri, Mar 13, 2015 at 8:33 PM, William Briggs wrote: > Sp

Re: Alternative to camus

2015-03-13 Thread William Briggs
Spark Streaming also has built-in support for Kafka, and as of Spark 1.2, it supports using an HDFS write-ahead log to ensure zero data loss while streaming: https://databricks.com/blog/2015/01/15/improved-driver-fault-tolerance-and-zero-data-loss-in-spark-streaming.html -Will On Fri, Mar 13, 201

Re: Alternative to camus

2015-03-13 Thread Alberto Miorin
I'll try this too. It looks very promising. Thx On Fri, Mar 13, 2015 at 8:25 PM, Gwen Shapira wrote: > There's a KafkaRDD that can be used in Spark: > https://github.com/tresata/spark-kafka. It doesn't exactly replace > Camus, but should be useful in building Camus-like system in Spark. > > On

Re: Alternative to camus

2015-03-13 Thread Gwen Shapira
There's a KafkaRDD that can be used in Spark: https://github.com/tresata/spark-kafka. It doesn't exactly replace Camus, but should be useful in building Camus-like system in Spark. On Fri, Mar 13, 2015 at 12:15 PM, Alberto Miorin wrote: > We use spark on mesos. I don't want to partition our clust

Re: Alternative to camus

2015-03-13 Thread Alberto Miorin
Flume solution looks very good. Thx. On Fri, Mar 13, 2015 at 8:15 PM, William Briggs wrote: > I would think that this is not a particularly great solution, as you will > end up running into quite a few edge cases, and I can't see this scaling > particularly well - how do you know which server t

Re: Alternative to camus

2015-03-13 Thread William Briggs
I would think that this is not a particularly great solution, as you will end up running into quite a few edge cases, and I can't see this scaling particularly well - how do you know which server to copy logs from in a clustered and replicated environment? What happens when Kafka detects a failure

Re: Alternative to camus

2015-03-13 Thread Alberto Miorin
We use spark on mesos. I don't want to partition our cluster because of one YARN job (camus). Best Alberto On Fri, Mar 13, 2015 at 7:43 PM, Otis Gospodnetic < otis.gospodne...@gmail.com> wrote: > Just curious - why - is Camus not suitable/working? > > Thanks, > Otis > -- > Monitoring * Alerting

Re: Alternative to camus

2015-03-13 Thread Otis Gospodnetic
Just curious - why - is Camus not suitable/working? Thanks, Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Fri, Mar 13, 2015 at 2:33 PM, Alberto Miorin wrote: > I was wondering if anybody has already tried t

Alternative to camus

2015-03-13 Thread Alberto Miorin
I was wondering if anybody has already tried to mirror a kafka topic to hdfs just copying the log files from the topic directory of the broker (like 23244237.log). The file format is very simple : https://twitter.com/amiorin/status/576448691139121152/photo/1 Implementing an InputForma