Re: hive output to kafka

2015-04-28 Thread Harut Martirosyan
I may be wrong but try to use flume (http://flume.apache.org), just not sure if it has hive source. On 28 April 2015 at 15:09, Svante Karlsson svante.karls...@csi.se wrote: What's the best way of exporting contents (avro encoded) from hive queries to kafka? Kind of camus, the other way

Re: New Producer API - batched sync mode support

2015-04-28 Thread Roshan Naik
@Ewen No I did not use compression in my measurements.

Re: New Producer API - batched sync mode support

2015-04-28 Thread Ivan Balashov
I must agree with @Roshan – it's hard to imagine anything more intuitive and easy to use for atomic batching as old sync batch api. Also, it's fast. Coupled with a separate instance of producer per broker:port:topic:partition it works very well. I would be glad if it finds its way into new

Re: SimpleConsumer not fetching messages

2015-04-28 Thread Ivan Balashov
Does increasing PartitionFetchInfo.fetchSize help? Speaking of Kafka API, it looks like throwing exception would be less confusing if fetchSize is not enough to get at least one message at requested offset. 2015-04-28 21:12 GMT+03:00 Laran Evans laran.ev...@nominum.com: I’ve got a simple

Re: New Producer API - batched sync mode support

2015-04-28 Thread Jay Kreps
Hey guys, The locking argument is correct for very small records ( 50 bytes), batching will help here because for small records locking becomes the big bottleneck. I think these use cases are rare but not unreasonable. Overall I'd emphasize that the new producer is way faster at virtually all

Could you answer the following kafka stackoverflow question?

2015-04-28 Thread Gomathivinayagam Muthuvinayagam
I have just posted the following question in stackoverflow. Could you answer the following questions? I would like to use Kafka high level consumer API, and at the same time I would like to disable auto commit of offsets. I tried to achieve this through the following steps. 1) auto.commit.enable

Kafka - preventing message loss

2015-04-28 Thread Gomathivinayagam Muthuvinayagam
I am trying to setup a cluster where messages should never be lost once it is published. Say if I have 3 brokers, and if I configure the replicas to be 3 also, and if I consider max failures as 1, and I can achieve the above requirement. But when I post a message, how do I prevent kafka from

subscribe please

2015-04-28 Thread Manish Malhotra
subscribe

hive output to kafka

2015-04-28 Thread Svante Karlsson
What's the best way of exporting contents (avro encoded) from hive queries to kafka? Kind of camus, the other way around best regards svante

Re: Why fetching meta-data for topic is done three times?

2015-04-28 Thread Madhukar Bharti
Hi Zakee, Thanks for your reply. message.send.max.retries 3 retry.backoff.ms 100 topic.metadata.refresh.interval.ms 600*1000 This is my properties. Regards, Madhukar On Tue, Apr 28, 2015 at 3:26 AM, Zakee kzak...@netzero.net wrote: What values do you have for below properties? Or are

zookeeper restart fatal error

2015-04-28 Thread Emley, Andrew
Hi I have had zk and kafka(2_8.0-0.8.1) set up nicely running for a week or so, I decided to stop the zk and the kafka brokers and re-start them, since stopping zk I can't start it again!! It gives me this fatal exception that is related to one of my test topics multinode1partition4reptopic!?

Re: New producer: metadata update problem on 2 Node cluster.

2015-04-28 Thread Manikumar Reddy
Hi Ewen, Thanks for the response. I agree with you, In some case we should use bootstrap servers. If you have logs at debug level, are you seeing this message in between the connection attempts: Give up sending metadata request since no node is available Yes, this log came for couple

RE: Kafka - preventing message loss

2015-04-28 Thread Aditya Auradkar
You can use the min.insync.replicas topic level configuration in this case. It must be used with acks=-1 which is a producer config. http://kafka.apache.org/documentation.html#topic-config Aditya From: Gomathivinayagam Muthuvinayagam

Re: Kafka commit offset

2015-04-28 Thread Jiangjie Qin
Yes, if you set the offset storage to Kafka, high level consumer will be using Kafka for all offset related operations. Jiangjie (Becket) Qin On 4/27/15, 7:03 PM, Gomathivinayagam Muthuvinayagam sankarm...@gmail.com wrote: I am trying to commit offset request in a background thread. I am able

Re: Unclaimed partitions

2015-04-28 Thread Dave Hamilton
1. We’re using version 0.8.1.1. 2. No failures in the consumer logs 3. We’re using the ConsumerOffsetChecker to see what partitions are assigned to the consumer group and what their offsets are. 8 of the 12 process each have been assigned two partitions and they’re keeping up with the topic. The

Re: New Producer API - batched sync mode support

2015-04-28 Thread Roshan Naik
@Joel, If flush() works for this use case it may be an acceptable starting point (although not as clean as a native batched sync). I am not as yet clear about some aspects of flush's batch semantics and its suitability for this mode of operation. Allow me explore it with you folks.. 1) flush()

Re: Unclaimed partitions

2015-04-28 Thread Dave Hamilton
I’m sorry, I forgot to specify that these processes are in the same consumer group. Thanks, Dave On 4/28/15, 1:15 PM, Aditya Auradkar aaurad...@linkedin.com.INVALID wrote: Hi Dave, The simple consumer doesn't do any state management across consumer instances. So I'm not sure how you are

Re: Topic missing Leader and Isr

2015-04-28 Thread Buntu Dev
Also note that the metadata for the topic is missing. I tried creating few more topics and all have the same issue. Using the Kafka console producer on the topic, I see these error messages indicating the missing metadata: WARN Error while fetching metadata [{TopicMetadata for topic my-topic -

RE: Unclaimed partitions

2015-04-28 Thread Aditya Auradkar
Couple of questions: - What version of the consumer API are you using? - Are you seeing any rebalance failures in the consumer logs? - How do you determine that some partitions are unassigned? Just confirming that you have partitions that are not being consumed from as opposed to consumer

Re: hive output to kafka

2015-04-28 Thread Gwen Shapira
Kind of what you need but not quiet: Sqoop2 is capable of getting data from HDFS to Kafka. AFAIK it doesn't support Hive queries, but feel free to open a JIRA for Sqoop :) Gwen On Tue, Apr 28, 2015 at 4:09 AM, Svante Karlsson svante.karls...@csi.se wrote: What's the best way of exporting

Re: New producer: metadata update problem on 2 Node cluster.

2015-04-28 Thread Ewen Cheslack-Postava
Ok, all of that makes sense. The only way to possibly recover from that state is either for K2 to come back up allowing the metadata refresh to eventually succeed or to eventually try some other node in the cluster. Reusing the bootstrap nodes is one possibility. Another would be for the client to

Writing Spark RDDs into Kafka

2015-04-28 Thread Ming Zhao
Hi, I wonder if anyone has a good example of how to write Spark RDDs into Kafka. Specifically, my question is if there is an advantage of sending a list of messages each time over sending one message at a time. Sample code for sending one message at a time: dStream.foreachRDD(rdd = {