Re: integrate Camus and Hive?

2015-03-11 Thread Andrew Otto
e.g File produce by the camus job: /user/[hive.user]/output/ *partition_month_utc=2015-03/partition_day_utc=2015-03-11/partition_minute_bucket=2015-03-11-02-09/* Bhavesh, how do you get Camus to write into a directory hierarchy like this? Is it reading the partition values from your

Re: Does consumer support combination of whitelist and blacklist topic filtering

2015-03-11 Thread Guozhang Wang
Tao, In MM people can pass in consumer configs, in which people can specify consumption topics, either in regular topic list format or whitelist / blacklist. So I think it already does what you need? Guozhang On Tue, Mar 10, 2015 at 10:09 PM, tao xiao xiaotao...@gmail.com wrote: Thank you

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-11 Thread Guozhang Wang
Hi James, What I meant before is that a single fetcher may be responsible for putting fetched data to multiple queues according to the construction of the streams setup, where each queue may be consumed by a different thread. And the queues are actually bounded. Now say if there are two queues

Re: High Level Consumer Example in 0.8.2

2015-03-11 Thread Ewen Cheslack-Postava
That example still works, the high level consumer interface hasn't changed. There is a new high level consumer on the way and an initial version has been checked into trunk, but it won't be ready to use until 0.9. On Wed, Mar 11, 2015 at 9:05 AM, ankit tyagi ankittyagi.mn...@gmail.com wrote:

Re: Idle/dead producer connections on broker

2015-03-11 Thread Guozhang Wang
Hmm, this sounds like a serious bug. I do remember we have some ticket reporting similar issues before but I cannot find it now. Let me dig a bit deeper later. BTW, could you try out the 0.8.2 broker version and see if this is still easily re-producible, i.e. starting a bunch of producers to send

Re: integrate Camus and Hive?

2015-03-11 Thread Bhavesh Mistry
Hi Ad You have to implement custom partitioner and also you will have to create what ever path (based on message eg log line timestamp, or however you choose to create directory hierarchy from your each message). You will need to implement your own Partitioner class implementation:

Re: integrate Camus and Hive?

2015-03-11 Thread Bhavesh Mistry
Hi Andrew, I would say camus is generic enough (but you can propose this to Camus Team). Here is sample code and methods that you can use to create any path or directory structure and create a corresponding (Hive Table schema for it). public class UTCLogPartitioner extends Partitioner {

Examples of kafka based architectures?

2015-03-11 Thread Joseph Pachod
Hi all In December Adrian Cockcroft presented some big names distributed architectures in his talk State of the Art in Microservices at dockercon. For each he put tooling/configuration/discovery/routing/observability on top and then under datastores, orchestration and development. One can see

[ANNOUNCEMENT] Apache Kafka 0.8.2.1 Released

2015-03-11 Thread Jun Rao
The Apache Kafka community is pleased to announce the release for Apache Kafka 0.8.2.1. The 0.8.2.1 release fixes 4 critical issues in 0.8.2.0. All of the changes in this release can be found: https://archive.apache.org/dist/kafka/0.8.2.1/RELEASE_NOTES.html Apache Kafka is high-throughput,

Re: integrate Camus and Hive?

2015-03-11 Thread Andrew Otto
Thanks, Do you have this partitioner implemented? Perhaps it would be good to try to get this into Camus as a build in option. HivePartitioner? :) -Ao On Mar 11, 2015, at 13:11, Bhavesh Mistry mistry.p.bhav...@gmail.com wrote: Hi Ad You have to implement custom partitioner and also

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-11 Thread tao xiao
Fetcher thread is per broker basis, it ensures that at lease one fetcher thread per broker. Fetcher thread is sent to broker with a fetch request to ask for all partitions. So if A, B, C are in the same broker fetcher thread is still able to fetch data from A, B, C even though A returns no data.

Out of Disk Space - Infinite loop

2015-03-11 Thread Saladi Naidu
We have 3 DC's and created 5 node Kafka cluster in each DC, connected these 3 DC's using Mirror Maker for replication. We were conducting performance testing using Kafka Producer Performance tool to load 100 million rows into 7 topics. We expected that data will be loaded evenly across 7 topics

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-11 Thread James Cheng
On Mar 11, 2015, at 9:12 AM, Guozhang Wang wangg...@gmail.com wrote: Hi James, What I meant before is that a single fetcher may be responsible for putting fetched data to multiple queues according to the construction of the streams setup, where each queue may be consumed by a different

Re: How replicas catch up the leader

2015-03-11 Thread sy.pan
Hi, @Jiangjie Qin this is the related info from controller.log: [2015-03-11 10:54:11,962] ERROR [Controller 0]: Error completing reassignment of partition [ad_click_sts,3] (kafka.controller.KafkaController) kafka.common.KafkaException: Partition [ad_click_sts,3] to be reassigned is already

Re: Database Replication Question

2015-03-11 Thread Xiao
Hi, Pete, Thank you for sharing your experience with me! sendfile and mmap are common system calls, but it sounds like we still need to consider at least the file-system differences when deploying Kafka. Cross-platform supports are a headache. : ) Best wishes, Xiao Li On Mar 10, 2015,

Re: Database Replication Question

2015-03-11 Thread Xiao
Hi Jiangjie and Guozhang, Native z/OS support is what I need. I wrote a few native z/OS applications using C/C++ before. Based on my current understanding, I have two alternatives: 1) Write native z/OS producers in C/C++ to feed data to Kafka clusters that are running on LUW servers.

Consuming messages on the same backend

2015-03-11 Thread jelmer
Hi. I have the following problem : we are building a system that will generate any number of different events published to different topics Events have an associated client and a client can express interest in these events. When they do, then for each event we will execute a callback to a remote

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-11 Thread tao xiao
consumer.timeout.ms only affects how the stream reads data from the internal chunk queue that is used to buffer received data. The actual data fetching is done by another fetcher thread kafka.consumer.ConsumerFetcherThread. The fetcher thread keeps reading data from broker and put them to the

Re: integrate Camus and Hive?

2015-03-11 Thread Andrew Otto
Hive provides the ability to provide custom patterns for partitions. You can use this in combination with MSCK REPAIR TABLE to automatically detect and load the partitions into the metastore. I tried this yesterday, and as far as I can tell it doesn’t work with a custom partition layout. At

Re: High Level Consumer Example in 0.8.2

2015-03-11 Thread ankit tyagi
Hi Ewen, I am using* kafka-clients-0.8.2.0.jar* client jar and i don't see above interface in that. which dependency do i need to include?? On Wed, Mar 11, 2015 at 10:17 PM, Ewen Cheslack-Postava e...@confluent.io wrote: That example still works, the high level consumer interface hasn't

Re: High Level Consumer Example in 0.8.2

2015-03-11 Thread Ewen Cheslack-Postava
Ah, I see the confusion now. The kafka-clients jar was introduced only recently and is meant to hold the new clients, which are pure Java implementations and cleanly isolated from the server code. The old clients (including what we call the old consumer since a new consumer is being developed, but

Re: High Level Consumer Example in 0.8.2

2015-03-11 Thread ankit tyagi
Hi Eden, Just One Question, if i want to have implementation for new producer then in that case i need to include below dependencies 1. *kafka-clients-0.8.2.0.jar (for new implementation of producer)* * 2. kafka_2.10-0.8.2.1 jar (for consumer )* As I see package of producer is different in