about java.io.EOFException / java.lang.ClassNotFoundException: kafka.common.OffsetOutOfRangeException

2017-03-17 Thread Selina Tech
Hi: I am processing on a new Kafka topic with Spark and then I got error below. I google this questions, looks like I lot of people having similar problems before. But I have not got clue yet. Is any one know how to fix this issue? Sincerely. Selina 00:39:58,004 WARN - 2017-03-18

Re: Capacity planning for Kafka Streams

2017-03-17 Thread Mahendra Kariya
Thanks for the heads up Guozhang! The problem is our brokers are on 0.10.0.x. So we will have to upgrade them. On Sat, Mar 18, 2017 at 12:30 AM, Guozhang Wang wrote: > Hi Mahendra, > > Just a kind reminder that upgrading Streams to 0.10.2 does not necessarily > require you

Re: kafka-topics[.sh]: fail to support connecting via broker / v0.10 style

2017-03-17 Thread Hans Jespersen
I can be updated once the Kafka AdminAPI is available and does everything over the Kafka wire protocol that the current kafka-topics command does by talking directly with zookeeper. For example create a topic or delete a topic. Unfortunately is has to remain this way for just a little while

Re: Streams RocksDBException with no message?

2017-03-17 Thread Guozhang Wang
Hi Mathieu, We are aware of that since long time ago and I have been looking into this issue, turns out to be a known issue in RocksDB: https://github.com/facebook/rocksdb/issues/1688 And the corresponding fix (https://github.com/facebook/rocksdb/pull/1714) has been merged in master but marked

Re: Reg: Kafka HDFS Connector with (HDFS SSL enabled)

2017-03-17 Thread BigData dev
Hi Colin, I have configured SSL in HDFS and used SWebHDFS. I am able to make it work with Kafka HDFS Connector. Thanks, Bharat On Fri, Feb 17, 2017 at 1:47 PM, Colin McCabe wrote: > Hi, > > Just to be clear, HDFS doesn't use HTTP or HTTPS as its primary > transport

Re: Kafka Streams: lockException

2017-03-17 Thread Guozhang Wang
Tianji and Sachin (and also cc'ing people who I remember have reported similar RocksDB memory issues), Sharing my experience with RocksDB tuning and also chatting with the RocksDB community: 1. If you are frequently flushing the state stores (e.g. with high commit frequency) then you will end up

kafka-topics[.sh]: fail to support connecting via broker / v0.10 style

2017-03-17 Thread Andrew Pennebaker
If I understand Kafka correctly, since v0.9 / v0.10, users are often recommended to connect consumers to the Kafka cluster via bootstrap.servers AKA broker node addresses. However, the kafka-topics shell scripts fails to support this interface, still requiring the legacy zookeeper connect string.

Re: Kafka Streams: ReadOnlyKeyValueStore range behavior

2017-03-17 Thread Damian Guy
Thanks Dmitry. Please do create a JIRA for the range scan. On Fri, 17 Mar 2017 at 18:01, Dmitry Minkovsky wrote: > Regarding the null bug: I had time to open a JIRA today. Looks like an > issue already exists: https://issues.apache.org/jira/browse/KAFKA-4750 > > Regarding

Re: Kafka Streams: ReadOnlyKeyValueStore range behavior

2017-03-17 Thread Dmitry Minkovsky
Regarding the null bug: I had time to open a JIRA today. Looks like an issue already exists: https://issues.apache.org/jira/browse/KAFKA-4750 Regarding scan order: I would gladly produce a sample that replicates this behavior if you can confirm that you will perceive this as a defect. I would

Streams RocksDBException with no message?

2017-03-17 Thread Mathieu Fenniak
Hey all, So... what does it mean to have a RocksDBException with a message that just has a single character? "e", "q", "]"... I've seen a few. Has anyone seen this before? Two example exceptions: https://gist.github.com/mfenniak/c56beb6d5058e2b21df0309aea224f12 Kafka Streams 0.10.2.0. Both

Re: Kafka Streams: ReadOnlyKeyValueStore range behavior

2017-03-17 Thread Dmitry Minkovsky
Ah! Yes. Thank you! That make sense. Anyway, I _think_ that's not what I was doing given that all items were being routed to and then read from a partition identified by one key. On Fri, Mar 17, 2017 at 12:50 PM, Damian Guy wrote: > > When you use Queryable State you are

Re: Kafka Streams: ReadOnlyKeyValueStore range behavior

2017-03-17 Thread Damian Guy
> When you use Queryable State you are actually querying multiple > underlying stores, i.e., one per partition. > > Huh? I was only querying one partition. In my example, I have a user's > posts. Upon creation, they are routed to a particular partition using a > partitioner that hashes the post's

Re: Kafka Streams: ReadOnlyKeyValueStore range behavior

2017-03-17 Thread Dmitry Minkovsky
Matthias, Damian: Thank you for your replies. > Can you check if the problem exist for 0.10.2, too? I will upgrade to 0.10.2 after this development cycle. I'm still in development so compatibility is not as big an issue as getting to production. > range() should return ordered data, In my

Re: Kafka Streams: lockException

2017-03-17 Thread Eno Thereska
Sachin, you also have a PR for this that could help, right?: https://github.com/apache/kafka/pull/2642#issuecomment-287372367 . Thanks Eno > On 17 Mar 2017, at 15:19, Sachin Mittal wrote: > > We also face

Re: Increasing partition count and preserving local order for a key

2017-03-17 Thread Ian Wrigley
Hi You can’t move existing records between partitions, but one possibility would be to create a new topic with the required number of partitions, then copy the data from the original topic to the new one. The default partitioning algorithm would ensure that all records with the same key in the

Re: Kafka Streams: lockException

2017-03-17 Thread Sachin Mittal
We also face same issues. What we have found is that rocksdb is the issue. With many instances of rocksdb per machine, over the time it slows down due to i/o operations, resulting in threads getting evicted because max.poll.interval exceeds the set limit. Try running rocksdb in memory

Re: Offset commit request failing

2017-03-17 Thread Robert Quinlivan
Thanks for the response. Reading through that thread, it appears that this issue was addressed with KAFKA-3810 . This change eases the restriction on fetch size between replicas. However, should the outcome be a more comprehensive change to the

Real Time Streaming With Multiple Data Sources

2017-03-17 Thread 6yvu7u+1evsxnxv0
Hi, We are planning to build a real time monitoring system with apache kafka. The overall idea is to push data from multiple data sources to kafka and perform data quality checks. I have few questions with this architecture 1. What are the best possible approaches of streaming data from

Re: Offset commit request failing

2017-03-17 Thread James Cheng
I think it's due to the high number of partitions and the high number of consumers in the group. The group coordination info to keep track of the assignments actually happens via a message that travels through the __consumer_offsets topic. So with so many partitions and consumers, the message