[ https://issues.apache.org/jira/browse/SPARK-20036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Nuriyev updated SPARK-20036: ----------------------------------- Description: I use kafka 0.10.1 and java code with the following dependencies: <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka_2.11</artifactId> <version>0.10.1.1</version> </dependency> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>0.10.1.1</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-streaming_2.11</artifactId> <version>2.0.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-streaming-kafka-0-10_2.11</artifactId> <version>2.0.0</version> </dependency> The code tries to read the whole topic using: kafkaParams.put("auto.offset.reset", "earliest"); Using 5 second batches: jssc = new JavaStreamingContext(conf, Durations.seconds(5)); Each batch returns empty. I debugged the code I noticed that KafkaUtils.fixKafkaParams is called that overrides "earliest" with "none". Whether this is related or not, when I used kafka 0.8 on the client with kafka 0.10.1 on the server, I could read the whole topic. was: I use kafka 0.10.1 and java code with the following dependencies: <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka_2.11</artifactId> <version>0.10.1.1</version> </dependency> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>0.10.1.1</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-streaming_2.11</artifactId> <version>2.0.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-streaming-kafka-0-10_2.11</artifactId> <version>2.0.0</version> </dependency> I code tries to read the whole topic using: kafkaParams.put("auto.offset.reset", "earliest"); Using 5 second batches: jssc = new JavaStreamingContext(conf, Durations.seconds(5)); Each batch returns empty. I debugged the code I noticed that KafkaUtils.fixKafkaParams is called that overrides "earliest" with "none". Whether this is related or not, when I used kafka 0.8 on the client with kafka 0.10.1 on the server, I could read the whole topic. > impossible to read a whole kafka topic using kafka 0.10 and spark 2.0.0 > ------------------------------------------------------------------------ > > Key: SPARK-20036 > URL: https://issues.apache.org/jira/browse/SPARK-20036 > Project: Spark > Issue Type: Bug > Components: Input/Output > Affects Versions: 2.0.0 > Reporter: Daniel Nuriyev > Priority: Critical > Fix For: 2.0.3 > > > I use kafka 0.10.1 and java code with the following dependencies: > <dependency> > <groupId>org.apache.kafka</groupId> > <artifactId>kafka_2.11</artifactId> > <version>0.10.1.1</version> > </dependency> > <dependency> > <groupId>org.apache.kafka</groupId> > <artifactId>kafka-clients</artifactId> > <version>0.10.1.1</version> > </dependency> > <dependency> > <groupId>org.apache.spark</groupId> > <artifactId>spark-streaming_2.11</artifactId> > <version>2.0.0</version> > </dependency> > <dependency> > <groupId>org.apache.spark</groupId> > <artifactId>spark-streaming-kafka-0-10_2.11</artifactId> > <version>2.0.0</version> > </dependency> > The code tries to read the whole topic using: > kafkaParams.put("auto.offset.reset", "earliest"); > Using 5 second batches: > jssc = new JavaStreamingContext(conf, Durations.seconds(5)); > Each batch returns empty. > I debugged the code I noticed that KafkaUtils.fixKafkaParams is called that > overrides "earliest" with "none". > Whether this is related or not, when I used kafka 0.8 on the client with > kafka 0.10.1 on the server, I could read the whole topic. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org