OK. I see the following to query the offsets. In our Kafka Stream, the
offsets are stored in ZooKeeper and I am not updating Offsets in Zookeeper.

How does Kafka Direct know which offsets to query?  Does it calculate
automatically as to which offsets to query?I have "auto.offset.reset" ->
"largest".

It looks like it is trying to query the offset from the latest offset each
time and those offsets are not available in Kafka Stream anymore. Other
Consumers that use the same stream and has zookeeper quorum seems to be
working fine.


 // Hold a reference to the current offset ranges, so it can be used downstream
 var offsetRanges = Array[OffsetRange]()
        
 directKafkaStream.transform { rdd =>
   offsetRanges = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
   rdd
 }.map {
           ...
 }.foreachRDD { rdd =>
   for (o <- offsetRanges) {
     println(s"${o.topic} ${o.partition} ${o.fromOffset} ${o.untilOffset}")
   }
   ...
 }


On Mon, Nov 23, 2015 at 6:31 PM, swetha kasireddy <swethakasire...@gmail.com
> wrote:

> Also, does Kafka direct query the offsets from the zookeeper directly?
> From where does it get the offsets? There is data in those offsets, but
> somehow Kafka Direct does not seem to pick it up?
>
> On Mon, Nov 23, 2015 at 6:18 PM, swetha kasireddy <
> swethakasire...@gmail.com> wrote:
>
>> I mean to show the Spark Kafka Direct consumers in Kafka Stream UI.
>> Usually we create a consumer and the consumer gets shown in the Kafka
>> Stream UI. How do I log the offsets in the Spark Job?
>>
>> On Mon, Nov 23, 2015 at 6:11 PM, Cody Koeninger <c...@koeninger.org>
>> wrote:
>>
>>> What exactly do you mean by kafka consumer reporting?
>>>
>>> I'd log the offsets in your spark job and try running
>>>
>>> kafka-simple-consumer-shell.sh --partition $yourbadpartition
>>> --print-offsets
>>>
>>> at the same time your spark job is running
>>>
>>> On Mon, Nov 23, 2015 at 7:37 PM, swetha <swethakasire...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> We see a bunch of issues like the following in Our Spark Kafka Direct.
>>>> Any
>>>> idea  as to how make Kafka Direct Consumers show up in Kafka Consumer
>>>> reporting to debug this issue?
>>>>
>>>>
>>>> Job aborted due to stage failure: Task 47 in stage 336.0 failed 4 times,
>>>> most recent failure: Lost task 47.3 in stage 336.0 (TID 5283,
>>>> 10.227.64.52):
>>>> java.lang.AssertionError: assertion failed: Ran out of messages before
>>>> reaching ending offset 225474235 for topic hubble_stream partition 55
>>>> start
>>>> 225467496. This should not happen, and indicates that messages may have
>>>> been
>>>> lost
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-have-Kafka-Direct-Consumers-show-up-in-Kafka-Consumer-reporting-tp25457.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>
>>>>
>>>
>>
>

Reply via email to