Thanks Dhaval, that fixed the issue. The constant resetting of Kafka
offsets misled me about the issue. Please feel free the answer the SO
question here
<https://stackoverflow.com/questions/57874681/spark-kafka-streaming-making-progress-but-there-is-no-data-to-be-consumed>
if
you would l
t;>> "18" : 1069224544,
>>>>>> "12" : 1256018541,
>>>>>> "3" : 1251150202,
>>>>>> "21" : 1256774117,
>>>>>> "15" : 1170591375,
>>>>>> "6" : 1185108169,
>>>>>> "24" : 1202342095,
>>>>>> "0" : 1165356330
>>>>>> }
>>>>>> },
>>>>>> "endOffset" : {
>>>>>> "my_kafka_topic" : {
>>>>>> "23" : 1206928043,
>>>>>> "8" : 1158516721,
>>>>>> "17" : 1258389219,
>>>>>> "11" : 1263093490,
>>>>>> "2" : 1226743225,
>>>>>> "20" : 1229562962,
>>>>>> "5" : 1170307882,
>>>>>> "14" : 1207335736,
>>>>>> "4" : 1274245585,
>>>>>> "13" : 1336388570,
>>>>>> "22" : 1260213582,
>>>>>> "7" : 1288641384,
>>>>>> "16" : 1247464311,
>>>>>> "10" : 1093159186,
>>>>>> "1" : 1219906407,
>>>>>> "19" : 1116271435,
>>>>>> "9" : 1238936994,
>>>>>> "18" : 1069226913,
>>>>>> "12" : 1256020926,
>>>>>> "3" : 1251152579,
>>>>>> "21" : 1256776910,
>>>>>> "15" : 1170593216,
>>>>>> "6" : 1185110032,
>>>>>> "24" : 1202344538,
>>>>>> "0" : 1165358262
>>>>>> }
>>>>>> },
>>>>>> "numInputRows" : 0,
>>>>>> "inputRowsPerSecond" : 0.0,
>>>>>> "processedRowsPerSecond" : 0.0
>>>>>> } ],
>>>>>> "sink" : {
>>>>>> "description" :
>>>>>> "FileSink[s3://my-s3-bucket/data/kafka/my_kafka_topic]"
>>>>>> }
>>>>>> }
>>>>>>
>>>>>>
>>>>>> In the above StreamingQueryProgress event the numInputRows fields is
>>>>>> zero and this is the case for all micro batch executions and no data is
>>>>>> being produced whatsoever. So basically for each batch my offsets are
>>>>>> being
>>>>>> reset and each batch is producing zero rows. Since there is no work being
>>>>>> done and since dynamic allocation is enabled all my executors killed... I
>>>>>> have tried deleting my checkpoint and started my application from scratch
>>>>>> and I am still facing the same issue. What could possibly be wrong
>>>>>> this?...
>>>>>> what lines of investigation should I take? If you are interested in
>>>>>> getting Stackoverflow point you can answer my question in SO here
>>>>>> <https://stackoverflow.com/questions/57874681/spark-kafka-streaming-making-progress-but-there-is-no-data-to-be-consumed>.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Charles
>>>>>>
>>>>>>
>>>>> --
>>> Sent from Gmail Mobile
>>>
>>
5,
>>>>> "6" : 1185108169,
>>>>> "24" : 1202342095,
>>>>> "0" : 1165356330
>>>>> }
>>>>> },
>>>>> "endOffset" : {
>>>>> "my_kafka_topic" : {
>>>>> "23" : 1206928043,
>>>>> "8" : 1158516721,
>>>>> "17" : 1258389219,
>>>>> "11" : 1263093490,
>>>>> "2" : 1226743225,
>>>>> "20" : 1229562962,
>>>>> "5" : 1170307882,
>>>>> "14" : 1207335736,
>>>>> "4" : 1274245585,
>>>>> "13" : 1336388570,
>>>>> "22" : 1260213582,
>>>>> "7" : 1288641384,
>>>>> "16" : 1247464311,
>>>>> "10" : 1093159186,
>>>>> "1" : 1219906407,
>>>>> "19" : 1116271435,
>>>>> "9" : 1238936994,
>>>>> "18" : 1069226913,
>>>>> "12" : 1256020926,
>>>>> "3" : 1251152579,
>>>>> "21" : 1256776910,
>>>>> "15" : 1170593216,
>>>>> "6" : 1185110032,
>>>>> "24" : 1202344538,
>>>>> "0" : 1165358262
>>>>> }
>>>>> },
>>>>> "numInputRows" : 0,
>>>>> "inputRowsPerSecond" : 0.0,
>>>>> "processedRowsPerSecond" : 0.0
>>>>> } ],
>>>>> "sink" : {
>>>>> "description" :
>>>>> "FileSink[s3://my-s3-bucket/data/kafka/my_kafka_topic]"
>>>>> }
>>>>> }
>>>>>
>>>>>
>>>>> In the above StreamingQueryProgress event the numInputRows fields is
>>>>> zero and this is the case for all micro batch executions and no data is
>>>>> being produced whatsoever. So basically for each batch my offsets are
>>>>> being
>>>>> reset and each batch is producing zero rows. Since there is no work being
>>>>> done and since dynamic allocation is enabled all my executors killed... I
>>>>> have tried deleting my checkpoint and started my application from scratch
>>>>> and I am still facing the same issue. What could possibly be wrong
>>>>> this?...
>>>>> what lines of investigation should I take? If you are interested in
>>>>> getting Stackoverflow point you can answer my question in SO here
>>>>> <https://stackoverflow.com/questions/57874681/spark-kafka-streaming-making-progress-but-there-is-no-data-to-be-consumed>.
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Charles
>>>>>
>>>>>
>>>> --
>> Sent from Gmail Mobile
>>
>
928043,
>>>> "8" : 1158516721,
>>>> "17" : 1258389219,
>>>> "11" : 1263093490,
>>>> "2" : 1226743225,
>>>> "20" : 1229562962,
>>>> "5" : 1170307882,
>>>> "14" : 1207335736,
>>>> "4" : 1274245585,
>>>> "13" : 1336388570,
>>>> "22" : 1260213582,
>>>> "7" : 1288641384,
>>>> "16" : 1247464311,
>>>> "10" : 1093159186,
>>>> "1" : 1219906407,
>>>> "19" : 1116271435,
>>>> "9" : 1238936994,
>>>> "18" : 1069226913,
>>>> "12" : 1256020926,
>>>> "3" : 1251152579,
>>>> "21" : 1256776910,
>>>> "15" : 1170593216,
>>>> "6" : 1185110032,
>>>> "24" : 1202344538,
>>>> "0" : 1165358262
>>>> }
>>>> },
>>>> "numInputRows" : 0,
>>>> "inputRowsPerSecond" : 0.0,
>>>> "processedRowsPerSecond" : 0.0
>>>> } ],
>>>> "sink" : {
>>>> "description" : "FileSink[s3://my-s3-bucket/data/kafka/my_kafka_topic]"
>>>> }
>>>> }
>>>>
>>>>
>>>> In the above StreamingQueryProgress event the numInputRows fields is
>>>> zero and this is the case for all micro batch executions and no data is
>>>> being produced whatsoever. So basically for each batch my offsets are being
>>>> reset and each batch is producing zero rows. Since there is no work being
>>>> done and since dynamic allocation is enabled all my executors killed... I
>>>> have tried deleting my checkpoint and started my application from scratch
>>>> and I am still facing the same issue. What could possibly be wrong this?...
>>>> what lines of investigation should I take? If you are interested in
>>>> getting Stackoverflow point you can answer my question in SO here
>>>> <https://stackoverflow.com/questions/57874681/spark-kafka-streaming-making-progress-but-there-is-no-data-to-be-consumed>.
>>>>
>>>>
>>>> Thanks,
>>>> Charles
>>>>
>>>>
>>> --
> Sent from Gmail Mobile
>
,
>>> "4" : 1274245585,
>>> "13" : 1336388570,
>>> "22" : 1260213582,
>>> "7" : 1288641384,
>>> "16" : 1247464311,
>>> "10" : 1093159186,
>>> "1" : 1219906407,
>>> "19" : 1116271435,
>>> "9" : 1238936994,
>>> "18" : 1069226913,
>>> "12" : 1256020926,
>>> "3" : 1251152579,
>>> "21" : 1256776910,
>>> "15" : 1170593216,
>>> "6" : 1185110032,
>>> "24" : 1202344538,
>>> "0" : 1165358262
>>> }
>>> },
>>> "numInputRows" : 0,
>>> "inputRowsPerSecond" : 0.0,
>>> "processedRowsPerSecond" : 0.0
>>> } ],
>>> "sink" : {
>>> "description" : "FileSink[s3://my-s3-bucket/data/kafka/my_kafka_topic]"
>>> }
>>> }
>>>
>>>
>>> In the above StreamingQueryProgress event the numInputRows fields is
>>> zero and this is the case for all micro batch executions and no data is
>>> being produced whatsoever. So basically for each batch my offsets are being
>>> reset and each batch is producing zero rows. Since there is no work being
>>> done and since dynamic allocation is enabled all my executors killed... I
>>> have tried deleting my checkpoint and started my application from scratch
>>> and I am still facing the same issue. What could possibly be wrong this?...
>>> what lines of investigation should I take? If you are interested in
>>> getting Stackoverflow point you can answer my question in SO here
>>> <https://stackoverflow.com/questions/57874681/spark-kafka-streaming-making-progress-but-there-is-no-data-to-be-consumed>.
>>>
>>>
>>> Thanks,
>>> Charles
>>>
>>>
>> --
Sent from Gmail Mobile
ot; : 1260213582,
>> "7" : 1288641384,
>> "16" : 1247464311,
>> "10" : 1093159186,
>> "1" : 1219906407,
>> "19" : 1116271435,
>> "9" : 1238936994,
>> "18" : 1069226913,
>> "12" : 1256020926,
>> "3" : 1251152579,
>> "21" : 1256776910,
>> "15" : 1170593216,
>> "6" : 1185110032,
>> "24" : 1202344538,
>> "0" : 1165358262
>> }
>> },
>> "numInputRows" : 0,
>> "inputRowsPerSecond" : 0.0,
>> "processedRowsPerSecond" : 0.0
>> } ],
>> "sink" : {
>> "description" : "FileSink[s3://my-s3-bucket/data/kafka/my_kafka_topic]"
>> }
>> }
>>
>>
>> In the above StreamingQueryProgress event the numInputRows fields is
>> zero and this is the case for all micro batch executions and no data is
>> being produced whatsoever. So basically for each batch my offsets are being
>> reset and each batch is producing zero rows. Since there is no work being
>> done and since dynamic allocation is enabled all my executors killed... I
>> have tried deleting my checkpoint and started my application from scratch
>> and I am still facing the same issue. What could possibly be wrong this?...
>> what lines of investigation should I take? If you are interested in
>> getting Stackoverflow point you can answer my question in SO here
>> <https://stackoverflow.com/questions/57874681/spark-kafka-streaming-making-progress-but-there-is-no-data-to-be-consumed>.
>>
>>
>> Thanks,
>> Charles
>>
>>
>
"9" : 1238936994,
> "18" : 1069226913,
> "12" : 1256020926,
> "3" : 1251152579,
> "21" : 1256776910,
> "15" : 1170593216,
> "6" : 1185110032,
>
rSecond" : 0.0,
"processedRowsPerSecond" : 0.0
} ],
"sink" : {
"description" : "FileSink[s3://my-s3-bucket/data/kafka/my_kafka_topic]"
}
}
In the above StreamingQueryProgress event the numInputRows fields is zero
and this is the case for all micro batch executions and no data is being
produced whatsoever. So basically for each batch my offsets are being reset
and each batch is producing zero rows. Since there is no work being done
and since dynamic allocation is enabled all my executors killed... I have
tried deleting my checkpoint and started my application from scratch and I
am still facing the same issue. What could possibly be wrong this?... what
lines of investigation should I take? If you are interested in getting
Stackoverflow point you can answer my question in SO here
<https://stackoverflow.com/questions/57874681/spark-kafka-streaming-making-progress-but-there-is-no-data-to-be-consumed>.
Thanks,
Charles