Re: Spark Kafka Streaming making progress but there is no data to be consumed

2019-09-11 Thread Charles vinodh
Thanks Dhaval, that fixed the issue. The constant resetting of Kafka offsets misled me about the issue. Please feel free the answer the SO question here <https://stackoverflow.com/questions/57874681/spark-kafka-streaming-making-progress-but-there-is-no-data-to-be-consumed> if you would l

Re: Spark Kafka Streaming making progress but there is no data to be consumed

2019-09-11 Thread Dhaval Patel
t;>> "18" : 1069224544, >>>>>> "12" : 1256018541, >>>>>> "3" : 1251150202, >>>>>> "21" : 1256774117, >>>>>> "15" : 1170591375, >>>>>> "6" : 1185108169, >>>>>> "24" : 1202342095, >>>>>> "0" : 1165356330 >>>>>> } >>>>>> }, >>>>>> "endOffset" : { >>>>>> "my_kafka_topic" : { >>>>>> "23" : 1206928043, >>>>>> "8" : 1158516721, >>>>>> "17" : 1258389219, >>>>>> "11" : 1263093490, >>>>>> "2" : 1226743225, >>>>>> "20" : 1229562962, >>>>>> "5" : 1170307882, >>>>>> "14" : 1207335736, >>>>>> "4" : 1274245585, >>>>>> "13" : 1336388570, >>>>>> "22" : 1260213582, >>>>>> "7" : 1288641384, >>>>>> "16" : 1247464311, >>>>>> "10" : 1093159186, >>>>>> "1" : 1219906407, >>>>>> "19" : 1116271435, >>>>>> "9" : 1238936994, >>>>>> "18" : 1069226913, >>>>>> "12" : 1256020926, >>>>>> "3" : 1251152579, >>>>>> "21" : 1256776910, >>>>>> "15" : 1170593216, >>>>>> "6" : 1185110032, >>>>>> "24" : 1202344538, >>>>>> "0" : 1165358262 >>>>>> } >>>>>> }, >>>>>> "numInputRows" : 0, >>>>>> "inputRowsPerSecond" : 0.0, >>>>>> "processedRowsPerSecond" : 0.0 >>>>>> } ], >>>>>> "sink" : { >>>>>> "description" : >>>>>> "FileSink[s3://my-s3-bucket/data/kafka/my_kafka_topic]" >>>>>> } >>>>>> } >>>>>> >>>>>> >>>>>> In the above StreamingQueryProgress event the numInputRows fields is >>>>>> zero and this is the case for all micro batch executions and no data is >>>>>> being produced whatsoever. So basically for each batch my offsets are >>>>>> being >>>>>> reset and each batch is producing zero rows. Since there is no work being >>>>>> done and since dynamic allocation is enabled all my executors killed... I >>>>>> have tried deleting my checkpoint and started my application from scratch >>>>>> and I am still facing the same issue. What could possibly be wrong >>>>>> this?... >>>>>> what lines of investigation should I take? If you are interested in >>>>>> getting Stackoverflow point you can answer my question in SO here >>>>>> <https://stackoverflow.com/questions/57874681/spark-kafka-streaming-making-progress-but-there-is-no-data-to-be-consumed>. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Charles >>>>>> >>>>>> >>>>> -- >>> Sent from Gmail Mobile >>> >>

Re: Spark Kafka Streaming making progress but there is no data to be consumed

2019-09-11 Thread Burak Yavuz
5, >>>>> "6" : 1185108169, >>>>> "24" : 1202342095, >>>>> "0" : 1165356330 >>>>> } >>>>> }, >>>>> "endOffset" : { >>>>> "my_kafka_topic" : { >>>>> "23" : 1206928043, >>>>> "8" : 1158516721, >>>>> "17" : 1258389219, >>>>> "11" : 1263093490, >>>>> "2" : 1226743225, >>>>> "20" : 1229562962, >>>>> "5" : 1170307882, >>>>> "14" : 1207335736, >>>>> "4" : 1274245585, >>>>> "13" : 1336388570, >>>>> "22" : 1260213582, >>>>> "7" : 1288641384, >>>>> "16" : 1247464311, >>>>> "10" : 1093159186, >>>>> "1" : 1219906407, >>>>> "19" : 1116271435, >>>>> "9" : 1238936994, >>>>> "18" : 1069226913, >>>>> "12" : 1256020926, >>>>> "3" : 1251152579, >>>>> "21" : 1256776910, >>>>> "15" : 1170593216, >>>>> "6" : 1185110032, >>>>> "24" : 1202344538, >>>>> "0" : 1165358262 >>>>> } >>>>> }, >>>>> "numInputRows" : 0, >>>>> "inputRowsPerSecond" : 0.0, >>>>> "processedRowsPerSecond" : 0.0 >>>>> } ], >>>>> "sink" : { >>>>> "description" : >>>>> "FileSink[s3://my-s3-bucket/data/kafka/my_kafka_topic]" >>>>> } >>>>> } >>>>> >>>>> >>>>> In the above StreamingQueryProgress event the numInputRows fields is >>>>> zero and this is the case for all micro batch executions and no data is >>>>> being produced whatsoever. So basically for each batch my offsets are >>>>> being >>>>> reset and each batch is producing zero rows. Since there is no work being >>>>> done and since dynamic allocation is enabled all my executors killed... I >>>>> have tried deleting my checkpoint and started my application from scratch >>>>> and I am still facing the same issue. What could possibly be wrong >>>>> this?... >>>>> what lines of investigation should I take? If you are interested in >>>>> getting Stackoverflow point you can answer my question in SO here >>>>> <https://stackoverflow.com/questions/57874681/spark-kafka-streaming-making-progress-but-there-is-no-data-to-be-consumed>. >>>>> >>>>> >>>>> Thanks, >>>>> Charles >>>>> >>>>> >>>> -- >> Sent from Gmail Mobile >> >

Re: Spark Kafka Streaming making progress but there is no data to be consumed

2019-09-11 Thread Charles vinodh
928043, >>>> "8" : 1158516721, >>>> "17" : 1258389219, >>>> "11" : 1263093490, >>>> "2" : 1226743225, >>>> "20" : 1229562962, >>>> "5" : 1170307882, >>>> "14" : 1207335736, >>>> "4" : 1274245585, >>>> "13" : 1336388570, >>>> "22" : 1260213582, >>>> "7" : 1288641384, >>>> "16" : 1247464311, >>>> "10" : 1093159186, >>>> "1" : 1219906407, >>>> "19" : 1116271435, >>>> "9" : 1238936994, >>>> "18" : 1069226913, >>>> "12" : 1256020926, >>>> "3" : 1251152579, >>>> "21" : 1256776910, >>>> "15" : 1170593216, >>>> "6" : 1185110032, >>>> "24" : 1202344538, >>>> "0" : 1165358262 >>>> } >>>> }, >>>> "numInputRows" : 0, >>>> "inputRowsPerSecond" : 0.0, >>>> "processedRowsPerSecond" : 0.0 >>>> } ], >>>> "sink" : { >>>> "description" : "FileSink[s3://my-s3-bucket/data/kafka/my_kafka_topic]" >>>> } >>>> } >>>> >>>> >>>> In the above StreamingQueryProgress event the numInputRows fields is >>>> zero and this is the case for all micro batch executions and no data is >>>> being produced whatsoever. So basically for each batch my offsets are being >>>> reset and each batch is producing zero rows. Since there is no work being >>>> done and since dynamic allocation is enabled all my executors killed... I >>>> have tried deleting my checkpoint and started my application from scratch >>>> and I am still facing the same issue. What could possibly be wrong this?... >>>> what lines of investigation should I take? If you are interested in >>>> getting Stackoverflow point you can answer my question in SO here >>>> <https://stackoverflow.com/questions/57874681/spark-kafka-streaming-making-progress-but-there-is-no-data-to-be-consumed>. >>>> >>>> >>>> Thanks, >>>> Charles >>>> >>>> >>> -- > Sent from Gmail Mobile >

Re: Spark Kafka Streaming making progress but there is no data to be consumed

2019-09-11 Thread Sandish Kumar HN
, >>> "4" : 1274245585, >>> "13" : 1336388570, >>> "22" : 1260213582, >>> "7" : 1288641384, >>> "16" : 1247464311, >>> "10" : 1093159186, >>> "1" : 1219906407, >>> "19" : 1116271435, >>> "9" : 1238936994, >>> "18" : 1069226913, >>> "12" : 1256020926, >>> "3" : 1251152579, >>> "21" : 1256776910, >>> "15" : 1170593216, >>> "6" : 1185110032, >>> "24" : 1202344538, >>> "0" : 1165358262 >>> } >>> }, >>> "numInputRows" : 0, >>> "inputRowsPerSecond" : 0.0, >>> "processedRowsPerSecond" : 0.0 >>> } ], >>> "sink" : { >>> "description" : "FileSink[s3://my-s3-bucket/data/kafka/my_kafka_topic]" >>> } >>> } >>> >>> >>> In the above StreamingQueryProgress event the numInputRows fields is >>> zero and this is the case for all micro batch executions and no data is >>> being produced whatsoever. So basically for each batch my offsets are being >>> reset and each batch is producing zero rows. Since there is no work being >>> done and since dynamic allocation is enabled all my executors killed... I >>> have tried deleting my checkpoint and started my application from scratch >>> and I am still facing the same issue. What could possibly be wrong this?... >>> what lines of investigation should I take? If you are interested in >>> getting Stackoverflow point you can answer my question in SO here >>> <https://stackoverflow.com/questions/57874681/spark-kafka-streaming-making-progress-but-there-is-no-data-to-be-consumed>. >>> >>> >>> Thanks, >>> Charles >>> >>> >> -- Sent from Gmail Mobile

Re: Spark Kafka Streaming making progress but there is no data to be consumed

2019-09-11 Thread Charles vinodh
ot; : 1260213582, >> "7" : 1288641384, >> "16" : 1247464311, >> "10" : 1093159186, >> "1" : 1219906407, >> "19" : 1116271435, >> "9" : 1238936994, >> "18" : 1069226913, >> "12" : 1256020926, >> "3" : 1251152579, >> "21" : 1256776910, >> "15" : 1170593216, >> "6" : 1185110032, >> "24" : 1202344538, >> "0" : 1165358262 >> } >> }, >> "numInputRows" : 0, >> "inputRowsPerSecond" : 0.0, >> "processedRowsPerSecond" : 0.0 >> } ], >> "sink" : { >> "description" : "FileSink[s3://my-s3-bucket/data/kafka/my_kafka_topic]" >> } >> } >> >> >> In the above StreamingQueryProgress event the numInputRows fields is >> zero and this is the case for all micro batch executions and no data is >> being produced whatsoever. So basically for each batch my offsets are being >> reset and each batch is producing zero rows. Since there is no work being >> done and since dynamic allocation is enabled all my executors killed... I >> have tried deleting my checkpoint and started my application from scratch >> and I am still facing the same issue. What could possibly be wrong this?... >> what lines of investigation should I take? If you are interested in >> getting Stackoverflow point you can answer my question in SO here >> <https://stackoverflow.com/questions/57874681/spark-kafka-streaming-making-progress-but-there-is-no-data-to-be-consumed>. >> >> >> Thanks, >> Charles >> >> >

Re: Spark Kafka Streaming making progress but there is no data to be consumed

2019-09-11 Thread Burak Yavuz
"9" : 1238936994, > "18" : 1069226913, > "12" : 1256020926, > "3" : 1251152579, > "21" : 1256776910, > "15" : 1170593216, > "6" : 1185110032, >

Spark Kafka Streaming making progress but there is no data to be consumed

2019-09-11 Thread Charles vinodh
rSecond" : 0.0, "processedRowsPerSecond" : 0.0 } ], "sink" : { "description" : "FileSink[s3://my-s3-bucket/data/kafka/my_kafka_topic]" } } In the above StreamingQueryProgress event the numInputRows fields is zero and this is the case for all micro batch executions and no data is being produced whatsoever. So basically for each batch my offsets are being reset and each batch is producing zero rows. Since there is no work being done and since dynamic allocation is enabled all my executors killed... I have tried deleting my checkpoint and started my application from scratch and I am still facing the same issue. What could possibly be wrong this?... what lines of investigation should I take? If you are interested in getting Stackoverflow point you can answer my question in SO here <https://stackoverflow.com/questions/57874681/spark-kafka-streaming-making-progress-but-there-is-no-data-to-be-consumed>. Thanks, Charles