Github user maropu commented on a diff in the pull request:
https://github.com/apache/spark/pull/16114#discussion_r90758322
--- Diff:
external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala
---
@@ -56,6 +56,27 @@ private[kinesis] class
KinesisRecordProcessor[T](receiver: KinesisReceiver[T], w
logInfo(s"Initialized workerId $workerId with shardId $shardId")
}
+ private def addRecords(batch: List[Record], checkpointer:
IRecordProcessorCheckpointer): Unit = {
+ receiver.addRecords(shardId, batch)
+ logDebug(s"Stored: Worker $workerId stored ${batch.size} records for
shardId $shardId")
+ receiver.setCheckpointer(shardId, checkpointer)
--- End diff --
yea, you're right and this code overwrites `checkpointer` every the
callback function called (maybe, every 1 sec.). I'm not sure what an original
author thinks about though, it seems this is waste of codes. But, I also not
sure that it is worth fixing this and this fix is out of scope in this jira. If
necessary, I'm pleased to fix in follow-up activities.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]