Hi Jorn,
Just want to check if you got a chance to look at this problem. I couldn't
figure out any reason on why this is happening. Any help would be
appreciated.
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
Hi Jorn,
Thanks for your kind reply. I do accept that there might be something in the
code. Any help would be appreciated.
To give you some insights, I checked the source of the message in kafka if
it has been repeated twice. But, I could only find it once. Also, it would
have been convincing
Do you have some code that you can share?
Maybe it is something in your code that unintentionally duplicates it?
Maybe your source (eg the application putting it on Kafka?)duplicates them
already?
Once and only once processing needs to be done end to end.
> Am 27.10.2018 um 02:10 schrieb
Hi All,
My problem is as explained,
Environment: Spark 2.2.0 installed on CDH
Use-Case: Reading from Kafka, cleansing the data and ingesting into a non
updatable database.
Problem: My streaming batch duration is 1 minute and I am receiving 3000
messages/min. I am observing a weird case where,