Have you tried to join both datasets, filter accordingly and then write the
full dataset to your filesystem?
Alternatively work with a NoSQL database that you update by key (eg it sounds a
key/value store could be useful for you).
However, it could be also that you need to do more depending on
Hi,
Any advice how to do this in spark sql ?
I have a scenario as below
dataframe1 = loaded from an HDFS Parquet file.
dataframe2 = read from a Kafka Stream.
If column1 of dataframe1 value in columnX value of dataframe2 , then I need
then I need to replace column1 value of dataframe1.