Re: Any advice how to do this usecase in spark sql ?

2019-08-13 Thread Jörn Franke
Have you tried to join both datasets, filter accordingly and then write the full dataset to your filesystem? Alternatively work with a NoSQL database that you update by key (eg it sounds a key/value store could be useful for you). However, it could be also that you need to do more depending on

Any advice how to do this usecase in spark sql ?

2019-08-13 Thread Shyam P
Hi, Any advice how to do this in spark sql ? I have a scenario as below dataframe1 = loaded from an HDFS Parquet file. dataframe2 = read from a Kafka Stream. If column1 of dataframe1 value in columnX value of dataframe2 , then I need then I need to replace column1 value of dataframe1.