I am reading the data from Oracle tables and Flat files (new excel file every week) and write it to Teradata weekly using Pyspark.
In the initial run it will load the all the data to Teradata. But in the later runs I just want to read the new records from Oracle and Flatfiles and want to append it to teradata tables. How can I do this using Pyspark, without touching the oracle and teradata tables? Please post the sample code if possible. Thanks