[Spark-SQL] : Incremental load in Pyspark

Vamsi Makkena Tue, 11 Apr 2017 12:24:03 -0700

I am reading the data from Oracle tables and Flat files (new excel file
every week) and write it to Teradata weekly using Pyspark.


In the initial run it will load the all the data to Teradata. But in the
later runs I just want to read the new records from Oracle and Flatfiles
and want to append it to teradata tables.

How can I do this using Pyspark, without touching the oracle and teradata
tables?

Please post the sample code if possible.

Thanks

[Spark-SQL] : Incremental load in Pyspark

Reply via email to