Hi,

I am using Sqoop-1.4.2 from the past few days in a hadoop cluster of 10
nodes.
As per the documentation of sqoop 9.4 Export & Transactions , the export
operation is not atomic in database becuase it creates separate
transactions to insert records.

Fore.g if a map task failed to export transaction while others succeeded ,
it would lead to partial & incomplete results in database tables.

I created a script in bash to load data from a CSV ( daily csvs ) of  500
thousand records into db in which i delete the records of the  day csvs
before loading the csv into db so that if there is issue while loading a
day CSV , we get correct results by again running the job.

Can we achieve the same functionality in Sqoop , so that if a job in sqoop
fails some map tasks, we achive correct & complete ( no duplicates )
 records  in db.


Thanks

Reply via email to