Hi all, I am fetching data from Netezza using GenerateTableFetch -> RPG -> ExecuteSQL -> PutHDFS . It is working fine for most of the time, but for some tables with more than a million rows, it fetches duplicate rows.
Partition Size varies from 3 million to 30 million with respect to table size. For table with ~300 million rows, size is 30 million and likewise. For Example - Table : abc Netezza count - 3265421 Hive Count - 3265421 Duplicate rows in Hive - 97070 Is this the expected behaviour while fetching from Netezza? Regards, Mohit
