Hi all,

I am fetching data from Netezza using GenerateTableFetch -> RPG ->
ExecuteSQL -> PutHDFS . It is working fine for most of the time, but for
some tables with more than a million rows, it fetches duplicate rows.

 

Partition Size  varies from 3 million to 30 million with respect to table
size. For table with ~300 million rows, size is 30 million and likewise.  

 

For Example -

 

Table : abc

Netezza count -  3265421

Hive Count - 3265421

Duplicate rows in Hive -  97070

 

Is this the expected behaviour while fetching from Netezza?

 

Regards,

Mohit

Reply via email to