Hi Everyone

Lets say I have hive table in 2 datacenters. Table format can be textfile
or Orc.
There is scoop job running every day which adds data to the table.

Each datacenter has its own instance of scoop job.
In Ideal case scenario the data in these two table should be the same.

The same means that row count is the same and tables contain the same rows.
However row order can be different. number of files and their size also can
be different.

Is there a way to scan the table and get some hashcode which can be used to
compare tables?

Thank you
Alex

Reply via email to