Hi all,

I want to replicate hive metadata to another place, while I found my hive
metadata contains a big portion of data looks like garbage.

In my understanding, the hive metadata store use 'Storage Descriptor' to
keep relationship between tables and columns. But the 'SD_ID' columns in
table 'TBLS' and 'COLUMNS' are unbalanced in count, as shown below:

mysql> select count(distinct SD_ID) from tbls;
+-----------------------+
| count(distinct SD_ID) |
+-----------------------+
|                   764 |
+-----------------------+
1 row in set (0.00 sec)

mysql> select count(distinct SD_ID) from columns;
+-----------------------+
| count(distinct SD_ID) |
+-----------------------+
|                  5219 |
+-----------------------+
1 row in set (0.05 sec)

Is that mean table 'columns' contains garbage data? If so, then how it is
generated?

-- 
Best Regards,
Ted Xu

Reply via email to