Hi all, I want to replicate hive metadata to another place, while I found my hive metadata contains a big portion of data looks like garbage.
In my understanding, the hive metadata store use 'Storage Descriptor' to keep relationship between tables and columns. But the 'SD_ID' columns in table 'TBLS' and 'COLUMNS' are unbalanced in count, as shown below: mysql> select count(distinct SD_ID) from tbls; +-----------------------+ | count(distinct SD_ID) | +-----------------------+ | 764 | +-----------------------+ 1 row in set (0.00 sec) mysql> select count(distinct SD_ID) from columns; +-----------------------+ | count(distinct SD_ID) | +-----------------------+ | 5219 | +-----------------------+ 1 row in set (0.05 sec) Is that mean table 'columns' contains garbage data? If so, then how it is generated? -- Best Regards, Ted Xu
