Hi Ashish,

Thank you for your reply, that explains my problem.

I also find the columns related to a certain partition is identical to the
columns which related to other partitions in the same table. So what is
the benefit for such a redundant design?

2010/5/27 Ashish Thusoo <[email protected]>

>  Do you have partitions in the table? Storage descriptors can also be
> associated with partitions.
>
> Ashish
>
>  ------------------------------
> *From:* Ted Xu [mailto:[email protected]]
> *Sent:* Wednesday, May 26, 2010 5:26 AM
> *To:* [email protected]
> *Subject:* Garbage data in metadata store?
>
> Hi all,
>
> I want to replicate hive metadata to another place, while I found my hive
> metadata contains a big portion of data looks like garbage.
>
> In my understanding, the hive metadata store use 'Storage Descriptor' to
> keep relationship between tables and columns. But the 'SD_ID' columns in
> table 'TBLS' and 'COLUMNS' are unbalanced in count, as shown below:
>
> mysql> select count(distinct SD_ID) from tbls;
> +-----------------------+
> | count(distinct SD_ID) |
> +-----------------------+
> |                   764 |
> +-----------------------+
> 1 row in set (0.00 sec)
>
> mysql> select count(distinct SD_ID) from columns;
> +-----------------------+
> | count(distinct SD_ID) |
> +-----------------------+
> |                  5219 |
> +-----------------------+
> 1 row in set (0.05 sec)
>
> Is that mean table 'columns' contains garbage data? If so, then how it is
> generated?
>
> --
> Best Regards,
> Ted Xu
>



-- 
Best Regards,
Ted Xu

Reply via email to