HCatalog is definitely not designed for this purpose.  Could you explain your 
use case more fully?  Is this indexing for better query planning or faster file 
access?  If so, you might look at some of the work going on in ORC, which is 
storing indices of its data in the format itself for these purposes.  Also, how 
much data do you need to store?  Even index size on a Hadoop scale data can 
quickly overwhelm MySQL or Postgres (which is what most people use for their 
metastores) if you are keeping per row information.  If you truly want to 
access an RDBMS as if it were an external data store, you could implement a 
HiveStorageHandler for your RDBMS.

Alan.

On Jan 22, 2014, at 2:02 AM, Petter von Dolwitz (Hem) 
<petter.von.dolw...@gmail.com> wrote:

> Hi,
> 
> I have a case where I would like to extend Hive to use information from a 
> regular RDBMS. To limit the complexity of the installation I thought I could 
> piggyback on the already existing metatstore.
> 
> As I understand it, HCatalog is not built for this purpose. Is there someone 
> out there that has a similar usecase or have any input on how this is done or 
> if it should be avoided?
> 
> The use case is to look up which partitions that contain certain data.
> 
> Thanks,
> Petter
> 
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Reply via email to