We have done some preliminary work with indexing - but that's not the focus 
right now and no code is available in the open source trunk for this purpose. I 
think it's fair to say that hive is not optimized for online processing right 
now. (and we are quite some ways off from columnar storage).

________________________________
From: Martin Matula [mailto:[email protected]]
Sent: Sunday, December 14, 2008 6:54 AM
To: [email protected]
Subject: OLAP with Hive

Hi,
Is Hive capable of indexing the data and storing them in a way optimized for 
querying (like a columnar database - bitmap indexes, compression, etc.)?
I need to be able to get decent response times for queries (up to a few 
seconds) over huge amounts of analytical data. Is that achievable (with 
appropriate number of machines in a cluster)? I saw the 
serialization/deserialization of tables is pluggable. Is that the way to make 
the storage more efficient? Any existing implementation (either ready or in 
progress) that would be targeted at this? Or any hints on what I may want to 
take a look at among the things that are currently available in Hive/Hadoop?
Thanks,
Martin

Reply via email to