Note: I am a newbie to Hive. Can someone please answer the following questions?
1) Does Hive provide APIs (like HBase does) that can be used to retrieve data from the tables in Hive from a Java program? I heard somewhere that the data can be accessed with JDBC (style) APIs. True? 2) I don't see how I can add indexes on the tables, so does that mean a query such as the following will trigger a MR job that will search files on HDFS sequentially? hive> SELECT a.foo FROM invites a WHERE a.ds='2008-08-15'; 3) Has anyone compared performance of Hive against other NOSQL databases such as HBase, MongoDB. I understand it's not exactly apples to apples comparison, but still... Thanks.