The patch contributor of https://issues.apache.org/jira/browse/PIG-6 is a student here in our institute, but another laboratory. If hive is interested in this, I will get in touch with him to see if he would like to do a similar contribution for hive.
On 09-7-29 上午8:10, "Peter Skomoroch" <[email protected]> wrote: > +1 for Hive queries on HBase - that would be a powerful combination. > > On Tue, Jul 28, 2009 at 8:05 PM, Amr Awadallah <[email protected]> wrote: >> Saurabh, I think you better off with HBase for this kind of use, see: >> >> http://hadoop.apache.org/hbase/ >> >> In a nutshell, HBase is a layer on top of HDFS which supports two things: (1) >> quick lookups based on keys (e.g. a userid), and (2) transaction semantics at >> the row-level (update/delete/insert values for a given key). >> >> Ashish, is there any way to run Hive queries on top of HBase? Pig has support >> for that via this patch: >> >> https://issues.apache.org/jira/browse/PIG-6 >> >> -- amr >> >> >> Ashish Thusoo wrote: >>> There is no update statement at this time and as there is no update of a >>> file in hadoop and update in Hive though possible would just be syntax sugar >>> for merging the new values to the old data in the table and then rewriting >>> the table with the merged output. This can be achieved by doing an insert >>> overwrite on the old table from the results of the merge done by a left >>> outer join on the old table and the new data staged in another table. Also >>> note that when you are updating the table, current queries running on the >>> table may fail. >>> >>> Another option is to change your schema so that the table actually contains >>> the changes to the row instead of the row values themselves and then change >>> the query that takes the new schema into account. >>> >>> Ashish >>> >>> ________________________________________ >>> From: Saurabh Nanda [[email protected]] >>> Sent: Tuesday, July 28, 2009 3:41 AM >>> To: [email protected] >>> Subject: UPDATE statement in Hive? >>> >>> Is there an UPDATE statement in Hive? If not, are there any plans for adding >>> support for it in the future? >>> >>> This is why I ask: I want to maintain a table which, against each user ID, >>> stores the first visit & last visit time. This is across the entire year, >>> not a day -- basically to understand how many visitors we got in last 1/3/6 >>> months, etc. >>> >>> I can add new users into a separate partition to get around the limitation >>> of not being able to append rows to a table. However, I don't know how to >>> update the last_visited_at column for each user? >>> >>> Is this best achieved by storing this table outside of Hive in a traditional >>> RDBMS? Using JDBC query Hive for a list of distinct visitors today and based >>> on that list update the 'external' table. >>> >>> Saurabh. >>> -- >>> http://nandz.blogspot.com >>> http://foodieforlife.blogspot.com >>> ? >>> >>>
