Talked with Samuel Guo, and I am sure he will work on it soon. On 09-7-29 上午10:15, "Ashish Thusoo" <[email protected]> wrote:
> That would be great Youngqiang. > > Amr, we don't have that kind of support but would love to add it. > > Ashish > > > From: He Yongqiang [mailto:[email protected]] > Sent: Tuesday, July 28, 2009 7:03 PM > To: [email protected] > Subject: Re: UPDATE statement in Hive? > > The patch contributor of https://issues.apache.org/jira/browse/PIG-6 is a > student here in our institute, but another laboratory. > If hive is interested in this, I will get in touch with him to see if he would > like to do a similar contribution for hive. > > On 09-7-29 上午8:10, "Peter Skomoroch" <[email protected]> wrote: > >> +1 for Hive queries on HBase - that would be a powerful combination. >> >> On Tue, Jul 28, 2009 at 8:05 PM, Amr Awadallah <[email protected]> wrote: >> >>> Saurabh, I think you better off with HBase for this kind of use, see: >>> >>> http://hadoop.apache.org/hbase/ >>> >>> In a nutshell, HBase is a layer on top of HDFS which supports two things: >>> (1) quick lookups based on keys (e.g. a userid), and (2) transaction >>> semantics at the row-level (update/delete/insert values for a given key). >>> >>> Ashish, is there any way to run Hive queries on top of HBase? Pig has >>> support for that via this patch: >>> >>> https://issues.apache.org/jira/browse/PIG-6 >>> >>> -- amr >>> >>> >>> Ashish Thusoo wrote: >>> >>>> There is no update statement at this time and as there is no update of a >>>> file in hadoop and update in Hive though possible would just be syntax >>>> sugar for merging the new values to the old data in the table and then >>>> rewriting the table with the merged output. This can be achieved by doing >>>> an insert overwrite on the old table from the results of the merge done by >>>> a left outer join on the old table and the new data staged in another >>>> table. Also note that when you are updating the table, current queries >>>> running on the table may fail. >>>> >>>> Another option is to change your schema so that the table actually >>>> contains the changes to the row instead of the row values themselves and >>>> then change the query that takes the new schema into account. >>>> >>>> Ashish >>>> >>>> ________________________________________ >>>> From: Saurabh Nanda [[email protected]] >>>> Sent: Tuesday, July 28, 2009 3:41 AM >>>> To: [email protected] >>>> Subject: UPDATE statement in Hive? >>>> >>>> Is there an UPDATE statement in Hive? If not, are there any plans for >>>> adding support for it in the future? >>>> >>>> This is why I ask: I want to maintain a table which, against each user ID, >>>> stores the first visit & last visit time. This is across the entire year, >>>> not a day -- basically to understand how many visitors we got in last >>>> 1/3/6 months, etc. >>>> >>>> I can add new users into a separate partition to get around the limitation >>>> of not being able to append rows to a table. However, I don't know how to >>>> update the last_visited_at column for each user? >>>> >>>> Is this best achieved by storing this table outside of Hive in a >>>> traditional RDBMS? Using JDBC query Hive for a list of distinct visitors >>>> today and based on that list update the 'external' table. >>>> >>>> Saurabh. >>>> -- >>>> http://nandz.blogspot.com >>>> http://foodieforlife.blogspot.com >>>> ? >>>> >>>> >>>>
