Re: UPDATE statement in Hive?

Abhijit Pol Tue, 28 Jul 2009 21:09:44 -0700

+1 if need more support for this feature. I think this will be very
powerful and useful addition to HIVE.


2009/7/28 He Yongqiang <[email protected]>:
> Talked with Samuel Guo, and I am sure he will work on it soon.
>
> On 09-7-29 上午10:15, "Ashish Thusoo" <[email protected]> wrote:
>
> That would be great Youngqiang.
>
> Amr, we don't have that kind of support but would love to add it.
>
> Ashish
>
> ________________________________
> From: He Yongqiang [mailto:[email protected]]
> Sent: Tuesday, July 28, 2009 7:03 PM
> To: [email protected]
> Subject: Re: UPDATE statement in Hive?
>
> The patch contributor of https://issues.apache.org/jira/browse/PIG-6 is a
> student here in our institute, but another laboratory.
> If hive is interested in this, I will get in touch with him to see if he
> would like to do a similar contribution for hive.
>
> On 09-7-29 上午8:10, "Peter Skomoroch" <[email protected]> wrote:
>
> +1 for Hive queries on HBase - that would be a  powerful combination.
>
> On Tue, Jul 28, 2009 at 8:05 PM, Amr Awadallah  <[email protected]> wrote:
>
>
> Saurabh, I think you better off with HBase for this  kind of use, see:
>
> http://hadoop.apache.org/hbase/
>
> In  a nutshell, HBase is a layer on top of HDFS which supports two things:
> (1)  quick lookups based on keys (e.g. a userid), and (2) transaction
> semantics  at the row-level (update/delete/insert values for a given  key).
>
> Ashish, is there any way to run Hive queries on top of HBase?  Pig has
> support for that via this  patch:
>
> https://issues.apache.org/jira/browse/PIG-6
>
> -- amr
>
>
> Ashish Thusoo  wrote:
>
>
> There is no update statement at this time and as  there is no update of a
> file in hadoop and update in Hive though possible  would just be syntax
> sugar for merging the new values to the old data in  the table and then
> rewriting the table with the merged output. This can be  achieved by doing
> an insert overwrite on the old table from the results of  the merge done by
> a left outer join on the old table and the new data  staged in another
> table. Also note that when you are updating the table,  current queries
> running on the table may fail.
>
> Another option is to  change your schema so that the table actually contains
> the changes to the  row instead of the row values themselves and then change
> the query that  takes the new schema into  account.
>
> Ashish
>
> ________________________________________
> From:  Saurabh Nanda [[email protected]]
> Sent: Tuesday, July 28, 2009  3:41 AM
> To: [email protected]
> Subject: UPDATE statement in  Hive?
>
> Is there an UPDATE statement in Hive? If not, are there any  plans for
> adding support for it in the future?
>
> This is why I ask: I  want to maintain a table which, against each user ID,
> stores the first  visit & last visit time. This is across the entire year,
> not a day --  basically to understand how many visitors we got in last 1/3/6
> months,  etc.
>
> I can add new users into a separate partition to get around  the limitation
> of not being able to append rows to a table. However, I  don't know how to
> update the last_visited_at column for each  user?
>
> Is this best achieved by storing this table outside of Hive  in a
> traditional RDBMS? Using JDBC query Hive for a list of distinct  visitors
> today and based on that list update the 'external'  table.
>
> Saurabh.
> --
> http://nandz.blogspot.com
> http://foodieforlife.blogspot.com
>  ?
>
>
>
>

Re: UPDATE statement in Hive?

Reply via email to