On Sun, Dec 9, 2012 at 10:32 PM, Bertrand Dechoux <[email protected]>wrote:
> I will reopen the subject a bit. > > I don't know the details of the RCFile implementation in Hive but if the > data were stored that way it is theoretically possible to add the column > data even without append and without rewriting the whole file. Does someone > has more information on that matter? > > Regards > > Bertrand > > > On Mon, Dec 10, 2012 at 2:02 AM, <[email protected]> wrote: > >> Hello Shreepadma, >> >> That's definitely very helpful. I doubted that this would be the case, >> but I was thinking that maybe there's a way to do it using a merge task. I >> will change my data structure to make it a bit like HBase, and I hope Hive >> would still be the right choice for me.. it can be backed by HBase anyway >> :). Thank you very much, your quick reply saved me a lot of time! >> >> Sincerely, >> Younos >> >> >> Quoting Shreepadma Venugopalan <[email protected]>: >> >> Hi Younos, >>> >>> Since HiveQL doesn't support an insert..value statement, you can't insert >>> values into a specific column. Let's assume your table had the following >>> structure before the alter table..add columns statement was executed, >>> >>> tab (a string, b bigint, c double) >>> >>> Furthermore, let's assume that it had 100 rows. Now, let's assume you did >>> an alter table tab add columns (d binary). The new table structure will >>> look like below, >>> >>> tab (a string, b bigint, c double, d binary) >>> >>> You can't insert binary data into the 100 rows that were present prior to >>> the alter table statement by executing a HiveQL statement. HiveQL doesn't >>> support an insert..values statement like most RDBMSs. However, you can >>> delete the existing files and add new files that contain records >>> corresponding to the new table structure. Alternatively, you can skip the >>> deletion step and just add new files that correspond to the new table >>> structure. When you execute a HiveQL query, null will be returned for >>> those >>> columns for which the data doesn't exist. >>> >>> Hope this helps. >>> >>> Thanks. >>> Shreepadma >>> >>> >>> On Sun, Dec 9, 2012 at 4:35 PM, <[email protected]> wrote: >>> >>> Hello, >>>> >>>> I couldn't find any example of how to populate columns that were added >>>> to >>>> a table. How would Hive tell which row to append by each value of the >>>> newly >>>> added columns? Does it do a column name matching? >>>> >>>> Sincerely, >>>> Younos >>>> >>>> >>>> >>>> >>>> >>> >> >> >> Best regards, >> Younos Aboulnaga >> >> Masters candidate >> David Cheriton school of computer science >> University of Waterloo >> http://cs.uwaterloo.ca >> >> E-Mail: [email protected] >> Mobile: +1 (519) 497-5669 >> >> >> >> > > > -- > Bertrand Dechoux >
