Hello Shreepadma,
That's definitely very helpful. I doubted that this would be the case,
but I was thinking that maybe there's a way to do it using a merge
task. I will change my data structure to make it a bit like HBase, and
I hope Hive would still be the right choice for me.. it can be backed
by HBase anyway :). Thank you very much, your quick reply saved me a
lot of time!
Sincerely,
Younos
Quoting Shreepadma Venugopalan <[email protected]>:
Hi Younos,
Since HiveQL doesn't support an insert..value statement, you can't insert
values into a specific column. Let's assume your table had the following
structure before the alter table..add columns statement was executed,
tab (a string, b bigint, c double)
Furthermore, let's assume that it had 100 rows. Now, let's assume you did
an alter table tab add columns (d binary). The new table structure will
look like below,
tab (a string, b bigint, c double, d binary)
You can't insert binary data into the 100 rows that were present prior to
the alter table statement by executing a HiveQL statement. HiveQL doesn't
support an insert..values statement like most RDBMSs. However, you can
delete the existing files and add new files that contain records
corresponding to the new table structure. Alternatively, you can skip the
deletion step and just add new files that correspond to the new table
structure. When you execute a HiveQL query, null will be returned for those
columns for which the data doesn't exist.
Hope this helps.
Thanks.
Shreepadma
On Sun, Dec 9, 2012 at 4:35 PM, <[email protected]> wrote:
Hello,
I couldn't find any example of how to populate columns that were added to
a table. How would Hive tell which row to append by each value of the newly
added columns? Does it do a column name matching?
Sincerely,
Younos
Best regards,
Younos Aboulnaga
Masters candidate
David Cheriton school of computer science
University of Waterloo
http://cs.uwaterloo.ca
E-Mail: [email protected]
Mobile: +1 (519) 497-5669