I updated the patch: http://issues.apache.org/jira/secure/attachment/12447307/HIVE-1364.2.patch.txt
Thanks. Carl On Wed, Jun 16, 2010 at 7:31 PM, John Sichi <[email protected]> wrote: > Looks like that patch has a bug. It should not be changing > PARTITIONS.PART_NAME, which is an indexed column. Try again, undoing that > line of the patch first. > > JVS > ________________________________________ > From: Ray Duong [[email protected]] > Sent: Wednesday, June 16, 2010 5:24 PM > To: [email protected] > Subject: Re: Hive-Hbase with large number of columns > > Hi, > > I applied the patch HIVE-1364 and rebuilt the metastore. I was able to > create an external table in Hive for a large number of columns ( upto > 4000bytes). > > Now when I tried to drop the external table I get the following error > message. Is there another file that I need to modify in order to drop the > table? > > hive> drop table hbase_test; > > FAILED: Error in metadata: javax.jdo.JDODataStoreException: Error(s) were > found while auto-creating/validating the datastore for classes. The errors > are printed in the log, and are attached to this exception. > NestedThrowables: > com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Specified key was too > long; max key length is 767 bytes > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > > Thanks, > -ray > > > On Tue, Jun 15, 2010 at 4:05 PM, John Sichi <[email protected]<mailto: > [email protected]>> wrote: > Column-level properties are attractive for other reasons, but I don't think > we should do it as a workaround for underlying limits. I've noted in JIRA > that I think a LOB would be more appropriate here. > > Note that while you're waiting for a permanent resolution, you can use > ALTER TABLE on your metastore DBMS to widen the precision. > > JVS > > On Jun 15, 2010, at 3:37 PM, Ray Duong wrote: > > Yes, I think I might have to do that. I was trying to avoid multiple Hbase > scans with one big table. > > BTW, would it better to store the column SERDE properties for each column > versus at the table level to avoid the 767 or 4000 byte limitation? > > Thanks again, > -ray > > > On Tue, Jun 15, 2010 at 2:42 PM, Edward Capriolo <[email protected] > <mailto:[email protected]>> wrote: > > > On Tue, Jun 15, 2010 at 5:04 PM, Ray Duong <[email protected]<mailto: > [email protected]>> wrote: > Thanks for all the help. > > -ray > > > On Tue, Jun 15, 2010 at 1:26 PM, Carl Steinbach <[email protected]<mailto: > [email protected]>> wrote: > Hi Ray, > > 4000 bytes is the maximum VARCHAR size allowed on Oracle 9i/10g/11g. As far > as I know this is the smallest maximum VARCHAR size out of the databases we > currently try to support (MySQL, Oracle, Derby, etc). > > Carl > > > On Tue, Jun 15, 2010 at 1:15 PM, Ray Duong <[email protected]<mailto: > [email protected]>> wrote: > Thank John/Carl, > > Yep, there seems to be a limit on the 767 byte size. So I see the patch > HIVE-1364 to set it to 4000 bytes. I'm using Db-derby, do you know if there > is a limit beyond 4000 bytes? > > -ray > > Error: > Caused by: ERROR 22001: A truncation error was encountered trying to shrink > VARCHAR 'segment:ITC_10#ITC_1001,segment:CITC_10#ITC_1001,segment:ITC&' to > length 767. > > > > > On Tue, Jun 15, 2010 at 12:26 PM, Carl Steinbach <[email protected] > <mailto:[email protected]>> wrote: > Hi Ray, > > There is currently a 767 byte size limit on SERDEPROPERTIES values (see > http://issues.apache.org/jira/browse/HIVE-1364). It's possible that you're > bumping into this limitation (assuming you abbreviated the column names in > your example). > > > On Tue, Jun 15, 2010 at 12:03 PM, John Sichi <[email protected]<mailto: > [email protected]>> wrote: > That exception is coming from the metastore (trying to write the table > definition). Could you dig down into the Hive logs to see if you can get > the underlying cause? > > You can get the logs to spew on console by adding "-hiveconf > hive.root.logger=DEBUG,console" to your Hive CLI invocation. > > JVS > > On Jun 15, 2010, at 11:57 AM, Ray Duong wrote: > > Hi, > > I'm trying to map a Hbase table in Hive that contains large number of > columns. Since Hbase is designed to be a wide table, does Hive/Hbase > integration have any set limitation on the number of columns it can map in > one table? I seem to hit a limit at 10 columns. > > Thanks, > -ray > > create external table hbase_t1 > ( > key string, > f1_a string, > f2_a string, > f1_b string, > f2_b string, > ... > ... > f1_m string, > f2_m string, > > ) > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > WITH SERDEPROPERTIES ("hbase.columns.mapping" = > ":key,f1:a,f2:a,f1:b,f2:b,f1:c,f2:c,f1:d,f2:d,f1:e,f2:e,f1:f,f2:f,f1:g,f2:g,f1:h,f2:h,f1:i,f2:i,f1:j,f2:j,f1:k,f2:k,f1:l,f2:l,f1:m,f2:m" > ) > TBLPROPERTIES("hbase.table.name<http://hbase.table.name/>" = "t1"); > > Error Message: > > FAILED: Error in metadata: javax.jdo.JDODataStoreException: Put request > failed : INSERT INTO `SERDE_PARAMS` (`PARAM_VALUE`,`SERDE_ID`,`PARAM_KEY`) > VALUES (?,?,?) > NestedThrowables: > org.datanucleus.store.mapped.exceptions.MappedDatastoreException: INSERT > INTO `SERDE_PARAMS` (`PARAM_VALUE`,`SERDE_ID`,`PARAM_KEY`) VALUES (?,?,?) > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > > > > > > > > > > You have probably thought of this, but in the short term you can create two > vertically partitioned tables and do a 1 to 1 join on their key. > Edward > > > > >
