Column-level properties are attractive for other reasons, but I don't think we should do it as a workaround for underlying limits. I've noted in JIRA that I think a LOB would be more appropriate here.
Note that while you're waiting for a permanent resolution, you can use ALTER TABLE on your metastore DBMS to widen the precision. JVS On Jun 15, 2010, at 3:37 PM, Ray Duong wrote: Yes, I think I might have to do that. I was trying to avoid multiple Hbase scans with one big table. BTW, would it better to store the column SERDE properties for each column versus at the table level to avoid the 767 or 4000 byte limitation? Thanks again, -ray On Tue, Jun 15, 2010 at 2:42 PM, Edward Capriolo <[email protected]<mailto:[email protected]>> wrote: On Tue, Jun 15, 2010 at 5:04 PM, Ray Duong <[email protected]<mailto:[email protected]>> wrote: Thanks for all the help. -ray On Tue, Jun 15, 2010 at 1:26 PM, Carl Steinbach <[email protected]<mailto:[email protected]>> wrote: Hi Ray, 4000 bytes is the maximum VARCHAR size allowed on Oracle 9i/10g/11g. As far as I know this is the smallest maximum VARCHAR size out of the databases we currently try to support (MySQL, Oracle, Derby, etc). Carl On Tue, Jun 15, 2010 at 1:15 PM, Ray Duong <[email protected]<mailto:[email protected]>> wrote: Thank John/Carl, Yep, there seems to be a limit on the 767 byte size. So I see the patch HIVE-1364 to set it to 4000 bytes. I'm using Db-derby, do you know if there is a limit beyond 4000 bytes? -ray Error: Caused by: ERROR 22001: A truncation error was encountered trying to shrink VARCHAR 'segment:ITC_10#ITC_1001,segment:CITC_10#ITC_1001,segment:ITC&' to length 767. On Tue, Jun 15, 2010 at 12:26 PM, Carl Steinbach <[email protected]<mailto:[email protected]>> wrote: Hi Ray, There is currently a 767 byte size limit on SERDEPROPERTIES values (see http://issues.apache.org/jira/browse/HIVE-1364). It's possible that you're bumping into this limitation (assuming you abbreviated the column names in your example). On Tue, Jun 15, 2010 at 12:03 PM, John Sichi <[email protected]<mailto:[email protected]>> wrote: That exception is coming from the metastore (trying to write the table definition). Could you dig down into the Hive logs to see if you can get the underlying cause? You can get the logs to spew on console by adding "-hiveconf hive.root.logger=DEBUG,console" to your Hive CLI invocation. JVS On Jun 15, 2010, at 11:57 AM, Ray Duong wrote: Hi, I'm trying to map a Hbase table in Hive that contains large number of columns. Since Hbase is designed to be a wide table, does Hive/Hbase integration have any set limitation on the number of columns it can map in one table? I seem to hit a limit at 10 columns. Thanks, -ray create external table hbase_t1 ( key string, f1_a string, f2_a string, f1_b string, f2_b string, ... ... f1_m string, f2_m string, ) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,f1:a,f2:a,f1:b,f2:b,f1:c,f2:c,f1:d,f2:d,f1:e,f2:e,f1:f,f2:f,f1:g,f2:g,f1:h,f2:h,f1:i,f2:i,f1:j,f2:j,f1:k,f2:k,f1:l,f2:l,f1:m,f2:m" ) TBLPROPERTIES("hbase.table.name<http://hbase.table.name/>" = "t1"); Error Message: FAILED: Error in metadata: javax.jdo.JDODataStoreException: Put request failed : INSERT INTO `SERDE_PARAMS` (`PARAM_VALUE`,`SERDE_ID`,`PARAM_KEY`) VALUES (?,?,?) NestedThrowables: org.datanucleus.store.mapped.exceptions.MappedDatastoreException: INSERT INTO `SERDE_PARAMS` (`PARAM_VALUE`,`SERDE_ID`,`PARAM_KEY`) VALUES (?,?,?) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask You have probably thought of this, but in the short term you can create two vertically partitioned tables and do a 1 to 1 join on their key. Edward
