Thanks Guys, I found a workaround for the size limitation. It seems Mysql does support upto 65,535 bytes for varchar columns. I manually modified the column property without the patch code and it seems to ignore the size limit.
-ray On Wed, Jun 16, 2010 at 8:02 PM, Carl Steinbach <[email protected]> wrote: > I updated the patch: > http://issues.apache.org/jira/secure/attachment/12447307/HIVE-1364.2.patch.txt > > Thanks. > > Carl > > > On Wed, Jun 16, 2010 at 7:31 PM, John Sichi <[email protected]> wrote: > >> Looks like that patch has a bug. It should not be changing >> PARTITIONS.PART_NAME, which is an indexed column. Try again, undoing that >> line of the patch first. >> >> JVS >> ________________________________________ >> From: Ray Duong [[email protected]] >> Sent: Wednesday, June 16, 2010 5:24 PM >> To: [email protected] >> Subject: Re: Hive-Hbase with large number of columns >> >> Hi, >> >> I applied the patch HIVE-1364 and rebuilt the metastore. I was able to >> create an external table in Hive for a large number of columns ( upto >> 4000bytes). >> >> Now when I tried to drop the external table I get the following error >> message. Is there another file that I need to modify in order to drop the >> table? >> >> hive> drop table hbase_test; >> >> FAILED: Error in metadata: javax.jdo.JDODataStoreException: Error(s) were >> found while auto-creating/validating the datastore for classes. The errors >> are printed in the log, and are attached to this exception. >> NestedThrowables: >> com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: Specified key was too >> long; max key length is 767 bytes >> FAILED: Execution Error, return code 1 from >> org.apache.hadoop.hive.ql.exec.DDLTask >> >> Thanks, >> -ray >> >> >> On Tue, Jun 15, 2010 at 4:05 PM, John Sichi <[email protected]<mailto: >> [email protected]>> wrote: >> Column-level properties are attractive for other reasons, but I don't >> think we should do it as a workaround for underlying limits. I've noted in >> JIRA that I think a LOB would be more appropriate here. >> >> Note that while you're waiting for a permanent resolution, you can use >> ALTER TABLE on your metastore DBMS to widen the precision. >> >> JVS >> >> On Jun 15, 2010, at 3:37 PM, Ray Duong wrote: >> >> Yes, I think I might have to do that. I was trying to avoid multiple Hbase >> scans with one big table. >> >> BTW, would it better to store the column SERDE properties for each column >> versus at the table level to avoid the 767 or 4000 byte limitation? >> >> Thanks again, >> -ray >> >> >> On Tue, Jun 15, 2010 at 2:42 PM, Edward Capriolo <[email protected] >> <mailto:[email protected]>> wrote: >> >> >> On Tue, Jun 15, 2010 at 5:04 PM, Ray Duong <[email protected]<mailto: >> [email protected]>> wrote: >> Thanks for all the help. >> >> -ray >> >> >> On Tue, Jun 15, 2010 at 1:26 PM, Carl Steinbach <[email protected] >> <mailto:[email protected]>> wrote: >> Hi Ray, >> >> 4000 bytes is the maximum VARCHAR size allowed on Oracle 9i/10g/11g. As >> far as I know this is the smallest maximum VARCHAR size out of the databases >> we currently try to support (MySQL, Oracle, Derby, etc). >> >> Carl >> >> >> On Tue, Jun 15, 2010 at 1:15 PM, Ray Duong <[email protected]<mailto: >> [email protected]>> wrote: >> Thank John/Carl, >> >> Yep, there seems to be a limit on the 767 byte size. So I see the patch >> HIVE-1364 to set it to 4000 bytes. I'm using Db-derby, do you know if there >> is a limit beyond 4000 bytes? >> >> -ray >> >> Error: >> Caused by: ERROR 22001: A truncation error was encountered trying to >> shrink VARCHAR >> 'segment:ITC_10#ITC_1001,segment:CITC_10#ITC_1001,segment:ITC&' to length >> 767. >> >> >> >> >> On Tue, Jun 15, 2010 at 12:26 PM, Carl Steinbach <[email protected] >> <mailto:[email protected]>> wrote: >> Hi Ray, >> >> There is currently a 767 byte size limit on SERDEPROPERTIES values (see >> http://issues.apache.org/jira/browse/HIVE-1364). It's possible that >> you're bumping into this limitation (assuming you abbreviated the column >> names in your example). >> >> >> On Tue, Jun 15, 2010 at 12:03 PM, John Sichi <[email protected]<mailto: >> [email protected]>> wrote: >> That exception is coming from the metastore (trying to write the table >> definition). Could you dig down into the Hive logs to see if you can get >> the underlying cause? >> >> You can get the logs to spew on console by adding "-hiveconf >> hive.root.logger=DEBUG,console" to your Hive CLI invocation. >> >> JVS >> >> On Jun 15, 2010, at 11:57 AM, Ray Duong wrote: >> >> Hi, >> >> I'm trying to map a Hbase table in Hive that contains large number of >> columns. Since Hbase is designed to be a wide table, does Hive/Hbase >> integration have any set limitation on the number of columns it can map in >> one table? I seem to hit a limit at 10 columns. >> >> Thanks, >> -ray >> >> create external table hbase_t1 >> ( >> key string, >> f1_a string, >> f2_a string, >> f1_b string, >> f2_b string, >> ... >> ... >> f1_m string, >> f2_m string, >> >> ) >> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' >> WITH SERDEPROPERTIES ("hbase.columns.mapping" = >> ":key,f1:a,f2:a,f1:b,f2:b,f1:c,f2:c,f1:d,f2:d,f1:e,f2:e,f1:f,f2:f,f1:g,f2:g,f1:h,f2:h,f1:i,f2:i,f1:j,f2:j,f1:k,f2:k,f1:l,f2:l,f1:m,f2:m" >> ) >> TBLPROPERTIES("hbase.table.name<http://hbase.table.name/>" = "t1"); >> >> Error Message: >> >> FAILED: Error in metadata: javax.jdo.JDODataStoreException: Put request >> failed : INSERT INTO `SERDE_PARAMS` (`PARAM_VALUE`,`SERDE_ID`,`PARAM_KEY`) >> VALUES (?,?,?) >> NestedThrowables: >> org.datanucleus.store.mapped.exceptions.MappedDatastoreException: INSERT >> INTO `SERDE_PARAMS` (`PARAM_VALUE`,`SERDE_ID`,`PARAM_KEY`) VALUES (?,?,?) >> FAILED: Execution Error, return code 1 from >> org.apache.hadoop.hive.ql.exec.DDLTask >> >> >> >> >> >> >> >> >> >> You have probably thought of this, but in the short term you can create >> two vertically partitioned tables and do a 1 to 1 join on their key. >> Edward >> >> >> >> >> >
