I think the official manual for hive can be accessed @ https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-string
Selva *From:* Qifan Chen [mailto:[email protected]] *Sent:* Thursday, May 26, 2016 8:55 AM *To:* [email protected] *Subject:* Re: 答复: Hive STRING and VARCHAR types To drive the support of Hive semantics, sounds like we need to know better the operations allowed on STRING and VARCHAR(n) in hive? On Thu, May 26, 2016 at 10:53 AM, Liu, Ming (Ming) <[email protected]> wrote: Agree with QiFan. I think we should map closest data type from Hive to Trafodion. Current mapping of hive STRING to Trafodion VARCHAR may not be proper. Since Hive can save up to 2G in a String column, but Trafodion VARCHAR has much smaller max size. So if there is a 2G hive string data, how can we convert it into VARCHAR? We can implicitly truncate like Impala, but that not seems good. But why I cannot find an official Hive manual that describes the max length of STRING? This seems a big semantic change, if map Hive STRING into Trafodion CLOB, and there will be no confusing anymore, since then STRING and VARCHAR are two very different types. VARCHAR(n) will be treated as n Characters. Thanks, Ming *发件人**:* Dave Birdsall [mailto:[email protected]] *发送时间:* 2016年5月26日 23:15 *收件人:* [email protected] *主题:* RE: Hive STRING and VARCHAR types But CLOB would limit what predicates and functions one could use on the column, right? *From:* Qifan Chen [mailto:[email protected]] *Sent:* Thursday, May 26, 2016 5:43 AM *To:* [email protected] *Subject:* Re: Hive STRING and VARCHAR types I wonder if we should consider the same length limit as Hive for a STRING type, which is 2GB (http://www.folkstalk.com/2011/11/data-types-in-hive.html). If so, the Trafodion mapping should be CLOB? --Qifan On Wed, May 25, 2016 at 11:49 PM, Selva Govindarajan < [email protected]> wrote: >From the Cloudera documentation *Text table considerations:* Text data files can contain values that are longer than allowed by the VARCHAR(n) length limit. Any extra trailing characters are ignored when Impala processes those values during a query Will Trafodion behave the same way? Having the maximum limit for the individual column provides the flexibility and optimal resource utilization. However, having the limit in number of bytes for String and number of characters for Varchar could be quite confusing for the user. Selva *From:* Hans Zeller [mailto:[email protected]] *Sent:* Wednesday, May 25, 2016 6:12 PM *To:* [email protected] *Subject:* Hive STRING and VARCHAR types Hi, Here is a question on Hive data types. Ming is about to add support for Hive VARCHAR data types, in addition to the existing STRING type, but we hit a question we wanted to pose to the user community. Here is a proposed type mapping from Hive to Trafodion: *Hive type* *Trafodion type* *Max # of chars* *Size in bytes* *Existing/new* *Comments* STRING VARCHAR(n BYTES) n/4 to n n existing n is determined by the HIVE_MAX_STRING_LENGTH CQD VARCHAR(m) VARCHAR(m) m 4*m proposed Is it ok if we treat STRING and VARCHAR differently? Thanks, Ming and Hans -- Regards, --Qifan -- Regards, --Qifan
