To drive the support of Hive semantics, sounds like we need to know better the operations allowed on STRING and VARCHAR(n) in hive?
On Thu, May 26, 2016 at 10:53 AM, Liu, Ming (Ming) <[email protected]> wrote: > Agree with QiFan. > > I think we should map closest data type from Hive to Trafodion. Current > mapping of hive STRING to Trafodion VARCHAR may not be proper. Since Hive > can save up to 2G in a String column, but Trafodion VARCHAR has much > smaller max size. So if there is a 2G hive string data, how can we convert > it into VARCHAR? We can implicitly truncate like Impala, but that not seems > good. > > But why I cannot find an official Hive manual that describes the max > length of STRING? > > > > This seems a big semantic change, if map Hive STRING into Trafodion CLOB, > and there will be no confusing anymore, since then STRING and VARCHAR are > two very different types. VARCHAR(n) will be treated as n Characters. > > > > Thanks, > > Ming > > > > > > *发件人:* Dave Birdsall [mailto:[email protected]] > *发送时间:* 2016年5月26日 23:15 > *收件人:* [email protected] > *主题:* RE: Hive STRING and VARCHAR types > > > > But CLOB would limit what predicates and functions one could use on the > column, right? > > > > *From:* Qifan Chen [mailto:[email protected]] > *Sent:* Thursday, May 26, 2016 5:43 AM > *To:* [email protected] > *Subject:* Re: Hive STRING and VARCHAR types > > > > I wonder if we should consider the same length limit as Hive for a STRING > type, which is 2GB ( > http://www.folkstalk.com/2011/11/data-types-in-hive.html). If so, the > Trafodion mapping should be CLOB? > > > > --Qifan > > > > On Wed, May 25, 2016 at 11:49 PM, Selva Govindarajan < > [email protected]> wrote: > > From the Cloudera documentation > > *Text table considerations:* > > Text data files can contain values that are longer than allowed by the > VARCHAR(n) length limit. Any extra trailing characters are ignored when > Impala processes those values during a query > > Will Trafodion behave the same way? Having the maximum limit for the > individual column provides the flexibility and optimal resource > utilization. However, having the limit in number of bytes for String and > number of characters for Varchar could be quite confusing for the user. > > Selva > > > > > > *From:* Hans Zeller [mailto:[email protected]] > *Sent:* Wednesday, May 25, 2016 6:12 PM > *To:* [email protected] > *Subject:* Hive STRING and VARCHAR types > > > > Hi, > > > > Here is a question on Hive data types. Ming is about to add support for > Hive VARCHAR data types, in addition to the existing STRING type, but we > hit a question we wanted to pose to the user community. Here is a proposed > type mapping from Hive to Trafodion: > > > > *Hive type* > > *Trafodion type* > > *Max # of chars* > > *Size in bytes* > > *Existing/new* > > *Comments* > > STRING > > VARCHAR(n BYTES) > > n/4 to n > > n > > existing > > n is determined by the HIVE_MAX_STRING_LENGTH CQD > > VARCHAR(m) > > VARCHAR(m) > > m > > 4*m > > proposed > > > > Is it ok if we treat STRING and VARCHAR differently? > > > > Thanks, > > > Ming and Hans > > > > > > -- > > Regards, --Qifan > > > -- Regards, --Qifan
