[ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12767678#action_12767678 ]
Zheng Shao commented on HIVE-867: --------------------------------- If you almost always need a String parameter, we can just use "String" as the type of the parameter in the UDF definition. If you almost always need to return a String, we can also just return "String". So for UDFLeft and UDFRight, we can do: {code} public String evaluate(String s, IntWritable r); {code} instead of {code} public Text evaluate(Text s, IntWritable r); {code} This will save a lot of conversions if user do "left(right(col, 10), 3)". This is the same for the SerDe - for example, RegexSerDe returns "String" instead of "Text", so "left(col, 3)" where col is from a RegexSerDe table does not need a conversion from "String" -> "Text" to pass to the Left function, and then "Text" -> "String" inside the left function. Of course, the most efficient way is to do the char counting without UTF-8 encoding/decoding, (then we still prefer Text because we don't need to create new objects), but I think we can do that later unless you want to do it now. > Add add UDFs found in mysq > -------------------------- > > Key: HIVE-867 > URL: https://issues.apache.org/jira/browse/HIVE-867 > Project: Hadoop Hive > Issue Type: New Feature > Reporter: Edward Capriolo > Assignee: Edward Capriolo > Attachments: hive-867-1.diff, hive-867-2.diff, hive-867-3.diff, > hive-867-7.diff > > > Some UDF's that mysql has that hive does not. > atan > aes_decrypt > aes_encrypt > bit_and > bit_count > bit_length > bit_or > bit_xor > char_length > char > character_length > collation > compress > crc32 > encode > encrypt > format > greatest > in > inet_oton > inet_ntoa > match > md5 > oct > ord > pi > radians > sha1 _sha > sign > sleep > truncate -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.