kgyrtkirk commented on a change in pull request #992:
URL: https://github.com/apache/hive/pull/992#discussion_r413748151
##########
File path:
ql/src/test/org/apache/hadoop/hive/ql/parse/type/TestTypeCheckProcFactory.java
##########
@@ -140,4 +147,37 @@ public void testWithNonZeroFraction() throws Exception {
}
}
+ @Test
+ public void testValidateUDFOnTypeCheck() throws Exception {
Review comment:
* this is a "@Parameterized" testclass; please don't add testcases which
are not use the parameterized nature... move these to a new testclass
* it might make sense to split the testcases into separate methods
##########
File path:
ql/src/java/org/apache/hadoop/hive/ql/parse/type/TypeCheckProcFactory.java
##########
@@ -789,12 +791,25 @@ protected void validateUDF(ASTNode expr, boolean
isFunction, TypeCheckCtx ctx, F
LogHelper console = new LogHelper(LOG);
+ Set<PrimitiveObjectInspector.PrimitiveCategory> unsafeConventionTyps =
Sets.newHashSet(
+ PrimitiveObjectInspector.PrimitiveCategory.STRING,
+ PrimitiveObjectInspector.PrimitiveCategory.VARCHAR,
+ PrimitiveObjectInspector.PrimitiveCategory.CHAR);
// For now, if a bigint is going to be cast to a double throw an error
or warning
- if ((oiTypeInfo0.equals(TypeInfoFactory.stringTypeInfo) &&
oiTypeInfo1.equals(TypeInfoFactory.longTypeInfo)) ||
- (oiTypeInfo0.equals(TypeInfoFactory.longTypeInfo) &&
oiTypeInfo1.equals(TypeInfoFactory.stringTypeInfo))) {
+ if ((oiTypeInfo0 instanceof PrimitiveTypeInfo &&
Review comment:
you could move the oiTypeInfo0 conditions into a method (along with the
Set) and then reuse the method 2 lines down
might make this more readable
##########
File path:
ql/src/java/org/apache/hadoop/hive/ql/parse/type/TypeCheckProcFactory.java
##########
@@ -789,12 +791,25 @@ protected void validateUDF(ASTNode expr, boolean
isFunction, TypeCheckCtx ctx, F
LogHelper console = new LogHelper(LOG);
+ Set<PrimitiveObjectInspector.PrimitiveCategory> unsafeConventionTyps =
Sets.newHashSet(
+ PrimitiveObjectInspector.PrimitiveCategory.STRING,
+ PrimitiveObjectInspector.PrimitiveCategory.VARCHAR,
+ PrimitiveObjectInspector.PrimitiveCategory.CHAR);
// For now, if a bigint is going to be cast to a double throw an error
or warning
- if ((oiTypeInfo0.equals(TypeInfoFactory.stringTypeInfo) &&
oiTypeInfo1.equals(TypeInfoFactory.longTypeInfo)) ||
- (oiTypeInfo0.equals(TypeInfoFactory.longTypeInfo) &&
oiTypeInfo1.equals(TypeInfoFactory.stringTypeInfo))) {
+ if ((oiTypeInfo0 instanceof PrimitiveTypeInfo &&
+
unsafeConventionTyps.contains(((PrimitiveTypeInfo)oiTypeInfo0).getPrimitiveCategory())
&&
+ oiTypeInfo1.equals(TypeInfoFactory.longTypeInfo)) || (oiTypeInfo1
instanceof PrimitiveTypeInfo &&
+
unsafeConventionTyps.contains(((PrimitiveTypeInfo)oiTypeInfo1).getPrimitiveCategory())
&&
+ oiTypeInfo0.equals(TypeInfoFactory.longTypeInfo))) {
String error = StrictChecks.checkTypeSafety(conf);
- if (error != null) throw new UDFArgumentException(error);
- console.printError("WARNING: Comparing a bigint and a string may
result in a loss of precision.");
+ if (error != null) {
+ throw new UDFArgumentException(error);
+ }
+ String type = oiTypeInfo0.getTypeName();
+ if (oiTypeInfo0.equals(TypeInfoFactory.longTypeInfo)) {
+ type = oiTypeInfo1.getTypeName();
+ }
+ console.printError("WARNING: Comparing a bigint and a " + type + "
may result in a loss of precision.");
Review comment:
I don't think the `type` variable is needed - you could just get both
types when you are generating the warning messages.
##########
File path: ql/src/test/results/clientpositive/llap/unsafe_compare.q.out
##########
@@ -0,0 +1,40 @@
+PREHOOK: query: CREATE TABLE test_a (appid1 varchar(256), appid2 char(20))
+PREHOOK: type: CREATETABLE
+PREHOOK: Output: database:default
+PREHOOK: Output: default@test_a
+POSTHOOK: query: CREATE TABLE test_a (appid1 varchar(256), appid2 char(20))
+POSTHOOK: type: CREATETABLE
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@test_a
+PREHOOK: query: INSERT INTO test_a VALUES ('2882303761517473127',
'2882303761517473127'), ('2882303761517473276','2882303761517473276')
+PREHOOK: type: QUERY
+PREHOOK: Input: _dummy_database@_dummy_table
+PREHOOK: Output: default@test_a
+POSTHOOK: query: INSERT INTO test_a VALUES ('2882303761517473127',
'2882303761517473127'), ('2882303761517473276','2882303761517473276')
+POSTHOOK: type: QUERY
+POSTHOOK: Input: _dummy_database@_dummy_table
+POSTHOOK: Output: default@test_a
+POSTHOOK: Lineage: test_a.appid1 SCRIPT []
+POSTHOOK: Lineage: test_a.appid2 SCRIPT []
+WARNING: Comparing a bigint and a varchar(256) may result in a loss of
precision.
+PREHOOK: query: SELECT appid1 FROM test_a WHERE appid1 = 2882303761517473127
+PREHOOK: type: QUERY
+PREHOOK: Input: default@test_a
+#### A masked pattern was here ####
+POSTHOOK: query: SELECT appid1 FROM test_a WHERE appid1 = 2882303761517473127
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@test_a
+#### A masked pattern was here ####
+2882303761517473127
+2882303761517473276
Review comment:
I think these both '2882303761517473127' and '2882303761517473276'
should fit into char(20) / varchar(20)
and because of that this result leaves me a bit puzzled; is this row
expected with the `appid1 = 2882303761517473127` condition?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]