----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/35968/#review89635 -----------------------------------------------------------
.reviewboardrc <https://reviews.apache.org/r/35968/#comment142287> this file should not be in the patch ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java <https://reviews.apache.org/r/35968/#comment142288> Try to minimize the changes in existing classes. I do not think we shoul replace list of imports with * ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java (line 4) <https://reviews.apache.org/r/35968/#comment142297> I can not find "import org.apache.commons.math3" in other classes. I'm not sure hive-exec has explicit dependency on commons-math3 jar ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java (line 12) <https://reviews.apache.org/r/35968/#comment142289> do not use * for import ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java (line 26) <https://reviews.apache.org/r/35968/#comment142290> it should be full stop at the end. remove trailing space ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java (line 28) <https://reviews.apache.org/r/35968/#comment142291> I do not think we need new line in the middle of the sentence. ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java (line 29) <https://reviews.apache.org/r/35968/#comment142292> Can you add Example? ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java (line 32) <https://reviews.apache.org/r/35968/#comment142293> why final? ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java (line 39) <https://reviews.apache.org/r/35968/#comment142294> add check for null and return null. In most cases Hive UDFs do not throw exception if args are null. UDF should just return null. ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java (line 47) <https://reviews.apache.org/r/35968/#comment142295> Use class field FloatWritable and use set() method instead of creating new FloatWritable on evry wor ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java (line 50) <https://reviews.apache.org/r/35968/#comment142296> why private? I recommend to use protected to have an ability to extend your UDF in future and create another UDF with slightly different behavious ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java (line 60) <https://reviews.apache.org/r/35968/#comment142298> Why you cast value to float in the middle of the calculation. I recommend to use double internally and convert final result of calculation to float ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java (line 63) <https://reviews.apache.org/r/35968/#comment142301> Probably it's better to make it protected ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java (line 72) <https://reviews.apache.org/r/35968/#comment142300> Hive UDF should return null in case input data is null or invalid ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java (line 81) <https://reviews.apache.org/r/35968/#comment142306> It can be top level class or at least "public static" ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java (line 88) <https://reviews.apache.org/r/35968/#comment142304> UDF should not throw exceptions in evaluate method ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java (line 93) <https://reviews.apache.org/r/35968/#comment142307> forgot generic. Probably it should be HashSet<String> ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java (line 109) <https://reviews.apache.org/r/35968/#comment142303> UDF should not throw exceptions in evaluate method ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java (line 122) <https://reviews.apache.org/r/35968/#comment142302> UDF should not throw exceptions in evaluate method - Alexander Pivovarov On June 27, 2015, 8:32 p.m., Nishant Kelkar wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/35968/ > ----------------------------------------------------------- > > (Updated June 27, 2015, 8:32 p.m.) > > > Review request for hive and Alexander Pivovarov. > > > Repository: hive-git > > > Description > ------- > > 1. Added preliminary UDF code for cosine similarity. 2. Added unit tests and > integration tests. 3. Registered the UDF in the FunctionRegistry class. > > > Diffs > ----- > > .reviewboardrc abc33f91a44b76573cbba334c33417307c63956f > ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java > fabc21e2092561cbf98c35a406e4ee40e71fe1de > ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCosineSimilarity.java > PRE-CREATION > ql/src/test/org/apache/hadoop/hive/ql/udf/TestUDFCosineSimilarity.java > PRE-CREATION > ql/src/test/queries/clientnegative/udf_cosine_similarity_error_1.q > PRE-CREATION > ql/src/test/queries/clientnegative/udf_cosine_similarity_error_2.q > PRE-CREATION > ql/src/test/queries/clientpositive/udf_cosine_similarity.q PRE-CREATION > ql/src/test/results/clientnegative/udf_cosine_similarity_error_1.q.out > PRE-CREATION > ql/src/test/results/clientnegative/udf_cosine_similarity_error_2.q.out > PRE-CREATION > ql/src/test/results/clientpositive/show_functions.q.out > 5de4ffcd1ace477af026b83fb7bfb8068fc192b3 > ql/src/test/results/clientpositive/udf_cosine_similarity.q.out PRE-CREATION > > Diff: https://reviews.apache.org/r/35968/diff/ > > > Testing > ------- > > > Thanks, > > Nishant Kelkar > >