[ https://issues.apache.org/jira/browse/HIVE-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Phabricator updated HIVE-4044: ------------------------------ Attachment: HIVE-4044.HIVE-4044.HIVE-4044.D8799.1.patch sxyuan requested code review of "HIVE-4044 [jira] Add URL type". Reviewers: kevinwilfong Having a separate type for URLs would enable improvements in storage efficiency based on breaking up a URL into its components. The new type will be named "URL" and made a non-reserved keyword (see HIVE-701). The change involves adding new classes and object inspectors for representing URLs, modifying existing serdes to read and write URLs, and adding the supporting grammar. In addition, some UDFs were modified to handle the new type. TEST PLAN Added queries testing various ways of using the new type: reading/writing URLs, comparing URLs, and applying UDFs to URLs. REVISION DETAIL https://reviews.facebook.net/D8799 AFFECTED FILES metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Partition.java metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/SkewedInfo.java metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/EnvironmentContext.java metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Schema.java metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/PrincipalPrivilegeSet.java metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/SerDeInfo.java metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Table.java metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/StorageDescriptor.java metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Database.java metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Index.java data/files/url.txt data/files/primitive_type_arrays.txt serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyURL.java serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyUtils.java serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyPrimitiveObjectInspectorFactory.java serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyURLObjectInspector.java serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryUtils.java serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryFactory.java serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryURL.java serde/src/java/org/apache/hadoop/hive/serde2/typeinfo/TypeInfoFactory.java serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/PrimitiveObjectInspector.java serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableConstantURLObjectInspector.java serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableURLObjectInspector.java serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/SettableURLObjectInspector.java serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorFactory.java serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/URLObjectInspector.java serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaURLObjectInspector.java serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java serde/src/java/org/apache/hadoop/hive/serde2/io/URLWritable.java serde/src/gen/thrift/gen-py/org_apache_hadoop_hive_serde/constants.py serde/src/gen/thrift/gen-cpp/serde_constants.cpp serde/src/gen/thrift/gen-cpp/serde_constants.h serde/src/gen/thrift/gen-rb/serde_constants.rb serde/src/gen/thrift/gen-php/org/apache/hadoop/hive/serde/Types.php serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/serdeConstants.java serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/Complex.java serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/MegaStruct.java serde/if/serde.thrift ql/src/test/results/clientpositive/url_udf.q.out ql/src/test/results/clientpositive/url.q.out ql/src/test/results/clientpositive/udf_sort_array.q.out ql/src/test/results/clientpositive/url_comparison.q.out ql/src/test/results/clientpositive/compute_stats_url.q.out ql/src/test/queries/clientpositive/url_comparison.q ql/src/test/queries/clientpositive/url_udf.q ql/src/test/queries/clientpositive/url.q ql/src/test/queries/clientpositive/udf_sort_array.q ql/src/test/queries/clientpositive/compute_stats_url.q ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToURL.java ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFStd.java ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFHistogramNumeric.java ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCorrelation.java ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCovariance.java ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFParseUrlTuple.java ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFStdSample.java ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFSum.java ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFVarianceSample.java ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFVariance.java ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCovarianceSample.java ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseCompare.java ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Task.java ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Stage.java ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Query.java ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Operator.java MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/21417/ To: kevinwilfong, sxyuan Cc: JIRA > Add URL type > ------------ > > Key: HIVE-4044 > URL: https://issues.apache.org/jira/browse/HIVE-4044 > Project: Hive > Issue Type: Improvement > Reporter: Samuel Yuan > Assignee: Samuel Yuan > Attachments: HIVE-4044.HIVE-4044.HIVE-4044.D8799.1.patch > > > Having a separate type for URLs would enable improvements in storage > efficiency based on breaking up a URL into its components. The new type will > be named "URL" and made a non-reserved keyword (see HIVE-701). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira