> On Aug. 16, 2017, 5:26 p.m., Ashutosh Chauhan wrote: > >
I have replied to the comments. I will update the q files and upload the new RB. > On Aug. 16, 2017, 5:26 p.m., Ashutosh Chauhan wrote: > > metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java > > Line 234 (original), 239 (patched) > > <https://reviews.apache.org/r/61586/diff/2/?file=1796121#file1796121line239> > > > > We should call deserialize and set estimator here and make inspector > > immutable. The problem is that this is the common path for retrieval of stats in all cases. Thus, making the inspector immutable does not seem to be an option if we do not duplicate this path (I tried to make it and we end up hitting the issue when we create/update stats for a given table). I look if it was possible somehow to make the original classes abstract, but this is not even an option. > On Aug. 16, 2017, 5:26 p.m., Ashutosh Chauhan wrote: > > metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java > > Line 409 (original), 414 (patched) > > <https://reviews.apache.org/r/61586/diff/2/?file=1796121#file1796121line414> > > > > Set estimator here and make inspector immutable. Same as above. > On Aug. 16, 2017, 5:26 p.m., Ashutosh Chauhan wrote: > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/cache/DateColumnStatsDataInspector.java > > Lines 35-36 (patched) > > <https://reviews.apache.org/r/61586/diff/2/?file=1796128#file1796128line35> > > > > There are other fields also in class. Any reason to override this one > > and not other which has all fields? This is the constructor of the super class. I guess it is the minimal information that you will have for a given column, i.e., not optional fields. Rest of fields are set via setter methods. > On Aug. 16, 2017, 5:26 p.m., Ashutosh Chauhan wrote: > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/cache/DateColumnStatsDataInspector.java > > Lines 94-110 (patched) > > <https://reviews.apache.org/r/61586/diff/2/?file=1796128#file1796128line94> > > > > Another (cleaner) way to handle these 2 fields is to make this class > > immutable and than at construction time set both estimator and bit vector. > > That way we never have to worry about whether bit vector is set or > > estimator. > > Since this class is only used in metastore to hold stats data for > > planning, having it immutable makes sense.. This is what I tried to do, but found the issue described above. Only way to fix this would be to create to paths to retrieve column stats for a given table, but I am not sure that is a better solution, since we will have two paths to maintain, update, etc. > On Aug. 16, 2017, 5:26 p.m., Ashutosh Chauhan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java > > Line 258 (original), 258 (patched) > > <https://reviews.apache.org/r/61586/diff/2/?file=1796140#file1796140line265> > > > > Any reason for this change. At bit vector compute time, we shall > > continue to use *ColumnStatsData classes and not inspector ones. Actually we might hit the mergeColStats path from here too. This is ClassCastException I get if I do not change it here: ERROR [7c5ea39b-9495-4e95-a787-dd22efb8567e main] metastore.RetryingHMSHandler: java.lang.ClassCastException: org.apache.hadoop.hive.metastore.api.LongColumnStatsData cannot be cast to org.apache.hadoop.hive.metastore.columnstats.cache.LongColumnStatsDataInspector at org.apache.hadoop.hive.metastore.columnstats.merge.LongColumnStatsMerger.merge(LongColumnStatsMerger.java:30) at org.apache.hadoop.hive.metastore.MetaStoreUtils.mergeColStats(MetaStoreUtils.java:1932) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:7012) at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) at com.sun.proxy.$Proxy37.set_aggr_stats_for(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.setPartitionColumnStatistics(HiveMetaStoreClient.java:1727) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.setPartitionColumnStatistics(SessionHiveMetaStoreClient.java:374) at sun.reflect.GeneratedMethodAccessor45.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173) at com.sun.proxy.$Proxy38.setPartitionColumnStatistics(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.setPartitionColumnStatistics(Hive.java:3911) at org.apache.hadoop.hive.ql.exec.ColumnStatsTask.persistColumnStats(ColumnStatsTask.java:425) at org.apache.hadoop.hive.ql.exec.ColumnStatsTask.execute(ColumnStatsTask.java:436) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2172) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1831) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1548) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1303) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1293) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:342) ... > On Aug. 16, 2017, 5:26 p.m., Ashutosh Chauhan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java > > Line 104 (original), 104 (patched) > > <https://reviews.apache.org/r/61586/diff/2/?file=1796141#file1796141line110> > > > > This change is not needed. As above. - Jesús ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61586/#review183040 ----------------------------------------------------------- On Aug. 12, 2017, 12:16 a.m., Jesús Camacho Rodríguez wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/61586/ > ----------------------------------------------------------- > > (Updated Aug. 12, 2017, 12:16 a.m.) > > > Review request for hive and Ashutosh Chauhan. > > > Bugs: HIVE-17286 > https://issues.apache.org/jira/browse/HIVE-17286 > > > Repository: hive-git > > > Description > ------- > > HIVE-17286-2 > > > Diffs > ----- > > common/src/java/org/apache/hadoop/hive/common/ndv/FMSketch.java > 160ce663ba5068ad029d51f431b87a40f97341a5 > > common/src/java/org/apache/hadoop/hive/common/ndv/NumDistinctValueEstimator.java > 4517b694ee38fd02ff32430ef24720d91c092f3a > > common/src/java/org/apache/hadoop/hive/common/ndv/NumDistinctValueEstimatorFactory.java > 6a29859df56f7d41b87d95242367f6ef401b4060 > common/src/java/org/apache/hadoop/hive/common/ndv/fm/FMSketchUtils.java > b6f7fdda0c9363516f610d5280c3c4e0821cba09 > common/src/java/org/apache/hadoop/hive/common/ndv/hll/HyperLogLog.java > 182560afbe6a8b9e2ac10591a520603bac0eb8d5 > > common/src/test/org/apache/hadoop/hive/common/ndv/fm/TestFMSketchSerialization.java > 74fdf58d2d3640a9e923e3dbd96f0704ba3b5f35 > metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java > b3274ca6ddd5377187190c896cb058e8686f74c2 > > metastore/src/java/org/apache/hadoop/hive/metastore/StatObjectConverter.java > d53ea4c5b2f294bbd4d530ce06a9a3fd1e640bde > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/ColumnStatsAggregatorFactory.java > 173e06fe8e6b3808dfe76c8e7266472ff822e186 > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DateColumnStatsAggregator.java > 04a1eb5d8933eb6644f992c5b742b94a9fbaa22b > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DecimalColumnStatsAggregator.java > d220e7fc86e28cb6e10ddb3679be3264470f7ebe > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DoubleColumnStatsAggregator.java > 1b44dd946f99daf65b91c824224d87f9a24de384 > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/LongColumnStatsAggregator.java > 802ad1a1b185fb619aa07c818107a057dc96ba2c > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/aggr/StringColumnStatsAggregator.java > e1a781fde977fcd9e5bba570468a48cda4c69611 > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/cache/DateColumnStatsDataInspector.java > PRE-CREATION > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/cache/DecimalColumnStatsDataInspector.java > PRE-CREATION > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/cache/DoubleColumnStatsDataInspector.java > PRE-CREATION > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/cache/LongColumnStatsDataInspector.java > PRE-CREATION > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/cache/StringColumnStatsDataInspector.java > PRE-CREATION > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/ColumnStatsMergerFactory.java > 0ce1847d1cbc1007ebd0bb2bf665659479e9e626 > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/DateColumnStatsMerger.java > 2542a00d361b6b2be532b9288b5958c680fa40fa > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/DecimalColumnStatsMerger.java > 4e8e1297585c18edc40fa71a3e33c0310b99750a > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/DoubleColumnStatsMerger.java > 4ef5c39d1c107a329023f38769b60045b7499fb4 > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/LongColumnStatsMerger.java > acf7f03c72adb64696f5d7c3e0cd1bb64e34f7fd > > metastore/src/java/org/apache/hadoop/hive/metastore/columnstats/merge/StringColumnStatsMerger.java > b3cd33c671ec8b478e233124671627238288a4e3 > metastore/src/test/org/apache/hadoop/hive/metastore/TestOldSchema.java > 54828f2289657470e1da02cdb938dd14ef6488af > ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsTask.java > d96f432fee9f4dde4ba76a21a28005299d3f9f7a > ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnStatsUpdateTask.java > 2acc777eacc5e482318807cf171a2f31eea52504 > > ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java > 23800734f7f7a00e363b367db2e197bd3b4f640a > > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFComputeStats.java > 8ee41bfab2b44e6c8b345fe5d9e3f6b30d2a328c > ql/src/test/results/clientpositive/autoColumnStats_4.q.out > e84499995b2c5cae1d84ef4f37532423dd0eac9b > ql/src/test/results/clientpositive/autoColumnStats_5.q.out > 29963975d38aa89a96744645a097278b7ce64f1f > ql/src/test/results/clientpositive/autoColumnStats_6.q.out > 1b125701d7c92fea66d1e6234e9a86a7b1922b17 > ql/src/test/results/clientpositive/autoColumnStats_7.q.out > 9e2121e0deae1deb37b7f1222c22aea8c6839f48 > ql/src/test/results/clientpositive/autoColumnStats_8.q.out > cdf2082d53356d18ed445b0d04b6673133878b2d > ql/src/test/results/clientpositive/autoColumnStats_9.q.out > e32c884c7d77b739f9ec114ae9eeb08da6aabb92 > ql/src/test/results/clientpositive/char_udf1.q.out > e701d64357aadca542d82caba9a19f0d5392c328 > ql/src/test/results/clientpositive/column_pruner_multiple_children.q.out > 00e53dc3e9a9dd375f341f0beb0a794c3f201166 > ql/src/test/results/clientpositive/columnstats_partlvl.q.out > c0f007159d3c4d67eb61077cfcde670e8abad77d > ql/src/test/results/clientpositive/columnstats_partlvl_dp.q.out > 0cb4863a17f566a3f336fcf414428817281d54a9 > ql/src/test/results/clientpositive/columnstats_quoting.q.out > 7e080fec9bb21aeede72b17141e7c4ce0c1c11cd > ql/src/test/results/clientpositive/columnstats_tbllvl.q.out > b85c1ff721d61be540e951ff5d60c1751da17b09 > ql/src/test/results/clientpositive/compute_stats_date.q.out > 78d04f9dfcb85c59c443158de17e3773a5c92564 > ql/src/test/results/clientpositive/compute_stats_decimal.q.out > e18b9890623d8cafd89c7eea59c0ace762f8e3b2 > ql/src/test/results/clientpositive/compute_stats_double.q.out > d937c3a00292fc850c8dff54721fffec82a9bc00 > ql/src/test/results/clientpositive/compute_stats_empty_table.q.out > 05042c9e42f492b30e04adb853dc8565d80595bb > ql/src/test/results/clientpositive/compute_stats_long.q.out > 3451072a1bb81c76f7edaf56f2da5b3aaa60a3f5 > ql/src/test/results/clientpositive/compute_stats_string.q.out > bbb236150e2de488411b1cb518239ee82892e04b > ql/src/test/results/clientpositive/constant_prop_2.q.out > 93050417c62718093595b73282214aab835532d9 > ql/src/test/results/clientpositive/display_colstats_tbllvl.q.out > 7cb62a8a94ca3fa001bc19a77d501e5692a7b11b > ql/src/test/results/clientpositive/exec_parallel_column_stats.q.out > f6c4237ca7d8c4dc6d01d0b2ead668b9eaf43e0d > ql/src/test/results/clientpositive/fm-sketch.q.out > 2bd218b4c82c2964d2c3ede12942f3b774d44e4c > ql/src/test/results/clientpositive/hll.q.out > 13da13087e00f5fbad84505f1b7a818c27de9628 > ql/src/test/results/clientpositive/llap/column_table_stats.q.out > c7726fec3017bd780977834a1c526a23912cd0f7 > ql/src/test/results/clientpositive/llap/column_table_stats_orc.q.out > 6dff50f9f19f643decfbce6a761ee0878337c65a > ql/src/test/results/clientpositive/llap/llap_stats.q.out > fda614f7b5d8c18639ed0b418dd7b49d9ad89e90 > ql/src/test/results/clientpositive/llap/parallel_colstats.q.out > 57498a6d8f5367bda343fa663f69e95e177928f7 > ql/src/test/results/clientpositive/llap/varchar_udf1.q.out > 023d51ccb8e66498c365cda6a06b70a9926edf8c > ql/src/test/results/clientpositive/llap/vector_udf1.q.out > b8d19c56374ba4518517ed27417db646d7816153 > ql/src/test/results/clientpositive/parallel_colstats.q.out > d5bce1ec77ab6bd2417a6be1312d28c8a3f7282b > ql/src/test/results/clientpositive/partial_column_stats.q.out > 452d4b688c253516e233e5364458fe202b539c5c > > ql/src/test/results/clientpositive/reduceSinkDeDuplication_pRS_key_empty.q.out > 124a4b489a1decc93c38d65ca1af8cacd2cc4f6f > ql/src/test/results/clientpositive/temp_table_display_colstats_tbllvl.q.out > 8d94ac6b07f8cf12ae7184de899e522cdc418cfe > ql/src/test/results/clientpositive/tez/explainanalyze_3.q.out > 9d47066297c9f73c7432e8bed578c397059bf43d > ql/src/test/results/clientpositive/tez/explainanalyze_5.q.out > 1764164a918d7af79bb5056e68557419e71b601d > standalone-metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp > d178f10a21861b776b7a1e7e904f18a51e5b8c4c > > standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/BinaryColumnStatsData.java > eeb510587a93d8c48224bacba557cf7fcaa61831 > > standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/BooleanColumnStatsData.java > de39d21abe6a0437264a6dcdc1953c6a387fa4a3 > > standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/DateColumnStatsData.java > edc87a195b8471f5434aebee6cf3c75ba67e3394 > > standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/DecimalColumnStatsData.java > ec363dcbdad5b8f31e2e69b3b09615dd941f3dd0 > > standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/DoubleColumnStatsData.java > e3340e412d3c22165e5dd199a67d07df0d12f97b > > standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/LongColumnStatsData.java > 440470654fc836b5218b995f29efeca2ef9124cd > > standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/StringColumnStatsData.java > c9afe877d63b620291b3d5fe47c8f54465cfdd0c > standalone-metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb > 02c5717c54f2e9bbc39767a3ea3512d9618553e4 > standalone-metastore/src/main/thrift/hive_metastore.thrift > 042a5d8d0b3e1e63ee4d7f832b4d6c0cdacb513f > > > Diff: https://reviews.apache.org/r/61586/diff/2/ > > > Testing > ------- > > > Thanks, > > Jesús Camacho Rodríguez > >