[GitHub] [hive] kgyrtkirk commented on pull request #1187: HIVE-23772. Reallocate calcite-core to prevent NoSuchFiledError.
kgyrtkirk commented on pull request #1187: URL: https://github.com/apache/hive/pull/1187#issuecomment-651580913 iirc the point where the switch to the "non-shaded calcite-core" happens is when the jdbc driver is loaded This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] kasakrisz merged pull request #1188: HIVE-19549: Enable TestAcidOnTez#testCtasTezUnion
kasakrisz merged pull request #1188: URL: https://github.com/apache/hive/pull/1188 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] jcamachor commented on a change in pull request #1190: [HIVE-23770] Fix Druid filter translation for inverted between
jcamachor commented on a change in pull request #1190: URL: https://github.com/apache/hive/pull/1190#discussion_r447411719 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveDruidPushInvertIntoBetweenRule.java ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.optimizer.calcite.rules; + +import org.apache.calcite.plan.RelOptRule; +import org.apache.calcite.plan.RelOptRuleCall; +import org.apache.calcite.rel.RelNode; +import org.apache.calcite.rel.core.Filter; +import org.apache.calcite.rex.RexBuilder; +import org.apache.calcite.rex.RexCall; +import org.apache.calcite.rex.RexNode; +import org.apache.calcite.rex.RexShuttle; +import org.apache.calcite.rex.RexUtil; +import org.apache.calcite.sql.SqlKind; +import org.apache.commons.logging.Log; +import org.apache.commons.logging.LogFactory; +import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveBetween; + +/** + * This rule is opposite of HiveDruidPullInvertFromBetweenRule + * It pushed invert back into Between. + */ +public class HiveDruidPushInvertIntoBetweenRule extends RelOptRule { + +protected static final Log LOG = LogFactory.getLog(HiveDruidPushInvertIntoBetweenRule.class); + +public static final HiveDruidPushInvertIntoBetweenRule INSTANCE = +new HiveDruidPushInvertIntoBetweenRule(); + +private HiveDruidPushInvertIntoBetweenRule() { +super(operand(Filter.class, any())); +} + +@Override +public void onMatch(RelOptRuleCall call) { +final Filter filter = call.rel(0); +final RexBuilder rexBuilder = filter.getCluster().getRexBuilder(); +final RexNode condition = RexUtil.pullFactors(rexBuilder, filter.getCondition()); + +RexPullInvertFromBetween t = new RexPullInvertFromBetween(rexBuilder); +RexNode newCondition = t.apply(condition); + +// If we could not transform anything, we bail out +if (newCondition.toString().equals(condition.toString())) { Review comment: Same comment as in the other rule. ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveDruidPushInvertIntoBetweenRule.java ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.optimizer.calcite.rules; + +import org.apache.calcite.plan.RelOptRule; +import org.apache.calcite.plan.RelOptRuleCall; +import org.apache.calcite.rel.RelNode; +import org.apache.calcite.rel.core.Filter; +import org.apache.calcite.rex.RexBuilder; +import org.apache.calcite.rex.RexCall; +import org.apache.calcite.rex.RexNode; +import org.apache.calcite.rex.RexShuttle; +import org.apache.calcite.rex.RexUtil; +import org.apache.calcite.sql.SqlKind; +import org.apache.commons.logging.Log; +import org.apache.commons.logging.LogFactory; +import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveBetween; + +/** + * This rule is opposite of HiveDruidPullInvertFromBetweenRule + * It pushed invert back into Between. + */ +public class HiveDruidPushInvertIntoBetweenRule extends RelOptRule { + +protected static final Log LOG = LogFactory.getLog(HiveDruidPushInvertIntoBetweenRule.class); + +public static final HiveDruidPushInvertIntoBetweenRule INSTANCE = +new HiveDruidPushInvertIntoBetweenRule(); + +private HiveDruidPushInvertIntoBetweenRule() { +super(operand(Fil
[GitHub] [hive] nareshpr removed a comment on pull request #1191: HIVE-23779: BasicStatsTask Info is not getting printed in beeline console
nareshpr removed a comment on pull request #1191: URL: https://github.com/apache/hive/pull/1191#issuecomment-651522808 /retest This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] nareshpr commented on pull request #1191: HIVE-23779: BasicStatsTask Info is not getting printed in beeline console
nareshpr commented on pull request #1191: URL: https://github.com/apache/hive/pull/1191#issuecomment-651522808 /retest This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] github-actions[bot] closed pull request #20: HIVE 2304 : for hive2
github-actions[bot] closed pull request #20: URL: https://github.com/apache/hive/pull/20 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] github-actions[bot] closed pull request #18: Branch 0.12
github-actions[bot] closed pull request #18: URL: https://github.com/apache/hive/pull/18 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] github-actions[bot] closed pull request #19: Add case sensitivity support to Parquet Serde.
github-actions[bot] closed pull request #19: URL: https://github.com/apache/hive/pull/19 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] github-actions[bot] closed pull request #22: Hive 8384
github-actions[bot] closed pull request #22: URL: https://github.com/apache/hive/pull/22 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] rbalamohan commented on a change in pull request #1081: HIVE-23597: VectorizedOrcAcidRowBatchReader::ColumnizedDeleteEventReg…
rbalamohan commented on a change in pull request #1081: URL: https://github.com/apache/hive/pull/1081#discussion_r447302965 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java ## @@ -1605,6 +1618,46 @@ public int compareTo(CompressedOwid other) { throw e; // rethrow the exception so that the caller can handle. } } + +/** + * Create delete delta reader. Caching orc tail to avoid FS lookup/reads for repeated scans. + * + * @param deleteDeltaFile + * @param conf + * @param fs FileSystem + * @return delete file reader + * @throws IOException + */ +private Reader getDeleteDeltaReader(Path deleteDeltaFile, JobConf conf, FileSystem fs) throws IOException { + OrcTail deleteDeltaTail = deleteDeltaOrcTailCache.getIfPresent(deleteDeltaFile); Review comment: Yes, it will not change for the file. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] mustafaiman opened a new pull request #1192: HIVE-23780: Fail dropTable if acid cleanup fails
mustafaiman opened a new pull request #1192: URL: https://github.com/apache/hive/pull/1192 Change-Id: Ica7666afe40cb0f0128266c9c3f6ebc560b24c0e ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY) For more details, please see https://cwiki.apache.org/confluence/display/Hive/HowToContribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] ayushtkn edited a comment on pull request #1187: HIVE-23772. Reallocate calcite-core to prevent NoSuchFiledError.
ayushtkn edited a comment on pull request #1187: URL: https://github.com/apache/hive/pull/1187#issuecomment-651363049 The values passed are from : https://github.com/apache/calcite/blob/branch-1.21/core/src/main/java/org/apache/calcite/jdbc/CalciteConnectionImpl.java#L127 The type factory must be null since it is the else part. The trace says its L127 - `at org.apache.calcite.jdbc.CalciteConnectionImpl.(CalciteConnectionImpl.java:127` and both argument ideally should be from `calcite-core` hence should have got relocated, they are further tweaked at `calcite-avatica` only, may be some dependency which uses calcite-core is called, Need to find, For that need to clone and get the dependency tree and stuff, or debug where the signature got changed from shaded to non shaded. Will find time to dig in more may be this or next weekend for this. Need to see Calcite working as well, Which I have no idea. Do let me know if you have any further pointers. :-) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] ayushtkn commented on pull request #1187: HIVE-23772. Reallocate calcite-core to prevent NoSuchFiledError.
ayushtkn commented on pull request #1187: URL: https://github.com/apache/hive/pull/1187#issuecomment-651363049 The values passed are from : https://github.com/apache/calcite/blob/branch-1.21/core/src/main/java/org/apache/calcite/jdbc/CalciteConnectionImpl.java#L127 The type factory must be null since it is the else part. The trace says its L127 - `at org.apache.calcite.jdbc.CalciteConnectionImpl.(CalciteConnectionImpl.java:127` and both argument ideally should be from `calcite-core` hence should have got relocated, they are further tweaked at `calcite-avatica` only, may be some dependency which uses calcite-core is called, Need to find, For that need to clone and get the dependency tree and stuff, or debug where the signature got changed from shaded to non shaded. Will find time to dig in more may be this or next weekend for this. Do let me know if you have any further pointers. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] kgyrtkirk commented on pull request #1188: HIVE-19549: Enable TestAcidOnTez#testCtasTezUnion
kgyrtkirk commented on pull request #1188: URL: https://github.com/apache/hive/pull/1188#issuecomment-651326251 before enabling flaky tests back please run the checker; without that there is no proof that the test is most likely stable I've started it for this one: http://ci.hive.apache.org/job/hive-flaky-check/54/ http://ci.hive.apache.org/job/hive-flaky-check/55/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] kgyrtkirk commented on pull request #1187: HIVE-23772. Reallocate calcite-core to prevent NoSuchFiledError.
kgyrtkirk commented on pull request #1187: URL: https://github.com/apache/hive/pull/1187#issuecomment-651325059 yeah; now I remember - I've also reached these avatica exceptions; the story behind them is: * jdbc driver is used to open the connection; it passes some things as arguments to the jdbc driver * the typefactory is "org.calcite..."" * the expected type factory is "shaded.org.calcite" so it fails with an exception... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] nareshpr opened a new pull request #1191: HIVE-23779: BasicStatsTask Info is not getting printed in beeline console
nareshpr opened a new pull request #1191: URL: https://github.com/apache/hive/pull/1191 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pkumarsinha commented on a change in pull request #1120: HIVE-23611 : Mandate fully qualified absolute path for external table…
pkumarsinha commented on a change in pull request #1120: URL: https://github.com/apache/hive/pull/1120#discussion_r447178750 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java ## @@ -517,7 +508,7 @@ public void externalTableIncrementalReplication() throws Throwable { + "'") .run("alter table t1 add partition(country='india')") .run("alter table t1 add partition(country='us')") -.dump(primaryDbName, withClause); Review comment: Not needed anymore This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pkumarsinha commented on a change in pull request #1120: HIVE-23611 : Mandate fully qualified absolute path for external table…
pkumarsinha commented on a change in pull request #1120: URL: https://github.com/apache/hive/pull/1120#discussion_r447178408 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java ## @@ -159,14 +154,14 @@ public void externalTableReplicationWithDefaultPaths() throws Throwable { .run("insert into table t2 partition(country='india') values ('bangalore')") .run("insert into table t2 partition(country='us') values ('austin')") .run("insert into table t2 partition(country='france') values ('paris')") -.dump(primaryDbName, withClauseOptions); +.dump(primaryDbName); Review comment: Not needed anymore ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java ## @@ -503,8 +495,7 @@ public void externalTableIncrementalCheckpointing() throws Throwable { @Test public void externalTableIncrementalReplication() throws Throwable { -List withClause = externalTableBasePathWithClause(); -WarehouseInstance.Tuple tuple = primary.dump(primaryDbName, withClause); +WarehouseInstance.Tuple tuple = primary.dump(primaryDbName); Review comment: Not needed anymore ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java ## @@ -623,7 +612,7 @@ public void bootstrapExternalTablesDuringIncrementalPhase() throws Throwable { assertFalse(primary.miniDFSCluster.getFileSystem() .exists(new Path(metadataPath + relativeExtInfoPath(null; -replica.load(replicatedDbName, primaryDbName, loadWithClause) +replica.load(replicatedDbName, primaryDbName) Review comment: Not needed anymore This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pkumarsinha commented on a change in pull request #1120: HIVE-23611 : Mandate fully qualified absolute path for external table…
pkumarsinha commented on a change in pull request #1120: URL: https://github.com/apache/hive/pull/1120#discussion_r447174957 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/BaseReplicationAcrossInstances.java ## @@ -103,6 +114,12 @@ public static void classLevelTearDown() throws IOException { replica.close(); } + private static void setReplicaExternalBase(FileSystem fs, Map confMap) throws IOException { +fs.mkdirs(REPLICA_EXTERNAL_BASE); +fullyQualifiedReplicaExternalBase = fs.getFileStatus(REPLICA_EXTERNAL_BASE).getPath().toString(); +confMap.put(HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname, fullyQualifiedReplicaExternalBase); Review comment: Yes, needed both the places. Both are different case. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] vihangk1 commented on pull request #1095: Make all DDLs transactional and advance commit write ID, so that subsequent read will invalidate the cache entry.
vihangk1 commented on pull request #1095: URL: https://github.com/apache/hive/pull/1095#issuecomment-651247834 Patch has been merged into master branch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] szlta merged pull request #1159: HIVE-23741: Store CacheTags in the file cache level
szlta merged pull request #1159: URL: https://github.com/apache/hive/pull/1159 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] kgyrtkirk commented on a change in pull request #1126: HIVE-23598: Add option to rewrite NTILE and RANK to sketch functions
kgyrtkirk commented on a change in pull request #1126: URL: https://github.com/apache/hive/pull/1126#discussion_r447097434 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRewriteToDataSketchesRules.java ## @@ -483,46 +489,44 @@ protected final SqlOperator getSqlOperator(String fnName) { } /** - * Rewrites {@code cume_dist() over (order by id)}. + * Provides a generic way to rewrite function into using an estimation based on CDF. + * + * There are a few methods which could be supported this way: NTILE, CUME_DIST, RANK * + * For example: * * SELECT id, CUME_DIST() OVER (ORDER BY id) FROM sketch_input; - * ⇒ SELECT id, 1.0-ds_kll_cdf(ds, CAST(-id AS FLOAT) )[0] + * ⇒ SELECT id, ds_kll_cdf(ds, CAST(id AS FLOAT) )[0] * FROM sketch_input JOIN ( - * SELECT ds_kll_sketch(CAST(-id AS FLOAT)) AS ds FROM sketch_input + * SELECT ds_kll_sketch(CAST(id AS FLOAT)) AS ds FROM sketch_input * ) q; * */ - public static class CumeDistRewrite extends WindowingToProjectAggregateJoinProject { + public static abstract class AbstractRankBasedRewriteRule extends WindowingToProjectAggregateJoinProject { -public CumeDistRewrite(String sketchType) { +public AbstractRankBasedRewriteRule(String sketchType) { super(sketchType); } -@Override -protected VbuilderPAP buildProcessor(RelOptRuleCall call) { - return new VB(sketchType, call.builder()); -} +protected static abstract class AbstractRankBasedRewriteBuilder extends VbuilderPAP { -private static class VB extends VbuilderPAP { - - protected VB(String sketchClass, RelBuilder relBuilder) { + protected AbstractRankBasedRewriteBuilder(String sketchClass, RelBuilder relBuilder) { super(sketchClass, relBuilder); } @Override - boolean isApplicable(RexOver over) { -SqlAggFunction aggOp = over.getAggOperator(); + final boolean isApplicable(RexOver over) { RexWindow window = over.getWindow(); -if (aggOp.getName().equalsIgnoreCase("cume_dist") && window.orderKeys.size() == 1 -&& window.getLowerBound().isUnbounded() && window.getUpperBound().isUnbounded()) { +if (window.orderKeys.size() == 1 +&& window.getLowerBound().isUnbounded() && window.getUpperBound().isUnbounded() Review comment: the logic to handle ``` Order by spec -> range, unbounded preceding, current row This also aligns with most RDBMSs implementation ``` I think at the time this rule fires it will see unbounded/unbounded...but it's very weird I'll open a separate ticket to invetigate that opened: HIVE-23775 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] kgyrtkirk commented on a change in pull request #1126: HIVE-23598: Add option to rewrite NTILE and RANK to sketch functions
kgyrtkirk commented on a change in pull request #1126: URL: https://github.com/apache/hive/pull/1126#discussion_r447081115 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRewriteToDataSketchesRules.java ## @@ -483,46 +489,44 @@ protected final SqlOperator getSqlOperator(String fnName) { } /** - * Rewrites {@code cume_dist() over (order by id)}. + * Provides a generic way to rewrite function into using an estimation based on CDF. + * + * There are a few methods which could be supported this way: NTILE, CUME_DIST, RANK * + * For example: * * SELECT id, CUME_DIST() OVER (ORDER BY id) FROM sketch_input; - * ⇒ SELECT id, 1.0-ds_kll_cdf(ds, CAST(-id AS FLOAT) )[0] + * ⇒ SELECT id, ds_kll_cdf(ds, CAST(id AS FLOAT) )[0] * FROM sketch_input JOIN ( - * SELECT ds_kll_sketch(CAST(-id AS FLOAT)) AS ds FROM sketch_input + * SELECT ds_kll_sketch(CAST(id AS FLOAT)) AS ds FROM sketch_input * ) q; * */ - public static class CumeDistRewrite extends WindowingToProjectAggregateJoinProject { + public static abstract class AbstractRankBasedRewriteRule extends WindowingToProjectAggregateJoinProject { -public CumeDistRewrite(String sketchType) { +public AbstractRankBasedRewriteRule(String sketchType) { super(sketchType); } -@Override -protected VbuilderPAP buildProcessor(RelOptRuleCall call) { - return new VB(sketchType, call.builder()); -} +protected static abstract class AbstractRankBasedRewriteBuilder extends VbuilderPAP { -private static class VB extends VbuilderPAP { - - protected VB(String sketchClass, RelBuilder relBuilder) { + protected AbstractRankBasedRewriteBuilder(String sketchClass, RelBuilder relBuilder) { super(sketchClass, relBuilder); } @Override - boolean isApplicable(RexOver over) { -SqlAggFunction aggOp = over.getAggOperator(); + final boolean isApplicable(RexOver over) { RexWindow window = over.getWindow(); -if (aggOp.getName().equalsIgnoreCase("cume_dist") && window.orderKeys.size() == 1 -&& window.getLowerBound().isUnbounded() && window.getUpperBound().isUnbounded()) { +if (window.orderKeys.size() == 1 +&& window.getLowerBound().isUnbounded() && window.getUpperBound().isUnbounded() Review comment: interesting; mostly the second :D for the current functions (ntile,cume_dist) doesn't really make sense to set the window anything than unbounded (or at least I don't see a usecase for it) I've tried this out for the below query: ``` select id,ntile(id) over (order by id rows between 1 preceding and 1 following) from sketch_input order by id nulls last; ``` * mysql: rejects it with an error that `ntile` doesn't support it * psql: accepts and executes it without interpreting the preceding/following stuff correctly * hive: stops with a semanticexception This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] deniskuzZ commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables
deniskuzZ commented on a change in pull request #1087: URL: https://github.com/apache/hive/pull/1087#discussion_r447053621 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java ## @@ -229,102 +239,168 @@ public int repair(MsckInfo msckInfo) { throw new MetastoreException(e); } } +if (transactionalTable && !MetaStoreServerUtils.isPartitioned(table)) { + if (result.getMaxWriteId() > 0) { +if (txnId < 0) { + // We need the txnId to check against even if we didn't do the locking + txnId = getMsc().openTxn(getUserName()); +} + +validateAndAddMaxTxnIdAndWriteId(result.getMaxWriteId(), result.getMaxTxnId(), +table.getDbName(), table.getTableName(), txnId); + } +} } success = true; } catch (Exception e) { LOG.warn("Failed to run metacheck: ", e); success = false; - ret = 1; } finally { - if (msckInfo.getResFile() != null) { -BufferedWriter resultOut = null; -try { - Path resFile = new Path(msckInfo.getResFile()); - FileSystem fs = resFile.getFileSystem(getConf()); - resultOut = new BufferedWriter(new OutputStreamWriter(fs.create(resFile))); - - boolean firstWritten = false; - firstWritten |= writeMsckResult(result.getTablesNotInMs(), -"Tables not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getTablesNotOnFs(), -"Tables missing on filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotInMs(), -"Partitions not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotOnFs(), -"Partitions missing from filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getExpiredPartitions(), -"Expired partitions (retention period: " + partitionExpirySeconds + "s) :", resultOut, firstWritten); - // sorting to stabilize qfile output (msck_repair_drop.q) - Collections.sort(repairOutput); - for (String rout : repairOutput) { -if (firstWritten) { - resultOut.write(terminator); -} else { - firstWritten = true; -} -resultOut.write(rout); - } -} catch (IOException e) { - LOG.warn("Failed to save metacheck output: ", e); - ret = 1; -} finally { - if (resultOut != null) { -try { - resultOut.close(); -} catch (IOException e) { - LOG.warn("Failed to close output file: ", e); - ret = 1; -} - } + if (result!=null) { +logResult(result); +if (msckInfo.getResFile() != null) { + success = writeResultToFile(msckInfo, result, repairOutput, partitionExpirySeconds) && success; } } - LOG.info("Tables not in metastore: {}", result.getTablesNotInMs()); - LOG.info("Tables missing on filesystem: {}", result.getTablesNotOnFs()); - LOG.info("Partitions not in metastore: {}", result.getPartitionsNotInMs()); - LOG.info("Partitions missing from filesystem: {}", result.getPartitionsNotOnFs()); - LOG.info("Expired partitions: {}", result.getExpiredPartitions()); - if (acquireLock && txnId > 0) { - if (success) { -try { - LOG.info("txnId: {} succeeded. Committing..", txnId); - getMsc().commitTxn(txnId); -} catch (Exception e) { - LOG.warn("Error while committing txnId: {} for table: {}", txnId, qualifiedTableName, e); - ret = 1; -} - } else { -try { - LOG.info("txnId: {} failed. Aborting..", txnId); - getMsc().abortTxns(Lists.newArrayList(txnId)); -} catch (Exception e) { - LOG.warn("Error while aborting txnId: {} for table: {}", txnId, qualifiedTableName, e); - ret = 1; -} - } + if (txnId > 0) { +success = closeTxn(qualifiedTableName, success, txnId) && success; } if (getMsc() != null) { getMsc().close(); msc = null; } } +return success ? 0 : 1; + } + private boolean closeTxn(String qualifiedTableName, boolean success, long txnId) { +boolean ret = true; +if (success) { + try { +LOG.info("txnId: {} succeeded. Committing..", txnId); +getMsc().commitTxn(txnId); + } catch (Exception e) { +LOG.warn("Error while committing txnId: {} for table: {}", txnId, qualifiedTableName, e); +ret = false; + } +} else { + try { +LOG.info("txnId: {} failed. Aborting..
[GitHub] [hive] deniskuzZ commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables
deniskuzZ commented on a change in pull request #1087: URL: https://github.com/apache/hive/pull/1087#discussion_r447052663 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java ## @@ -229,102 +239,168 @@ public int repair(MsckInfo msckInfo) { throw new MetastoreException(e); } } +if (transactionalTable && !MetaStoreServerUtils.isPartitioned(table)) { + if (result.getMaxWriteId() > 0) { +if (txnId < 0) { + // We need the txnId to check against even if we didn't do the locking + txnId = getMsc().openTxn(getUserName()); +} + +validateAndAddMaxTxnIdAndWriteId(result.getMaxWriteId(), result.getMaxTxnId(), +table.getDbName(), table.getTableName(), txnId); + } +} } success = true; } catch (Exception e) { LOG.warn("Failed to run metacheck: ", e); success = false; - ret = 1; } finally { - if (msckInfo.getResFile() != null) { -BufferedWriter resultOut = null; -try { - Path resFile = new Path(msckInfo.getResFile()); - FileSystem fs = resFile.getFileSystem(getConf()); - resultOut = new BufferedWriter(new OutputStreamWriter(fs.create(resFile))); - - boolean firstWritten = false; - firstWritten |= writeMsckResult(result.getTablesNotInMs(), -"Tables not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getTablesNotOnFs(), -"Tables missing on filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotInMs(), -"Partitions not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotOnFs(), -"Partitions missing from filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getExpiredPartitions(), -"Expired partitions (retention period: " + partitionExpirySeconds + "s) :", resultOut, firstWritten); - // sorting to stabilize qfile output (msck_repair_drop.q) - Collections.sort(repairOutput); - for (String rout : repairOutput) { -if (firstWritten) { - resultOut.write(terminator); -} else { - firstWritten = true; -} -resultOut.write(rout); - } -} catch (IOException e) { - LOG.warn("Failed to save metacheck output: ", e); - ret = 1; -} finally { - if (resultOut != null) { -try { - resultOut.close(); -} catch (IOException e) { - LOG.warn("Failed to close output file: ", e); - ret = 1; -} - } + if (result!=null) { +logResult(result); +if (msckInfo.getResFile() != null) { + success = writeResultToFile(msckInfo, result, repairOutput, partitionExpirySeconds) && success; } } - LOG.info("Tables not in metastore: {}", result.getTablesNotInMs()); - LOG.info("Tables missing on filesystem: {}", result.getTablesNotOnFs()); - LOG.info("Partitions not in metastore: {}", result.getPartitionsNotInMs()); - LOG.info("Partitions missing from filesystem: {}", result.getPartitionsNotOnFs()); - LOG.info("Expired partitions: {}", result.getExpiredPartitions()); - if (acquireLock && txnId > 0) { - if (success) { -try { - LOG.info("txnId: {} succeeded. Committing..", txnId); - getMsc().commitTxn(txnId); -} catch (Exception e) { - LOG.warn("Error while committing txnId: {} for table: {}", txnId, qualifiedTableName, e); - ret = 1; -} - } else { -try { - LOG.info("txnId: {} failed. Aborting..", txnId); - getMsc().abortTxns(Lists.newArrayList(txnId)); -} catch (Exception e) { - LOG.warn("Error while aborting txnId: {} for table: {}", txnId, qualifiedTableName, e); - ret = 1; -} - } + if (txnId > 0) { +success = closeTxn(qualifiedTableName, success, txnId) && success; } if (getMsc() != null) { getMsc().close(); msc = null; } } +return success ? 0 : 1; + } + private boolean closeTxn(String qualifiedTableName, boolean success, long txnId) { +boolean ret = true; +if (success) { + try { +LOG.info("txnId: {} succeeded. Committing..", txnId); +getMsc().commitTxn(txnId); + } catch (Exception e) { +LOG.warn("Error while committing txnId: {} for table: {}", txnId, qualifiedTableName, e); +ret = false; + } +} else { + try { +LOG.info("txnId: {} failed. Aborting..
[GitHub] [hive] deniskuzZ commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables
deniskuzZ commented on a change in pull request #1087: URL: https://github.com/apache/hive/pull/1087#discussion_r447050970 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java ## @@ -229,102 +239,168 @@ public int repair(MsckInfo msckInfo) { throw new MetastoreException(e); } } +if (transactionalTable && !MetaStoreServerUtils.isPartitioned(table)) { + if (result.getMaxWriteId() > 0) { +if (txnId < 0) { + // We need the txnId to check against even if we didn't do the locking + txnId = getMsc().openTxn(getUserName()); +} + +validateAndAddMaxTxnIdAndWriteId(result.getMaxWriteId(), result.getMaxTxnId(), +table.getDbName(), table.getTableName(), txnId); + } +} } success = true; } catch (Exception e) { LOG.warn("Failed to run metacheck: ", e); success = false; - ret = 1; } finally { - if (msckInfo.getResFile() != null) { -BufferedWriter resultOut = null; -try { - Path resFile = new Path(msckInfo.getResFile()); - FileSystem fs = resFile.getFileSystem(getConf()); - resultOut = new BufferedWriter(new OutputStreamWriter(fs.create(resFile))); - - boolean firstWritten = false; - firstWritten |= writeMsckResult(result.getTablesNotInMs(), -"Tables not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getTablesNotOnFs(), -"Tables missing on filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotInMs(), -"Partitions not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotOnFs(), -"Partitions missing from filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getExpiredPartitions(), -"Expired partitions (retention period: " + partitionExpirySeconds + "s) :", resultOut, firstWritten); - // sorting to stabilize qfile output (msck_repair_drop.q) - Collections.sort(repairOutput); - for (String rout : repairOutput) { -if (firstWritten) { - resultOut.write(terminator); -} else { - firstWritten = true; -} -resultOut.write(rout); - } -} catch (IOException e) { - LOG.warn("Failed to save metacheck output: ", e); - ret = 1; -} finally { - if (resultOut != null) { -try { - resultOut.close(); -} catch (IOException e) { - LOG.warn("Failed to close output file: ", e); - ret = 1; -} - } + if (result!=null) { +logResult(result); +if (msckInfo.getResFile() != null) { + success = writeResultToFile(msckInfo, result, repairOutput, partitionExpirySeconds) && success; } } - LOG.info("Tables not in metastore: {}", result.getTablesNotInMs()); - LOG.info("Tables missing on filesystem: {}", result.getTablesNotOnFs()); - LOG.info("Partitions not in metastore: {}", result.getPartitionsNotInMs()); - LOG.info("Partitions missing from filesystem: {}", result.getPartitionsNotOnFs()); - LOG.info("Expired partitions: {}", result.getExpiredPartitions()); - if (acquireLock && txnId > 0) { - if (success) { -try { - LOG.info("txnId: {} succeeded. Committing..", txnId); - getMsc().commitTxn(txnId); -} catch (Exception e) { - LOG.warn("Error while committing txnId: {} for table: {}", txnId, qualifiedTableName, e); - ret = 1; -} - } else { -try { - LOG.info("txnId: {} failed. Aborting..", txnId); - getMsc().abortTxns(Lists.newArrayList(txnId)); -} catch (Exception e) { - LOG.warn("Error while aborting txnId: {} for table: {}", txnId, qualifiedTableName, e); - ret = 1; -} - } + if (txnId > 0) { +success = closeTxn(qualifiedTableName, success, txnId) && success; } if (getMsc() != null) { getMsc().close(); msc = null; } } +return success ? 0 : 1; + } + private boolean closeTxn(String qualifiedTableName, boolean success, long txnId) { +boolean ret = true; +if (success) { + try { +LOG.info("txnId: {} succeeded. Committing..", txnId); +getMsc().commitTxn(txnId); + } catch (Exception e) { +LOG.warn("Error while committing txnId: {} for table: {}", txnId, qualifiedTableName, e); +ret = false; + } +} else { + try { +LOG.info("txnId: {} failed. Aborting..
[GitHub] [hive] nishantmonu51 opened a new pull request #1190: [HIVE-23770] Fix Druid filter translation for inverted between
nishantmonu51 opened a new pull request #1190: URL: https://github.com/apache/hive/pull/1190 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] deniskuzZ commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables
deniskuzZ commented on a change in pull request #1087: URL: https://github.com/apache/hive/pull/1087#discussion_r447046016 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java ## @@ -229,102 +239,168 @@ public int repair(MsckInfo msckInfo) { throw new MetastoreException(e); } } +if (transactionalTable && !MetaStoreServerUtils.isPartitioned(table)) { + if (result.getMaxWriteId() > 0) { +if (txnId < 0) { + // We need the txnId to check against even if we didn't do the locking + txnId = getMsc().openTxn(getUserName()); +} + +validateAndAddMaxTxnIdAndWriteId(result.getMaxWriteId(), result.getMaxTxnId(), +table.getDbName(), table.getTableName(), txnId); + } +} } success = true; } catch (Exception e) { LOG.warn("Failed to run metacheck: ", e); success = false; - ret = 1; } finally { - if (msckInfo.getResFile() != null) { -BufferedWriter resultOut = null; -try { - Path resFile = new Path(msckInfo.getResFile()); - FileSystem fs = resFile.getFileSystem(getConf()); - resultOut = new BufferedWriter(new OutputStreamWriter(fs.create(resFile))); - - boolean firstWritten = false; - firstWritten |= writeMsckResult(result.getTablesNotInMs(), -"Tables not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getTablesNotOnFs(), -"Tables missing on filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotInMs(), -"Partitions not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotOnFs(), -"Partitions missing from filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getExpiredPartitions(), -"Expired partitions (retention period: " + partitionExpirySeconds + "s) :", resultOut, firstWritten); - // sorting to stabilize qfile output (msck_repair_drop.q) - Collections.sort(repairOutput); - for (String rout : repairOutput) { -if (firstWritten) { - resultOut.write(terminator); -} else { - firstWritten = true; -} -resultOut.write(rout); - } -} catch (IOException e) { - LOG.warn("Failed to save metacheck output: ", e); - ret = 1; -} finally { - if (resultOut != null) { -try { - resultOut.close(); -} catch (IOException e) { - LOG.warn("Failed to close output file: ", e); - ret = 1; -} - } + if (result!=null) { +logResult(result); +if (msckInfo.getResFile() != null) { + success = writeResultToFile(msckInfo, result, repairOutput, partitionExpirySeconds) && success; } } - LOG.info("Tables not in metastore: {}", result.getTablesNotInMs()); - LOG.info("Tables missing on filesystem: {}", result.getTablesNotOnFs()); - LOG.info("Partitions not in metastore: {}", result.getPartitionsNotInMs()); - LOG.info("Partitions missing from filesystem: {}", result.getPartitionsNotOnFs()); - LOG.info("Expired partitions: {}", result.getExpiredPartitions()); - if (acquireLock && txnId > 0) { - if (success) { -try { - LOG.info("txnId: {} succeeded. Committing..", txnId); - getMsc().commitTxn(txnId); -} catch (Exception e) { - LOG.warn("Error while committing txnId: {} for table: {}", txnId, qualifiedTableName, e); - ret = 1; -} - } else { -try { - LOG.info("txnId: {} failed. Aborting..", txnId); - getMsc().abortTxns(Lists.newArrayList(txnId)); -} catch (Exception e) { - LOG.warn("Error while aborting txnId: {} for table: {}", txnId, qualifiedTableName, e); - ret = 1; -} - } + if (txnId > 0) { +success = closeTxn(qualifiedTableName, success, txnId) && success; } if (getMsc() != null) { getMsc().close(); msc = null; } } +return success ? 0 : 1; + } + private boolean closeTxn(String qualifiedTableName, boolean success, long txnId) { +boolean ret = true; +if (success) { + try { +LOG.info("txnId: {} succeeded. Committing..", txnId); +getMsc().commitTxn(txnId); + } catch (Exception e) { +LOG.warn("Error while committing txnId: {} for table: {}", txnId, qualifiedTableName, e); +ret = false; + } +} else { + try { +LOG.info("txnId: {} failed. Aborting..
[GitHub] [hive] deniskuzZ commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables
deniskuzZ commented on a change in pull request #1087: URL: https://github.com/apache/hive/pull/1087#discussion_r447044667 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java ## @@ -229,102 +239,168 @@ public int repair(MsckInfo msckInfo) { throw new MetastoreException(e); } } +if (transactionalTable && !MetaStoreServerUtils.isPartitioned(table)) { + if (result.getMaxWriteId() > 0) { +if (txnId < 0) { + // We need the txnId to check against even if we didn't do the locking + txnId = getMsc().openTxn(getUserName()); +} + +validateAndAddMaxTxnIdAndWriteId(result.getMaxWriteId(), result.getMaxTxnId(), +table.getDbName(), table.getTableName(), txnId); + } +} } success = true; } catch (Exception e) { LOG.warn("Failed to run metacheck: ", e); success = false; - ret = 1; } finally { - if (msckInfo.getResFile() != null) { -BufferedWriter resultOut = null; -try { - Path resFile = new Path(msckInfo.getResFile()); - FileSystem fs = resFile.getFileSystem(getConf()); - resultOut = new BufferedWriter(new OutputStreamWriter(fs.create(resFile))); - - boolean firstWritten = false; - firstWritten |= writeMsckResult(result.getTablesNotInMs(), -"Tables not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getTablesNotOnFs(), -"Tables missing on filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotInMs(), -"Partitions not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotOnFs(), -"Partitions missing from filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getExpiredPartitions(), -"Expired partitions (retention period: " + partitionExpirySeconds + "s) :", resultOut, firstWritten); - // sorting to stabilize qfile output (msck_repair_drop.q) - Collections.sort(repairOutput); - for (String rout : repairOutput) { -if (firstWritten) { - resultOut.write(terminator); -} else { - firstWritten = true; -} -resultOut.write(rout); - } -} catch (IOException e) { - LOG.warn("Failed to save metacheck output: ", e); - ret = 1; -} finally { - if (resultOut != null) { -try { - resultOut.close(); -} catch (IOException e) { - LOG.warn("Failed to close output file: ", e); - ret = 1; -} - } + if (result!=null) { +logResult(result); +if (msckInfo.getResFile() != null) { + success = writeResultToFile(msckInfo, result, repairOutput, partitionExpirySeconds) && success; } } - LOG.info("Tables not in metastore: {}", result.getTablesNotInMs()); - LOG.info("Tables missing on filesystem: {}", result.getTablesNotOnFs()); - LOG.info("Partitions not in metastore: {}", result.getPartitionsNotInMs()); - LOG.info("Partitions missing from filesystem: {}", result.getPartitionsNotOnFs()); - LOG.info("Expired partitions: {}", result.getExpiredPartitions()); - if (acquireLock && txnId > 0) { - if (success) { -try { - LOG.info("txnId: {} succeeded. Committing..", txnId); - getMsc().commitTxn(txnId); -} catch (Exception e) { - LOG.warn("Error while committing txnId: {} for table: {}", txnId, qualifiedTableName, e); - ret = 1; -} - } else { -try { - LOG.info("txnId: {} failed. Aborting..", txnId); - getMsc().abortTxns(Lists.newArrayList(txnId)); -} catch (Exception e) { - LOG.warn("Error while aborting txnId: {} for table: {}", txnId, qualifiedTableName, e); - ret = 1; -} - } + if (txnId > 0) { +success = closeTxn(qualifiedTableName, success, txnId) && success; } if (getMsc() != null) { getMsc().close(); msc = null; } } +return success ? 0 : 1; + } + private boolean closeTxn(String qualifiedTableName, boolean success, long txnId) { +boolean ret = true; +if (success) { + try { +LOG.info("txnId: {} succeeded. Committing..", txnId); +getMsc().commitTxn(txnId); + } catch (Exception e) { +LOG.warn("Error while committing txnId: {} for table: {}", txnId, qualifiedTableName, e); Review comment: should we use error level here? -
[GitHub] [hive] deniskuzZ commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables
deniskuzZ commented on a change in pull request #1087: URL: https://github.com/apache/hive/pull/1087#discussion_r447044866 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java ## @@ -229,102 +239,168 @@ public int repair(MsckInfo msckInfo) { throw new MetastoreException(e); } } +if (transactionalTable && !MetaStoreServerUtils.isPartitioned(table)) { + if (result.getMaxWriteId() > 0) { +if (txnId < 0) { + // We need the txnId to check against even if we didn't do the locking + txnId = getMsc().openTxn(getUserName()); +} + +validateAndAddMaxTxnIdAndWriteId(result.getMaxWriteId(), result.getMaxTxnId(), +table.getDbName(), table.getTableName(), txnId); + } +} } success = true; } catch (Exception e) { LOG.warn("Failed to run metacheck: ", e); success = false; - ret = 1; } finally { - if (msckInfo.getResFile() != null) { -BufferedWriter resultOut = null; -try { - Path resFile = new Path(msckInfo.getResFile()); - FileSystem fs = resFile.getFileSystem(getConf()); - resultOut = new BufferedWriter(new OutputStreamWriter(fs.create(resFile))); - - boolean firstWritten = false; - firstWritten |= writeMsckResult(result.getTablesNotInMs(), -"Tables not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getTablesNotOnFs(), -"Tables missing on filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotInMs(), -"Partitions not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotOnFs(), -"Partitions missing from filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getExpiredPartitions(), -"Expired partitions (retention period: " + partitionExpirySeconds + "s) :", resultOut, firstWritten); - // sorting to stabilize qfile output (msck_repair_drop.q) - Collections.sort(repairOutput); - for (String rout : repairOutput) { -if (firstWritten) { - resultOut.write(terminator); -} else { - firstWritten = true; -} -resultOut.write(rout); - } -} catch (IOException e) { - LOG.warn("Failed to save metacheck output: ", e); - ret = 1; -} finally { - if (resultOut != null) { -try { - resultOut.close(); -} catch (IOException e) { - LOG.warn("Failed to close output file: ", e); - ret = 1; -} - } + if (result!=null) { +logResult(result); +if (msckInfo.getResFile() != null) { + success = writeResultToFile(msckInfo, result, repairOutput, partitionExpirySeconds) && success; } } - LOG.info("Tables not in metastore: {}", result.getTablesNotInMs()); - LOG.info("Tables missing on filesystem: {}", result.getTablesNotOnFs()); - LOG.info("Partitions not in metastore: {}", result.getPartitionsNotInMs()); - LOG.info("Partitions missing from filesystem: {}", result.getPartitionsNotOnFs()); - LOG.info("Expired partitions: {}", result.getExpiredPartitions()); - if (acquireLock && txnId > 0) { - if (success) { -try { - LOG.info("txnId: {} succeeded. Committing..", txnId); - getMsc().commitTxn(txnId); -} catch (Exception e) { - LOG.warn("Error while committing txnId: {} for table: {}", txnId, qualifiedTableName, e); - ret = 1; -} - } else { -try { - LOG.info("txnId: {} failed. Aborting..", txnId); - getMsc().abortTxns(Lists.newArrayList(txnId)); -} catch (Exception e) { - LOG.warn("Error while aborting txnId: {} for table: {}", txnId, qualifiedTableName, e); - ret = 1; -} - } + if (txnId > 0) { +success = closeTxn(qualifiedTableName, success, txnId) && success; } if (getMsc() != null) { getMsc().close(); msc = null; } } +return success ? 0 : 1; + } + private boolean closeTxn(String qualifiedTableName, boolean success, long txnId) { +boolean ret = true; +if (success) { + try { +LOG.info("txnId: {} succeeded. Committing..", txnId); +getMsc().commitTxn(txnId); + } catch (Exception e) { +LOG.warn("Error while committing txnId: {} for table: {}", txnId, qualifiedTableName, e); +ret = false; + } +} else { + try { +LOG.info("txnId: {} failed. Aborting..
[GitHub] [hive] deniskuzZ commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables
deniskuzZ commented on a change in pull request #1087: URL: https://github.com/apache/hive/pull/1087#discussion_r447042125 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java ## @@ -229,102 +239,168 @@ public int repair(MsckInfo msckInfo) { throw new MetastoreException(e); } } +if (transactionalTable && !MetaStoreServerUtils.isPartitioned(table)) { + if (result.getMaxWriteId() > 0) { +if (txnId < 0) { + // We need the txnId to check against even if we didn't do the locking + txnId = getMsc().openTxn(getUserName()); +} + +validateAndAddMaxTxnIdAndWriteId(result.getMaxWriteId(), result.getMaxTxnId(), +table.getDbName(), table.getTableName(), txnId); + } +} } success = true; } catch (Exception e) { LOG.warn("Failed to run metacheck: ", e); success = false; - ret = 1; } finally { - if (msckInfo.getResFile() != null) { -BufferedWriter resultOut = null; -try { - Path resFile = new Path(msckInfo.getResFile()); - FileSystem fs = resFile.getFileSystem(getConf()); - resultOut = new BufferedWriter(new OutputStreamWriter(fs.create(resFile))); - - boolean firstWritten = false; - firstWritten |= writeMsckResult(result.getTablesNotInMs(), -"Tables not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getTablesNotOnFs(), -"Tables missing on filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotInMs(), -"Partitions not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotOnFs(), -"Partitions missing from filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getExpiredPartitions(), -"Expired partitions (retention period: " + partitionExpirySeconds + "s) :", resultOut, firstWritten); - // sorting to stabilize qfile output (msck_repair_drop.q) - Collections.sort(repairOutput); - for (String rout : repairOutput) { -if (firstWritten) { - resultOut.write(terminator); -} else { - firstWritten = true; -} -resultOut.write(rout); - } -} catch (IOException e) { - LOG.warn("Failed to save metacheck output: ", e); - ret = 1; -} finally { - if (resultOut != null) { -try { - resultOut.close(); -} catch (IOException e) { - LOG.warn("Failed to close output file: ", e); - ret = 1; -} - } + if (result!=null) { +logResult(result); +if (msckInfo.getResFile() != null) { + success = writeResultToFile(msckInfo, result, repairOutput, partitionExpirySeconds) && success; } } - LOG.info("Tables not in metastore: {}", result.getTablesNotInMs()); - LOG.info("Tables missing on filesystem: {}", result.getTablesNotOnFs()); - LOG.info("Partitions not in metastore: {}", result.getPartitionsNotInMs()); - LOG.info("Partitions missing from filesystem: {}", result.getPartitionsNotOnFs()); - LOG.info("Expired partitions: {}", result.getExpiredPartitions()); - if (acquireLock && txnId > 0) { - if (success) { -try { - LOG.info("txnId: {} succeeded. Committing..", txnId); - getMsc().commitTxn(txnId); -} catch (Exception e) { - LOG.warn("Error while committing txnId: {} for table: {}", txnId, qualifiedTableName, e); - ret = 1; -} - } else { -try { - LOG.info("txnId: {} failed. Aborting..", txnId); - getMsc().abortTxns(Lists.newArrayList(txnId)); -} catch (Exception e) { - LOG.warn("Error while aborting txnId: {} for table: {}", txnId, qualifiedTableName, e); - ret = 1; -} - } + if (txnId > 0) { +success = closeTxn(qualifiedTableName, success, txnId) && success; Review comment: same success &= closeTxn(qualifiedTableName, success, txnId) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] deniskuzZ commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables
deniskuzZ commented on a change in pull request #1087: URL: https://github.com/apache/hive/pull/1087#discussion_r447041559 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java ## @@ -229,102 +239,168 @@ public int repair(MsckInfo msckInfo) { throw new MetastoreException(e); } } +if (transactionalTable && !MetaStoreServerUtils.isPartitioned(table)) { + if (result.getMaxWriteId() > 0) { +if (txnId < 0) { + // We need the txnId to check against even if we didn't do the locking + txnId = getMsc().openTxn(getUserName()); +} + +validateAndAddMaxTxnIdAndWriteId(result.getMaxWriteId(), result.getMaxTxnId(), +table.getDbName(), table.getTableName(), txnId); + } +} } success = true; } catch (Exception e) { LOG.warn("Failed to run metacheck: ", e); success = false; - ret = 1; } finally { - if (msckInfo.getResFile() != null) { -BufferedWriter resultOut = null; -try { - Path resFile = new Path(msckInfo.getResFile()); - FileSystem fs = resFile.getFileSystem(getConf()); - resultOut = new BufferedWriter(new OutputStreamWriter(fs.create(resFile))); - - boolean firstWritten = false; - firstWritten |= writeMsckResult(result.getTablesNotInMs(), -"Tables not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getTablesNotOnFs(), -"Tables missing on filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotInMs(), -"Partitions not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotOnFs(), -"Partitions missing from filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getExpiredPartitions(), -"Expired partitions (retention period: " + partitionExpirySeconds + "s) :", resultOut, firstWritten); - // sorting to stabilize qfile output (msck_repair_drop.q) - Collections.sort(repairOutput); - for (String rout : repairOutput) { -if (firstWritten) { - resultOut.write(terminator); -} else { - firstWritten = true; -} -resultOut.write(rout); - } -} catch (IOException e) { - LOG.warn("Failed to save metacheck output: ", e); - ret = 1; -} finally { - if (resultOut != null) { -try { - resultOut.close(); -} catch (IOException e) { - LOG.warn("Failed to close output file: ", e); - ret = 1; -} - } + if (result!=null) { +logResult(result); +if (msckInfo.getResFile() != null) { + success = writeResultToFile(msckInfo, result, repairOutput, partitionExpirySeconds) && success; Review comment: you can do success &= writeResultToFile(msckInfo, result, repairOutput, partitionExpirySeconds) for readability This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] deniskuzZ commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables
deniskuzZ commented on a change in pull request #1087: URL: https://github.com/apache/hive/pull/1087#discussion_r447037968 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java ## @@ -229,102 +239,168 @@ public int repair(MsckInfo msckInfo) { throw new MetastoreException(e); } } +if (transactionalTable && !MetaStoreServerUtils.isPartitioned(table)) { + if (result.getMaxWriteId() > 0) { +if (txnId < 0) { + // We need the txnId to check against even if we didn't do the locking + txnId = getMsc().openTxn(getUserName()); +} + +validateAndAddMaxTxnIdAndWriteId(result.getMaxWriteId(), result.getMaxTxnId(), +table.getDbName(), table.getTableName(), txnId); + } +} } success = true; } catch (Exception e) { LOG.warn("Failed to run metacheck: ", e); success = false; - ret = 1; } finally { - if (msckInfo.getResFile() != null) { -BufferedWriter resultOut = null; -try { - Path resFile = new Path(msckInfo.getResFile()); - FileSystem fs = resFile.getFileSystem(getConf()); - resultOut = new BufferedWriter(new OutputStreamWriter(fs.create(resFile))); - - boolean firstWritten = false; - firstWritten |= writeMsckResult(result.getTablesNotInMs(), -"Tables not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getTablesNotOnFs(), -"Tables missing on filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotInMs(), -"Partitions not in metastore:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getPartitionsNotOnFs(), -"Partitions missing from filesystem:", resultOut, firstWritten); - firstWritten |= writeMsckResult(result.getExpiredPartitions(), -"Expired partitions (retention period: " + partitionExpirySeconds + "s) :", resultOut, firstWritten); - // sorting to stabilize qfile output (msck_repair_drop.q) - Collections.sort(repairOutput); - for (String rout : repairOutput) { -if (firstWritten) { - resultOut.write(terminator); -} else { - firstWritten = true; -} -resultOut.write(rout); - } -} catch (IOException e) { - LOG.warn("Failed to save metacheck output: ", e); - ret = 1; -} finally { - if (resultOut != null) { -try { - resultOut.close(); -} catch (IOException e) { - LOG.warn("Failed to close output file: ", e); - ret = 1; -} - } + if (result!=null) { Review comment: not formatted This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] deniskuzZ commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables
deniskuzZ commented on a change in pull request #1087: URL: https://github.com/apache/hive/pull/1087#discussion_r447037520 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java ## @@ -229,102 +239,168 @@ public int repair(MsckInfo msckInfo) { throw new MetastoreException(e); } } +if (transactionalTable && !MetaStoreServerUtils.isPartitioned(table)) { + if (result.getMaxWriteId() > 0) { Review comment: you can remove 1 nesting level This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] deniskuzZ commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables
deniskuzZ commented on a change in pull request #1087: URL: https://github.com/apache/hive/pull/1087#discussion_r447033768 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java ## @@ -429,6 +451,75 @@ void findUnknownPartitions(Table table, Set partPaths, LOG.debug("Number of partitions not in metastore : " + result.getPartitionsNotInMs().size()); } + /** + * Calculate the maximum seen writeId from the acid directory structure + * @param partPath Path of the partition directory + * @param res Partition result to write the max ids + * @throws IOException ex + */ + private void setMaxTxnAndWriteIdFromPartition(Path partPath, CheckResult.PartitionResult res) throws IOException { +FileSystem fs = partPath.getFileSystem(conf); +FileStatus[] deltaOrBaseFiles = fs.listStatus(partPath, HIDDEN_FILES_PATH_FILTER); + +// Read the writeIds from every base and delta directory and find the max +long maxWriteId = 0L; +long maxVisibilityId = 0L; +for(FileStatus fileStatus : deltaOrBaseFiles) { + if (!fileStatus.isDirectory()) { +continue; + } + long writeId = 0L; + long visibilityId = 0L; + String folder = fileStatus.getPath().getName(); + if (folder.startsWith(BASE_PREFIX)) { +visibilityId = getVisibilityTxnId(folder); +if (visibilityId > 0) { + folder = removeVisibilityTxnId(folder); +} +writeId = Long.parseLong(folder.substring(BASE_PREFIX.length())); + } else if (folder.startsWith(DELTA_PREFIX) || folder.startsWith(DELETE_DELTA_PREFIX)) { +// See AcidUtils.parseDelta +visibilityId = getVisibilityTxnId(folder); +if (visibilityId > 0) { + folder = removeVisibilityTxnId(folder); +} +boolean isDeleteDelta = folder.startsWith(DELETE_DELTA_PREFIX); +String rest = folder.substring((isDeleteDelta ? DELETE_DELTA_PREFIX : DELTA_PREFIX).length()); +int split = rest.indexOf('_'); +//split2 may be -1 if no statementId +int split2 = rest.indexOf('_', split + 1); +// We always want the second part (it is either the same or greater if it is a compacted delta) +writeId = split2 == -1 ? Long.parseLong(rest.substring(split + 1)) : Long +.parseLong(rest.substring(split + 1, split2)); + } + if (writeId > maxWriteId) { +maxWriteId = writeId; + } + if (visibilityId > maxVisibilityId) { +maxVisibilityId = visibilityId; + } +} +LOG.debug("Max writeId {}, max txnId {} found in partition {}", maxWriteId, maxVisibilityId, +partPath.toUri().toString()); +res.setMaxWriteId(maxWriteId); +res.setMaxTxnId(maxVisibilityId); + } + private long getVisibilityTxnId(String folder) { +int idxOfVis = folder.indexOf(VISIBILITY_PREFIX); Review comment: why not use regex? removeVisibilityTxnId probably wouldn't even be needed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] deniskuzZ commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables
deniskuzZ commented on a change in pull request #1087: URL: https://github.com/apache/hive/pull/1087#discussion_r447033768 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java ## @@ -429,6 +451,75 @@ void findUnknownPartitions(Table table, Set partPaths, LOG.debug("Number of partitions not in metastore : " + result.getPartitionsNotInMs().size()); } + /** + * Calculate the maximum seen writeId from the acid directory structure + * @param partPath Path of the partition directory + * @param res Partition result to write the max ids + * @throws IOException ex + */ + private void setMaxTxnAndWriteIdFromPartition(Path partPath, CheckResult.PartitionResult res) throws IOException { +FileSystem fs = partPath.getFileSystem(conf); +FileStatus[] deltaOrBaseFiles = fs.listStatus(partPath, HIDDEN_FILES_PATH_FILTER); + +// Read the writeIds from every base and delta directory and find the max +long maxWriteId = 0L; +long maxVisibilityId = 0L; +for(FileStatus fileStatus : deltaOrBaseFiles) { + if (!fileStatus.isDirectory()) { +continue; + } + long writeId = 0L; + long visibilityId = 0L; + String folder = fileStatus.getPath().getName(); + if (folder.startsWith(BASE_PREFIX)) { +visibilityId = getVisibilityTxnId(folder); +if (visibilityId > 0) { + folder = removeVisibilityTxnId(folder); +} +writeId = Long.parseLong(folder.substring(BASE_PREFIX.length())); + } else if (folder.startsWith(DELTA_PREFIX) || folder.startsWith(DELETE_DELTA_PREFIX)) { +// See AcidUtils.parseDelta +visibilityId = getVisibilityTxnId(folder); +if (visibilityId > 0) { + folder = removeVisibilityTxnId(folder); +} +boolean isDeleteDelta = folder.startsWith(DELETE_DELTA_PREFIX); +String rest = folder.substring((isDeleteDelta ? DELETE_DELTA_PREFIX : DELTA_PREFIX).length()); +int split = rest.indexOf('_'); +//split2 may be -1 if no statementId +int split2 = rest.indexOf('_', split + 1); +// We always want the second part (it is either the same or greater if it is a compacted delta) +writeId = split2 == -1 ? Long.parseLong(rest.substring(split + 1)) : Long +.parseLong(rest.substring(split + 1, split2)); + } + if (writeId > maxWriteId) { +maxWriteId = writeId; + } + if (visibilityId > maxVisibilityId) { +maxVisibilityId = visibilityId; + } +} +LOG.debug("Max writeId {}, max txnId {} found in partition {}", maxWriteId, maxVisibilityId, +partPath.toUri().toString()); +res.setMaxWriteId(maxWriteId); +res.setMaxTxnId(maxVisibilityId); + } + private long getVisibilityTxnId(String folder) { +int idxOfVis = folder.indexOf(VISIBILITY_PREFIX); Review comment: why not use regex? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] deniskuzZ commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables
deniskuzZ commented on a change in pull request #1087: URL: https://github.com/apache/hive/pull/1087#discussion_r447028756 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java ## @@ -429,6 +451,75 @@ void findUnknownPartitions(Table table, Set partPaths, LOG.debug("Number of partitions not in metastore : " + result.getPartitionsNotInMs().size()); } + /** + * Calculate the maximum seen writeId from the acid directory structure + * @param partPath Path of the partition directory + * @param res Partition result to write the max ids + * @throws IOException ex + */ + private void setMaxTxnAndWriteIdFromPartition(Path partPath, CheckResult.PartitionResult res) throws IOException { +FileSystem fs = partPath.getFileSystem(conf); +FileStatus[] deltaOrBaseFiles = fs.listStatus(partPath, HIDDEN_FILES_PATH_FILTER); + +// Read the writeIds from every base and delta directory and find the max +long maxWriteId = 0L; +long maxVisibilityId = 0L; +for(FileStatus fileStatus : deltaOrBaseFiles) { + if (!fileStatus.isDirectory()) { +continue; + } + long writeId = 0L; + long visibilityId = 0L; + String folder = fileStatus.getPath().getName(); + if (folder.startsWith(BASE_PREFIX)) { +visibilityId = getVisibilityTxnId(folder); +if (visibilityId > 0) { + folder = removeVisibilityTxnId(folder); +} +writeId = Long.parseLong(folder.substring(BASE_PREFIX.length())); + } else if (folder.startsWith(DELTA_PREFIX) || folder.startsWith(DELETE_DELTA_PREFIX)) { +// See AcidUtils.parseDelta +visibilityId = getVisibilityTxnId(folder); +if (visibilityId > 0) { + folder = removeVisibilityTxnId(folder); +} +boolean isDeleteDelta = folder.startsWith(DELETE_DELTA_PREFIX); +String rest = folder.substring((isDeleteDelta ? DELETE_DELTA_PREFIX : DELTA_PREFIX).length()); +int split = rest.indexOf('_'); Review comment: why not use rest.split('_') ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java ## @@ -429,6 +451,75 @@ void findUnknownPartitions(Table table, Set partPaths, LOG.debug("Number of partitions not in metastore : " + result.getPartitionsNotInMs().size()); } + /** + * Calculate the maximum seen writeId from the acid directory structure + * @param partPath Path of the partition directory + * @param res Partition result to write the max ids + * @throws IOException ex + */ + private void setMaxTxnAndWriteIdFromPartition(Path partPath, CheckResult.PartitionResult res) throws IOException { +FileSystem fs = partPath.getFileSystem(conf); +FileStatus[] deltaOrBaseFiles = fs.listStatus(partPath, HIDDEN_FILES_PATH_FILTER); + +// Read the writeIds from every base and delta directory and find the max +long maxWriteId = 0L; +long maxVisibilityId = 0L; +for(FileStatus fileStatus : deltaOrBaseFiles) { + if (!fileStatus.isDirectory()) { +continue; + } + long writeId = 0L; + long visibilityId = 0L; + String folder = fileStatus.getPath().getName(); + if (folder.startsWith(BASE_PREFIX)) { +visibilityId = getVisibilityTxnId(folder); +if (visibilityId > 0) { + folder = removeVisibilityTxnId(folder); +} +writeId = Long.parseLong(folder.substring(BASE_PREFIX.length())); + } else if (folder.startsWith(DELTA_PREFIX) || folder.startsWith(DELETE_DELTA_PREFIX)) { +// See AcidUtils.parseDelta +visibilityId = getVisibilityTxnId(folder); +if (visibilityId > 0) { + folder = removeVisibilityTxnId(folder); +} +boolean isDeleteDelta = folder.startsWith(DELETE_DELTA_PREFIX); +String rest = folder.substring((isDeleteDelta ? DELETE_DELTA_PREFIX : DELTA_PREFIX).length()); +int split = rest.indexOf('_'); Review comment: why not use rest.split('_')? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] deniskuzZ commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables
deniskuzZ commented on a change in pull request #1087: URL: https://github.com/apache/hive/pull/1087#discussion_r447003871 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java ## @@ -111,24 +120,24 @@ public IMetaStoreClient getMsc() { * @param partitions * List of partition name value pairs, if null or empty check all * partitions - * @param table - * @param result - * Fill this with the results of the check + * @param table Table we want to run the check for. + * @return Results of the check * @throws MetastoreException * Failed to get required information from the metastore. * @throws IOException * Most likely filesystem related */ - public void checkMetastore(String catName, String dbName, String tableName, - List> partitions, Table table, CheckResult result) + public CheckResult checkMetastore(String catName, String dbName, String tableName, + List> partitions, Table table) throws MetastoreException, IOException { - +CheckResult result = new CheckResult(); if (dbName == null || "".equalsIgnoreCase(dbName)) { dbName = Warehouse.DEFAULT_DATABASE_NAME; } try { if (tableName == null || "".equals(tableName)) { +// TODO: I do not think this is used by anything other than tests Review comment: should we answer this question in a current patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] deniskuzZ commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables
deniskuzZ commented on a change in pull request #1087: URL: https://github.com/apache/hive/pull/1087#discussion_r447001420 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java ## @@ -8322,6 +8322,22 @@ public AllocateTableWriteIdsResponse allocate_table_write_ids( return response; } +@Override +public MaxAllocatedTableWriteIdResponse get_max_allocated_table_write_id(MaxAllocatedTableWriteIdRequest rqst) Review comment: weird mix of Camel case and underscore? ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java ## @@ -8322,6 +8322,22 @@ public AllocateTableWriteIdsResponse allocate_table_write_ids( return response; } +@Override +public MaxAllocatedTableWriteIdResponse get_max_allocated_table_write_id(MaxAllocatedTableWriteIdRequest rqst) Review comment: weird mix of Camel case and underscore This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] deniskuzZ commented on a change in pull request #1087: HIVE-23671: MSCK repair should handle transactional tables
deniskuzZ commented on a change in pull request #1087: URL: https://github.com/apache/hive/pull/1087#discussion_r447001420 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java ## @@ -8322,6 +8322,22 @@ public AllocateTableWriteIdsResponse allocate_table_write_ids( return response; } +@Override +public MaxAllocatedTableWriteIdResponse get_max_allocated_table_write_id(MaxAllocatedTableWriteIdRequest rqst) Review comment: why not Camel case? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] belugabehr closed pull request #1090: HIVE-22676: Replace Base64 in hive-service Package
belugabehr closed pull request #1090: URL: https://github.com/apache/hive/pull/1090 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] belugabehr commented on a change in pull request #1161: HIVE-23638: Spotbugs issues in hive-common
belugabehr commented on a change in pull request #1161: URL: https://github.com/apache/hive/pull/1161#discussion_r446983489 ## File path: common/src/java/org/apache/hadoop/hive/common/FileUtils.java ## @@ -483,12 +483,6 @@ public static boolean isActionPermittedForFileHierarchy(FileSystem fs, FileStatu String userName, FsAction action, boolean recurse) throws Exception { boolean isDir = fileStatus.isDir(); -FsAction dirActionNeeded = action; -if (isDir) { - // for dirs user needs execute privileges as well - dirActionNeeded.and(FsAction.EXECUTE); -} - Review comment: So, I understand why this would be removed from a find-bugs perspective (this is a no-op), but this is actually an all around bug. This should be: ``` // for dirs user needs execute privileges as well FsAction dirActionNeeded = (isDir) ? action.and(FsAction.EXECUTE) : action; ``` ## File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ## @@ -6475,17 +6477,17 @@ private static boolean isAllowed(Configuration conf, ConfVars setting) { } public static String getNonMrEngines() { -String result = StringUtils.EMPTY; +StringBuffer result = new StringBuffer(); for (String s : ConfVars.HIVE_EXECUTION_ENGINE.getValidStringValues()) { if ("mr".equals(s)) { continue; } - if (!result.isEmpty()) { -result += ", "; + if (result.length() != 0) { +result.append(", "); } - result += s; + result.append(s); } -return result; +return result.toString(); Review comment: Please change this to just use String#join and more human friendly. ``` Set engines = new HashSet<>(ConfVars.HIVE_EXECUTION_ENGINE.getValidStringValues()); boolean removedMR = engines.remove("mr"); LOG.debug("Found and removed MapReduce engine from list of valid execution engines: {}", removedMR); return String.join(", ", engines); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] aasha commented on a change in pull request #1120: HIVE-23611 : Mandate fully qualified absolute path for external table…
aasha commented on a change in pull request #1120: URL: https://github.com/apache/hive/pull/1120#discussion_r444723985 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java ## @@ -517,7 +508,7 @@ public void externalTableIncrementalReplication() throws Throwable { + "'") .run("alter table t1 add partition(country='india')") .run("alter table t1 add partition(country='us')") -.dump(primaryDbName, withClause); Review comment: why is the withclause removed from here? ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/BaseReplicationAcrossInstances.java ## @@ -103,6 +114,12 @@ public static void classLevelTearDown() throws IOException { replica.close(); } + private static void setReplicaExternalBase(FileSystem fs, Map confMap) throws IOException { +fs.mkdirs(REPLICA_EXTERNAL_BASE); +fullyQualifiedReplicaExternalBase = fs.getFileStatus(REPLICA_EXTERNAL_BASE).getPath().toString(); +confMap.put(HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname, fullyQualifiedReplicaExternalBase); Review comment: this is set at 104 line also ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/ReplicationTestUtils.java ## @@ -502,19 +502,11 @@ public static void insertForMerge(WarehouseInstance primary, String primaryDbNam "creation", "creation", "merge_update", "merge_insert", "merge_insert"}); } - public static List externalTableBasePathWithClause(String replExternalBase, WarehouseInstance replica) - throws IOException, SemanticException { -Path externalTableLocation = new Path(replExternalBase); -DistributedFileSystem fileSystem = replica.miniDFSCluster.getFileSystem(); -externalTableLocation = PathBuilder.fullyQualifiedHDFSUri(externalTableLocation, fileSystem); -fileSystem.mkdirs(externalTableLocation); - -// this is required since the same filesystem is used in both source and target -return Arrays.asList( -"'" + HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + "'='" -+ externalTableLocation.toString() + "'", -"'distcp.options.pugpb'=''" -); + public static List externalTableClause(boolean enable) { Review comment: should this be include external table clause ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplExternalTables.java ## @@ -66,6 +67,9 @@ private ReplExternalTables(){} public static String externalTableLocation(HiveConf hiveConf, String location) throws SemanticException { String baseDir = hiveConf.get(HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname); +if (StringUtils.isEmpty(baseDir)) { Review comment: At the time of load the REPL_EXTERNAL_TABLE_BASE_DIR fully qualified path is not needed? ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java ## @@ -503,8 +495,7 @@ public void externalTableIncrementalCheckpointing() throws Throwable { @Test public void externalTableIncrementalReplication() throws Throwable { -List withClause = externalTableBasePathWithClause(); -WarehouseInstance.Tuple tuple = primary.dump(primaryDbName, withClause); +WarehouseInstance.Tuple tuple = primary.dump(primaryDbName); Review comment: with clause removed? ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java ## @@ -159,14 +154,14 @@ public void externalTableReplicationWithDefaultPaths() throws Throwable { .run("insert into table t2 partition(country='india') values ('bangalore')") .run("insert into table t2 partition(country='us') values ('austin')") .run("insert into table t2 partition(country='france') values ('paris')") -.dump(primaryDbName, withClauseOptions); +.dump(primaryDbName); Review comment: why is the with clause removed? ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTables.java ## @@ -623,7 +612,7 @@ public void bootstrapExternalTablesDuringIncrementalPhase() throws Throwable { assertFalse(primary.miniDFSCluster.getFileSystem() .exists(new Path(metadataPath + relativeExtInfoPath(null; -replica.load(replicatedDbName, primaryDbName, loadWithClause) +replica.load(replicatedDbName, primaryDbName) Review comment: with clause removed? ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java ## @@ -750,10 +752,19 @@ private void dumpTableListToDumpLocation(List tableList, Path dbRoot, St private List dirLocationsToCopy(List sourceLocations) throws HiveException
[GitHub] [hive] bmaidics opened a new pull request #1189: HIVE-23774: Reduce log level at aggrColStatsForPartitions in MetaStoreDirectSql.java
bmaidics opened a new pull request #1189: URL: https://github.com/apache/hive/pull/1189 This log is not needed at INFO log level. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] kasakrisz opened a new pull request #1188: HIVE-19549: Enable TestAcidOnTez#testCtasTezUnion
kasakrisz opened a new pull request #1188: URL: https://github.com/apache/hive/pull/1188 Testing done: ``` mvn test -Dtest=TestAcidOnTez#testCtasTezUnion -pl itests/hive-unit -Pitests ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] adesh-rao commented on a change in pull request #1109: HIVE-22015: Add table constraints in CachedStore
adesh-rao commented on a change in pull request #1109: URL: https://github.com/apache/hive/pull/1109#discussion_r446900743 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java ## @@ -2497,26 +2599,82 @@ long getPartsFound() { @Override public List getPrimaryKeys(String catName, String dbName, String tblName) throws MetaException { -// TODO constraintCache -return rawStore.getPrimaryKeys(catName, dbName, tblName); +catName = normalizeIdentifier(catName); +dbName = StringUtils.normalizeIdentifier(dbName); +tblName = StringUtils.normalizeIdentifier(tblName); +if (!shouldCacheTable(catName, dbName, tblName) || (canUseEvents && rawStore.isActiveTransaction())) { + return rawStore.getPrimaryKeys(catName, dbName, tblName); +} + +Table tbl = sharedCache.getTableFromCache(catName, dbName, tblName); +if (tbl == null) { + // The table containing the primary keys is not yet loaded in cache + return rawStore.getPrimaryKeys(catName, dbName, tblName); +} +List keys = sharedCache.listCachedPrimaryKeys(catName, dbName, tblName); Review comment: Yes, While updating the cache, there is a possibility that table got updated but constraints didn't (they are yet to be updated). But this is similar to partition/columnStats caching. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] ayushtkn commented on pull request #1187: HIVE-23772. Reallocate calcite-core to prevent NoSuchFiledError.
ayushtkn commented on pull request #1187: URL: https://github.com/apache/hive/pull/1187#issuecomment-651055461 Have updated itest as well, which caused the compilation to fail. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] adesh-rao commented on a change in pull request #1109: HIVE-22015: Add table constraints in CachedStore
adesh-rao commented on a change in pull request #1109: URL: https://github.com/apache/hive/pull/1109#discussion_r446898404 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java ## @@ -543,10 +556,24 @@ static void prewarm(RawStore rawStore) { tableColStats = rawStore.getTableColumnStatistics(catName, dbName, tblName, colNames, CacheUtils.HIVE_ENGINE); Deadline.stopTimer(); } + Deadline.startTimer("getPrimaryKeys"); + rawStore.getPrimaryKeys(catName, dbName, tblName); + Deadline.stopTimer(); + Deadline.startTimer("getForeignKeys"); + rawStore.getForeignKeys(catName, null, null, dbName, tblName); + Deadline.stopTimer(); + Deadline.startTimer("getUniqueConstraints"); + rawStore.getUniqueConstraints(catName, dbName, tblName); + Deadline.stopTimer(); + Deadline.startTimer("getNotNullConstraints"); + rawStore.getNotNullConstraints(catName, dbName, tblName); + Deadline.stopTimer(); + // If the table could not cached due to memory limit, stop prewarm boolean isSuccess = sharedCache .populateTableInCache(table, tableColStats, partitions, partitionColStats, aggrStatsAllPartitions, Review comment: Done. Though the new class just contains constraints objects for now, we can have a different refactoring jira for partition/column stat that can also refactor the array created to store size/dirtyCache variable. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] ayushtkn opened a new pull request #1187: HIVE-23772. Reallocate calcite-core to prevent NoSuchFiledError.
ayushtkn opened a new pull request #1187: URL: https://github.com/apache/hive/pull/1187 https://issues.apache.org/jira/browse/HIVE-23772 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary merged pull request #1168: HIVE-23738: DBLockManager::lock() : Move lock request to debug level
pvary merged pull request #1168: URL: https://github.com/apache/hive/pull/1168 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] kishendas commented on a change in pull request #1095: Make all DDLs transactional and advance commit write ID, so that subsequent read will invalidate the cache entry.
kishendas commented on a change in pull request #1095: URL: https://github.com/apache/hive/pull/1095#discussion_r446851340 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java ## @@ -2351,7 +2351,6 @@ public static TableSnapshot getTableSnapshot(Configuration conf, } } - Review comment: Reverting the entire file. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] kishendas commented on a change in pull request #1095: Make all DDLs transactional and advance commit write ID, so that subsequent read will invalidate the cache entry.
kishendas commented on a change in pull request #1095: URL: https://github.com/apache/hive/pull/1095#discussion_r446849934 ## File path: ql/src/java/org/apache/hadoop/hive/ql/Driver.java ## @@ -366,13 +368,16 @@ private void acquireLocks() throws CommandProcessorException { driverContext.getTxnManager().getTableWriteId(t.getDbName(), t.getTableName()); } - Review comment: Ok, I will revert entire file. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] kishendas commented on a change in pull request #1095: Make all DDLs transactional and advance commit write ID, so that subsequent read will invalidate the cache entry.
kishendas commented on a change in pull request #1095: URL: https://github.com/apache/hive/pull/1095#discussion_r446847170 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/partition/exchange/AlterTableExchangePartitionsDesc.java ## @@ -53,4 +54,20 @@ public Table getDestinationTable() { public Map getPartitionSpecs() { return partitionSpecs; } + + @Override + public void setWriteId(long writeId) { +this.writeId = writeId; + } + + @Override + public String getFullTableName() { +return null; Review comment: Was not sure what to return for table name, so disabled advancing the write ID, as this DDL involves two different tables. I will just revert the entire file, till I figure this out. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary commented on a change in pull request #1095: Make all DDLs transactional and advance commit write ID, so that subsequent read will invalidate the cache entry.
pvary commented on a change in pull request #1095: URL: https://github.com/apache/hive/pull/1095#discussion_r446845384 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java ## @@ -2351,7 +2351,6 @@ public static TableSnapshot getTableSnapshot(Configuration conf, } } - Review comment: Here is one more extra line here This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary commented on a change in pull request #1095: Make all DDLs transactional and advance commit write ID, so that subsequent read will invalidate the cache entry.
pvary commented on a change in pull request #1095: URL: https://github.com/apache/hive/pull/1095#discussion_r446842444 ## File path: ql/src/java/org/apache/hadoop/hive/ql/Driver.java ## @@ -435,8 +439,14 @@ public void releaseLocksAndCommitOrRollback(boolean commit, HiveTxnManager txnMa } // If we've opened a transaction we need to commit or rollback rather than explicitly // releasing the locks. -driverContext.getConf().unset(ValidTxnList.VALID_TXNS_KEY); - driverContext.getConf().unset(ValidTxnWriteIdList.VALID_TABLES_WRITEIDS_KEY); +// Unset all the keys +for (String key : new String[] { ValidTxnList.VALID_TXNS_KEY, ValidTxnWriteIdList.VALID_TABLES_WRITEIDS_KEY, +ValidTxnList.COMPACTOR_VALID_TXNS_ID_KEY, ValidTxnWriteIdList.COMPACTOR_VALID_TABLES_WRITEIDS_KEY }) { Review comment: Why do we need the COMPACTOR_VALID_TABLES_WRITEIDS_KEY in the conf? I would remove it if it is not strictly necessary. ## File path: ql/src/java/org/apache/hadoop/hive/ql/Driver.java ## @@ -366,13 +368,16 @@ private void acquireLocks() throws CommandProcessorException { driverContext.getTxnManager().getTableWriteId(t.getDbName(), t.getTableName()); } - Review comment: Still one extra line is here... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary commented on a change in pull request #1095: Make all DDLs transactional and advance commit write ID, so that subsequent read will invalidate the cache entry.
pvary commented on a change in pull request #1095: URL: https://github.com/apache/hive/pull/1095#discussion_r446842752 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/RewriteSemanticAnalyzer.java ## @@ -405,11 +405,9 @@ protected void updateOutputs(Table targetTable) { */ private boolean isTargetTable(Entity entity, Table targetTable) { //todo: https://issues.apache.org/jira/browse/HIVE-15048 -/** - * is this the right way to compare? Should it just compare paths? - * equals() impl looks heavy weight - */ -return targetTable.equals(entity.getTable()); +// Since any DDL now advances the write id, we should ignore the write Id, +// while comparing two tables +return targetTable.equalsWithIgnoreWriteId(entity.getTable()); Review comment: Ok. Let's shelve this This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary commented on a change in pull request #1095: Make all DDLs transactional and advance commit write ID, so that subsequent read will invalidate the cache entry.
pvary commented on a change in pull request #1095: URL: https://github.com/apache/hive/pull/1095#discussion_r446842042 ## File path: ql/src/java/org/apache/hadoop/hive/ql/ddl/table/partition/exchange/AlterTableExchangePartitionsDesc.java ## @@ -53,4 +54,20 @@ public Table getDestinationTable() { public Map getPartitionSpecs() { return partitionSpecs; } + + @Override + public void setWriteId(long writeId) { +this.writeId = writeId; + } + + @Override + public String getFullTableName() { +return null; Review comment: Could we add a comment for this then, why it is needed? This seems unintuitive.. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org