kasakrisz commented on code in PR #4344: URL: https://github.com/apache/hive/pull/4344#discussion_r1219040515
########## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/parser/PartitionFilter.g4: ########## @@ -0,0 +1,149 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +grammar PartitionFilter; + +filter + : filterExpression + ; + +filterExpression + : conditionExpression #singleCondition + | LPAREN filterExpression RPAREN #parenedFilter + | left=filterExpression logicOperator right=filterExpression #binaryFilter Review Comment: 1. IIUC the shape of the tree defines the execution order of the operators. Is it valid to reshape it? 2. I haven't dig deeply into this area but I think It would be better to keep the visitor's scope just to build the initial tree and do optimizations in a different module/class as a second step. Like what Calcite and Hive Optimizer does. Also the expression comes from hs2 already went through optimizations. 3. I think one of the benefits of switching to antlr 4 is the simplified grammar. I like the grammar you proposed in this patch because it is already much simpler than the one in `Filter.g`. I think it worth exploit this advantage further. IMHO we can end up with a simpler code by enforcing the precedence via grammar rules overall. These rules doesn't seems to be complex nor the corresponding Visitor method bodies. 4. The grammar is easily extendable in the future if we want to support other operators. 5. Finally we can always add optimizations in follow-up patches if this part identified as a bottleneck. ########## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/parser/PartFilterParser.java: ########## @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.metastore.parser; + +import java.util.function.Function; + +import org.antlr.v4.runtime.*; +import org.antlr.v4.runtime.misc.Interval; +import org.antlr.v4.runtime.misc.ParseCancellationException; +import org.apache.hadoop.hive.metastore.api.MetaException; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +public class PartFilterParser { + private static final Logger LOG = LoggerFactory.getLogger(PartFilterParser.class); + + public static ExpressionTree parseFilter(String filter) throws MetaException { + return parse(filter, parser -> new PartFilterVisitor().visitFilter(parser.filter())); + } + + private static <T> T parse(String filter, Function<PartitionFilterParser, T> toResult) + throws MetaException { + LOG.debug("Parsing filter: " + filter); + CharStream upperCaseCharStream = new UpperCaseCharStream(CharStreams.fromString(filter)); + PartitionFilterLexer lexer = new PartitionFilterLexer(upperCaseCharStream); + lexer.removeErrorListeners(); + lexer.addErrorListener(ParseErrorListener.INSTANCE); + + CommonTokenStream tokenStream = new CommonTokenStream(lexer); + PartitionFilterParser parser = new PartitionFilterParser(tokenStream); + parser.removeErrorListeners(); + parser.addErrorListener(ParseErrorListener.INSTANCE); + + try { + return toResult.apply(parser); Review Comment: Is this `parse` method called other than the `parseFilter` method? If not what is the reason of passing `new PartFilterVisitor().visitFilter` as the `toResult` parameter instead of inline it to here? If we inline this call here the `parse` method could be public and the `parseFilter` could be removed. Sorry to bother with these questions but the code would be simpler with this change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
