zratkai commented on code in PR #45: URL: https://github.com/apache/hive-site/pull/45#discussion_r2018313540
########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case + +### Preparation + +You have to compile Hive's source codes ahead of time. + +```sh +$ cd /path/to/hive +$ mvn clean install -DskipTests +$ cd itests +$ mvn clean install -DskipTests +``` + +### Add a QFile + +A QFile includes a set of SQL statements that you want to test. Typically, we should put a new file in `ql/src/test/queries/clientpositive`. + +Let's say you created the following file. + +```sh +$ cat ql/src/test/queries/clientpositive/aaa.q +SELECT 1; +``` + +### Generate a result file + +You can generate the expected output using JUnit. + +```sh +$ cd /path/to/hive/itests/qtest +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile=aaa.q +... +$ cat ql/src/test/results/clientpositive/llap/aaa.q.out +PREHOOK: query: SELECT 1 +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +POSTHOOK: query: SELECT 1 +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +1 +``` + +### Verify the result file + +You can ensure the generated result file is correct by rerunning the test case without `test.output.overwrite=true`. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=aaa.q +``` + +## Helpful magic comments + +### Using test data + +Adding `--! qt:dataset:{table name}`, Query Unit Test automatically sets up a test table. [You can find the table definitions here](https://github.com/apache/hive/tree/master/data/files/datasets). + +```sql +--! qt:dataset:src +SELECT * FROM src; +``` + +### Mask non-deterministic outputs + +Some test cases might generate non-deterministic results. You can mask the non-deterministic part using a special comment prefixed with `--! qt:replace:`. + +For example, the result of `CURRENT_DATE` changes every day. Using the comment, the output will be `non-deterministic-output #Masked#` , which is stable. + +```sql +--! qt:replace:/(non-deterministic-output\s)[0-9]{4}-[0-9]{2}-[0-9]{2}/$1#Masked#/ +SELECT 'non-deterministic-output', CURRENT_DATE(); +``` + +## Commandline options + +### Run multiple test cases + +We sometimes want to run multiple test cases in parallel. The `qfile_regex` option helps query relevant test cases using a regular expression. + +For example, if you wanted to regenerate the result files of `alter1.q`, `alter2.q`, and so on, you would trigger the following command. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=alter[0-9] +``` + +### Test Iceberg, Accumulo, or Kudu + +Most test drivers are available in the `itest/qtest` directory. However, you must be in a different directory when using the following drivers. + +| Driver | Directory | +|-|-| +| TestAccumuloCliDriver | itest/qtest-accumulo | +| TestIcebergCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCompactorCliDriver | itest/qtest-iceberg | +| TestIcebergNegativeCliDriver | itest/qtest-iceberg | +| TestKuduCliDriver | itest/qtest-kudu | +| TestKuduNegativeCliDriver | itest/qtest-kudu | + +If you use TestIcebergLlapLocalCliDriver, you have to go to `itest/qtest-iceberg`. + +```sh +$ cd itest/qtest-iceberg +$ mvn test -Dtest=TestIcebergLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=iceberg_bucket_map_join_8 +``` + +### How to let Jenkins run specific drivers + +[The hive-precommit Jenkins job](https://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/activity) uses the following drivers for each directory by default. Review Comment: I aggree with @zabetak, the CliConfigs is not mentioned here, I recommend this as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org