zabetak commented on code in PR #45: URL: https://github.com/apache/hive-site/pull/45#discussion_r2018106251
########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test Review Comment: Consider renaming this section as "Query File Tests". These are not really unit tests. These are integration end-to-end tests. ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case + +### Preparation + +You have to compile Hive's source codes ahead of time. + +```sh +$ cd /path/to/hive +$ mvn clean install -DskipTests +$ cd itests +$ mvn clean install -DskipTests +``` + +### Add a QFile + +A QFile includes a set of SQL statements that you want to test. Typically, we should put a new file in `ql/src/test/queries/clientpositive`. + +Let's say you created the following file. + +```sh +$ cat ql/src/test/queries/clientpositive/aaa.q +SELECT 1; +``` + +### Generate a result file + +You can generate the expected output using JUnit. + +```sh +$ cd /path/to/hive/itests/qtest +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile=aaa.q +... +$ cat ql/src/test/results/clientpositive/llap/aaa.q.out +PREHOOK: query: SELECT 1 +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +POSTHOOK: query: SELECT 1 +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +1 +``` + +### Verify the result file + +You can ensure the generated result file is correct by rerunning the test case without `test.output.overwrite=true`. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=aaa.q +``` + +## Helpful magic comments Review Comment: This should be part of the how to add a new test case section since they are relevant only for those adding/modifying tests. ########## content/docs/latest/developerguide_27362074.md: ########## @@ -288,95 +251,6 @@ Then you can run '`build/dist/bin/hive`' and it will work against your local fil Hive uses [JUnit](http://junit.org/) for unit tests. Each of the 3 main components of Hive have their unit test implementations in the corresponding src/test directory e.g. trunk/metastore/src/test has all the unit tests for metastore, trunk/serde/src/test has all the unit tests for serde and trunk/ql/src/test has all the unit tests for the query processor. The metastore and serde unit tests provide the TestCase implementations for JUnit. The query processor tests on the other hand are generated using Velocity. The main directories under trunk/ql/src/test that contain these tests and the corresponding results are as follows: -* Test Queries: - + queries/clientnegative - This directory contains the query files (.q files) for the negative test cases. These are run through the CLI classes and therefore test the entire query processor stack. - + queries/clientpositive - This directory contains the query files (.q files) for the positive test cases. Thesre are run through the CLI classes and therefore test the entire query processor stack. - + qureies/positive (Will be deprecated) - This directory contains the query files (.q files) for the positive test cases for the compiler. These only test the compiler and do not run the execution code. - + queries/negative (Will be deprecated) - This directory contains the query files (.q files) for the negative test cases for the compiler. These only test the compiler and do not run the execution code. -* Test Results: - + results/clientnegative - The expected results from the queries in queries/clientnegative. - + results/clientpositive - The expected results from the queries in queries/clientpositive. - + results/compiler/errors - The expected results from the queries in queries/negative. - + results/compiler/parse - The expected Abstract Syntax Tree output for the queries in queries/positive. - + results/compiler/plan - The expected query plans for the queries in queries/positive. -* Velocity Templates to Generate the Tests: - + templates/TestCliDriver.vm - Generates the tests from queries/clientpositive. - + templates/TestNegativeCliDriver.vm - Generates the tests from queries/clientnegative. - + templates/TestParse.vm - Generates the tests from queries/positive. - + templates/TestParseNegative.vm - Generates the tests from queries/negative. - -### Running unit tests - -Ant to Maven - -As of version [0.13](https://issues.apache.org/jira/browse/HIVE-5107) Hive uses Maven instead of Ant for its build. The following instructions are not up to date. - -See the [Hive Developer FAQ]({{< ref "#hive-developer-faq" >}}) for updated instructions. - -Run all tests: - -``` -ant package test - -``` - -Run all positive test queries: - -``` -ant test -Dtestcase=TestCliDriver - -``` - -Run a specific positive test query: - -``` -ant test -Dtestcase=TestCliDriver -Dqfile=groupby1.q - -``` - -The above test produces the following files: - -* `build/ql/test/TEST-org.apache.hadoop.hive.cli.TestCliDriver.txt` - Log output for the test. This can be helpful when examining test failures. -* `build/ql/test/logs/groupby1.q.out` - Actual query result for the test. This result is compared to the expected result as part of the test. - -Run the set of unit tests matching a regex, e.g. partition_wise_fileformat tests 10-16: - -``` -ant test -Dtestcase=TestCliDriver -Dqfile_regex=partition_wise_fileformat1[0-6] - -``` - -Note that this option matches against the basename of the test without the .q suffix. - -Apparently the Hive tests do not run successfully after a clean unless you run `ant package` first. Not sure why build.xml doesn't encode this dependency. - -### Adding new unit tests - -Ant to Maven - -As of version [0.13](https://issues.apache.org/jira/browse/HIVE-5107) Hive uses Maven instead of Ant for its build. The following instructions are not up to date. - -See the [Hive Developer FAQ]({{< ref "#hive-developer-faq" >}}) for updated instructions. See also [Tips for Adding New Tests in Hive]({{< ref "tipsforaddingnewtests_27362060" >}}) and [How to Contribute: Add a Unit Test]({{< ref "#how-to-contribute:-add-a-unit-test" >}}). - -First, write a new myname.q in ql/src/test/queries/clientpositive. - -Then, run the test with the query and overwrite the result (useful when you add a new test). - -``` -ant test -Dtestcase=TestCliDriver -Dqfile=myname.q -Doverwrite=true - -``` - -Then we can create a patch by: - -``` -svn add ql/src/test/queries/clientpositive/myname.q ql/src/test/results/clientpositive/myname.q.out -svn diff > patch.txt - -``` - -Similarly, to add negative client tests, write a new query input file in ql/src/test/queries/clientnegative and run the same command, this time specifying the testcase name as TestNegativeCliDriver instead of TestCliDriver. Note that for negative client tests, the output file if created using the overwrite flag can be be found in the directory ql/src/test/results/clientnegative. - Review Comment: Consider keeping somewhere a brief mention about the negative tests. ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case + +### Preparation + +You have to compile Hive's source codes ahead of time. + +```sh +$ cd /path/to/hive +$ mvn clean install -DskipTests +$ cd itests +$ mvn clean install -DskipTests +``` + +### Add a QFile + +A QFile includes a set of SQL statements that you want to test. Typically, we should put a new file in `ql/src/test/queries/clientpositive`. + +Let's say you created the following file. + +```sh +$ cat ql/src/test/queries/clientpositive/aaa.q +SELECT 1; +``` + +### Generate a result file + +You can generate the expected output using JUnit. + +```sh +$ cd /path/to/hive/itests/qtest +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile=aaa.q +... +$ cat ql/src/test/results/clientpositive/llap/aaa.q.out +PREHOOK: query: SELECT 1 +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +POSTHOOK: query: SELECT 1 +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +1 +``` + +### Verify the result file + +You can ensure the generated result file is correct by rerunning the test case without `test.output.overwrite=true`. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=aaa.q +``` + +## Helpful magic comments + +### Using test data + +Adding `--! qt:dataset:{table name}`, Query Unit Test automatically sets up a test table. [You can find the table definitions here](https://github.com/apache/hive/tree/master/data/files/datasets). + +```sql +--! qt:dataset:src +SELECT * FROM src; +``` + +### Mask non-deterministic outputs + +Some test cases might generate non-deterministic results. You can mask the non-deterministic part using a special comment prefixed with `--! qt:replace:`. + +For example, the result of `CURRENT_DATE` changes every day. Using the comment, the output will be `non-deterministic-output #Masked#` , which is stable. + +```sql +--! qt:replace:/(non-deterministic-output\s)[0-9]{4}-[0-9]{2}-[0-9]{2}/$1#Masked#/ +SELECT 'non-deterministic-output', CURRENT_DATE(); +``` + +## Commandline options + +### Run multiple test cases + +We sometimes want to run multiple test cases in parallel. The `qfile_regex` option helps query relevant test cases using a regular expression. + +For example, if you wanted to regenerate the result files of `alter1.q`, `alter2.q`, and so on, you would trigger the following command. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=alter[0-9] +``` + +### Test Iceberg, Accumulo, or Kudu + +Most test drivers are available in the `itest/qtest` directory. However, you must be in a different directory when using the following drivers. + +| Driver | Directory | +|-|-| +| TestAccumuloCliDriver | itest/qtest-accumulo | +| TestIcebergCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCompactorCliDriver | itest/qtest-iceberg | +| TestIcebergNegativeCliDriver | itest/qtest-iceberg | +| TestKuduCliDriver | itest/qtest-kudu | +| TestKuduNegativeCliDriver | itest/qtest-kudu | + +If you use TestIcebergLlapLocalCliDriver, you have to go to `itest/qtest-iceberg`. + +```sh +$ cd itest/qtest-iceberg +$ mvn test -Dtest=TestIcebergLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=iceberg_bucket_map_join_8 +``` + +### How to let Jenkins run specific drivers + +[The hive-precommit Jenkins job](https://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/activity) uses the following drivers for each directory by default. Review Comment: The mention of Jenkins and the precommit job here is irrelevant. The `configuration.properties` define the default tests that will be run for each driver anywhere unless qfile/qfile_regex or another option is available. ########## content/docs/latest/developerguide_27362074.md: ########## @@ -288,95 +251,6 @@ Then you can run '`build/dist/bin/hive`' and it will work against your local fil Hive uses [JUnit](http://junit.org/) for unit tests. Each of the 3 main components of Hive have their unit test implementations in the corresponding src/test directory e.g. trunk/metastore/src/test has all the unit tests for metastore, trunk/serde/src/test has all the unit tests for serde and trunk/ql/src/test has all the unit tests for the query processor. The metastore and serde unit tests provide the TestCase implementations for JUnit. The query processor tests on the other hand are generated using Velocity. The main directories under trunk/ql/src/test that contain these tests and the corresponding results are as follows: -* Test Queries: Review Comment: I think the following sentences are also obsolete and should be removed: ``` The query processor tests on the other hand are generated using Velocity. The main directories under trunk/ql/src/test that contain these tests and the corresponding results are as follows: ``` ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case + +### Preparation + +You have to compile Hive's source codes ahead of time. + +```sh +$ cd /path/to/hive +$ mvn clean install -DskipTests +$ cd itests +$ mvn clean install -DskipTests +``` + +### Add a QFile + +A QFile includes a set of SQL statements that you want to test. Typically, we should put a new file in `ql/src/test/queries/clientpositive`. + +Let's say you created the following file. + +```sh +$ cat ql/src/test/queries/clientpositive/aaa.q +SELECT 1; +``` + +### Generate a result file + +You can generate the expected output using JUnit. + +```sh +$ cd /path/to/hive/itests/qtest +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile=aaa.q +... +$ cat ql/src/test/results/clientpositive/llap/aaa.q.out +PREHOOK: query: SELECT 1 +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +POSTHOOK: query: SELECT 1 +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +1 +``` + +### Verify the result file + +You can ensure the generated result file is correct by rerunning the test case without `test.output.overwrite=true`. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=aaa.q +``` + +## Helpful magic comments + +### Using test data + +Adding `--! qt:dataset:{table name}`, Query Unit Test automatically sets up a test table. [You can find the table definitions here](https://github.com/apache/hive/tree/master/data/files/datasets). + +```sql +--! qt:dataset:src +SELECT * FROM src; +``` + +### Mask non-deterministic outputs + +Some test cases might generate non-deterministic results. You can mask the non-deterministic part using a special comment prefixed with `--! qt:replace:`. + +For example, the result of `CURRENT_DATE` changes every day. Using the comment, the output will be `non-deterministic-output #Masked#` , which is stable. + +```sql +--! qt:replace:/(non-deterministic-output\s)[0-9]{4}-[0-9]{2}-[0-9]{2}/$1#Masked#/ +SELECT 'non-deterministic-output', CURRENT_DATE(); +``` + +## Commandline options + +### Run multiple test cases + +We sometimes want to run multiple test cases in parallel. The `qfile_regex` option helps query relevant test cases using a regular expression. + +For example, if you wanted to regenerate the result files of `alter1.q`, `alter2.q`, and so on, you would trigger the following command. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=alter[0-9] +``` + +### Test Iceberg, Accumulo, or Kudu + +Most test drivers are available in the `itest/qtest` directory. However, you must be in a different directory when using the following drivers. + +| Driver | Directory | +|-|-| +| TestAccumuloCliDriver | itest/qtest-accumulo | +| TestIcebergCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCompactorCliDriver | itest/qtest-iceberg | +| TestIcebergNegativeCliDriver | itest/qtest-iceberg | +| TestKuduCliDriver | itest/qtest-kudu | +| TestKuduNegativeCliDriver | itest/qtest-kudu | + +If you use TestIcebergLlapLocalCliDriver, you have to go to `itest/qtest-iceberg`. + +```sh +$ cd itest/qtest-iceberg +$ mvn test -Dtest=TestIcebergLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=iceberg_bucket_map_join_8 +``` + +### How to let Jenkins run specific drivers + +[The hive-precommit Jenkins job](https://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/activity) uses the following drivers for each directory by default. + +| Driver | Query File's Location | +|-|-| +| TestMiniLlapLocalCliDriver | ql/src/test/queries/clientpositive | +| TestNegativeLlapLocalCliDriver | ql/src/test/queries/clientnegative | +| TestTezTPCDS30TBPerfCliDriver | ql/src/test/queries/clientpositive/perf | +| TestParseNegativeDriver | ql/src/test/queries/negative | +| TestAccumuloCliDriver | accumulo-handler/src/test/queries/positive | +| TestContribCliDriver | contrib/src/test/queries/clientpositive | +| TestContribNegativeCliDriver | contrib/src/test/queries/clientnegative | +| TestHBaseCliDriver | hbase-handler/src/test/queries/positive | +| TestHBaseNegativeCliDriver | hbase-handler/src/test/queries/negative | +| TestIcebergCliDriver | iceberg/iceberg-handler/src/test/queries/positive | +| TestIcebergNegativeCliDriver | iceberg/iceberg-handler/src/test/queries/negative | +| TestBlobstoreCliDriver | itests/hive-blobstore/src/test/queries/clientpositive | +| TestBlobstoreNegativeCliDriver | itests/hive-blobstore/src/test/queries/clientnegative | +| TestKuduCliDriver | kudu-handler/src/test/queries/positive | +| TestKuduNegativeCliDriver | kudu-handler/src/test/queries/negative | + +You can override the mapping through [itests/src/test/resources/testconfiguration.properties](https://github.com/apache/hive/blob/master/itests/src/test/resources/testconfiguration.properties). For example, if you want to test `ql/src/test/queries/clientpositive/aaa.q` not by LLAP but by Tez, you have to include the file name in `minitez.query.files` and generate the result file with `-Dtest=TestMiniLlapLocalCliDriver`. + +| Driver | Query File's Location | Property | +|-|-|-| +| TestCliDriver | ql/src/test/queries/clientpositive | mr.query.files | +| TestMinimrCliDriver | ql/src/test/queries/clientpositive | minimr.query.files | +| TestMiniTezCliDriver | ql/src/test/queries/clientpositive | minitez.query.files, minitez.query.files.shared | +| TestMiniLlapCliDriver | ql/src/test/queries/clientpositive | minillap.query.files | +| TestMiniDruidCliDriver | ql/src/test/queries/clientpositive | druid.query.files | +| TestMiniDruidKafkaCliDriver | ql/src/test/queries/clientpositive | druid.kafka.query.files | +| TestMiniHiveKafkaCliDriver | ql/src/test/queries/clientpositive | hive.kafka.query.files | +| TestMiniLlapLocalCompactorCliDriver | ql/src/test/queries/clientpositive | compaction.query.files | +| TestEncryptedHDFSCliDriver | ql/src/test/queries/clientpositive | encrypted.query.files | +| TestBeeLineDriver | ql/src/test/queries/clientpositive | beeline.positive.include, beeline.query.files.shared | +| TestErasureCodingHDFSCliDriver | ql/src/test/queries/clientpositive | erasurecoding.only.query.files | +| MiniDruidLlapLocalCliDriver | ql/src/test/queries/clientpositive | druid.llap.local.query.files | +| TestNegativeLlapCliDriver | ql/src/test/queries/clientnegative | llap.query.negative.files | +| TestIcebergLlapLocalCliDriver | iceberg/iceberg-handler/src/test/queries/positive | iceberg.llap.query.files | +| IcebergLlapLocalCompactorCliConfig | iceberg/iceberg-handler/src/test/queries/positive | iceberg.llap.query.compactor.files | + +## Advanced + +### Locations of log files + +The Query Unit Test framework outputs log files in the following paths. + +- `itests/{qtest, qtest-accumulo, qtest-iceberg, qtest-kudu}/target/surefire-reports` +- From the root of the source tree: `find . -name hive.log` + + +### How do I run with Postgre/MySQL/Oracle? + +To run a test with a specified DB, it is possible by adding the "-Dtest.metastore.db" parameter like in the following commands: + +```sh +mvn test -Pitests -pl itests/qtest -Dtest=TestCliDriver -Dqfile=partition_params_postgres.q -Dtest.metastore.db=postgres +mvn test -Pitests -pl itests/qtest -Dtest=TestCliDriver -Dqfile=partition_params_postgres.q -Dtest.metastore.db=mssql +mvn test -Pitests -pl itests/qtest -Dtest=TestCliDriver -Dqfile=partition_params_postgres.q -Dtest.metastore.db=mysql +mvn test -Pitests -pl itests/qtest -Dtest=TestCliDriver -Dqfile=partition_params_postgres.q -Dtest.metastore.db=oracle -Ditest.jdbc.jars=/path/to/your/god/damn/oracle/jdbc/driver/ojdbc6.jar +``` + +### Remote debug + +Remote debugging with Query Unit Test is a potent tool for debugging Hive. + +```sh +$ mvn -Dmaven.surefire.debug="-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=localhost:5005" test -Dtest=TestCliDriver -Dqfile=<test>.q Review Comment: In most cases it suffices to just put `-Dmaven.surefire.debug` without a specific value. The default configuration works fine most of the time. ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case + +### Preparation + +You have to compile Hive's source codes ahead of time. + +```sh +$ cd /path/to/hive +$ mvn clean install -DskipTests +$ cd itests +$ mvn clean install -DskipTests +``` + +### Add a QFile + +A QFile includes a set of SQL statements that you want to test. Typically, we should put a new file in `ql/src/test/queries/clientpositive`. + +Let's say you created the following file. + +```sh +$ cat ql/src/test/queries/clientpositive/aaa.q +SELECT 1; +``` + +### Generate a result file + +You can generate the expected output using JUnit. + +```sh +$ cd /path/to/hive/itests/qtest +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile=aaa.q +... +$ cat ql/src/test/results/clientpositive/llap/aaa.q.out +PREHOOK: query: SELECT 1 +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +POSTHOOK: query: SELECT 1 +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +1 +``` + +### Verify the result file + +You can ensure the generated result file is correct by rerunning the test case without `test.output.overwrite=true`. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=aaa.q +``` + +## Helpful magic comments Review Comment: Instead of saying "magic" I would rather use something like "Query file pre/post-processor commands". In addition, it would be nice to have a high-level description of what they do and a pointer to `QTestOptionHandler` interface for people interested in the full list. ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case + +### Preparation + +You have to compile Hive's source codes ahead of time. + +```sh +$ cd /path/to/hive +$ mvn clean install -DskipTests +$ cd itests +$ mvn clean install -DskipTests Review Comment: Usually, I compile everything using a single command `mvn clean install -DskipTests -Pitests` ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case Review Comment: I would put first a section on how to run the tests and then focus on how adding new ones. ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case + +### Preparation + +You have to compile Hive's source codes ahead of time. + +```sh +$ cd /path/to/hive +$ mvn clean install -DskipTests +$ cd itests +$ mvn clean install -DskipTests +``` + +### Add a QFile + +A QFile includes a set of SQL statements that you want to test. Typically, we should put a new file in `ql/src/test/queries/clientpositive`. + +Let's say you created the following file. + +```sh +$ cat ql/src/test/queries/clientpositive/aaa.q +SELECT 1; +``` + +### Generate a result file + +You can generate the expected output using JUnit. + +```sh +$ cd /path/to/hive/itests/qtest +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile=aaa.q +... +$ cat ql/src/test/results/clientpositive/llap/aaa.q.out +PREHOOK: query: SELECT 1 +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +POSTHOOK: query: SELECT 1 +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +1 +``` + +### Verify the result file + +You can ensure the generated result file is correct by rerunning the test case without `test.output.overwrite=true`. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=aaa.q +``` + +## Helpful magic comments + +### Using test data + +Adding `--! qt:dataset:{table name}`, Query Unit Test automatically sets up a test table. [You can find the table definitions here](https://github.com/apache/hive/tree/master/data/files/datasets). + +```sql +--! qt:dataset:src +SELECT * FROM src; +``` + +### Mask non-deterministic outputs + +Some test cases might generate non-deterministic results. You can mask the non-deterministic part using a special comment prefixed with `--! qt:replace:`. + +For example, the result of `CURRENT_DATE` changes every day. Using the comment, the output will be `non-deterministic-output #Masked#` , which is stable. + +```sql +--! qt:replace:/(non-deterministic-output\s)[0-9]{4}-[0-9]{2}-[0-9]{2}/$1#Masked#/ +SELECT 'non-deterministic-output', CURRENT_DATE(); +``` + +## Commandline options + +### Run multiple test cases + +We sometimes want to run multiple test cases in parallel. The `qfile_regex` option helps query relevant test cases using a regular expression. + +For example, if you wanted to regenerate the result files of `alter1.q`, `alter2.q`, and so on, you would trigger the following command. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=alter[0-9] +``` + +### Test Iceberg, Accumulo, or Kudu + +Most test drivers are available in the `itest/qtest` directory. However, you must be in a different directory when using the following drivers. + +| Driver | Directory | +|-|-| +| TestAccumuloCliDriver | itest/qtest-accumulo | +| TestIcebergCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCompactorCliDriver | itest/qtest-iceberg | +| TestIcebergNegativeCliDriver | itest/qtest-iceberg | +| TestKuduCliDriver | itest/qtest-kudu | +| TestKuduNegativeCliDriver | itest/qtest-kudu | + +If you use TestIcebergLlapLocalCliDriver, you have to go to `itest/qtest-iceberg`. + +```sh +$ cd itest/qtest-iceberg +$ mvn test -Dtest=TestIcebergLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=iceberg_bucket_map_join_8 +``` + +### How to let Jenkins run specific drivers + +[The hive-precommit Jenkins job](https://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/activity) uses the following drivers for each directory by default. + +| Driver | Query File's Location | +|-|-| +| TestMiniLlapLocalCliDriver | ql/src/test/queries/clientpositive | +| TestNegativeLlapLocalCliDriver | ql/src/test/queries/clientnegative | +| TestTezTPCDS30TBPerfCliDriver | ql/src/test/queries/clientpositive/perf | +| TestParseNegativeDriver | ql/src/test/queries/negative | +| TestAccumuloCliDriver | accumulo-handler/src/test/queries/positive | +| TestContribCliDriver | contrib/src/test/queries/clientpositive | +| TestContribNegativeCliDriver | contrib/src/test/queries/clientnegative | +| TestHBaseCliDriver | hbase-handler/src/test/queries/positive | +| TestHBaseNegativeCliDriver | hbase-handler/src/test/queries/negative | +| TestIcebergCliDriver | iceberg/iceberg-handler/src/test/queries/positive | +| TestIcebergNegativeCliDriver | iceberg/iceberg-handler/src/test/queries/negative | +| TestBlobstoreCliDriver | itests/hive-blobstore/src/test/queries/clientpositive | +| TestBlobstoreNegativeCliDriver | itests/hive-blobstore/src/test/queries/clientnegative | +| TestKuduCliDriver | kudu-handler/src/test/queries/positive | +| TestKuduNegativeCliDriver | kudu-handler/src/test/queries/negative | + +You can override the mapping through [itests/src/test/resources/testconfiguration.properties](https://github.com/apache/hive/blob/master/itests/src/test/resources/testconfiguration.properties). For example, if you want to test `ql/src/test/queries/clientpositive/aaa.q` not by LLAP but by Tez, you have to include the file name in `minitez.query.files` and generate the result file with `-Dtest=TestMiniLlapLocalCliDriver`. + +| Driver | Query File's Location | Property | +|-|-|-| +| TestCliDriver | ql/src/test/queries/clientpositive | mr.query.files | +| TestMinimrCliDriver | ql/src/test/queries/clientpositive | minimr.query.files | +| TestMiniTezCliDriver | ql/src/test/queries/clientpositive | minitez.query.files, minitez.query.files.shared | +| TestMiniLlapCliDriver | ql/src/test/queries/clientpositive | minillap.query.files | +| TestMiniDruidCliDriver | ql/src/test/queries/clientpositive | druid.query.files | +| TestMiniDruidKafkaCliDriver | ql/src/test/queries/clientpositive | druid.kafka.query.files | +| TestMiniHiveKafkaCliDriver | ql/src/test/queries/clientpositive | hive.kafka.query.files | +| TestMiniLlapLocalCompactorCliDriver | ql/src/test/queries/clientpositive | compaction.query.files | +| TestEncryptedHDFSCliDriver | ql/src/test/queries/clientpositive | encrypted.query.files | +| TestBeeLineDriver | ql/src/test/queries/clientpositive | beeline.positive.include, beeline.query.files.shared | +| TestErasureCodingHDFSCliDriver | ql/src/test/queries/clientpositive | erasurecoding.only.query.files | +| MiniDruidLlapLocalCliDriver | ql/src/test/queries/clientpositive | druid.llap.local.query.files | +| TestNegativeLlapCliDriver | ql/src/test/queries/clientnegative | llap.query.negative.files | +| TestIcebergLlapLocalCliDriver | iceberg/iceberg-handler/src/test/queries/positive | iceberg.llap.query.files | +| IcebergLlapLocalCompactorCliConfig | iceberg/iceberg-handler/src/test/queries/positive | iceberg.llap.query.compactor.files | + +## Advanced + +### Locations of log files + +The Query Unit Test framework outputs log files in the following paths. + +- `itests/{qtest, qtest-accumulo, qtest-iceberg, qtest-kudu}/target/surefire-reports` Review Comment: Worth mentioning that the `hive.log` and `surefire-reports` usually have overlapping content. ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case + +### Preparation + +You have to compile Hive's source codes ahead of time. + +```sh +$ cd /path/to/hive +$ mvn clean install -DskipTests +$ cd itests +$ mvn clean install -DskipTests +``` + +### Add a QFile + +A QFile includes a set of SQL statements that you want to test. Typically, we should put a new file in `ql/src/test/queries/clientpositive`. + +Let's say you created the following file. + +```sh +$ cat ql/src/test/queries/clientpositive/aaa.q +SELECT 1; +``` + +### Generate a result file + +You can generate the expected output using JUnit. + +```sh +$ cd /path/to/hive/itests/qtest +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile=aaa.q +... +$ cat ql/src/test/results/clientpositive/llap/aaa.q.out +PREHOOK: query: SELECT 1 +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +POSTHOOK: query: SELECT 1 +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +1 +``` + +### Verify the result file + +You can ensure the generated result file is correct by rerunning the test case without `test.output.overwrite=true`. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=aaa.q +``` + +## Helpful magic comments + +### Using test data + +Adding `--! qt:dataset:{table name}`, Query Unit Test automatically sets up a test table. [You can find the table definitions here](https://github.com/apache/hive/tree/master/data/files/datasets). + +```sql +--! qt:dataset:src +SELECT * FROM src; +``` + +### Mask non-deterministic outputs + +Some test cases might generate non-deterministic results. You can mask the non-deterministic part using a special comment prefixed with `--! qt:replace:`. + +For example, the result of `CURRENT_DATE` changes every day. Using the comment, the output will be `non-deterministic-output #Masked#` , which is stable. + +```sql +--! qt:replace:/(non-deterministic-output\s)[0-9]{4}-[0-9]{2}-[0-9]{2}/$1#Masked#/ +SELECT 'non-deterministic-output', CURRENT_DATE(); +``` + +## Commandline options Review Comment: The command line options should be part of the how to run tests section and not the other way around. The most notable to mention are: * qfile * qfile_regex * test.output.overwrite * test.metastore.db ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case + +### Preparation + +You have to compile Hive's source codes ahead of time. + +```sh +$ cd /path/to/hive +$ mvn clean install -DskipTests +$ cd itests +$ mvn clean install -DskipTests +``` + +### Add a QFile + +A QFile includes a set of SQL statements that you want to test. Typically, we should put a new file in `ql/src/test/queries/clientpositive`. + +Let's say you created the following file. + +```sh +$ cat ql/src/test/queries/clientpositive/aaa.q +SELECT 1; +``` + +### Generate a result file + +You can generate the expected output using JUnit. + +```sh +$ cd /path/to/hive/itests/qtest +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile=aaa.q +... +$ cat ql/src/test/results/clientpositive/llap/aaa.q.out +PREHOOK: query: SELECT 1 +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +POSTHOOK: query: SELECT 1 +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +1 +``` + +### Verify the result file + +You can ensure the generated result file is correct by rerunning the test case without `test.output.overwrite=true`. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=aaa.q +``` + +## Helpful magic comments + +### Using test data + +Adding `--! qt:dataset:{table name}`, Query Unit Test automatically sets up a test table. [You can find the table definitions here](https://github.com/apache/hive/tree/master/data/files/datasets). + +```sql +--! qt:dataset:src +SELECT * FROM src; +``` + +### Mask non-deterministic outputs + +Some test cases might generate non-deterministic results. You can mask the non-deterministic part using a special comment prefixed with `--! qt:replace:`. + +For example, the result of `CURRENT_DATE` changes every day. Using the comment, the output will be `non-deterministic-output #Masked#` , which is stable. + +```sql +--! qt:replace:/(non-deterministic-output\s)[0-9]{4}-[0-9]{2}-[0-9]{2}/$1#Masked#/ +SELECT 'non-deterministic-output', CURRENT_DATE(); +``` + +## Commandline options + +### Run multiple test cases + +We sometimes want to run multiple test cases in parallel. The `qfile_regex` option helps query relevant test cases using a regular expression. + +For example, if you wanted to regenerate the result files of `alter1.q`, `alter2.q`, and so on, you would trigger the following command. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=alter[0-9] +``` + +### Test Iceberg, Accumulo, or Kudu + +Most test drivers are available in the `itest/qtest` directory. However, you must be in a different directory when using the following drivers. + +| Driver | Directory | +|-|-| +| TestAccumuloCliDriver | itest/qtest-accumulo | +| TestIcebergCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCompactorCliDriver | itest/qtest-iceberg | +| TestIcebergNegativeCliDriver | itest/qtest-iceberg | +| TestKuduCliDriver | itest/qtest-kudu | +| TestKuduNegativeCliDriver | itest/qtest-kudu | + +If you use TestIcebergLlapLocalCliDriver, you have to go to `itest/qtest-iceberg`. + +```sh +$ cd itest/qtest-iceberg +$ mvn test -Dtest=TestIcebergLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=iceberg_bucket_map_join_8 +``` + +### How to let Jenkins run specific drivers + +[The hive-precommit Jenkins job](https://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/activity) uses the following drivers for each directory by default. + +| Driver | Query File's Location | +|-|-| +| TestMiniLlapLocalCliDriver | ql/src/test/queries/clientpositive | +| TestNegativeLlapLocalCliDriver | ql/src/test/queries/clientnegative | +| TestTezTPCDS30TBPerfCliDriver | ql/src/test/queries/clientpositive/perf | +| TestParseNegativeDriver | ql/src/test/queries/negative | +| TestAccumuloCliDriver | accumulo-handler/src/test/queries/positive | +| TestContribCliDriver | contrib/src/test/queries/clientpositive | +| TestContribNegativeCliDriver | contrib/src/test/queries/clientnegative | +| TestHBaseCliDriver | hbase-handler/src/test/queries/positive | +| TestHBaseNegativeCliDriver | hbase-handler/src/test/queries/negative | +| TestIcebergCliDriver | iceberg/iceberg-handler/src/test/queries/positive | +| TestIcebergNegativeCliDriver | iceberg/iceberg-handler/src/test/queries/negative | +| TestBlobstoreCliDriver | itests/hive-blobstore/src/test/queries/clientpositive | +| TestBlobstoreNegativeCliDriver | itests/hive-blobstore/src/test/queries/clientnegative | +| TestKuduCliDriver | kudu-handler/src/test/queries/positive | +| TestKuduNegativeCliDriver | kudu-handler/src/test/queries/negative | + +You can override the mapping through [itests/src/test/resources/testconfiguration.properties](https://github.com/apache/hive/blob/master/itests/src/test/resources/testconfiguration.properties). For example, if you want to test `ql/src/test/queries/clientpositive/aaa.q` not by LLAP but by Tez, you have to include the file name in `minitez.query.files` and generate the result file with `-Dtest=TestMiniLlapLocalCliDriver`. + +| Driver | Query File's Location | Property | +|-|-|-| +| TestCliDriver | ql/src/test/queries/clientpositive | mr.query.files | +| TestMinimrCliDriver | ql/src/test/queries/clientpositive | minimr.query.files | +| TestMiniTezCliDriver | ql/src/test/queries/clientpositive | minitez.query.files, minitez.query.files.shared | +| TestMiniLlapCliDriver | ql/src/test/queries/clientpositive | minillap.query.files | +| TestMiniDruidCliDriver | ql/src/test/queries/clientpositive | druid.query.files | +| TestMiniDruidKafkaCliDriver | ql/src/test/queries/clientpositive | druid.kafka.query.files | +| TestMiniHiveKafkaCliDriver | ql/src/test/queries/clientpositive | hive.kafka.query.files | +| TestMiniLlapLocalCompactorCliDriver | ql/src/test/queries/clientpositive | compaction.query.files | +| TestEncryptedHDFSCliDriver | ql/src/test/queries/clientpositive | encrypted.query.files | +| TestBeeLineDriver | ql/src/test/queries/clientpositive | beeline.positive.include, beeline.query.files.shared | +| TestErasureCodingHDFSCliDriver | ql/src/test/queries/clientpositive | erasurecoding.only.query.files | +| MiniDruidLlapLocalCliDriver | ql/src/test/queries/clientpositive | druid.llap.local.query.files | +| TestNegativeLlapCliDriver | ql/src/test/queries/clientnegative | llap.query.negative.files | +| TestIcebergLlapLocalCliDriver | iceberg/iceberg-handler/src/test/queries/positive | iceberg.llap.query.files | +| IcebergLlapLocalCompactorCliConfig | iceberg/iceberg-handler/src/test/queries/positive | iceberg.llap.query.compactor.files | + +## Advanced + +### Locations of log files + +The Query Unit Test framework outputs log files in the following paths. + +- `itests/{qtest, qtest-accumulo, qtest-iceberg, qtest-kudu}/target/surefire-reports` +- From the root of the source tree: `find . -name hive.log` + + +### How do I run with Postgre/MySQL/Oracle? + +To run a test with a specified DB, it is possible by adding the "-Dtest.metastore.db" parameter like in the following commands: + +```sh +mvn test -Pitests -pl itests/qtest -Dtest=TestCliDriver -Dqfile=partition_params_postgres.q -Dtest.metastore.db=postgres +mvn test -Pitests -pl itests/qtest -Dtest=TestCliDriver -Dqfile=partition_params_postgres.q -Dtest.metastore.db=mssql +mvn test -Pitests -pl itests/qtest -Dtest=TestCliDriver -Dqfile=partition_params_postgres.q -Dtest.metastore.db=mysql +mvn test -Pitests -pl itests/qtest -Dtest=TestCliDriver -Dqfile=partition_params_postgres.q -Dtest.metastore.db=oracle -Ditest.jdbc.jars=/path/to/your/god/damn/oracle/jdbc/driver/ojdbc6.jar +``` + +### Remote debug + +Remote debugging with Query Unit Test is a potent tool for debugging Hive. + +```sh +$ mvn -Dmaven.surefire.debug="-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=localhost:5005" test -Dtest=TestCliDriver -Dqfile=<test>.q +``` + +### How do I modify the init script when testing? + +Hive 2.0 added the option to skip the init script or supply a custom init script (see HIVE-11538). + +To skip initialization: + +```sh +mvn test -Dtest=TestCliDriver -Phadoop-2 -Dqfile=test_to_run.q -DinitScript= +``` + +To supply a custom script: + +```sh +mvn test -Dtest=TestCliDriver -Phadoop-2 -Dtest.output.overwrite=true -Dqfile=test_to_run.q -DinitScript=custom_script.sql +``` Review Comment: I never used this option so I wouldn't mind dropping this section unless you find it useful. If we keep it should go together with the other command line options. ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case + +### Preparation + +You have to compile Hive's source codes ahead of time. + +```sh +$ cd /path/to/hive +$ mvn clean install -DskipTests +$ cd itests +$ mvn clean install -DskipTests +``` + +### Add a QFile + +A QFile includes a set of SQL statements that you want to test. Typically, we should put a new file in `ql/src/test/queries/clientpositive`. + +Let's say you created the following file. + +```sh +$ cat ql/src/test/queries/clientpositive/aaa.q +SELECT 1; +``` + +### Generate a result file + +You can generate the expected output using JUnit. + +```sh +$ cd /path/to/hive/itests/qtest +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile=aaa.q +... +$ cat ql/src/test/results/clientpositive/llap/aaa.q.out +PREHOOK: query: SELECT 1 +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +POSTHOOK: query: SELECT 1 +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +1 +``` + +### Verify the result file + +You can ensure the generated result file is correct by rerunning the test case without `test.output.overwrite=true`. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=aaa.q +``` + +## Helpful magic comments + +### Using test data + +Adding `--! qt:dataset:{table name}`, Query Unit Test automatically sets up a test table. [You can find the table definitions here](https://github.com/apache/hive/tree/master/data/files/datasets). + +```sql +--! qt:dataset:src +SELECT * FROM src; +``` + +### Mask non-deterministic outputs + +Some test cases might generate non-deterministic results. You can mask the non-deterministic part using a special comment prefixed with `--! qt:replace:`. + +For example, the result of `CURRENT_DATE` changes every day. Using the comment, the output will be `non-deterministic-output #Masked#` , which is stable. + +```sql +--! qt:replace:/(non-deterministic-output\s)[0-9]{4}-[0-9]{2}-[0-9]{2}/$1#Masked#/ +SELECT 'non-deterministic-output', CURRENT_DATE(); +``` + +## Commandline options + +### Run multiple test cases + +We sometimes want to run multiple test cases in parallel. The `qfile_regex` option helps query relevant test cases using a regular expression. + +For example, if you wanted to regenerate the result files of `alter1.q`, `alter2.q`, and so on, you would trigger the following command. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=alter[0-9] +``` + +### Test Iceberg, Accumulo, or Kudu + +Most test drivers are available in the `itest/qtest` directory. However, you must be in a different directory when using the following drivers. + +| Driver | Directory | +|-|-| +| TestAccumuloCliDriver | itest/qtest-accumulo | +| TestIcebergCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCompactorCliDriver | itest/qtest-iceberg | +| TestIcebergNegativeCliDriver | itest/qtest-iceberg | +| TestKuduCliDriver | itest/qtest-kudu | +| TestKuduNegativeCliDriver | itest/qtest-kudu | + +If you use TestIcebergLlapLocalCliDriver, you have to go to `itest/qtest-iceberg`. + +```sh +$ cd itest/qtest-iceberg +$ mvn test -Dtest=TestIcebergLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=iceberg_bucket_map_join_8 +``` + +### How to let Jenkins run specific drivers + +[The hive-precommit Jenkins job](https://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/activity) uses the following drivers for each directory by default. + +| Driver | Query File's Location | +|-|-| +| TestMiniLlapLocalCliDriver | ql/src/test/queries/clientpositive | +| TestNegativeLlapLocalCliDriver | ql/src/test/queries/clientnegative | +| TestTezTPCDS30TBPerfCliDriver | ql/src/test/queries/clientpositive/perf | +| TestParseNegativeDriver | ql/src/test/queries/negative | +| TestAccumuloCliDriver | accumulo-handler/src/test/queries/positive | +| TestContribCliDriver | contrib/src/test/queries/clientpositive | +| TestContribNegativeCliDriver | contrib/src/test/queries/clientnegative | +| TestHBaseCliDriver | hbase-handler/src/test/queries/positive | +| TestHBaseNegativeCliDriver | hbase-handler/src/test/queries/negative | +| TestIcebergCliDriver | iceberg/iceberg-handler/src/test/queries/positive | +| TestIcebergNegativeCliDriver | iceberg/iceberg-handler/src/test/queries/negative | +| TestBlobstoreCliDriver | itests/hive-blobstore/src/test/queries/clientpositive | +| TestBlobstoreNegativeCliDriver | itests/hive-blobstore/src/test/queries/clientnegative | +| TestKuduCliDriver | kudu-handler/src/test/queries/positive | +| TestKuduNegativeCliDriver | kudu-handler/src/test/queries/negative | + +You can override the mapping through [itests/src/test/resources/testconfiguration.properties](https://github.com/apache/hive/blob/master/itests/src/test/resources/testconfiguration.properties). For example, if you want to test `ql/src/test/queries/clientpositive/aaa.q` not by LLAP but by Tez, you have to include the file name in `minitez.query.files` and generate the result file with `-Dtest=TestMiniLlapLocalCliDriver`. + +| Driver | Query File's Location | Property | +|-|-|-| +| TestCliDriver | ql/src/test/queries/clientpositive | mr.query.files | +| TestMinimrCliDriver | ql/src/test/queries/clientpositive | minimr.query.files | +| TestMiniTezCliDriver | ql/src/test/queries/clientpositive | minitez.query.files, minitez.query.files.shared | +| TestMiniLlapCliDriver | ql/src/test/queries/clientpositive | minillap.query.files | +| TestMiniDruidCliDriver | ql/src/test/queries/clientpositive | druid.query.files | +| TestMiniDruidKafkaCliDriver | ql/src/test/queries/clientpositive | druid.kafka.query.files | +| TestMiniHiveKafkaCliDriver | ql/src/test/queries/clientpositive | hive.kafka.query.files | +| TestMiniLlapLocalCompactorCliDriver | ql/src/test/queries/clientpositive | compaction.query.files | +| TestEncryptedHDFSCliDriver | ql/src/test/queries/clientpositive | encrypted.query.files | +| TestBeeLineDriver | ql/src/test/queries/clientpositive | beeline.positive.include, beeline.query.files.shared | +| TestErasureCodingHDFSCliDriver | ql/src/test/queries/clientpositive | erasurecoding.only.query.files | +| MiniDruidLlapLocalCliDriver | ql/src/test/queries/clientpositive | druid.llap.local.query.files | +| TestNegativeLlapCliDriver | ql/src/test/queries/clientnegative | llap.query.negative.files | +| TestIcebergLlapLocalCliDriver | iceberg/iceberg-handler/src/test/queries/positive | iceberg.llap.query.files | +| IcebergLlapLocalCompactorCliConfig | iceberg/iceberg-handler/src/test/queries/positive | iceberg.llap.query.compactor.files | + +## Advanced + +### Locations of log files + +The Query Unit Test framework outputs log files in the following paths. + +- `itests/{qtest, qtest-accumulo, qtest-iceberg, qtest-kudu}/target/surefire-reports` +- From the root of the source tree: `find . -name hive.log` + + +### How do I run with Postgre/MySQL/Oracle? + +To run a test with a specified DB, it is possible by adding the "-Dtest.metastore.db" parameter like in the following commands: Review Comment: It's important to highlight here that we are talking about selecting that database backend for **Hive Metastore** . We have other kinds of tests that are using externals databases when external JDBC tables are used and this is another category of tests. ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case + +### Preparation + +You have to compile Hive's source codes ahead of time. + +```sh +$ cd /path/to/hive +$ mvn clean install -DskipTests +$ cd itests +$ mvn clean install -DskipTests +``` + +### Add a QFile + +A QFile includes a set of SQL statements that you want to test. Typically, we should put a new file in `ql/src/test/queries/clientpositive`. + +Let's say you created the following file. + +```sh +$ cat ql/src/test/queries/clientpositive/aaa.q +SELECT 1; +``` + +### Generate a result file + +You can generate the expected output using JUnit. + +```sh +$ cd /path/to/hive/itests/qtest +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile=aaa.q +... +$ cat ql/src/test/results/clientpositive/llap/aaa.q.out +PREHOOK: query: SELECT 1 +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +POSTHOOK: query: SELECT 1 +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +1 +``` + +### Verify the result file + +You can ensure the generated result file is correct by rerunning the test case without `test.output.overwrite=true`. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=aaa.q +``` + +## Helpful magic comments + +### Using test data + +Adding `--! qt:dataset:{table name}`, Query Unit Test automatically sets up a test table. [You can find the table definitions here](https://github.com/apache/hive/tree/master/data/files/datasets). + +```sql +--! qt:dataset:src +SELECT * FROM src; +``` + +### Mask non-deterministic outputs + +Some test cases might generate non-deterministic results. You can mask the non-deterministic part using a special comment prefixed with `--! qt:replace:`. + +For example, the result of `CURRENT_DATE` changes every day. Using the comment, the output will be `non-deterministic-output #Masked#` , which is stable. + +```sql +--! qt:replace:/(non-deterministic-output\s)[0-9]{4}-[0-9]{2}-[0-9]{2}/$1#Masked#/ +SELECT 'non-deterministic-output', CURRENT_DATE(); +``` + +## Commandline options + +### Run multiple test cases + +We sometimes want to run multiple test cases in parallel. The `qfile_regex` option helps query relevant test cases using a regular expression. + +For example, if you wanted to regenerate the result files of `alter1.q`, `alter2.q`, and so on, you would trigger the following command. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=alter[0-9] +``` + +### Test Iceberg, Accumulo, or Kudu + +Most test drivers are available in the `itest/qtest` directory. However, you must be in a different directory when using the following drivers. + +| Driver | Directory | +|-|-| +| TestAccumuloCliDriver | itest/qtest-accumulo | +| TestIcebergCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCompactorCliDriver | itest/qtest-iceberg | +| TestIcebergNegativeCliDriver | itest/qtest-iceberg | +| TestKuduCliDriver | itest/qtest-kudu | +| TestKuduNegativeCliDriver | itest/qtest-kudu | + +If you use TestIcebergLlapLocalCliDriver, you have to go to `itest/qtest-iceberg`. + +```sh +$ cd itest/qtest-iceberg +$ mvn test -Dtest=TestIcebergLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=iceberg_bucket_map_join_8 +``` + +### How to let Jenkins run specific drivers + +[The hive-precommit Jenkins job](https://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/activity) uses the following drivers for each directory by default. + +| Driver | Query File's Location | +|-|-| +| TestMiniLlapLocalCliDriver | ql/src/test/queries/clientpositive | +| TestNegativeLlapLocalCliDriver | ql/src/test/queries/clientnegative | +| TestTezTPCDS30TBPerfCliDriver | ql/src/test/queries/clientpositive/perf | +| TestParseNegativeDriver | ql/src/test/queries/negative | +| TestAccumuloCliDriver | accumulo-handler/src/test/queries/positive | +| TestContribCliDriver | contrib/src/test/queries/clientpositive | +| TestContribNegativeCliDriver | contrib/src/test/queries/clientnegative | +| TestHBaseCliDriver | hbase-handler/src/test/queries/positive | +| TestHBaseNegativeCliDriver | hbase-handler/src/test/queries/negative | +| TestIcebergCliDriver | iceberg/iceberg-handler/src/test/queries/positive | +| TestIcebergNegativeCliDriver | iceberg/iceberg-handler/src/test/queries/negative | +| TestBlobstoreCliDriver | itests/hive-blobstore/src/test/queries/clientpositive | +| TestBlobstoreNegativeCliDriver | itests/hive-blobstore/src/test/queries/clientnegative | +| TestKuduCliDriver | kudu-handler/src/test/queries/positive | +| TestKuduNegativeCliDriver | kudu-handler/src/test/queries/negative | + +You can override the mapping through [itests/src/test/resources/testconfiguration.properties](https://github.com/apache/hive/blob/master/itests/src/test/resources/testconfiguration.properties). For example, if you want to test `ql/src/test/queries/clientpositive/aaa.q` not by LLAP but by Tez, you have to include the file name in `minitez.query.files` and generate the result file with `-Dtest=TestMiniLlapLocalCliDriver`. + +| Driver | Query File's Location | Property | +|-|-|-| +| TestCliDriver | ql/src/test/queries/clientpositive | mr.query.files | +| TestMinimrCliDriver | ql/src/test/queries/clientpositive | minimr.query.files | +| TestMiniTezCliDriver | ql/src/test/queries/clientpositive | minitez.query.files, minitez.query.files.shared | +| TestMiniLlapCliDriver | ql/src/test/queries/clientpositive | minillap.query.files | +| TestMiniDruidCliDriver | ql/src/test/queries/clientpositive | druid.query.files | +| TestMiniDruidKafkaCliDriver | ql/src/test/queries/clientpositive | druid.kafka.query.files | +| TestMiniHiveKafkaCliDriver | ql/src/test/queries/clientpositive | hive.kafka.query.files | +| TestMiniLlapLocalCompactorCliDriver | ql/src/test/queries/clientpositive | compaction.query.files | +| TestEncryptedHDFSCliDriver | ql/src/test/queries/clientpositive | encrypted.query.files | +| TestBeeLineDriver | ql/src/test/queries/clientpositive | beeline.positive.include, beeline.query.files.shared | +| TestErasureCodingHDFSCliDriver | ql/src/test/queries/clientpositive | erasurecoding.only.query.files | +| MiniDruidLlapLocalCliDriver | ql/src/test/queries/clientpositive | druid.llap.local.query.files | +| TestNegativeLlapCliDriver | ql/src/test/queries/clientnegative | llap.query.negative.files | +| TestIcebergLlapLocalCliDriver | iceberg/iceberg-handler/src/test/queries/positive | iceberg.llap.query.files | +| IcebergLlapLocalCompactorCliConfig | iceberg/iceberg-handler/src/test/queries/positive | iceberg.llap.query.compactor.files | + +## Advanced + +### Locations of log files + +The Query Unit Test framework outputs log files in the following paths. + +- `itests/{qtest, qtest-accumulo, qtest-iceberg, qtest-kudu}/target/surefire-reports` +- From the root of the source tree: `find . -name hive.log` + + +### How do I run with Postgre/MySQL/Oracle? + +To run a test with a specified DB, it is possible by adding the "-Dtest.metastore.db" parameter like in the following commands: + +```sh +mvn test -Pitests -pl itests/qtest -Dtest=TestCliDriver -Dqfile=partition_params_postgres.q -Dtest.metastore.db=postgres +mvn test -Pitests -pl itests/qtest -Dtest=TestCliDriver -Dqfile=partition_params_postgres.q -Dtest.metastore.db=mssql +mvn test -Pitests -pl itests/qtest -Dtest=TestCliDriver -Dqfile=partition_params_postgres.q -Dtest.metastore.db=mysql +mvn test -Pitests -pl itests/qtest -Dtest=TestCliDriver -Dqfile=partition_params_postgres.q -Dtest.metastore.db=oracle -Ditest.jdbc.jars=/path/to/your/god/damn/oracle/jdbc/driver/ojdbc6.jar Review Comment: In other parts of this doc we are not using the module selection feature (`-pl`) but tell users to navigate to the folder. To keep things more consistent I would pick one or the other and keep the same for the whole page. ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case + +### Preparation + +You have to compile Hive's source codes ahead of time. + +```sh +$ cd /path/to/hive +$ mvn clean install -DskipTests +$ cd itests +$ mvn clean install -DskipTests +``` + +### Add a QFile + +A QFile includes a set of SQL statements that you want to test. Typically, we should put a new file in `ql/src/test/queries/clientpositive`. + +Let's say you created the following file. + +```sh +$ cat ql/src/test/queries/clientpositive/aaa.q +SELECT 1; +``` + +### Generate a result file + +You can generate the expected output using JUnit. + +```sh +$ cd /path/to/hive/itests/qtest +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile=aaa.q +... +$ cat ql/src/test/results/clientpositive/llap/aaa.q.out +PREHOOK: query: SELECT 1 +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +POSTHOOK: query: SELECT 1 +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +1 +``` + +### Verify the result file + +You can ensure the generated result file is correct by rerunning the test case without `test.output.overwrite=true`. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=aaa.q +``` + +## Helpful magic comments + +### Using test data + +Adding `--! qt:dataset:{table name}`, Query Unit Test automatically sets up a test table. [You can find the table definitions here](https://github.com/apache/hive/tree/master/data/files/datasets). + +```sql +--! qt:dataset:src +SELECT * FROM src; +``` + +### Mask non-deterministic outputs + +Some test cases might generate non-deterministic results. You can mask the non-deterministic part using a special comment prefixed with `--! qt:replace:`. + +For example, the result of `CURRENT_DATE` changes every day. Using the comment, the output will be `non-deterministic-output #Masked#` , which is stable. + +```sql +--! qt:replace:/(non-deterministic-output\s)[0-9]{4}-[0-9]{2}-[0-9]{2}/$1#Masked#/ +SELECT 'non-deterministic-output', CURRENT_DATE(); +``` + +## Commandline options + +### Run multiple test cases + +We sometimes want to run multiple test cases in parallel. The `qfile_regex` option helps query relevant test cases using a regular expression. + +For example, if you wanted to regenerate the result files of `alter1.q`, `alter2.q`, and so on, you would trigger the following command. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=alter[0-9] +``` + +### Test Iceberg, Accumulo, or Kudu + +Most test drivers are available in the `itest/qtest` directory. However, you must be in a different directory when using the following drivers. + +| Driver | Directory | +|-|-| +| TestAccumuloCliDriver | itest/qtest-accumulo | +| TestIcebergCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCompactorCliDriver | itest/qtest-iceberg | +| TestIcebergNegativeCliDriver | itest/qtest-iceberg | +| TestKuduCliDriver | itest/qtest-kudu | +| TestKuduNegativeCliDriver | itest/qtest-kudu | + +If you use TestIcebergLlapLocalCliDriver, you have to go to `itest/qtest-iceberg`. + +```sh +$ cd itest/qtest-iceberg +$ mvn test -Dtest=TestIcebergLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=iceberg_bucket_map_join_8 Review Comment: I wouldn't put `-Dtest.output.overwrite=true` everywhere assuming that we just want to run the test here and not overwrite its output. ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case + +### Preparation + +You have to compile Hive's source codes ahead of time. + +```sh +$ cd /path/to/hive +$ mvn clean install -DskipTests +$ cd itests +$ mvn clean install -DskipTests +``` + +### Add a QFile + +A QFile includes a set of SQL statements that you want to test. Typically, we should put a new file in `ql/src/test/queries/clientpositive`. + +Let's say you created the following file. + +```sh +$ cat ql/src/test/queries/clientpositive/aaa.q +SELECT 1; +``` + +### Generate a result file + +You can generate the expected output using JUnit. + +```sh +$ cd /path/to/hive/itests/qtest +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile=aaa.q +... +$ cat ql/src/test/results/clientpositive/llap/aaa.q.out +PREHOOK: query: SELECT 1 +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +POSTHOOK: query: SELECT 1 +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +1 +``` + +### Verify the result file + +You can ensure the generated result file is correct by rerunning the test case without `test.output.overwrite=true`. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=aaa.q +``` + +## Helpful magic comments + +### Using test data + +Adding `--! qt:dataset:{table name}`, Query Unit Test automatically sets up a test table. [You can find the table definitions here](https://github.com/apache/hive/tree/master/data/files/datasets). + +```sql +--! qt:dataset:src +SELECT * FROM src; +``` + +### Mask non-deterministic outputs + +Some test cases might generate non-deterministic results. You can mask the non-deterministic part using a special comment prefixed with `--! qt:replace:`. + +For example, the result of `CURRENT_DATE` changes every day. Using the comment, the output will be `non-deterministic-output #Masked#` , which is stable. + +```sql +--! qt:replace:/(non-deterministic-output\s)[0-9]{4}-[0-9]{2}-[0-9]{2}/$1#Masked#/ +SELECT 'non-deterministic-output', CURRENT_DATE(); +``` + +## Commandline options + +### Run multiple test cases + +We sometimes want to run multiple test cases in parallel. The `qfile_regex` option helps query relevant test cases using a regular expression. + +For example, if you wanted to regenerate the result files of `alter1.q`, `alter2.q`, and so on, you would trigger the following command. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=alter[0-9] +``` + +### Test Iceberg, Accumulo, or Kudu + +Most test drivers are available in the `itest/qtest` directory. However, you must be in a different directory when using the following drivers. + +| Driver | Directory | +|-|-| +| TestAccumuloCliDriver | itest/qtest-accumulo | +| TestIcebergCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCompactorCliDriver | itest/qtest-iceberg | +| TestIcebergNegativeCliDriver | itest/qtest-iceberg | +| TestKuduCliDriver | itest/qtest-kudu | +| TestKuduNegativeCliDriver | itest/qtest-kudu | + +If you use TestIcebergLlapLocalCliDriver, you have to go to `itest/qtest-iceberg`. + +```sh +$ cd itest/qtest-iceberg +$ mvn test -Dtest=TestIcebergLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=iceberg_bucket_map_join_8 +``` + +### How to let Jenkins run specific drivers + +[The hive-precommit Jenkins job](https://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/activity) uses the following drivers for each directory by default. + +| Driver | Query File's Location | +|-|-| +| TestMiniLlapLocalCliDriver | ql/src/test/queries/clientpositive | +| TestNegativeLlapLocalCliDriver | ql/src/test/queries/clientnegative | +| TestTezTPCDS30TBPerfCliDriver | ql/src/test/queries/clientpositive/perf | +| TestParseNegativeDriver | ql/src/test/queries/negative | +| TestAccumuloCliDriver | accumulo-handler/src/test/queries/positive | +| TestContribCliDriver | contrib/src/test/queries/clientpositive | +| TestContribNegativeCliDriver | contrib/src/test/queries/clientnegative | +| TestHBaseCliDriver | hbase-handler/src/test/queries/positive | +| TestHBaseNegativeCliDriver | hbase-handler/src/test/queries/negative | +| TestIcebergCliDriver | iceberg/iceberg-handler/src/test/queries/positive | +| TestIcebergNegativeCliDriver | iceberg/iceberg-handler/src/test/queries/negative | +| TestBlobstoreCliDriver | itests/hive-blobstore/src/test/queries/clientpositive | +| TestBlobstoreNegativeCliDriver | itests/hive-blobstore/src/test/queries/clientnegative | +| TestKuduCliDriver | kudu-handler/src/test/queries/positive | +| TestKuduNegativeCliDriver | kudu-handler/src/test/queries/negative | + +You can override the mapping through [itests/src/test/resources/testconfiguration.properties](https://github.com/apache/hive/blob/master/itests/src/test/resources/testconfiguration.properties). For example, if you want to test `ql/src/test/queries/clientpositive/aaa.q` not by LLAP but by Tez, you have to include the file name in `minitez.query.files` and generate the result file with `-Dtest=TestMiniLlapLocalCliDriver`. + +| Driver | Query File's Location | Property | +|-|-|-| +| TestCliDriver | ql/src/test/queries/clientpositive | mr.query.files | +| TestMinimrCliDriver | ql/src/test/queries/clientpositive | minimr.query.files | +| TestMiniTezCliDriver | ql/src/test/queries/clientpositive | minitez.query.files, minitez.query.files.shared | +| TestMiniLlapCliDriver | ql/src/test/queries/clientpositive | minillap.query.files | +| TestMiniDruidCliDriver | ql/src/test/queries/clientpositive | druid.query.files | +| TestMiniDruidKafkaCliDriver | ql/src/test/queries/clientpositive | druid.kafka.query.files | +| TestMiniHiveKafkaCliDriver | ql/src/test/queries/clientpositive | hive.kafka.query.files | +| TestMiniLlapLocalCompactorCliDriver | ql/src/test/queries/clientpositive | compaction.query.files | +| TestEncryptedHDFSCliDriver | ql/src/test/queries/clientpositive | encrypted.query.files | +| TestBeeLineDriver | ql/src/test/queries/clientpositive | beeline.positive.include, beeline.query.files.shared | +| TestErasureCodingHDFSCliDriver | ql/src/test/queries/clientpositive | erasurecoding.only.query.files | +| MiniDruidLlapLocalCliDriver | ql/src/test/queries/clientpositive | druid.llap.local.query.files | +| TestNegativeLlapCliDriver | ql/src/test/queries/clientnegative | llap.query.negative.files | +| TestIcebergLlapLocalCliDriver | iceberg/iceberg-handler/src/test/queries/positive | iceberg.llap.query.files | +| IcebergLlapLocalCompactorCliConfig | iceberg/iceberg-handler/src/test/queries/positive | iceberg.llap.query.compactor.files | + +## Advanced + +### Locations of log files + +The Query Unit Test framework outputs log files in the following paths. + +- `itests/{qtest, qtest-accumulo, qtest-iceberg, qtest-kudu}/target/surefire-reports` +- From the root of the source tree: `find . -name hive.log` + + +### How do I run with Postgre/MySQL/Oracle? + +To run a test with a specified DB, it is possible by adding the "-Dtest.metastore.db" parameter like in the following commands: + +```sh +mvn test -Pitests -pl itests/qtest -Dtest=TestCliDriver -Dqfile=partition_params_postgres.q -Dtest.metastore.db=postgres +mvn test -Pitests -pl itests/qtest -Dtest=TestCliDriver -Dqfile=partition_params_postgres.q -Dtest.metastore.db=mssql +mvn test -Pitests -pl itests/qtest -Dtest=TestCliDriver -Dqfile=partition_params_postgres.q -Dtest.metastore.db=mysql +mvn test -Pitests -pl itests/qtest -Dtest=TestCliDriver -Dqfile=partition_params_postgres.q -Dtest.metastore.db=oracle -Ditest.jdbc.jars=/path/to/your/god/damn/oracle/jdbc/driver/ojdbc6.jar +``` + +### Remote debug + +Remote debugging with Query Unit Test is a potent tool for debugging Hive. + +```sh +$ mvn -Dmaven.surefire.debug="-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=localhost:5005" test -Dtest=TestCliDriver -Dqfile=<test>.q +``` + +### How do I modify the init script when testing? + +Hive 2.0 added the option to skip the init script or supply a custom init script (see HIVE-11538). + +To skip initialization: + +```sh +mvn test -Dtest=TestCliDriver -Phadoop-2 -Dqfile=test_to_run.q -DinitScript= Review Comment: The `hadoop-2` profile is not used anymore so please remove it. Also consider using `TestMiniLlapLocalCliDriver` which is the default where most tests are run. ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case + +### Preparation + +You have to compile Hive's source codes ahead of time. + +```sh +$ cd /path/to/hive +$ mvn clean install -DskipTests +$ cd itests +$ mvn clean install -DskipTests +``` + +### Add a QFile + +A QFile includes a set of SQL statements that you want to test. Typically, we should put a new file in `ql/src/test/queries/clientpositive`. + +Let's say you created the following file. + +```sh +$ cat ql/src/test/queries/clientpositive/aaa.q +SELECT 1; +``` + +### Generate a result file + +You can generate the expected output using JUnit. + +```sh +$ cd /path/to/hive/itests/qtest +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile=aaa.q +... +$ cat ql/src/test/results/clientpositive/llap/aaa.q.out +PREHOOK: query: SELECT 1 +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +POSTHOOK: query: SELECT 1 +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +1 +``` + +### Verify the result file + +You can ensure the generated result file is correct by rerunning the test case without `test.output.overwrite=true`. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=aaa.q +``` + +## Helpful magic comments + +### Using test data + +Adding `--! qt:dataset:{table name}`, Query Unit Test automatically sets up a test table. [You can find the table definitions here](https://github.com/apache/hive/tree/master/data/files/datasets). + +```sql +--! qt:dataset:src +SELECT * FROM src; +``` + +### Mask non-deterministic outputs + +Some test cases might generate non-deterministic results. You can mask the non-deterministic part using a special comment prefixed with `--! qt:replace:`. + +For example, the result of `CURRENT_DATE` changes every day. Using the comment, the output will be `non-deterministic-output #Masked#` , which is stable. + +```sql +--! qt:replace:/(non-deterministic-output\s)[0-9]{4}-[0-9]{2}-[0-9]{2}/$1#Masked#/ +SELECT 'non-deterministic-output', CURRENT_DATE(); +``` + +## Commandline options + +### Run multiple test cases + +We sometimes want to run multiple test cases in parallel. The `qfile_regex` option helps query relevant test cases using a regular expression. + +For example, if you wanted to regenerate the result files of `alter1.q`, `alter2.q`, and so on, you would trigger the following command. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=alter[0-9] Review Comment: Worth highlighting somewhere that `TestMiniLlapLocalCliDriver` is the default driver and is used for most tests. ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} Review Comment: I really like the toc short code. I think we should apply it to all pages but this is out of scope for this PR. ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case + +### Preparation + +You have to compile Hive's source codes ahead of time. + +```sh +$ cd /path/to/hive +$ mvn clean install -DskipTests +$ cd itests +$ mvn clean install -DskipTests +``` + +### Add a QFile + +A QFile includes a set of SQL statements that you want to test. Typically, we should put a new file in `ql/src/test/queries/clientpositive`. + +Let's say you created the following file. + +```sh +$ cat ql/src/test/queries/clientpositive/aaa.q +SELECT 1; +``` + +### Generate a result file + +You can generate the expected output using JUnit. + +```sh +$ cd /path/to/hive/itests/qtest +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile=aaa.q +... +$ cat ql/src/test/results/clientpositive/llap/aaa.q.out +PREHOOK: query: SELECT 1 +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +POSTHOOK: query: SELECT 1 +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +1 +``` + +### Verify the result file + +You can ensure the generated result file is correct by rerunning the test case without `test.output.overwrite=true`. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=aaa.q +``` + +## Helpful magic comments + +### Using test data + +Adding `--! qt:dataset:{table name}`, Query Unit Test automatically sets up a test table. [You can find the table definitions here](https://github.com/apache/hive/tree/master/data/files/datasets). + +```sql +--! qt:dataset:src +SELECT * FROM src; +``` + +### Mask non-deterministic outputs + +Some test cases might generate non-deterministic results. You can mask the non-deterministic part using a special comment prefixed with `--! qt:replace:`. + +For example, the result of `CURRENT_DATE` changes every day. Using the comment, the output will be `non-deterministic-output #Masked#` , which is stable. + +```sql +--! qt:replace:/(non-deterministic-output\s)[0-9]{4}-[0-9]{2}-[0-9]{2}/$1#Masked#/ +SELECT 'non-deterministic-output', CURRENT_DATE(); +``` + +## Commandline options + +### Run multiple test cases Review Comment: The most basic thing which is to run a single query file seems to be missing from this page. I think we should begin this doc from there. It's kinda present in `Verify the result file` section but it should be more explicit. ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case + +### Preparation + +You have to compile Hive's source codes ahead of time. + +```sh +$ cd /path/to/hive +$ mvn clean install -DskipTests +$ cd itests +$ mvn clean install -DskipTests +``` + +### Add a QFile + +A QFile includes a set of SQL statements that you want to test. Typically, we should put a new file in `ql/src/test/queries/clientpositive`. + +Let's say you created the following file. + +```sh +$ cat ql/src/test/queries/clientpositive/aaa.q +SELECT 1; +``` + +### Generate a result file + +You can generate the expected output using JUnit. + +```sh +$ cd /path/to/hive/itests/qtest +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile=aaa.q +... +$ cat ql/src/test/results/clientpositive/llap/aaa.q.out +PREHOOK: query: SELECT 1 +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +POSTHOOK: query: SELECT 1 +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +1 +``` + +### Verify the result file + +You can ensure the generated result file is correct by rerunning the test case without `test.output.overwrite=true`. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=aaa.q +``` + +## Helpful magic comments + +### Using test data + +Adding `--! qt:dataset:{table name}`, Query Unit Test automatically sets up a test table. [You can find the table definitions here](https://github.com/apache/hive/tree/master/data/files/datasets). + +```sql +--! qt:dataset:src +SELECT * FROM src; +``` + +### Mask non-deterministic outputs + +Some test cases might generate non-deterministic results. You can mask the non-deterministic part using a special comment prefixed with `--! qt:replace:`. + +For example, the result of `CURRENT_DATE` changes every day. Using the comment, the output will be `non-deterministic-output #Masked#` , which is stable. + +```sql +--! qt:replace:/(non-deterministic-output\s)[0-9]{4}-[0-9]{2}-[0-9]{2}/$1#Masked#/ +SELECT 'non-deterministic-output', CURRENT_DATE(); +``` + +## Commandline options + +### Run multiple test cases + +We sometimes want to run multiple test cases in parallel. The `qfile_regex` option helps query relevant test cases using a regular expression. + +For example, if you wanted to regenerate the result files of `alter1.q`, `alter2.q`, and so on, you would trigger the following command. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=alter[0-9] +``` + +### Test Iceberg, Accumulo, or Kudu + +Most test drivers are available in the `itest/qtest` directory. However, you must be in a different directory when using the following drivers. + +| Driver | Directory | +|-|-| +| TestAccumuloCliDriver | itest/qtest-accumulo | +| TestIcebergCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCompactorCliDriver | itest/qtest-iceberg | +| TestIcebergNegativeCliDriver | itest/qtest-iceberg | +| TestKuduCliDriver | itest/qtest-kudu | +| TestKuduNegativeCliDriver | itest/qtest-kudu | + +If you use TestIcebergLlapLocalCliDriver, you have to go to `itest/qtest-iceberg`. + +```sh +$ cd itest/qtest-iceberg +$ mvn test -Dtest=TestIcebergLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=iceberg_bucket_map_join_8 +``` + +### How to let Jenkins run specific drivers + +[The hive-precommit Jenkins job](https://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/activity) uses the following drivers for each directory by default. Review Comment: Additionally instead of putting an exhaustive list with the drivers and the config names here I would rather put a pointer to `CliConfigs` file and explain briefly where the mapping is defined. ########## content/Development/qtest.md: ########## @@ -0,0 +1,224 @@ +--- +title: "QFile Query Unit Test" +date: 2025-03-28 +draft: false +aliases: [/qtest.html] +--- + +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. --> + +# QFile Query Unit Test + +QFile Query Unit Test is a JUnit-based integration test suite for Apache Hive. Developers write any SQL; the testing framework runs it and verifies the result and output. + +{{< toc >}} + +## Tutorial: How to add a new test case + +### Preparation + +You have to compile Hive's source codes ahead of time. + +```sh +$ cd /path/to/hive +$ mvn clean install -DskipTests +$ cd itests +$ mvn clean install -DskipTests +``` + +### Add a QFile + +A QFile includes a set of SQL statements that you want to test. Typically, we should put a new file in `ql/src/test/queries/clientpositive`. + +Let's say you created the following file. + +```sh +$ cat ql/src/test/queries/clientpositive/aaa.q +SELECT 1; +``` + +### Generate a result file + +You can generate the expected output using JUnit. + +```sh +$ cd /path/to/hive/itests/qtest +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile=aaa.q +... +$ cat ql/src/test/results/clientpositive/llap/aaa.q.out +PREHOOK: query: SELECT 1 +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +POSTHOOK: query: SELECT 1 +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +#### A masked pattern was here #### +1 +``` + +### Verify the result file + +You can ensure the generated result file is correct by rerunning the test case without `test.output.overwrite=true`. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=aaa.q +``` + +## Helpful magic comments + +### Using test data + +Adding `--! qt:dataset:{table name}`, Query Unit Test automatically sets up a test table. [You can find the table definitions here](https://github.com/apache/hive/tree/master/data/files/datasets). + +```sql +--! qt:dataset:src +SELECT * FROM src; +``` + +### Mask non-deterministic outputs + +Some test cases might generate non-deterministic results. You can mask the non-deterministic part using a special comment prefixed with `--! qt:replace:`. + +For example, the result of `CURRENT_DATE` changes every day. Using the comment, the output will be `non-deterministic-output #Masked#` , which is stable. + +```sql +--! qt:replace:/(non-deterministic-output\s)[0-9]{4}-[0-9]{2}-[0-9]{2}/$1#Masked#/ +SELECT 'non-deterministic-output', CURRENT_DATE(); +``` + +## Commandline options + +### Run multiple test cases + +We sometimes want to run multiple test cases in parallel. The `qfile_regex` option helps query relevant test cases using a regular expression. + +For example, if you wanted to regenerate the result files of `alter1.q`, `alter2.q`, and so on, you would trigger the following command. + +```sh +$ mvn test -Dtest=TestMiniLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=alter[0-9] +``` + +### Test Iceberg, Accumulo, or Kudu + +Most test drivers are available in the `itest/qtest` directory. However, you must be in a different directory when using the following drivers. + +| Driver | Directory | +|-|-| +| TestAccumuloCliDriver | itest/qtest-accumulo | +| TestIcebergCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCliDriver | itest/qtest-iceberg | +| TestIcebergLlapLocalCompactorCliDriver | itest/qtest-iceberg | +| TestIcebergNegativeCliDriver | itest/qtest-iceberg | +| TestKuduCliDriver | itest/qtest-kudu | +| TestKuduNegativeCliDriver | itest/qtest-kudu | + +If you use TestIcebergLlapLocalCliDriver, you have to go to `itest/qtest-iceberg`. + +```sh +$ cd itest/qtest-iceberg +$ mvn test -Dtest=TestIcebergLlapLocalCliDriver -Dtest.output.overwrite=true -Dqfile_regex=iceberg_bucket_map_join_8 +``` + +### How to let Jenkins run specific drivers + +[The hive-precommit Jenkins job](https://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/activity) uses the following drivers for each directory by default. + +| Driver | Query File's Location | +|-|-| +| TestMiniLlapLocalCliDriver | ql/src/test/queries/clientpositive | +| TestNegativeLlapLocalCliDriver | ql/src/test/queries/clientnegative | +| TestTezTPCDS30TBPerfCliDriver | ql/src/test/queries/clientpositive/perf | +| TestParseNegativeDriver | ql/src/test/queries/negative | +| TestAccumuloCliDriver | accumulo-handler/src/test/queries/positive | +| TestContribCliDriver | contrib/src/test/queries/clientpositive | +| TestContribNegativeCliDriver | contrib/src/test/queries/clientnegative | +| TestHBaseCliDriver | hbase-handler/src/test/queries/positive | +| TestHBaseNegativeCliDriver | hbase-handler/src/test/queries/negative | +| TestIcebergCliDriver | iceberg/iceberg-handler/src/test/queries/positive | +| TestIcebergNegativeCliDriver | iceberg/iceberg-handler/src/test/queries/negative | +| TestBlobstoreCliDriver | itests/hive-blobstore/src/test/queries/clientpositive | +| TestBlobstoreNegativeCliDriver | itests/hive-blobstore/src/test/queries/clientnegative | +| TestKuduCliDriver | kudu-handler/src/test/queries/positive | +| TestKuduNegativeCliDriver | kudu-handler/src/test/queries/negative | + +You can override the mapping through [itests/src/test/resources/testconfiguration.properties](https://github.com/apache/hive/blob/master/itests/src/test/resources/testconfiguration.properties). For example, if you want to test `ql/src/test/queries/clientpositive/aaa.q` not by LLAP but by Tez, you have to include the file name in `minitez.query.files` and generate the result file with `-Dtest=TestMiniLlapLocalCliDriver`. + +| Driver | Query File's Location | Property | +|-|-|-| +| TestCliDriver | ql/src/test/queries/clientpositive | mr.query.files | +| TestMinimrCliDriver | ql/src/test/queries/clientpositive | minimr.query.files | +| TestMiniTezCliDriver | ql/src/test/queries/clientpositive | minitez.query.files, minitez.query.files.shared | +| TestMiniLlapCliDriver | ql/src/test/queries/clientpositive | minillap.query.files | +| TestMiniDruidCliDriver | ql/src/test/queries/clientpositive | druid.query.files | +| TestMiniDruidKafkaCliDriver | ql/src/test/queries/clientpositive | druid.kafka.query.files | +| TestMiniHiveKafkaCliDriver | ql/src/test/queries/clientpositive | hive.kafka.query.files | +| TestMiniLlapLocalCompactorCliDriver | ql/src/test/queries/clientpositive | compaction.query.files | +| TestEncryptedHDFSCliDriver | ql/src/test/queries/clientpositive | encrypted.query.files | +| TestBeeLineDriver | ql/src/test/queries/clientpositive | beeline.positive.include, beeline.query.files.shared | +| TestErasureCodingHDFSCliDriver | ql/src/test/queries/clientpositive | erasurecoding.only.query.files | +| MiniDruidLlapLocalCliDriver | ql/src/test/queries/clientpositive | druid.llap.local.query.files | +| TestNegativeLlapCliDriver | ql/src/test/queries/clientnegative | llap.query.negative.files | +| TestIcebergLlapLocalCliDriver | iceberg/iceberg-handler/src/test/queries/positive | iceberg.llap.query.files | +| IcebergLlapLocalCompactorCliConfig | iceberg/iceberg-handler/src/test/queries/positive | iceberg.llap.query.compactor.files | + +## Advanced + +### Locations of log files + +The Query Unit Test framework outputs log files in the following paths. + +- `itests/{qtest, qtest-accumulo, qtest-iceberg, qtest-kudu}/target/surefire-reports` +- From the root of the source tree: `find . -name hive.log` + + +### How do I run with Postgre/MySQL/Oracle? + +To run a test with a specified DB, it is possible by adding the "-Dtest.metastore.db" parameter like in the following commands: Review Comment: Postgre -> PostgreSQL -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org