If you'd like to contribute a patch to Impala, but aren't sure what
you want to work on, you can look at Impala's newbie issues:
https://issues.apache.org/jira/issues/?filter=12341668. You can find
detailed instructions on submitting patches at
https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala.
This is a walkthrough of a ticket a new contributor could take on,
with hopefully enough detail to get you going but not so much to take
away the fun.

How can we fix https://issues.apache.org/jira/browse/IMPALA-5754,
"rand() algorithm is very non-random"? This is a partial walk-through
of how to get started.

Set up your development environment. Then, look for where we might
first write a failing test. The test case given in the ticket is
"select count(distinct(rand(867-5309))), count(*) from alltypes a,
alltypes b;". Tests that run a full query are considered "end-to-end
tests".

End-to-end tests are described in two ways: .test files and .py files.

.test files contain queries and their expected results. For example:

====
---- QUERY
# Regression test for IMPALA-938
select smallint_col, int_col, (cast("1970-01-01" as timestamp) +
interval smallint_col days)
from functional.alltypes where smallint_col = 1 limit 1
---- RESULTS
1,1,1970-01-02 00:00:00
---- TYPES
smallint, int, timestamp
====

That is taken from
testdata/workloads/functional-query/queries/QueryTest/exprs.test.
That's a good test file to add a test case to, since it is testing
"exprs", and the bug is in  MathFunctions::Rand, which is defined in
be/src/exprs.

First, let's run all of the exprs tests to see that they pass. You can
see them called in tests/query_test/test_exprs.py. The Python scrips
in tests/ can run these .test files by calling ImpalaTestSuite's
run_test_case() method with an abbreviated name of the .test file. In
test_exprs.py, this looks like

self.run_test_case('QueryTest/exprs', vector)

That call is in the method TestExprs.test_exprs(); you can invoke it with:

./bin/impala-py.test
tests/query_test/test_exprs.py::TestExprs::test_exprs --sanity

This should take about 40 seconds and should pass, indicated by a
return value of 0 and a green line printed to the terminal reading:

...====== 1 passed in 39.85 seconds ======...

Now add a test case, following the example from the ticket and the
format in exprs.test. Run the test again; it should fail.

Fix the bug and run the test again. Once the test is passing, follow
the instructions on the wiki to send your patch for code review:
https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala

Reply via email to