[jira] [Created] (IMPALA-8023) Fix PlannerTest to handle error lines consistently
Paul Rogers created IMPALA-8023: --- Summary: Fix PlannerTest to handle error lines consistently Key: IMPALA-8023 URL: https://issues.apache.org/jira/browse/IMPALA-8023 Project: IMPALA Issue Type: Improvement Components: Frontend Affects Versions: Impala 3.1.0 Reporter: Paul Rogers {{PlannerTest}} works by running a query from a .test file, generating a plan, and comparing that plan to a "golden" expected result. It work well for most cases. We can use Eclipse's diff tools to compare the actual with expected files, and to copy across any expected changes that result from changes to the planner code. Once case that does *not* work are exceptions. When PlannerTest indicates encounters failure, it emits a line such as the following to the actual results file: {noformat} org.apache.impala.common.NotImplementedException: Scan of table 't' in format 'RC_FILE' is not supported because the table has a column 's' with a complex type 'STRUCT'. {noformat} Yet, in order for the comparison to pass, the golden file must contain the error in the following form: {noformat} NotImplementedException: Scan of table 'functional.complextypes_fileformat' in format 'TEXT' is not supported because the table has a column 's' with a complex type 'STRUCT'. {noformat} Note that the actual output includes the package prefix, the expected error must *not* include that prefix. The result is that: * When comparing files, one must learn to ignore the differences between these lines: the differences are *not* the reason why a test might fail, and * When "rebasing" a file, one must copy all expected changes *except* the error lines. In short, this is a real nuisance. Use a filter mechanism to fix this once and for all. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8024) HBase table cardinality estimates are wrong
Paul Rogers created IMPALA-8024: --- Summary: HBase table cardinality estimates are wrong Key: IMPALA-8024 URL: https://issues.apache.org/jira/browse/IMPALA-8024 Project: IMPALA Issue Type: Bug Components: Frontend Affects Versions: Impala 3.1.0 Reporter: Paul Rogers IMPALA-8021 added cardinality estimates to EXPLAIN plan output. Running some of our {{PlannerTest}} files revealed that our HBase cardinality estimates are very poor, even for our simple test tables. For example, for {{functional_hbase.alltypessmall}}: {{count\(*)}} tells us that there are 100 rows: {noformat} select count(*) from functional_hbase.alltypessmall +--+ | count(*) | +--+ | 100 | +--+ {noformat} Table stats claim that there are only 60 rows: {noformat} show table stats functional_hbase.alltypessmall; +-+--++--+ | Region Location | Start RowKey | Est. #Rows | Size | +-+--++--+ | localhost | | 10 | 0B | | localhost | 1| 10 | 0B | | localhost | 3| 10 | 0B | | localhost | 5| 10 | 0B | | localhost | 7| 10 | 0B | | localhost | 9| 10 | 0B | | Total | | 60 | 0B | +-+--++--+ {noformat} The NDV stats show that there must be at least 100 rows: {noformat} show column stats functional_hbase.alltypessmall +-+---+--++--+--+ | Column | Type | #Distinct Values | #Nulls | Max Size | Avg Size | +-+---+--++--+--+ | id | INT | 99 | 0 | 4| 4 | ... | timestamp_col | TIMESTAMP | 100 | 0 | 16 | 16 | ... +-+---+--++--+--+ {noformat} Planning a query, the most critical part, thinks there are only 50 rows: {noformat} select * from functional.alltypesagg join functional_hbase.alltypessmall using (id, int_col) |--01:SCAN HBASE [functional_hbase.alltypessmall] | row-size=89B cardinality=50 {noformat} We need a more reliable estimate. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8026) Actual row counts for nested loop join are meaningless
Paul Rogers created IMPALA-8026: --- Summary: Actual row counts for nested loop join are meaningless Key: IMPALA-8026 URL: https://issues.apache.org/jira/browse/IMPALA-8026 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.1.0 Reporter: Paul Rogers Consider this extract from a query plan: {noformat} Operator #Rows Est. #Rows -- … | 10:HASH JOIN 9.53M 18.14K | |--19:EXCHANGE 1 1 | | 00:SCAN HDFS1 1 | 06:NESTED LOOP JOIN4.88B 863.84K | |--18:EXCHANGE 1 1 | | 04:SCAN HDFS1 1 | 05:HASH JOIN 9.53M 863.84K {noformat} If the above is to be believed, the 06 nested loop join produced 5 billion rows. But, the actual number is far too huge for that: joining 1 row with 10 million rows cannot produce 500 times that number of rows. It appears that the nested loop join actually processed and returned the 9.5 million rows, since that is the same number produced by the 10 hash join which joins a single row with the output of the nested loop join. Because this same bogus result appears across multiple plans, it is likely that the actual number is completely wrong and bears no relation to the number of rows actually returned. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-7896) Literals should not need explicit analyze step
[ https://issues.apache.org/jira/browse/IMPALA-7896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers resolved IMPALA-7896. - Resolution: Fixed > Literals should not need explicit analyze step > -- > > Key: IMPALA-7896 > URL: https://issues.apache.org/jira/browse/IMPALA-7896 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > The Impala FE has the concept of a _lteral_ (string, boolean, null, number.) > Originally, literals could only be created as part of the AST. Hence, all > literals are subclasses of {{LiteralExpr}} which are {{ExprNodes}}. The > analysis step is used to set the type of the literal numbers, when not known > at create time. If literals were used only in the AST, this would be fine, > they could be analyzed with an analyzer. > In fact, as the code has evolved, {{LiteralExpr}} nodes are created via the > catalog, which has no analyzer. To fudge the issue, the > {{LiteralExpr.create()}} function does analysis with a null analyzer. This, > in turn, means that the {{analyze()}} code needs to special case a null > analyzer. This, in turn, leads to brittle, error prone code. > Since literals are immutable (except, sadly, for type), it is better that > they start analyzed. Since the only attribute which must be set is the type, > and the type can be known at create time, we have the {{analyze()}} be an > optional no-op, leading to cleaner semantics. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-7888) Incorrect NumericLiteral overflow checks for FLOAT, DOUBLE
[ https://issues.apache.org/jira/browse/IMPALA-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers resolved IMPALA-7888. - Resolution: Fixed > Incorrect NumericLiteral overflow checks for FLOAT, DOUBLE > -- > > Key: IMPALA-7888 > URL: https://issues.apache.org/jira/browse/IMPALA-7888 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > Consider the following (new) unit test: > {code:java} > assertFalse(NumericLiteral.isOverflow(BigDecimal.ZERO, Type.FLOAT)); > {code} > This test fails (that is, the value zero, so the method claims, overflows a > FLOAT.) > The reason is a misunderstanding of the meaning of {{MIN_VALUE}} for Float: > {code:java} > case FLOAT: > return (value.compareTo(BigDecimal.valueOf(Float.MAX_VALUE)) > 0 || > value.compareTo(BigDecimal.valueOf(Float.MIN_VALUE)) < 0); > {code} > For Float, {{MIN_VALUE}} is the smallest positive number that Float can > represent: > {code:java} > public static final float MIN_VALUE = 0x0.02P-126f; // 1.4e-45f > {code} > The value that the Impala code wants to check it {{- Float.MAX_VALUE}}. > The only reason that this is not marked as more serious is that the method > appears to be used in only one place, and that place does not use {{FLOAT}} > values. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-7887) NumericLiteral fails to detect numeric overflow
[ https://issues.apache.org/jira/browse/IMPALA-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers resolved IMPALA-7887. - Resolution: Fixed > NumericLiteral fails to detect numeric overflow > --- > > Key: IMPALA-7887 > URL: https://issues.apache.org/jira/browse/IMPALA-7887 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > The {{NumericLiteral}} constructor takes a value and a type. The code does > not check that the value is within range of the type, allowing nonsensical > values: > {code:java} > NumericLiteral n = new NumericLiteral(new BigDecimal("123.45"), > ScalarType.createDecimalType(3, 1)); > System.out.println(n.getValue().toString()); > n = new NumericLiteral(new BigDecimal(Integer.MAX_VALUE), > Type.TINYINT); > System.out.println(n.getValue().toString()); > {code} > Prints: > {noformat} > 123.45 > 2147483647 > {noformat} > The value 123.45 is not valid for DECIMAL(3,1), nor is 2^31 valid for TINYINT. > According to the SQL-2016 standard, section 4.4: > bq. If an assignment of some number would result in a loss of its most > significant digit, an exception condition is raised. > The purpose of the constructor appears to be for "friendly" use where the > caller promises not to create incorrect literals. Better would be to enforce > this rule. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-7886) NumericLiteral constructor fails to round values to Decimal type
[ https://issues.apache.org/jira/browse/IMPALA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers resolved IMPALA-7886. - Resolution: Fixed > NumericLiteral constructor fails to round values to Decimal type > > > Key: IMPALA-7886 > URL: https://issues.apache.org/jira/browse/IMPALA-7886 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > According to the SQL spec, section 4.4, regarding numeric assignment: > bq. If least significant digits are lost, implementation defined rounding or > truncating occurs, with no exception condition being raised > When creating a {{NumericLiteral}} via the constructor, no rounding occurs. > The value has more precision than specified by the type. > Create a NumericLiteral and test it as follows: > {code:java} > NumericLiteral n = new NumericLiteral(new BigDecimal("1.567"), > ScalarType.createDecimalType(2, 1)); > assertEquals(ScalarType.createDecimalType(2, 1), n.getType()); > assertEquals("1.6", n.getValue().toString()); > {code} > The above test fails because the value in the literal is “1.567”, it has not > been rounded to fit the type of the literal as required by the SQL standard. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-7891) Analyzer does not detect numeric overflow in CAST
[ https://issues.apache.org/jira/browse/IMPALA-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers resolved IMPALA-7891. - Resolution: Fixed > Analyzer does not detect numeric overflow in CAST > - > > Key: IMPALA-7891 > URL: https://issues.apache.org/jira/browse/IMPALA-7891 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > Consider the following SQL: > {code:sql} > SELECT CAST(257 AS TINYINT) AS c FROM functional.alltypestiny > {code} > Run this in the shell: > {noformat} > +--+ > | cast(257 as tinyint) | > +--+ > | 1| > +--+ > {noformat} > The SQL-2016 standard, section 4.4 states: > bq. If an assignment of some number would result in a loss of its most > significant digit, an exception condition is raised. > Expected an error rather than wrong result. > This is not as simple as it appears. The BE is written in C which does not > detect integer overflow. So, one could argue that the behavior is correct: > Impala makes no guarantees about integer overflow. > On the other hand, the math above is actually done in the planner; the > serialized plan contains an incorrect value. One could argue that the planner > should be more strict than the runtime, so that this is an error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-7894) Parser does not catch double overflow
[ https://issues.apache.org/jira/browse/IMPALA-7894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers resolved IMPALA-7894. - Resolution: Fixed > Parser does not catch double overflow > - > > Key: IMPALA-7894 > URL: https://issues.apache.org/jira/browse/IMPALA-7894 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > The test case {{ParserTest.TestNumericLiteralMinMaxValues()}} incorrectly > expects success on {{DOUBLE}} constant overflow and underflow: > {code:java} > ParsesOk(String.format("select %s1", Double.toString(Double.MIN_VALUE))); > ParsesOk(String.format("select %s1", Double.toString(Double.MAX_VALUE))); > {code} > These values are actually out of range as tested by > {{NumericLiteral.analyzeImpl()}}. However, the parser uses a code path that > bypasses this check. > Expected that the above tests will fail, not succeed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8027) KRPC datastream timing out on both the receiver and sender side even in a minicluster
Bikramjeet Vig created IMPALA-8027: -- Summary: KRPC datastream timing out on both the receiver and sender side even in a minicluster Key: IMPALA-8027 URL: https://issues.apache.org/jira/browse/IMPALA-8027 Project: IMPALA Issue Type: Bug Components: Distributed Exec Affects Versions: Impala 3.2.0 Reporter: Bikramjeet Vig Assignee: Michael Ho krpc datastreams seem to time out at the same time at both sender and receiver causing two running queries to fail. This happened while running core tests on s3. Logs from coordinator: {noformat} I1228 05:18:56.202587 13396 krpc-data-stream-mgr.cc:353] Sender 127.0.0.1 timed out waiting for receiver fragment instance: 8f46b2518734bef1:6ef2d404, dest node: 2 I1228 05:18:56.203061 13396 rpcz_store.cc:265] Call impala.DataStreamService.TransmitData from 127.0.0.1:53118 (request call id 11274) took 120782ms. Request Metrics: {} I1228 05:18:56.203114 13396 krpc-data-stream-mgr.cc:353] Sender 127.0.0.1 timed out waiting for receiver fragment instance: 194f5b70907ac97c:84a116d6, dest node: 2 I1228 05:18:56.203136 13396 rpcz_store.cc:265] Call impala.DataStreamService.TransmitData from 127.0.0.1:53110 (request call id 8637) took 123811ms. Request Metrics: {} I1228 05:18:56.203155 13396 krpc-data-stream-mgr.cc:353] Sender 127.0.0.1 timed out waiting for receiver fragment instance: 194f5b70907ac97c:84a116d6, dest node: 2 I1228 05:18:56.203167 13396 rpcz_store.cc:265] Call impala.DataStreamService.TransmitData from 127.0.0.1:53118 (request call id 11273) took 123776ms. Request Metrics: {} I1228 05:18:56.203181 13396 krpc-data-stream-mgr.cc:408] Reduced stream ID cache from 413 items, to 410, eviction took: 1ms I1228 05:18:56.204746 13377 coordinator.cc:707] Backend completed: host=impala-ec2-centos74-m5-4xlarge-ondemand-07b3.vpc.cloudera.com:22001 remaining=2 query_id=8f46b2518734bef1:6ef2d404 I1228 05:18:56.204756 13377 coordinator-backend-state.cc:262] query_id=8f46b2518734bef1:6ef2d404: first in-progress backend: impala-ec2-centos74-m5-4xlarge-ondemand-07b3.vpc.cloudera.com:22000 I1228 05:18:56.204769 13377 coordinator.cc:522] ExecState: query id=8f46b2518734bef1:6ef2d404 finstance=8f46b2518734bef1:6ef2d4040001 on host=impala-ec2-centos74-m5-4xlarge-ondemand-07b3.vpc.cloudera.com:22001 (EXECUTING -> ERROR) status=Sender 127.0.0.1 timed out waiting for receiver fragment instance: 8f46b2518734bef1:6ef2d404, dest node: 2 {noformat} Logs from executor: {noformat} E1228 05:18:56.203181 26715 krpc-data-stream-sender.cc:343] channel send to 127.0.0.1:27000 failed: (fragment_instance_id=8f46b2518734bef1:6ef2d404): Sender 127.0.0.1 timed out waiting for receiver fragment instance: 8f46b2518734bef1:6ef2d404, dest node: 2 E1228 05:18:56.203256 26682 krpc-data-stream-sender.cc:343] channel send to 127.0.0.1:27000 failed: (fragment_instance_id=194f5b70907ac97c:84a116d6): Sender 127.0.0.1 timed out waiting for receiver fragment instance: 194f5b70907ac97c:84a116d6, dest node: 2 I1228 05:18:56.203451 26715 query-state.cc:576] Instance completed. instance_id=8f46b2518734bef1:6ef2d4040001 #in-flight=3 status=DATASTREAM_SENDER_TIMEOUT: Sender 127.0.0.1 timed out waiting for receiver fragment instance: 8f46b2518734bef1:6ef2d404, dest node: 2 I1228 05:18:56.203485 26713 query-state.cc:249] UpdateBackendExecState(): last report for 8f46b2518734bef1:6ef2d404 I1228 05:18:56.203514 26682 query-state.cc:576] Instance completed. instance_id=194f5b70907ac97c:84a116d60003 #in-flight=2 status=DATASTREAM_SENDER_TIMEOUT: Sender 127.0.0.1 timed out waiting for receiver fragment instance: 194f5b70907ac97c:84a116d6, dest node: 2 I1228 05:18:56.203536 26680 query-state.cc:249] UpdateBackendExecState(): last report for 194f5b70907ac97c:84a116d6 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8028) BETWEEN predicate failures - BetweenPredicate needs to be rewritten into a CompoundPredicate
Greg Rahn created IMPALA-8028: - Summary: BETWEEN predicate failures - BetweenPredicate needs to be rewritten into a CompoundPredicate Key: IMPALA-8028 URL: https://issues.apache.org/jira/browse/IMPALA-8028 Project: IMPALA Issue Type: Bug Components: Frontend Affects Versions: Impala 3.2.0 Reporter: Greg Rahn Assignee: Paul Rogers It appears that there are between predicates that Impala has challenges handling. These result in the error: {noformat} IllegalStateException: BetweenPredicate needs to be rewritten into a CompoundPredicate. {noformat} Attaching test cases in file. Tested on {noformat} impalad version 3.2.0-cdh6.x-SNAPSHOT RELEASE (build 7effb62de5add60eb071ae5331e80a42cf7b0dc1) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8029) Support DISTINCT with aggregates
Greg Rahn created IMPALA-8029: - Summary: Support DISTINCT with aggregates Key: IMPALA-8029 URL: https://issues.apache.org/jira/browse/IMPALA-8029 Project: IMPALA Issue Type: Bug Components: Frontend Affects Versions: Impala 3.2.0 Reporter: Greg Rahn Assignee: Paul Rogers The following is valid syntax, but throws an error in Impala 3.2.0: {noformat} sql> select distinct sum(col0) from tab0; ERROR: AnalysisException: cannot combine SELECT DISTINCT with aggregate functions or GROUP BY {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IMPALA-8025) End-to-end tests sometimes unhelpfully truncate error output
[ https://issues.apache.org/jira/browse/IMPALA-8025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730545#comment-16730545 ] Philip Zeyliger commented on IMPALA-8025: - {{run-tests.py}} is a wrapper around {{py.test}}. It's likely you can run the test directly with {{impala-py.test tests/metadata/test_explain.py -vv -s}} and get a bit more of what you want. If we want to always pass {{-vv}} to py.test, the invocation is at and around https://github.com/apache/impala/blob/master/tests/run-tests.py#L294 . > End-to-end tests sometimes unhelpfully truncate error output > > > Key: IMPALA-8025 > URL: https://issues.apache.org/jira/browse/IMPALA-8025 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.1.0 >Reporter: Paul Rogers >Priority: Minor > > Made a change to the DESCRIBE output per IMPALA-8021. This required > adjustment to {{metadata/test_explain.py}} to account for the change. The > test encodes a "golden" version in a .test file using a specialized syntax. > But, when running the test, the output shows the first few lines (which do > match), then elides the rest: > {noformat} > E row_regex:.*mem-estimate=[0-9.]*[A-Z]*B mem-reservation=[0-9.]*[A-Z]*B > thread-reservation=0 == '| mem-estimate=0B mem-reservation=0B > thread-reservation=0' > E Detailed information truncated (45 more lines), use "-vv" to show > {noformat} > As it turns out, passing "-vv" to {{tests/run-tests.py}} does not seem to > pass it to the test program, so that did not work. > The .xml file for the test contains the same message: output is truncated. > Same in the .log file. > So, the question is, how is a developer to figure out the issue if we can't > see the actual error lines? This is the kind of thing that converts a simple > task into a multi-hour ordeal. > Right now, the only solution is to rerun the tests with {{--update_results}} > flag to {{run-tests.py}}, then hunt down the generated output file. > Better would be to output the n lines before the error, rather than the first > n lines. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8023) Fix PlannerTest to handle error lines consistently
Paul Rogers created IMPALA-8023: --- Summary: Fix PlannerTest to handle error lines consistently Key: IMPALA-8023 URL: https://issues.apache.org/jira/browse/IMPALA-8023 Project: IMPALA Issue Type: Improvement Components: Frontend Affects Versions: Impala 3.1.0 Reporter: Paul Rogers {{PlannerTest}} works by running a query from a .test file, generating a plan, and comparing that plan to a "golden" expected result. It work well for most cases. We can use Eclipse's diff tools to compare the actual with expected files, and to copy across any expected changes that result from changes to the planner code. Once case that does *not* work are exceptions. When PlannerTest indicates encounters failure, it emits a line such as the following to the actual results file: {noformat} org.apache.impala.common.NotImplementedException: Scan of table 't' in format 'RC_FILE' is not supported because the table has a column 's' with a complex type 'STRUCT'. {noformat} Yet, in order for the comparison to pass, the golden file must contain the error in the following form: {noformat} NotImplementedException: Scan of table 'functional.complextypes_fileformat' in format 'TEXT' is not supported because the table has a column 's' with a complex type 'STRUCT'. {noformat} Note that the actual output includes the package prefix, the expected error must *not* include that prefix. The result is that: * When comparing files, one must learn to ignore the differences between these lines: the differences are *not* the reason why a test might fail, and * When "rebasing" a file, one must copy all expected changes *except* the error lines. In short, this is a real nuisance. Use a filter mechanism to fix this once and for all. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8024) HBase table cardinality estimates are wrong
Paul Rogers created IMPALA-8024: --- Summary: HBase table cardinality estimates are wrong Key: IMPALA-8024 URL: https://issues.apache.org/jira/browse/IMPALA-8024 Project: IMPALA Issue Type: Bug Components: Frontend Affects Versions: Impala 3.1.0 Reporter: Paul Rogers IMPALA-8021 added cardinality estimates to EXPLAIN plan output. Running some of our {{PlannerTest}} files revealed that our HBase cardinality estimates are very poor, even for our simple test tables. For example, for {{functional_hbase.alltypessmall}}: {{count\(*)}} tells us that there are 100 rows: {noformat} select count(*) from functional_hbase.alltypessmall +--+ | count(*) | +--+ | 100 | +--+ {noformat} Table stats claim that there are only 60 rows: {noformat} show table stats functional_hbase.alltypessmall; +-+--++--+ | Region Location | Start RowKey | Est. #Rows | Size | +-+--++--+ | localhost | | 10 | 0B | | localhost | 1| 10 | 0B | | localhost | 3| 10 | 0B | | localhost | 5| 10 | 0B | | localhost | 7| 10 | 0B | | localhost | 9| 10 | 0B | | Total | | 60 | 0B | +-+--++--+ {noformat} The NDV stats show that there must be at least 100 rows: {noformat} show column stats functional_hbase.alltypessmall +-+---+--++--+--+ | Column | Type | #Distinct Values | #Nulls | Max Size | Avg Size | +-+---+--++--+--+ | id | INT | 99 | 0 | 4| 4 | ... | timestamp_col | TIMESTAMP | 100 | 0 | 16 | 16 | ... +-+---+--++--+--+ {noformat} Planning a query, the most critical part, thinks there are only 50 rows: {noformat} select * from functional.alltypesagg join functional_hbase.alltypessmall using (id, int_col) |--01:SCAN HBASE [functional_hbase.alltypessmall] | row-size=89B cardinality=50 {noformat} We need a more reliable estimate. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8025) End-to-end tests sometimes unhelpfully truncate error output
Paul Rogers created IMPALA-8025: --- Summary: End-to-end tests sometimes unhelpfully truncate error output Key: IMPALA-8025 URL: https://issues.apache.org/jira/browse/IMPALA-8025 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.1.0 Reporter: Paul Rogers Made a change to the DESCRIBE output per IMPALA-8021. This required adjustment to {{metadata/test_explain.py}} to account for the change. The test encodes a "golden" version in a .test file using a specialized syntax. But, when running the test, the output shows the first few lines (which do match), then elides the rest: {noformat} E row_regex:.*mem-estimate=[0-9.]*[A-Z]*B mem-reservation=[0-9.]*[A-Z]*B thread-reservation=0 == '| mem-estimate=0B mem-reservation=0B thread-reservation=0' E Detailed information truncated (45 more lines), use "-vv" to show {noformat} As it turns out, passing "-vv" to {{tests/run-tests.py}} does not seem to pass it to the test program, so that did not work. The .xml file for the test contains the same message: output is truncated. Same in the .log file. So, the question is, how is a developer to figure out the issue if we can't see the actual error lines? This is the kind of thing that converts a simple task into a multi-hour ordeal. Right now, the only solution is to rerun the tests with {{--update_results}} flag to {{run-tests.py}}, then hunt down the generated output file. Better would be to output the n lines before the error, rather than the first n lines. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8026) Actual row counts for nested loop join are meaningless
Paul Rogers created IMPALA-8026: --- Summary: Actual row counts for nested loop join are meaningless Key: IMPALA-8026 URL: https://issues.apache.org/jira/browse/IMPALA-8026 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.1.0 Reporter: Paul Rogers Consider this extract from a query plan: {noformat} Operator #Rows Est. #Rows -- … | 10:HASH JOIN 9.53M 18.14K | |--19:EXCHANGE 1 1 | | 00:SCAN HDFS1 1 | 06:NESTED LOOP JOIN4.88B 863.84K | |--18:EXCHANGE 1 1 | | 04:SCAN HDFS1 1 | 05:HASH JOIN 9.53M 863.84K {noformat} If the above is to be believed, the 06 nested loop join produced 5 billion rows. But, the actual number is far too huge for that: joining 1 row with 10 million rows cannot produce 500 times that number of rows. It appears that the nested loop join actually processed and returned the 9.5 million rows, since that is the same number produced by the 10 hash join which joins a single row with the output of the nested loop join. Because this same bogus result appears across multiple plans, it is likely that the actual number is completely wrong and bears no relation to the number of rows actually returned. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-7887) NumericLiteral fails to detect numeric overflow
[ https://issues.apache.org/jira/browse/IMPALA-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers resolved IMPALA-7887. - Resolution: Fixed > NumericLiteral fails to detect numeric overflow > --- > > Key: IMPALA-7887 > URL: https://issues.apache.org/jira/browse/IMPALA-7887 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > The {{NumericLiteral}} constructor takes a value and a type. The code does > not check that the value is within range of the type, allowing nonsensical > values: > {code:java} > NumericLiteral n = new NumericLiteral(new BigDecimal("123.45"), > ScalarType.createDecimalType(3, 1)); > System.out.println(n.getValue().toString()); > n = new NumericLiteral(new BigDecimal(Integer.MAX_VALUE), > Type.TINYINT); > System.out.println(n.getValue().toString()); > {code} > Prints: > {noformat} > 123.45 > 2147483647 > {noformat} > The value 123.45 is not valid for DECIMAL(3,1), nor is 2^31 valid for TINYINT. > According to the SQL-2016 standard, section 4.4: > bq. If an assignment of some number would result in a loss of its most > significant digit, an exception condition is raised. > The purpose of the constructor appears to be for "friendly" use where the > caller promises not to create incorrect literals. Better would be to enforce > this rule. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-7886) NumericLiteral constructor fails to round values to Decimal type
[ https://issues.apache.org/jira/browse/IMPALA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers resolved IMPALA-7886. - Resolution: Fixed > NumericLiteral constructor fails to round values to Decimal type > > > Key: IMPALA-7886 > URL: https://issues.apache.org/jira/browse/IMPALA-7886 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > According to the SQL spec, section 4.4, regarding numeric assignment: > bq. If least significant digits are lost, implementation defined rounding or > truncating occurs, with no exception condition being raised > When creating a {{NumericLiteral}} via the constructor, no rounding occurs. > The value has more precision than specified by the type. > Create a NumericLiteral and test it as follows: > {code:java} > NumericLiteral n = new NumericLiteral(new BigDecimal("1.567"), > ScalarType.createDecimalType(2, 1)); > assertEquals(ScalarType.createDecimalType(2, 1), n.getType()); > assertEquals("1.6", n.getValue().toString()); > {code} > The above test fails because the value in the literal is “1.567”, it has not > been rounded to fit the type of the literal as required by the SQL standard. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-7896) Literals should not need explicit analyze step
[ https://issues.apache.org/jira/browse/IMPALA-7896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers resolved IMPALA-7896. - Resolution: Fixed > Literals should not need explicit analyze step > -- > > Key: IMPALA-7896 > URL: https://issues.apache.org/jira/browse/IMPALA-7896 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > The Impala FE has the concept of a _lteral_ (string, boolean, null, number.) > Originally, literals could only be created as part of the AST. Hence, all > literals are subclasses of {{LiteralExpr}} which are {{ExprNodes}}. The > analysis step is used to set the type of the literal numbers, when not known > at create time. If literals were used only in the AST, this would be fine, > they could be analyzed with an analyzer. > In fact, as the code has evolved, {{LiteralExpr}} nodes are created via the > catalog, which has no analyzer. To fudge the issue, the > {{LiteralExpr.create()}} function does analysis with a null analyzer. This, > in turn, means that the {{analyze()}} code needs to special case a null > analyzer. This, in turn, leads to brittle, error prone code. > Since literals are immutable (except, sadly, for type), it is better that > they start analyzed. Since the only attribute which must be set is the type, > and the type can be known at create time, we have the {{analyze()}} be an > optional no-op, leading to cleaner semantics. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-7891) Analyzer does not detect numeric overflow in CAST
[ https://issues.apache.org/jira/browse/IMPALA-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers resolved IMPALA-7891. - Resolution: Fixed > Analyzer does not detect numeric overflow in CAST > - > > Key: IMPALA-7891 > URL: https://issues.apache.org/jira/browse/IMPALA-7891 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > Consider the following SQL: > {code:sql} > SELECT CAST(257 AS TINYINT) AS c FROM functional.alltypestiny > {code} > Run this in the shell: > {noformat} > +--+ > | cast(257 as tinyint) | > +--+ > | 1| > +--+ > {noformat} > The SQL-2016 standard, section 4.4 states: > bq. If an assignment of some number would result in a loss of its most > significant digit, an exception condition is raised. > Expected an error rather than wrong result. > This is not as simple as it appears. The BE is written in C which does not > detect integer overflow. So, one could argue that the behavior is correct: > Impala makes no guarantees about integer overflow. > On the other hand, the math above is actually done in the planner; the > serialized plan contains an incorrect value. One could argue that the planner > should be more strict than the runtime, so that this is an error. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-7894) Parser does not catch double overflow
[ https://issues.apache.org/jira/browse/IMPALA-7894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers resolved IMPALA-7894. - Resolution: Fixed > Parser does not catch double overflow > - > > Key: IMPALA-7894 > URL: https://issues.apache.org/jira/browse/IMPALA-7894 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > The test case {{ParserTest.TestNumericLiteralMinMaxValues()}} incorrectly > expects success on {{DOUBLE}} constant overflow and underflow: > {code:java} > ParsesOk(String.format("select %s1", Double.toString(Double.MIN_VALUE))); > ParsesOk(String.format("select %s1", Double.toString(Double.MAX_VALUE))); > {code} > These values are actually out of range as tested by > {{NumericLiteral.analyzeImpl()}}. However, the parser uses a code path that > bypasses this check. > Expected that the above tests will fail, not succeed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-7888) Incorrect NumericLiteral overflow checks for FLOAT, DOUBLE
[ https://issues.apache.org/jira/browse/IMPALA-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers resolved IMPALA-7888. - Resolution: Fixed > Incorrect NumericLiteral overflow checks for FLOAT, DOUBLE > -- > > Key: IMPALA-7888 > URL: https://issues.apache.org/jira/browse/IMPALA-7888 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > Consider the following (new) unit test: > {code:java} > assertFalse(NumericLiteral.isOverflow(BigDecimal.ZERO, Type.FLOAT)); > {code} > This test fails (that is, the value zero, so the method claims, overflows a > FLOAT.) > The reason is a misunderstanding of the meaning of {{MIN_VALUE}} for Float: > {code:java} > case FLOAT: > return (value.compareTo(BigDecimal.valueOf(Float.MAX_VALUE)) > 0 || > value.compareTo(BigDecimal.valueOf(Float.MIN_VALUE)) < 0); > {code} > For Float, {{MIN_VALUE}} is the smallest positive number that Float can > represent: > {code:java} > public static final float MIN_VALUE = 0x0.02P-126f; // 1.4e-45f > {code} > The value that the Impala code wants to check it {{- Float.MAX_VALUE}}. > The only reason that this is not marked as more serious is that the method > appears to be used in only one place, and that place does not use {{FLOAT}} > values. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8027) KRPC datastream timing out on both the receiver and sender side even in a minicluster
Bikramjeet Vig created IMPALA-8027: -- Summary: KRPC datastream timing out on both the receiver and sender side even in a minicluster Key: IMPALA-8027 URL: https://issues.apache.org/jira/browse/IMPALA-8027 Project: IMPALA Issue Type: Bug Components: Distributed Exec Affects Versions: Impala 3.2.0 Reporter: Bikramjeet Vig Assignee: Michael Ho krpc datastreams seem to time out at the same time at both sender and receiver causing two running queries to fail. This happened while running core tests on s3. Logs from coordinator: {noformat} I1228 05:18:56.202587 13396 krpc-data-stream-mgr.cc:353] Sender 127.0.0.1 timed out waiting for receiver fragment instance: 8f46b2518734bef1:6ef2d404, dest node: 2 I1228 05:18:56.203061 13396 rpcz_store.cc:265] Call impala.DataStreamService.TransmitData from 127.0.0.1:53118 (request call id 11274) took 120782ms. Request Metrics: {} I1228 05:18:56.203114 13396 krpc-data-stream-mgr.cc:353] Sender 127.0.0.1 timed out waiting for receiver fragment instance: 194f5b70907ac97c:84a116d6, dest node: 2 I1228 05:18:56.203136 13396 rpcz_store.cc:265] Call impala.DataStreamService.TransmitData from 127.0.0.1:53110 (request call id 8637) took 123811ms. Request Metrics: {} I1228 05:18:56.203155 13396 krpc-data-stream-mgr.cc:353] Sender 127.0.0.1 timed out waiting for receiver fragment instance: 194f5b70907ac97c:84a116d6, dest node: 2 I1228 05:18:56.203167 13396 rpcz_store.cc:265] Call impala.DataStreamService.TransmitData from 127.0.0.1:53118 (request call id 11273) took 123776ms. Request Metrics: {} I1228 05:18:56.203181 13396 krpc-data-stream-mgr.cc:408] Reduced stream ID cache from 413 items, to 410, eviction took: 1ms I1228 05:18:56.204746 13377 coordinator.cc:707] Backend completed: host=impala-ec2-centos74-m5-4xlarge-ondemand-07b3.vpc.cloudera.com:22001 remaining=2 query_id=8f46b2518734bef1:6ef2d404 I1228 05:18:56.204756 13377 coordinator-backend-state.cc:262] query_id=8f46b2518734bef1:6ef2d404: first in-progress backend: impala-ec2-centos74-m5-4xlarge-ondemand-07b3.vpc.cloudera.com:22000 I1228 05:18:56.204769 13377 coordinator.cc:522] ExecState: query id=8f46b2518734bef1:6ef2d404 finstance=8f46b2518734bef1:6ef2d4040001 on host=impala-ec2-centos74-m5-4xlarge-ondemand-07b3.vpc.cloudera.com:22001 (EXECUTING -> ERROR) status=Sender 127.0.0.1 timed out waiting for receiver fragment instance: 8f46b2518734bef1:6ef2d404, dest node: 2 {noformat} Logs from executor: {noformat} E1228 05:18:56.203181 26715 krpc-data-stream-sender.cc:343] channel send to 127.0.0.1:27000 failed: (fragment_instance_id=8f46b2518734bef1:6ef2d404): Sender 127.0.0.1 timed out waiting for receiver fragment instance: 8f46b2518734bef1:6ef2d404, dest node: 2 E1228 05:18:56.203256 26682 krpc-data-stream-sender.cc:343] channel send to 127.0.0.1:27000 failed: (fragment_instance_id=194f5b70907ac97c:84a116d6): Sender 127.0.0.1 timed out waiting for receiver fragment instance: 194f5b70907ac97c:84a116d6, dest node: 2 I1228 05:18:56.203451 26715 query-state.cc:576] Instance completed. instance_id=8f46b2518734bef1:6ef2d4040001 #in-flight=3 status=DATASTREAM_SENDER_TIMEOUT: Sender 127.0.0.1 timed out waiting for receiver fragment instance: 8f46b2518734bef1:6ef2d404, dest node: 2 I1228 05:18:56.203485 26713 query-state.cc:249] UpdateBackendExecState(): last report for 8f46b2518734bef1:6ef2d404 I1228 05:18:56.203514 26682 query-state.cc:576] Instance completed. instance_id=194f5b70907ac97c:84a116d60003 #in-flight=2 status=DATASTREAM_SENDER_TIMEOUT: Sender 127.0.0.1 timed out waiting for receiver fragment instance: 194f5b70907ac97c:84a116d6, dest node: 2 I1228 05:18:56.203536 26680 query-state.cc:249] UpdateBackendExecState(): last report for 194f5b70907ac97c:84a116d6 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8023) Fix PlannerTest to handle error lines consistently
[ https://issues.apache.org/jira/browse/IMPALA-8023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8023 started by Paul Rogers. --- > Fix PlannerTest to handle error lines consistently > -- > > Key: IMPALA-8023 > URL: https://issues.apache.org/jira/browse/IMPALA-8023 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Affects Versions: Impala 3.1.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > {{PlannerTest}} works by running a query from a .test file, generating a > plan, and comparing that plan to a "golden" expected result. It work well for > most cases. We can use Eclipse's diff tools to compare the actual with > expected files, and to copy across any expected changes that result from > changes to the planner code. > Once case that does *not* work are exceptions. When PlannerTest indicates > encounters failure, it emits a line such as the following to the actual > results file: > {noformat} > org.apache.impala.common.NotImplementedException: Scan of table 't' in format > 'RC_FILE' is not supported because the table has a column 's' with a complex > type 'STRUCT'. > {noformat} > Yet, in order for the comparison to pass, the golden file must contain the > error in the following form: > {noformat} > NotImplementedException: Scan of table 'functional.complextypes_fileformat' > in format 'TEXT' is not supported because the table has a column 's' with a > complex type 'STRUCT'. > {noformat} > Note that the actual output includes the package prefix, the expected error > must *not* include that prefix. > The result is that: > * When comparing files, one must learn to ignore the differences between > these lines: the differences are *not* the reason why a test might fail, and > * When "rebasing" a file, one must copy all expected changes *except* the > error lines. > In short, this is a real nuisance. Use a filter mechanism to fix this once > and for all. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8023) Fix PlannerTest to handle error lines consistently
[ https://issues.apache.org/jira/browse/IMPALA-8023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated IMPALA-8023: Description: {{PlannerTest}} works by running a query from a .test file, generating a plan, and comparing that plan to a "golden" expected result. It work well for most cases. We can use Eclipse's diff tools to compare the actual with expected files, and to copy across any expected changes that result from changes to the planner code. Once case that does *not* work are exceptions. When PlannerTest indicates encounters failure, it emits a line such as the following to the actual results file: {noformat} org.apache.impala.common.NotImplementedException: Scan of table 't' in format 'RC_FILE' is not supported because the table has a column 's' with a complex type 'STRUCT'. {noformat} Yet, in order for the comparison to pass, the golden file must contain the error in the following form: {noformat} NotImplementedException: Scan of table 'functional.complextypes_fileformat' in format 'TEXT' is not supported because the table has a column 's' with a complex type 'STRUCT'. {noformat} Note that the actual output includes the package prefix, the expected error must *not* include that prefix. The result is that: * When comparing files, one must learn to ignore the differences between these lines: the differences are *not* the reason why a test might fail, and * When "rebasing" a file, one must copy all expected changes *except* the error lines. In short, this is a real nuisance. Use a filter mechanism to fix this once and for all. The problem is that the text appended to the "actual output" is not the same as that used for comparison. A simple two-line fix will eliminate this issue. Current code in {{PlanerTestBase.handleException()}}: {code:java} actualOutput.append(e.toString() + "\n"); ... String actualErrorMsg = e.getClass().getSimpleName() + ": " + e.getMessage(); {code} Proposed: {code:java} String actualErrorMsg = e.getClass().getSimpleName() + ": " + e.getMessage(); actualOutput.append(actualErrorMsg).append("\n"); {code} was: {{PlannerTest}} works by running a query from a .test file, generating a plan, and comparing that plan to a "golden" expected result. It work well for most cases. We can use Eclipse's diff tools to compare the actual with expected files, and to copy across any expected changes that result from changes to the planner code. Once case that does *not* work are exceptions. When PlannerTest indicates encounters failure, it emits a line such as the following to the actual results file: {noformat} org.apache.impala.common.NotImplementedException: Scan of table 't' in format 'RC_FILE' is not supported because the table has a column 's' with a complex type 'STRUCT'. {noformat} Yet, in order for the comparison to pass, the golden file must contain the error in the following form: {noformat} NotImplementedException: Scan of table 'functional.complextypes_fileformat' in format 'TEXT' is not supported because the table has a column 's' with a complex type 'STRUCT'. {noformat} Note that the actual output includes the package prefix, the expected error must *not* include that prefix. The result is that: * When comparing files, one must learn to ignore the differences between these lines: the differences are *not* the reason why a test might fail, and * When "rebasing" a file, one must copy all expected changes *except* the error lines. In short, this is a real nuisance. Use a filter mechanism to fix this once and for all. > Fix PlannerTest to handle error lines consistently > -- > > Key: IMPALA-8023 > URL: https://issues.apache.org/jira/browse/IMPALA-8023 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Affects Versions: Impala 3.1.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > > {{PlannerTest}} works by running a query from a .test file, generating a > plan, and comparing that plan to a "golden" expected result. It work well for > most cases. We can use Eclipse's diff tools to compare the actual with > expected files, and to copy across any expected changes that result from > changes to the planner code. > Once case that does *not* work are exceptions. When PlannerTest indicates > encounters failure, it emits a line such as the following to the actual > results file: > {noformat} > org.apache.impala.common.NotImplementedException: Scan of table 't' in format > 'RC_FILE' is not supported because the table has a column 's' with a complex > type 'STRUCT'. > {noformat} > Yet, in order for the comparison to pass, the golden file must contain the > error in the following form: > {noformat} > NotImplementedException: Scan of table
[jira] [Created] (IMPALA-8028) BETWEEN predicate failures - BetweenPredicate needs to be rewritten into a CompoundPredicate
Greg Rahn created IMPALA-8028: - Summary: BETWEEN predicate failures - BetweenPredicate needs to be rewritten into a CompoundPredicate Key: IMPALA-8028 URL: https://issues.apache.org/jira/browse/IMPALA-8028 Project: IMPALA Issue Type: Bug Components: Frontend Affects Versions: Impala 3.2.0 Reporter: Greg Rahn Assignee: Paul Rogers It appears that there are between predicates that Impala has challenges handling. These result in the error: {noformat} IllegalStateException: BetweenPredicate needs to be rewritten into a CompoundPredicate. {noformat} Attaching test cases in file. Tested on {noformat} impalad version 3.2.0-cdh6.x-SNAPSHOT RELEASE (build 7effb62de5add60eb071ae5331e80a42cf7b0dc1) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8028) BETWEEN predicate failures - BetweenPredicate needs to be rewritten into a CompoundPredicate
[ https://issues.apache.org/jira/browse/IMPALA-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Rahn updated IMPALA-8028: -- Labels: ansi-sql (was: ) > BETWEEN predicate failures - BetweenPredicate needs to be rewritten into a > CompoundPredicate > > > Key: IMPALA-8028 > URL: https://issues.apache.org/jira/browse/IMPALA-8028 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.2.0 >Reporter: Greg Rahn >Assignee: Paul Rogers >Priority: Major > Labels: ansi-sql > Attachments: BetweenPredicate.sql > > > It appears that there are between predicates that Impala has challenges > handling. These result in the error: > {noformat} > IllegalStateException: BetweenPredicate needs to be rewritten into a > CompoundPredicate. > {noformat} > Attaching test cases in file. > Tested on > {noformat} > impalad version 3.2.0-cdh6.x-SNAPSHOT RELEASE (build > 7effb62de5add60eb071ae5331e80a42cf7b0dc1) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8029) Support DISTINCT with aggregates
[ https://issues.apache.org/jira/browse/IMPALA-8029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Rahn updated IMPALA-8029: -- Labels: ansi-sql (was: ) > Support DISTINCT with aggregates > > > Key: IMPALA-8029 > URL: https://issues.apache.org/jira/browse/IMPALA-8029 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.2.0 >Reporter: Greg Rahn >Assignee: Paul Rogers >Priority: Major > Labels: ansi-sql > > The following is valid syntax, but throws an error in Impala 3.2.0: > {noformat} > sql> select distinct sum(col0) from tab0; > ERROR: AnalysisException: cannot combine SELECT DISTINCT with aggregate > functions or GROUP BY > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8029) Support DISTINCT with aggregates
Greg Rahn created IMPALA-8029: - Summary: Support DISTINCT with aggregates Key: IMPALA-8029 URL: https://issues.apache.org/jira/browse/IMPALA-8029 Project: IMPALA Issue Type: Bug Components: Frontend Affects Versions: Impala 3.2.0 Reporter: Greg Rahn Assignee: Paul Rogers The following is valid syntax, but throws an error in Impala 3.2.0: {noformat} sql> select distinct sum(col0) from tab0; ERROR: AnalysisException: cannot combine SELECT DISTINCT with aggregate functions or GROUP BY {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8028) BETWEEN predicate failures - BetweenPredicate needs to be rewritten into a CompoundPredicate
[ https://issues.apache.org/jira/browse/IMPALA-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Rahn updated IMPALA-8028: -- Attachment: BetweenPredicate.sql > BETWEEN predicate failures - BetweenPredicate needs to be rewritten into a > CompoundPredicate > > > Key: IMPALA-8028 > URL: https://issues.apache.org/jira/browse/IMPALA-8028 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.2.0 >Reporter: Greg Rahn >Assignee: Paul Rogers >Priority: Major > Attachments: BetweenPredicate.sql > > > It appears that there are between predicates that Impala has challenges > handling. These result in the error: > {noformat} > IllegalStateException: BetweenPredicate needs to be rewritten into a > CompoundPredicate. > {noformat} > Attaching test cases in file. > Tested on > {noformat} > impalad version 3.2.0-cdh6.x-SNAPSHOT RELEASE (build > 7effb62de5add60eb071ae5331e80a42cf7b0dc1) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org