[jira] [Created] (IMPALA-8023) Fix PlannerTest to handle error lines consistently

2018-12-28 Thread Paul Rogers (JIRA)
Paul Rogers created IMPALA-8023:
---

 Summary: Fix PlannerTest to handle error lines consistently
 Key: IMPALA-8023
 URL: https://issues.apache.org/jira/browse/IMPALA-8023
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Affects Versions: Impala 3.1.0
Reporter: Paul Rogers


{{PlannerTest}} works by running a query from a .test file, generating a plan, 
and comparing that plan to a "golden" expected result. It work well for most 
cases. We can use Eclipse's diff tools to compare the actual with expected 
files, and to copy across any expected changes that result from changes to the 
planner code.

Once case that does *not* work are exceptions. When PlannerTest indicates 
encounters failure, it emits a line such as the following to the actual results 
file:

{noformat}
org.apache.impala.common.NotImplementedException: Scan of table 't' in format 
'RC_FILE' is not supported because the table has a column 's' with a complex 
type 'STRUCT'.
{noformat}

Yet, in order for the comparison to pass, the golden file must contain the 
error in the following form:

{noformat}
NotImplementedException: Scan of table 'functional.complextypes_fileformat' in 
format 'TEXT' is not supported because the table has a column 's' with a 
complex type 'STRUCT'.
{noformat}

Note that the actual output includes the package prefix, the expected error 
must *not* include that prefix.

The result is that:

* When comparing files, one must learn to ignore the differences between these 
lines: the differences are *not* the reason why a test might fail, and
* When "rebasing" a file, one must copy all expected changes *except* the error 
lines.

In short, this is a real nuisance. Use a filter mechanism to fix this once and 
for all.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8024) HBase table cardinality estimates are wrong

2018-12-28 Thread Paul Rogers (JIRA)
Paul Rogers created IMPALA-8024:
---

 Summary: HBase table cardinality estimates are wrong
 Key: IMPALA-8024
 URL: https://issues.apache.org/jira/browse/IMPALA-8024
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.1.0
Reporter: Paul Rogers


IMPALA-8021 added cardinality estimates to EXPLAIN plan output. Running some of 
our {{PlannerTest}} files revealed that our HBase cardinality estimates are 
very poor, even for our simple test tables. For example, for 
{{functional_hbase.alltypessmall}}:

{{count\(*)}} tells us that there are 100 rows:

{noformat}
select count(*) from functional_hbase.alltypessmall
+--+
| count(*) |
+--+
| 100  |
+--+
{noformat}

Table stats claim that there are only 60 rows:

{noformat}
show table stats functional_hbase.alltypessmall;
+-+--++--+
| Region Location | Start RowKey | Est. #Rows | Size |
+-+--++--+
| localhost   |  | 10 | 0B   |
| localhost   | 1| 10 | 0B   |
| localhost   | 3| 10 | 0B   |
| localhost   | 5| 10 | 0B   |
| localhost   | 7| 10 | 0B   |
| localhost   | 9| 10 | 0B   |
| Total   |  | 60 | 0B   |
+-+--++--+
{noformat}

The NDV stats show that there must be at least 100 rows:

{noformat}
show column stats functional_hbase.alltypessmall
+-+---+--++--+--+
| Column  | Type  | #Distinct Values | #Nulls | Max Size | Avg Size 
|
+-+---+--++--+--+
| id  | INT   | 99   | 0  | 4| 4
|
...
| timestamp_col   | TIMESTAMP | 100  | 0  | 16   | 16   
|
...
+-+---+--++--+--+
{noformat}

Planning a query, the most critical part, thinks there are only 50 rows:

{noformat}
select *
from functional.alltypesagg join functional_hbase.alltypessmall using (id, 
int_col)

|--01:SCAN HBASE [functional_hbase.alltypessmall]
| row-size=89B cardinality=50
{noformat}

We need a more reliable estimate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8026) Actual row counts for nested loop join are meaningless

2018-12-28 Thread Paul Rogers (JIRA)
Paul Rogers created IMPALA-8026:
---

 Summary: Actual row counts for nested loop join are meaningless
 Key: IMPALA-8026
 URL: https://issues.apache.org/jira/browse/IMPALA-8026
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.1.0
Reporter: Paul Rogers


Consider this extract from a query plan:

{noformat}
Operator  #Rows  Est. #Rows
--
…
|  10:HASH JOIN   9.53M  18.14K 
|  |--19:EXCHANGE 1   1
|  |  00:SCAN HDFS1   1
|  06:NESTED LOOP JOIN4.88B 863.84K 
|  |--18:EXCHANGE 1   1
|  |  04:SCAN HDFS1   1
|  05:HASH JOIN   9.53M 863.84K
{noformat}

If the above is to be believed, the 06 nested loop join produced 5 billion 
rows. But, the actual number is far too huge for that: joining 1 row with 10 
million rows cannot produce 500 times that number of rows.

It appears that the nested loop join actually processed and returned the 9.5 
million rows, since that is the same number produced by the 10 hash join which 
joins a single row with the output of the nested loop join.

Because this same bogus result appears across multiple plans, it is likely that 
the actual number is completely wrong and bears no relation to the number of 
rows actually returned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-7896) Literals should not need explicit analyze step

2018-12-28 Thread Paul Rogers (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved IMPALA-7896.
-
Resolution: Fixed

> Literals should not need explicit analyze step
> --
>
> Key: IMPALA-7896
> URL: https://issues.apache.org/jira/browse/IMPALA-7896
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> The Impala FE has the concept of a _lteral_ (string, boolean, null, number.) 
> Originally, literals could only be created as part of the AST. Hence, all 
> literals are subclasses of {{LiteralExpr}} which are {{ExprNodes}}. The 
> analysis step is used to set the type of the literal numbers, when not known 
> at create time. If literals were used only in the AST, this would be fine, 
> they could be analyzed with an analyzer.
> In fact, as the code has evolved, {{LiteralExpr}} nodes are created via the 
> catalog, which has no analyzer. To fudge the issue, the 
> {{LiteralExpr.create()}} function does analysis with a null analyzer. This, 
> in turn, means that the {{analyze()}} code needs to special case a null 
> analyzer. This, in turn, leads to brittle, error prone code.
> Since literals are immutable (except, sadly, for type), it is better that 
> they start analyzed. Since the only attribute which must be set is the type, 
> and the type can be known at create time, we have the {{analyze()}} be an 
> optional no-op, leading to cleaner semantics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-7888) Incorrect NumericLiteral overflow checks for FLOAT, DOUBLE

2018-12-28 Thread Paul Rogers (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved IMPALA-7888.
-
Resolution: Fixed

> Incorrect NumericLiteral overflow checks for FLOAT, DOUBLE
> --
>
> Key: IMPALA-7888
> URL: https://issues.apache.org/jira/browse/IMPALA-7888
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> Consider the following (new) unit test:
> {code:java}
> assertFalse(NumericLiteral.isOverflow(BigDecimal.ZERO, Type.FLOAT));
> {code}
> This test fails (that is, the value zero, so the method claims, overflows a 
> FLOAT.)
> The reason is a misunderstanding of the meaning of {{MIN_VALUE}} for Float:
> {code:java}
>   case FLOAT:
> return (value.compareTo(BigDecimal.valueOf(Float.MAX_VALUE)) > 0 ||
> value.compareTo(BigDecimal.valueOf(Float.MIN_VALUE)) < 0);
> {code}
> For Float, {{MIN_VALUE}} is the smallest positive number that Float can 
> represent:
> {code:java}
> public static final float MIN_VALUE = 0x0.02P-126f; // 1.4e-45f
> {code}
> The value that the Impala code wants to check it {{- Float.MAX_VALUE}}.
> The only reason that this is not marked as more serious is that the method 
> appears to be used in only one place, and that place does not use {{FLOAT}} 
> values.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-7887) NumericLiteral fails to detect numeric overflow

2018-12-28 Thread Paul Rogers (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved IMPALA-7887.
-
Resolution: Fixed

> NumericLiteral fails to detect numeric overflow
> ---
>
> Key: IMPALA-7887
> URL: https://issues.apache.org/jira/browse/IMPALA-7887
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> The {{NumericLiteral}} constructor takes a value and a type. The code does 
> not check that the value is within range of the type, allowing nonsensical 
> values:
> {code:java}
> NumericLiteral n = new NumericLiteral(new BigDecimal("123.45"),
> ScalarType.createDecimalType(3, 1));
> System.out.println(n.getValue().toString());
> n = new NumericLiteral(new BigDecimal(Integer.MAX_VALUE),
> Type.TINYINT);
> System.out.println(n.getValue().toString());
> {code}
> Prints:
> {noformat}
> 123.45
> 2147483647
> {noformat}
> The value 123.45 is not valid for DECIMAL(3,1), nor is 2^31 valid for TINYINT.
> According to the SQL-2016 standard, section 4.4:
> bq. If an assignment of some number would result in a loss of its most 
> significant digit, an exception condition is raised.
> The purpose of the constructor appears to be for "friendly" use where the 
> caller promises not to create incorrect literals. Better would be to enforce 
> this rule.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-7886) NumericLiteral constructor fails to round values to Decimal type

2018-12-28 Thread Paul Rogers (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved IMPALA-7886.
-
Resolution: Fixed

> NumericLiteral constructor fails to round values to Decimal type
> 
>
> Key: IMPALA-7886
> URL: https://issues.apache.org/jira/browse/IMPALA-7886
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> According to the SQL spec, section 4.4, regarding numeric assignment:
> bq. If least significant digits are lost, implementation defined rounding or 
> truncating occurs, with no exception condition being raised
> When creating a {{NumericLiteral}} via the constructor, no rounding occurs. 
> The value has more precision than specified by the type.
> Create a NumericLiteral and test it as follows:
> {code:java}
>     NumericLiteral n = new NumericLiteral(new BigDecimal("1.567"),
> ScalarType.createDecimalType(2, 1));
>     assertEquals(ScalarType.createDecimalType(2, 1), n.getType());
>     assertEquals("1.6", n.getValue().toString());
> {code}
> The above test fails because the value in the literal is “1.567”, it has not 
> been rounded to fit the type of the literal as required by the SQL standard.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-7891) Analyzer does not detect numeric overflow in CAST

2018-12-28 Thread Paul Rogers (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved IMPALA-7891.
-
Resolution: Fixed

> Analyzer does not detect numeric overflow in CAST
> -
>
> Key: IMPALA-7891
> URL: https://issues.apache.org/jira/browse/IMPALA-7891
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> Consider the following SQL:
> {code:sql}
> SELECT CAST(257 AS TINYINT) AS c FROM functional.alltypestiny
> {code}
> Run this in the shell:
> {noformat}
> +--+
> | cast(257 as tinyint) |
> +--+
> | 1|
> +--+
> {noformat}
> The SQL-2016 standard, section 4.4 states:
> bq. If an assignment of some number would result in a loss of its most 
> significant digit, an exception condition is raised.
> Expected an error rather than wrong result.
> This is not as simple as it appears. The BE is written in C which does not 
> detect integer overflow. So, one could argue that the behavior is correct: 
> Impala makes no guarantees about integer overflow.
> On the other hand, the math above is actually done in the planner; the 
> serialized plan contains an incorrect value. One could argue that the planner 
> should be more strict than the runtime, so that this is an error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-7894) Parser does not catch double overflow

2018-12-28 Thread Paul Rogers (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved IMPALA-7894.
-
Resolution: Fixed

> Parser does not catch double overflow
> -
>
> Key: IMPALA-7894
> URL: https://issues.apache.org/jira/browse/IMPALA-7894
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> The test case {{ParserTest.TestNumericLiteralMinMaxValues()}} incorrectly 
> expects success on {{DOUBLE}} constant overflow and underflow:
> {code:java}
> ParsesOk(String.format("select %s1", Double.toString(Double.MIN_VALUE)));
> ParsesOk(String.format("select %s1", Double.toString(Double.MAX_VALUE)));
> {code}
> These values are actually out of range as tested by 
> {{NumericLiteral.analyzeImpl()}}. However, the parser uses a code path that 
> bypasses this check.
> Expected that the above tests will fail, not succeed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8027) KRPC datastream timing out on both the receiver and sender side even in a minicluster

2018-12-28 Thread Bikramjeet Vig (JIRA)
Bikramjeet Vig created IMPALA-8027:
--

 Summary: KRPC datastream timing out on both the receiver and 
sender side even in a minicluster
 Key: IMPALA-8027
 URL: https://issues.apache.org/jira/browse/IMPALA-8027
 Project: IMPALA
  Issue Type: Bug
  Components: Distributed Exec
Affects Versions: Impala 3.2.0
Reporter: Bikramjeet Vig
Assignee: Michael Ho


krpc datastreams seem to time out at the same time at both sender and receiver 
causing two running queries to fail. This happened while running core tests on 
s3.

Logs from coordinator:
{noformat}
I1228 05:18:56.202587 13396 krpc-data-stream-mgr.cc:353] Sender 127.0.0.1 timed 
out waiting for receiver fragment instance: 8f46b2518734bef1:6ef2d404, 
dest node: 2
I1228 05:18:56.203061 13396 rpcz_store.cc:265] Call 
impala.DataStreamService.TransmitData from 127.0.0.1:53118 (request call id 
11274) took 120782ms. Request Metrics: {}
I1228 05:18:56.203114 13396 krpc-data-stream-mgr.cc:353] Sender 127.0.0.1 timed 
out waiting for receiver fragment instance: 194f5b70907ac97c:84a116d6, 
dest node: 2
I1228 05:18:56.203136 13396 rpcz_store.cc:265] Call 
impala.DataStreamService.TransmitData from 127.0.0.1:53110 (request call id 
8637) took 123811ms. Request Metrics: {}
I1228 05:18:56.203155 13396 krpc-data-stream-mgr.cc:353] Sender 127.0.0.1 timed 
out waiting for receiver fragment instance: 194f5b70907ac97c:84a116d6, 
dest node: 2
I1228 05:18:56.203167 13396 rpcz_store.cc:265] Call 
impala.DataStreamService.TransmitData from 127.0.0.1:53118 (request call id 
11273) took 123776ms. Request Metrics: {}
I1228 05:18:56.203181 13396 krpc-data-stream-mgr.cc:408] Reduced stream ID 
cache from 413 items, to 410, eviction took: 1ms
I1228 05:18:56.204746 13377 coordinator.cc:707] Backend completed: 
host=impala-ec2-centos74-m5-4xlarge-ondemand-07b3.vpc.cloudera.com:22001 
remaining=2 query_id=8f46b2518734bef1:6ef2d404
I1228 05:18:56.204756 13377 coordinator-backend-state.cc:262] 
query_id=8f46b2518734bef1:6ef2d404: first in-progress backend: 
impala-ec2-centos74-m5-4xlarge-ondemand-07b3.vpc.cloudera.com:22000
I1228 05:18:56.204769 13377 coordinator.cc:522] ExecState: query 
id=8f46b2518734bef1:6ef2d404 
finstance=8f46b2518734bef1:6ef2d4040001 on 
host=impala-ec2-centos74-m5-4xlarge-ondemand-07b3.vpc.cloudera.com:22001 
(EXECUTING -> ERROR) status=Sender 127.0.0.1 timed out waiting for receiver 
fragment instance: 8f46b2518734bef1:6ef2d404, dest node: 2
{noformat}
Logs from executor:
{noformat}
E1228 05:18:56.203181 26715 krpc-data-stream-sender.cc:343] channel send to 
127.0.0.1:27000 failed: 
(fragment_instance_id=8f46b2518734bef1:6ef2d404): Sender 127.0.0.1 
timed out waiting for receiver fragment instance: 
8f46b2518734bef1:6ef2d404, dest node: 2
E1228 05:18:56.203256 26682 krpc-data-stream-sender.cc:343] channel send to 
127.0.0.1:27000 failed: 
(fragment_instance_id=194f5b70907ac97c:84a116d6): Sender 127.0.0.1 
timed out waiting for receiver fragment instance: 
194f5b70907ac97c:84a116d6, dest node: 2
I1228 05:18:56.203451 26715 query-state.cc:576] Instance completed. 
instance_id=8f46b2518734bef1:6ef2d4040001 #in-flight=3 
status=DATASTREAM_SENDER_TIMEOUT: Sender 127.0.0.1 timed out waiting for 
receiver fragment instance: 8f46b2518734bef1:6ef2d404, dest node: 2
I1228 05:18:56.203485 26713 query-state.cc:249] UpdateBackendExecState(): last 
report for 8f46b2518734bef1:6ef2d404
I1228 05:18:56.203514 26682 query-state.cc:576] Instance completed. 
instance_id=194f5b70907ac97c:84a116d60003 #in-flight=2 
status=DATASTREAM_SENDER_TIMEOUT: Sender 127.0.0.1 timed out waiting for 
receiver fragment instance: 194f5b70907ac97c:84a116d6, dest node: 2
I1228 05:18:56.203536 26680 query-state.cc:249] UpdateBackendExecState(): last 
report for 194f5b70907ac97c:84a116d6
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8028) BETWEEN predicate failures - BetweenPredicate needs to be rewritten into a CompoundPredicate

2018-12-28 Thread Greg Rahn (JIRA)
Greg Rahn created IMPALA-8028:
-

 Summary: BETWEEN predicate failures - BetweenPredicate needs to be 
rewritten into a CompoundPredicate
 Key: IMPALA-8028
 URL: https://issues.apache.org/jira/browse/IMPALA-8028
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.2.0
Reporter: Greg Rahn
Assignee: Paul Rogers


It appears that there are between predicates that Impala has challenges 
handling.  These result in the error:
{noformat}
IllegalStateException: BetweenPredicate needs to be rewritten into a 
CompoundPredicate.
{noformat}
Attaching test cases in file.

Tested on
{noformat}
impalad version 3.2.0-cdh6.x-SNAPSHOT RELEASE (build 
7effb62de5add60eb071ae5331e80a42cf7b0dc1)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-8029) Support DISTINCT with aggregates

2018-12-28 Thread Greg Rahn (JIRA)
Greg Rahn created IMPALA-8029:
-

 Summary: Support DISTINCT with aggregates
 Key: IMPALA-8029
 URL: https://issues.apache.org/jira/browse/IMPALA-8029
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.2.0
Reporter: Greg Rahn
Assignee: Paul Rogers


The following is valid syntax, but throws an error in Impala 3.2.0:
{noformat}
sql> select distinct sum(col0) from tab0;
ERROR: AnalysisException: cannot combine SELECT DISTINCT with aggregate 
functions or GROUP BY
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IMPALA-8025) End-to-end tests sometimes unhelpfully truncate error output

2018-12-28 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730545#comment-16730545
 ] 

Philip Zeyliger commented on IMPALA-8025:
-

{{run-tests.py}} is a wrapper around {{py.test}}. It's likely you can run the 
test directly with {{impala-py.test tests/metadata/test_explain.py -vv -s}} and 
get a bit more of what you want. 

If we want to always pass {{-vv}} to py.test, the invocation is at and around 
https://github.com/apache/impala/blob/master/tests/run-tests.py#L294 . 

> End-to-end tests sometimes unhelpfully truncate error output
> 
>
> Key: IMPALA-8025
> URL: https://issues.apache.org/jira/browse/IMPALA-8025
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Paul Rogers
>Priority: Minor
>
> Made a change to the DESCRIBE output per IMPALA-8021. This required 
> adjustment to {{metadata/test_explain.py}} to account for the change. The 
> test encodes a "golden" version in a .test file using a specialized syntax.
> But, when running the test, the output shows the first few lines (which do 
> match), then elides the rest:
> {noformat}
> E row_regex:.*mem-estimate=[0-9.]*[A-Z]*B mem-reservation=[0-9.]*[A-Z]*B 
> thread-reservation=0 == '|  mem-estimate=0B mem-reservation=0B 
> thread-reservation=0'
> E Detailed information truncated (45 more lines), use "-vv" to show
> {noformat}
> As it turns out, passing "-vv" to {{tests/run-tests.py}} does not seem to 
> pass it to the test program, so that did not work.
> The .xml file for the test contains the same message: output is truncated. 
> Same in the .log file.
> So, the question is, how is a developer to figure out the issue if we can't 
> see the actual error lines? This is the kind of thing that converts a simple 
> task into a multi-hour ordeal.
> Right now, the only solution is to rerun the tests with {{--update_results}} 
> flag to {{run-tests.py}}, then hunt down the generated output file.
> Better would be to output the n lines before the error, rather than the first 
> n lines.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8023) Fix PlannerTest to handle error lines consistently

2018-12-28 Thread Paul Rogers (JIRA)
Paul Rogers created IMPALA-8023:
---

 Summary: Fix PlannerTest to handle error lines consistently
 Key: IMPALA-8023
 URL: https://issues.apache.org/jira/browse/IMPALA-8023
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Affects Versions: Impala 3.1.0
Reporter: Paul Rogers


{{PlannerTest}} works by running a query from a .test file, generating a plan, 
and comparing that plan to a "golden" expected result. It work well for most 
cases. We can use Eclipse's diff tools to compare the actual with expected 
files, and to copy across any expected changes that result from changes to the 
planner code.

Once case that does *not* work are exceptions. When PlannerTest indicates 
encounters failure, it emits a line such as the following to the actual results 
file:

{noformat}
org.apache.impala.common.NotImplementedException: Scan of table 't' in format 
'RC_FILE' is not supported because the table has a column 's' with a complex 
type 'STRUCT'.
{noformat}

Yet, in order for the comparison to pass, the golden file must contain the 
error in the following form:

{noformat}
NotImplementedException: Scan of table 'functional.complextypes_fileformat' in 
format 'TEXT' is not supported because the table has a column 's' with a 
complex type 'STRUCT'.
{noformat}

Note that the actual output includes the package prefix, the expected error 
must *not* include that prefix.

The result is that:

* When comparing files, one must learn to ignore the differences between these 
lines: the differences are *not* the reason why a test might fail, and
* When "rebasing" a file, one must copy all expected changes *except* the error 
lines.

In short, this is a real nuisance. Use a filter mechanism to fix this once and 
for all.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8024) HBase table cardinality estimates are wrong

2018-12-28 Thread Paul Rogers (JIRA)
Paul Rogers created IMPALA-8024:
---

 Summary: HBase table cardinality estimates are wrong
 Key: IMPALA-8024
 URL: https://issues.apache.org/jira/browse/IMPALA-8024
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.1.0
Reporter: Paul Rogers


IMPALA-8021 added cardinality estimates to EXPLAIN plan output. Running some of 
our {{PlannerTest}} files revealed that our HBase cardinality estimates are 
very poor, even for our simple test tables. For example, for 
{{functional_hbase.alltypessmall}}:

{{count\(*)}} tells us that there are 100 rows:

{noformat}
select count(*) from functional_hbase.alltypessmall
+--+
| count(*) |
+--+
| 100  |
+--+
{noformat}

Table stats claim that there are only 60 rows:

{noformat}
show table stats functional_hbase.alltypessmall;
+-+--++--+
| Region Location | Start RowKey | Est. #Rows | Size |
+-+--++--+
| localhost   |  | 10 | 0B   |
| localhost   | 1| 10 | 0B   |
| localhost   | 3| 10 | 0B   |
| localhost   | 5| 10 | 0B   |
| localhost   | 7| 10 | 0B   |
| localhost   | 9| 10 | 0B   |
| Total   |  | 60 | 0B   |
+-+--++--+
{noformat}

The NDV stats show that there must be at least 100 rows:

{noformat}
show column stats functional_hbase.alltypessmall
+-+---+--++--+--+
| Column  | Type  | #Distinct Values | #Nulls | Max Size | Avg Size 
|
+-+---+--++--+--+
| id  | INT   | 99   | 0  | 4| 4
|
...
| timestamp_col   | TIMESTAMP | 100  | 0  | 16   | 16   
|
...
+-+---+--++--+--+
{noformat}

Planning a query, the most critical part, thinks there are only 50 rows:

{noformat}
select *
from functional.alltypesagg join functional_hbase.alltypessmall using (id, 
int_col)

|--01:SCAN HBASE [functional_hbase.alltypessmall]
| row-size=89B cardinality=50
{noformat}

We need a more reliable estimate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8025) End-to-end tests sometimes unhelpfully truncate error output

2018-12-28 Thread Paul Rogers (JIRA)
Paul Rogers created IMPALA-8025:
---

 Summary: End-to-end tests sometimes unhelpfully truncate error 
output
 Key: IMPALA-8025
 URL: https://issues.apache.org/jira/browse/IMPALA-8025
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.1.0
Reporter: Paul Rogers


Made a change to the DESCRIBE output per IMPALA-8021. This required adjustment 
to {{metadata/test_explain.py}} to account for the change. The test encodes a 
"golden" version in a .test file using a specialized syntax.

But, when running the test, the output shows the first few lines (which do 
match), then elides the rest:

{noformat}
E row_regex:.*mem-estimate=[0-9.]*[A-Z]*B mem-reservation=[0-9.]*[A-Z]*B 
thread-reservation=0 == '|  mem-estimate=0B mem-reservation=0B 
thread-reservation=0'
E Detailed information truncated (45 more lines), use "-vv" to show
{noformat}

As it turns out, passing "-vv" to {{tests/run-tests.py}} does not seem to pass 
it to the test program, so that did not work.

The .xml file for the test contains the same message: output is truncated. Same 
in the .log file.

So, the question is, how is a developer to figure out the issue if we can't see 
the actual error lines? This is the kind of thing that converts a simple task 
into a multi-hour ordeal.

Right now, the only solution is to rerun the tests with {{--update_results}} 
flag to {{run-tests.py}}, then hunt down the generated output file.

Better would be to output the n lines before the error, rather than the first n 
lines.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8026) Actual row counts for nested loop join are meaningless

2018-12-28 Thread Paul Rogers (JIRA)
Paul Rogers created IMPALA-8026:
---

 Summary: Actual row counts for nested loop join are meaningless
 Key: IMPALA-8026
 URL: https://issues.apache.org/jira/browse/IMPALA-8026
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.1.0
Reporter: Paul Rogers


Consider this extract from a query plan:

{noformat}
Operator  #Rows  Est. #Rows
--
…
|  10:HASH JOIN   9.53M  18.14K 
|  |--19:EXCHANGE 1   1
|  |  00:SCAN HDFS1   1
|  06:NESTED LOOP JOIN4.88B 863.84K 
|  |--18:EXCHANGE 1   1
|  |  04:SCAN HDFS1   1
|  05:HASH JOIN   9.53M 863.84K
{noformat}

If the above is to be believed, the 06 nested loop join produced 5 billion 
rows. But, the actual number is far too huge for that: joining 1 row with 10 
million rows cannot produce 500 times that number of rows.

It appears that the nested loop join actually processed and returned the 9.5 
million rows, since that is the same number produced by the 10 hash join which 
joins a single row with the output of the nested loop join.

Because this same bogus result appears across multiple plans, it is likely that 
the actual number is completely wrong and bears no relation to the number of 
rows actually returned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7887) NumericLiteral fails to detect numeric overflow

2018-12-28 Thread Paul Rogers (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved IMPALA-7887.
-
Resolution: Fixed

> NumericLiteral fails to detect numeric overflow
> ---
>
> Key: IMPALA-7887
> URL: https://issues.apache.org/jira/browse/IMPALA-7887
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> The {{NumericLiteral}} constructor takes a value and a type. The code does 
> not check that the value is within range of the type, allowing nonsensical 
> values:
> {code:java}
> NumericLiteral n = new NumericLiteral(new BigDecimal("123.45"),
> ScalarType.createDecimalType(3, 1));
> System.out.println(n.getValue().toString());
> n = new NumericLiteral(new BigDecimal(Integer.MAX_VALUE),
> Type.TINYINT);
> System.out.println(n.getValue().toString());
> {code}
> Prints:
> {noformat}
> 123.45
> 2147483647
> {noformat}
> The value 123.45 is not valid for DECIMAL(3,1), nor is 2^31 valid for TINYINT.
> According to the SQL-2016 standard, section 4.4:
> bq. If an assignment of some number would result in a loss of its most 
> significant digit, an exception condition is raised.
> The purpose of the constructor appears to be for "friendly" use where the 
> caller promises not to create incorrect literals. Better would be to enforce 
> this rule.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7886) NumericLiteral constructor fails to round values to Decimal type

2018-12-28 Thread Paul Rogers (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved IMPALA-7886.
-
Resolution: Fixed

> NumericLiteral constructor fails to round values to Decimal type
> 
>
> Key: IMPALA-7886
> URL: https://issues.apache.org/jira/browse/IMPALA-7886
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> According to the SQL spec, section 4.4, regarding numeric assignment:
> bq. If least significant digits are lost, implementation defined rounding or 
> truncating occurs, with no exception condition being raised
> When creating a {{NumericLiteral}} via the constructor, no rounding occurs. 
> The value has more precision than specified by the type.
> Create a NumericLiteral and test it as follows:
> {code:java}
>     NumericLiteral n = new NumericLiteral(new BigDecimal("1.567"),
> ScalarType.createDecimalType(2, 1));
>     assertEquals(ScalarType.createDecimalType(2, 1), n.getType());
>     assertEquals("1.6", n.getValue().toString());
> {code}
> The above test fails because the value in the literal is “1.567”, it has not 
> been rounded to fit the type of the literal as required by the SQL standard.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7896) Literals should not need explicit analyze step

2018-12-28 Thread Paul Rogers (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved IMPALA-7896.
-
Resolution: Fixed

> Literals should not need explicit analyze step
> --
>
> Key: IMPALA-7896
> URL: https://issues.apache.org/jira/browse/IMPALA-7896
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> The Impala FE has the concept of a _lteral_ (string, boolean, null, number.) 
> Originally, literals could only be created as part of the AST. Hence, all 
> literals are subclasses of {{LiteralExpr}} which are {{ExprNodes}}. The 
> analysis step is used to set the type of the literal numbers, when not known 
> at create time. If literals were used only in the AST, this would be fine, 
> they could be analyzed with an analyzer.
> In fact, as the code has evolved, {{LiteralExpr}} nodes are created via the 
> catalog, which has no analyzer. To fudge the issue, the 
> {{LiteralExpr.create()}} function does analysis with a null analyzer. This, 
> in turn, means that the {{analyze()}} code needs to special case a null 
> analyzer. This, in turn, leads to brittle, error prone code.
> Since literals are immutable (except, sadly, for type), it is better that 
> they start analyzed. Since the only attribute which must be set is the type, 
> and the type can be known at create time, we have the {{analyze()}} be an 
> optional no-op, leading to cleaner semantics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7891) Analyzer does not detect numeric overflow in CAST

2018-12-28 Thread Paul Rogers (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved IMPALA-7891.
-
Resolution: Fixed

> Analyzer does not detect numeric overflow in CAST
> -
>
> Key: IMPALA-7891
> URL: https://issues.apache.org/jira/browse/IMPALA-7891
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> Consider the following SQL:
> {code:sql}
> SELECT CAST(257 AS TINYINT) AS c FROM functional.alltypestiny
> {code}
> Run this in the shell:
> {noformat}
> +--+
> | cast(257 as tinyint) |
> +--+
> | 1|
> +--+
> {noformat}
> The SQL-2016 standard, section 4.4 states:
> bq. If an assignment of some number would result in a loss of its most 
> significant digit, an exception condition is raised.
> Expected an error rather than wrong result.
> This is not as simple as it appears. The BE is written in C which does not 
> detect integer overflow. So, one could argue that the behavior is correct: 
> Impala makes no guarantees about integer overflow.
> On the other hand, the math above is actually done in the planner; the 
> serialized plan contains an incorrect value. One could argue that the planner 
> should be more strict than the runtime, so that this is an error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7894) Parser does not catch double overflow

2018-12-28 Thread Paul Rogers (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved IMPALA-7894.
-
Resolution: Fixed

> Parser does not catch double overflow
> -
>
> Key: IMPALA-7894
> URL: https://issues.apache.org/jira/browse/IMPALA-7894
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> The test case {{ParserTest.TestNumericLiteralMinMaxValues()}} incorrectly 
> expects success on {{DOUBLE}} constant overflow and underflow:
> {code:java}
> ParsesOk(String.format("select %s1", Double.toString(Double.MIN_VALUE)));
> ParsesOk(String.format("select %s1", Double.toString(Double.MAX_VALUE)));
> {code}
> These values are actually out of range as tested by 
> {{NumericLiteral.analyzeImpl()}}. However, the parser uses a code path that 
> bypasses this check.
> Expected that the above tests will fail, not succeed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7888) Incorrect NumericLiteral overflow checks for FLOAT, DOUBLE

2018-12-28 Thread Paul Rogers (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved IMPALA-7888.
-
Resolution: Fixed

> Incorrect NumericLiteral overflow checks for FLOAT, DOUBLE
> --
>
> Key: IMPALA-7888
> URL: https://issues.apache.org/jira/browse/IMPALA-7888
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> Consider the following (new) unit test:
> {code:java}
> assertFalse(NumericLiteral.isOverflow(BigDecimal.ZERO, Type.FLOAT));
> {code}
> This test fails (that is, the value zero, so the method claims, overflows a 
> FLOAT.)
> The reason is a misunderstanding of the meaning of {{MIN_VALUE}} for Float:
> {code:java}
>   case FLOAT:
> return (value.compareTo(BigDecimal.valueOf(Float.MAX_VALUE)) > 0 ||
> value.compareTo(BigDecimal.valueOf(Float.MIN_VALUE)) < 0);
> {code}
> For Float, {{MIN_VALUE}} is the smallest positive number that Float can 
> represent:
> {code:java}
> public static final float MIN_VALUE = 0x0.02P-126f; // 1.4e-45f
> {code}
> The value that the Impala code wants to check it {{- Float.MAX_VALUE}}.
> The only reason that this is not marked as more serious is that the method 
> appears to be used in only one place, and that place does not use {{FLOAT}} 
> values.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8027) KRPC datastream timing out on both the receiver and sender side even in a minicluster

2018-12-28 Thread Bikramjeet Vig (JIRA)
Bikramjeet Vig created IMPALA-8027:
--

 Summary: KRPC datastream timing out on both the receiver and 
sender side even in a minicluster
 Key: IMPALA-8027
 URL: https://issues.apache.org/jira/browse/IMPALA-8027
 Project: IMPALA
  Issue Type: Bug
  Components: Distributed Exec
Affects Versions: Impala 3.2.0
Reporter: Bikramjeet Vig
Assignee: Michael Ho


krpc datastreams seem to time out at the same time at both sender and receiver 
causing two running queries to fail. This happened while running core tests on 
s3.

Logs from coordinator:
{noformat}
I1228 05:18:56.202587 13396 krpc-data-stream-mgr.cc:353] Sender 127.0.0.1 timed 
out waiting for receiver fragment instance: 8f46b2518734bef1:6ef2d404, 
dest node: 2
I1228 05:18:56.203061 13396 rpcz_store.cc:265] Call 
impala.DataStreamService.TransmitData from 127.0.0.1:53118 (request call id 
11274) took 120782ms. Request Metrics: {}
I1228 05:18:56.203114 13396 krpc-data-stream-mgr.cc:353] Sender 127.0.0.1 timed 
out waiting for receiver fragment instance: 194f5b70907ac97c:84a116d6, 
dest node: 2
I1228 05:18:56.203136 13396 rpcz_store.cc:265] Call 
impala.DataStreamService.TransmitData from 127.0.0.1:53110 (request call id 
8637) took 123811ms. Request Metrics: {}
I1228 05:18:56.203155 13396 krpc-data-stream-mgr.cc:353] Sender 127.0.0.1 timed 
out waiting for receiver fragment instance: 194f5b70907ac97c:84a116d6, 
dest node: 2
I1228 05:18:56.203167 13396 rpcz_store.cc:265] Call 
impala.DataStreamService.TransmitData from 127.0.0.1:53118 (request call id 
11273) took 123776ms. Request Metrics: {}
I1228 05:18:56.203181 13396 krpc-data-stream-mgr.cc:408] Reduced stream ID 
cache from 413 items, to 410, eviction took: 1ms
I1228 05:18:56.204746 13377 coordinator.cc:707] Backend completed: 
host=impala-ec2-centos74-m5-4xlarge-ondemand-07b3.vpc.cloudera.com:22001 
remaining=2 query_id=8f46b2518734bef1:6ef2d404
I1228 05:18:56.204756 13377 coordinator-backend-state.cc:262] 
query_id=8f46b2518734bef1:6ef2d404: first in-progress backend: 
impala-ec2-centos74-m5-4xlarge-ondemand-07b3.vpc.cloudera.com:22000
I1228 05:18:56.204769 13377 coordinator.cc:522] ExecState: query 
id=8f46b2518734bef1:6ef2d404 
finstance=8f46b2518734bef1:6ef2d4040001 on 
host=impala-ec2-centos74-m5-4xlarge-ondemand-07b3.vpc.cloudera.com:22001 
(EXECUTING -> ERROR) status=Sender 127.0.0.1 timed out waiting for receiver 
fragment instance: 8f46b2518734bef1:6ef2d404, dest node: 2
{noformat}
Logs from executor:
{noformat}
E1228 05:18:56.203181 26715 krpc-data-stream-sender.cc:343] channel send to 
127.0.0.1:27000 failed: 
(fragment_instance_id=8f46b2518734bef1:6ef2d404): Sender 127.0.0.1 
timed out waiting for receiver fragment instance: 
8f46b2518734bef1:6ef2d404, dest node: 2
E1228 05:18:56.203256 26682 krpc-data-stream-sender.cc:343] channel send to 
127.0.0.1:27000 failed: 
(fragment_instance_id=194f5b70907ac97c:84a116d6): Sender 127.0.0.1 
timed out waiting for receiver fragment instance: 
194f5b70907ac97c:84a116d6, dest node: 2
I1228 05:18:56.203451 26715 query-state.cc:576] Instance completed. 
instance_id=8f46b2518734bef1:6ef2d4040001 #in-flight=3 
status=DATASTREAM_SENDER_TIMEOUT: Sender 127.0.0.1 timed out waiting for 
receiver fragment instance: 8f46b2518734bef1:6ef2d404, dest node: 2
I1228 05:18:56.203485 26713 query-state.cc:249] UpdateBackendExecState(): last 
report for 8f46b2518734bef1:6ef2d404
I1228 05:18:56.203514 26682 query-state.cc:576] Instance completed. 
instance_id=194f5b70907ac97c:84a116d60003 #in-flight=2 
status=DATASTREAM_SENDER_TIMEOUT: Sender 127.0.0.1 timed out waiting for 
receiver fragment instance: 194f5b70907ac97c:84a116d6, dest node: 2
I1228 05:18:56.203536 26680 query-state.cc:249] UpdateBackendExecState(): last 
report for 194f5b70907ac97c:84a116d6
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8023) Fix PlannerTest to handle error lines consistently

2018-12-28 Thread Paul Rogers (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8023 started by Paul Rogers.
---
> Fix PlannerTest to handle error lines consistently
> --
>
> Key: IMPALA-8023
> URL: https://issues.apache.org/jira/browse/IMPALA-8023
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> {{PlannerTest}} works by running a query from a .test file, generating a 
> plan, and comparing that plan to a "golden" expected result. It work well for 
> most cases. We can use Eclipse's diff tools to compare the actual with 
> expected files, and to copy across any expected changes that result from 
> changes to the planner code.
> Once case that does *not* work are exceptions. When PlannerTest indicates 
> encounters failure, it emits a line such as the following to the actual 
> results file:
> {noformat}
> org.apache.impala.common.NotImplementedException: Scan of table 't' in format 
> 'RC_FILE' is not supported because the table has a column 's' with a complex 
> type 'STRUCT'.
> {noformat}
> Yet, in order for the comparison to pass, the golden file must contain the 
> error in the following form:
> {noformat}
> NotImplementedException: Scan of table 'functional.complextypes_fileformat' 
> in format 'TEXT' is not supported because the table has a column 's' with a 
> complex type 'STRUCT'.
> {noformat}
> Note that the actual output includes the package prefix, the expected error 
> must *not* include that prefix.
> The result is that:
> * When comparing files, one must learn to ignore the differences between 
> these lines: the differences are *not* the reason why a test might fail, and
> * When "rebasing" a file, one must copy all expected changes *except* the 
> error lines.
> In short, this is a real nuisance. Use a filter mechanism to fix this once 
> and for all.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8023) Fix PlannerTest to handle error lines consistently

2018-12-28 Thread Paul Rogers (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated IMPALA-8023:

Description: 
{{PlannerTest}} works by running a query from a .test file, generating a plan, 
and comparing that plan to a "golden" expected result. It work well for most 
cases. We can use Eclipse's diff tools to compare the actual with expected 
files, and to copy across any expected changes that result from changes to the 
planner code.

Once case that does *not* work are exceptions. When PlannerTest indicates 
encounters failure, it emits a line such as the following to the actual results 
file:

{noformat}
org.apache.impala.common.NotImplementedException: Scan of table 't' in format 
'RC_FILE' is not supported because the table has a column 's' with a complex 
type 'STRUCT'.
{noformat}

Yet, in order for the comparison to pass, the golden file must contain the 
error in the following form:

{noformat}
NotImplementedException: Scan of table 'functional.complextypes_fileformat' in 
format 'TEXT' is not supported because the table has a column 's' with a 
complex type 'STRUCT'.
{noformat}

Note that the actual output includes the package prefix, the expected error 
must *not* include that prefix.

The result is that:

* When comparing files, one must learn to ignore the differences between these 
lines: the differences are *not* the reason why a test might fail, and
* When "rebasing" a file, one must copy all expected changes *except* the error 
lines.

In short, this is a real nuisance. Use a filter mechanism to fix this once and 
for all.

The problem is that the text appended to the "actual output" is not the same as 
that used for comparison. A simple two-line fix will eliminate this issue.

Current code in {{PlanerTestBase.handleException()}}:

{code:java}
actualOutput.append(e.toString() + "\n");
...
String actualErrorMsg = e.getClass().getSimpleName() + ": " + 
e.getMessage();
{code}

Proposed:

{code:java}
String actualErrorMsg = e.getClass().getSimpleName() + ": " + 
e.getMessage();
actualOutput.append(actualErrorMsg).append("\n");
{code}

  was:
{{PlannerTest}} works by running a query from a .test file, generating a plan, 
and comparing that plan to a "golden" expected result. It work well for most 
cases. We can use Eclipse's diff tools to compare the actual with expected 
files, and to copy across any expected changes that result from changes to the 
planner code.

Once case that does *not* work are exceptions. When PlannerTest indicates 
encounters failure, it emits a line such as the following to the actual results 
file:

{noformat}
org.apache.impala.common.NotImplementedException: Scan of table 't' in format 
'RC_FILE' is not supported because the table has a column 's' with a complex 
type 'STRUCT'.
{noformat}

Yet, in order for the comparison to pass, the golden file must contain the 
error in the following form:

{noformat}
NotImplementedException: Scan of table 'functional.complextypes_fileformat' in 
format 'TEXT' is not supported because the table has a column 's' with a 
complex type 'STRUCT'.
{noformat}

Note that the actual output includes the package prefix, the expected error 
must *not* include that prefix.

The result is that:

* When comparing files, one must learn to ignore the differences between these 
lines: the differences are *not* the reason why a test might fail, and
* When "rebasing" a file, one must copy all expected changes *except* the error 
lines.

In short, this is a real nuisance. Use a filter mechanism to fix this once and 
for all.



> Fix PlannerTest to handle error lines consistently
> --
>
> Key: IMPALA-8023
> URL: https://issues.apache.org/jira/browse/IMPALA-8023
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> {{PlannerTest}} works by running a query from a .test file, generating a 
> plan, and comparing that plan to a "golden" expected result. It work well for 
> most cases. We can use Eclipse's diff tools to compare the actual with 
> expected files, and to copy across any expected changes that result from 
> changes to the planner code.
> Once case that does *not* work are exceptions. When PlannerTest indicates 
> encounters failure, it emits a line such as the following to the actual 
> results file:
> {noformat}
> org.apache.impala.common.NotImplementedException: Scan of table 't' in format 
> 'RC_FILE' is not supported because the table has a column 's' with a complex 
> type 'STRUCT'.
> {noformat}
> Yet, in order for the comparison to pass, the golden file must contain the 
> error in the following form:
> {noformat}
> NotImplementedException: Scan of table 

[jira] [Created] (IMPALA-8028) BETWEEN predicate failures - BetweenPredicate needs to be rewritten into a CompoundPredicate

2018-12-28 Thread Greg Rahn (JIRA)
Greg Rahn created IMPALA-8028:
-

 Summary: BETWEEN predicate failures - BetweenPredicate needs to be 
rewritten into a CompoundPredicate
 Key: IMPALA-8028
 URL: https://issues.apache.org/jira/browse/IMPALA-8028
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.2.0
Reporter: Greg Rahn
Assignee: Paul Rogers


It appears that there are between predicates that Impala has challenges 
handling.  These result in the error:
{noformat}
IllegalStateException: BetweenPredicate needs to be rewritten into a 
CompoundPredicate.
{noformat}
Attaching test cases in file.

Tested on
{noformat}
impalad version 3.2.0-cdh6.x-SNAPSHOT RELEASE (build 
7effb62de5add60eb071ae5331e80a42cf7b0dc1)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8028) BETWEEN predicate failures - BetweenPredicate needs to be rewritten into a CompoundPredicate

2018-12-28 Thread Greg Rahn (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Rahn updated IMPALA-8028:
--
Labels: ansi-sql  (was: )

> BETWEEN predicate failures - BetweenPredicate needs to be rewritten into a 
> CompoundPredicate
> 
>
> Key: IMPALA-8028
> URL: https://issues.apache.org/jira/browse/IMPALA-8028
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.2.0
>Reporter: Greg Rahn
>Assignee: Paul Rogers
>Priority: Major
>  Labels: ansi-sql
> Attachments: BetweenPredicate.sql
>
>
> It appears that there are between predicates that Impala has challenges 
> handling.  These result in the error:
> {noformat}
> IllegalStateException: BetweenPredicate needs to be rewritten into a 
> CompoundPredicate.
> {noformat}
> Attaching test cases in file.
> Tested on
> {noformat}
> impalad version 3.2.0-cdh6.x-SNAPSHOT RELEASE (build 
> 7effb62de5add60eb071ae5331e80a42cf7b0dc1)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8029) Support DISTINCT with aggregates

2018-12-28 Thread Greg Rahn (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Rahn updated IMPALA-8029:
--
Labels: ansi-sql  (was: )

> Support DISTINCT with aggregates
> 
>
> Key: IMPALA-8029
> URL: https://issues.apache.org/jira/browse/IMPALA-8029
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.2.0
>Reporter: Greg Rahn
>Assignee: Paul Rogers
>Priority: Major
>  Labels: ansi-sql
>
> The following is valid syntax, but throws an error in Impala 3.2.0:
> {noformat}
> sql> select distinct sum(col0) from tab0;
> ERROR: AnalysisException: cannot combine SELECT DISTINCT with aggregate 
> functions or GROUP BY
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8029) Support DISTINCT with aggregates

2018-12-28 Thread Greg Rahn (JIRA)
Greg Rahn created IMPALA-8029:
-

 Summary: Support DISTINCT with aggregates
 Key: IMPALA-8029
 URL: https://issues.apache.org/jira/browse/IMPALA-8029
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.2.0
Reporter: Greg Rahn
Assignee: Paul Rogers


The following is valid syntax, but throws an error in Impala 3.2.0:
{noformat}
sql> select distinct sum(col0) from tab0;
ERROR: AnalysisException: cannot combine SELECT DISTINCT with aggregate 
functions or GROUP BY
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8028) BETWEEN predicate failures - BetweenPredicate needs to be rewritten into a CompoundPredicate

2018-12-28 Thread Greg Rahn (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Rahn updated IMPALA-8028:
--
Attachment: BetweenPredicate.sql

> BETWEEN predicate failures - BetweenPredicate needs to be rewritten into a 
> CompoundPredicate
> 
>
> Key: IMPALA-8028
> URL: https://issues.apache.org/jira/browse/IMPALA-8028
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.2.0
>Reporter: Greg Rahn
>Assignee: Paul Rogers
>Priority: Major
> Attachments: BetweenPredicate.sql
>
>
> It appears that there are between predicates that Impala has challenges 
> handling.  These result in the error:
> {noformat}
> IllegalStateException: BetweenPredicate needs to be rewritten into a 
> CompoundPredicate.
> {noformat}
> Attaching test cases in file.
> Tested on
> {noformat}
> impalad version 3.2.0-cdh6.x-SNAPSHOT RELEASE (build 
> 7effb62de5add60eb071ae5331e80a42cf7b0dc1)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org