date:20181105

[jira] [Work started] (IMPALA-7233) Impala 3.1 Doc: Doc the support for IANA time zone database

2018-11-05 Thread Alex Rodoni (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7233 started by Alex Rodoni.
---
> Impala 3.1 Doc: Doc the support for IANA time zone database
> ---
>
> Key: IMPALA-7233
> URL: https://issues.apache.org/jira/browse/IMPALA-7233
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-4063) Make fragment instance reports per-query (or per-host) instead of per-fragment instance.

2018-11-05 Thread Michael Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-4063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho resolved IMPALA-4063.

   Resolution: Fixed
Fix Version/s: Impala 3.2.0

The proposed release of Impala 3.1.0 will not include the fix of IMPALA-7213 
which the fix of this Jira is dependent on. So marking the fix version as 3.2.0 
for now.

> Make fragment instance reports per-query (or per-host) instead of 
> per-fragment instance.
> 
>
> Key: IMPALA-4063
> URL: https://issues.apache.org/jira/browse/IMPALA-4063
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 2.7.0
>Reporter: Sailesh Mukil
>Assignee: Michael Ho
>Priority: Major
>  Labels: impala-scalability-sprint-08-13-2018, performance
> Fix For: Impala 3.2.0
>
>
> Currently we send a report per-fragment instance to the coordinator every 5 
> seconds (by default; modifiable via query option 'status_report_interval').
> For queries with a large number of fragment instances, this generates 
> tremendous amounts  of network traffic to the coordinator, which will only be 
> aggravated with higher a DOP.
> We should instead queue per-fragment instance reports and send out a 
> per-query report to the coordinator instead.
> For code references, see:
> PlanFragmentExecutor:: ReportProfile()
> PlanFragmentExecutor:: SendReport()
> FragmentExecState:: ReportStatusCb()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-4063) Make fragment instance reports per-query (or per-host) instead of per-fragment instance.

2018-11-05 Thread Michael Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-4063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho resolved IMPALA-4063.

   Resolution: Fixed
Fix Version/s: Impala 3.2.0

The proposed release of Impala 3.1.0 will not include the fix of IMPALA-7213 
which the fix of this Jira is dependent on. So marking the fix version as 3.2.0 
for now.

> Make fragment instance reports per-query (or per-host) instead of 
> per-fragment instance.
> 
>
> Key: IMPALA-4063
> URL: https://issues.apache.org/jira/browse/IMPALA-4063
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 2.7.0
>Reporter: Sailesh Mukil
>Assignee: Michael Ho
>Priority: Major
>  Labels: impala-scalability-sprint-08-13-2018, performance
> Fix For: Impala 3.2.0
>
>
> Currently we send a report per-fragment instance to the coordinator every 5 
> seconds (by default; modifiable via query option 'status_report_interval').
> For queries with a large number of fragment instances, this generates 
> tremendous amounts  of network traffic to the coordinator, which will only be 
> aggravated with higher a DOP.
> We should instead queue per-fragment instance reports and send out a 
> per-query report to the coordinator instead.
> For code references, see:
> PlanFragmentExecutor:: ReportProfile()
> PlanFragmentExecutor:: SendReport()
> FragmentExecState:: ReportStatusCb()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (IMPALA-4063) Make fragment instance reports per-query (or per-host) instead of per-fragment instance.

2018-11-05 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-4063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675960#comment-16675960
 ] 

ASF subversion and git services commented on IMPALA-4063:
-

Commit 941038229ae7073ddf7b9c6f58e9eaf866b89b2c in impala's branch 
refs/heads/master from Michael Ho
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=9410382 ]

IMPALA-4063: Merge report of query fragment instances per executor

Previously, each fragment instance executing on an executor will
independently report its status to the coordinator periodically.
This creates a huge amount of RPCs to the coordinator under highly
concurrent workloads, causing lock contention in the coordinator's
backend states when multiple fragment instances send them at the
same time. In addition, due to the lack of coordination between query
fragment instances, a query may end without collecting the profiles
from all fragment instances when one of them hits an error before
another fragment instance manages to finish Prepare(), leading to
missing profiles for certain fragment instances.

This change fixes the problem above by making a thread per QueryState
(started by QueryExecMgr) to be responsible for periodically reporting
the status and profiles of all fragment instances of a query running
on a backend. As part of this refactoring, each query fragment instance
will not report their errors individually. Instead, there is a cumulative
status maintained per QueryState. It's set to the error status of the first
fragment instance which hits an error or any general error (e.g. failure
to start a thread) when starting fragment instances. With this change,
the status reporting threads are also removed.

Testing done: exhaustive tests

This patch is based on a patch by Sailesh Mukil

Change-Id: I5f95e026ba05631f33f48ce32da6db39c6f421fa
Reviewed-on: http://gerrit.cloudera.org:8080/11615
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Make fragment instance reports per-query (or per-host) instead of 
> per-fragment instance.
> 
>
> Key: IMPALA-4063
> URL: https://issues.apache.org/jira/browse/IMPALA-4063
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 2.7.0
>Reporter: Sailesh Mukil
>Assignee: Michael Ho
>Priority: Major
>  Labels: impala-scalability-sprint-08-13-2018, performance
>
> Currently we send a report per-fragment instance to the coordinator every 5 
> seconds (by default; modifiable via query option 'status_report_interval').
> For queries with a large number of fragment instances, this generates 
> tremendous amounts  of network traffic to the coordinator, which will only be 
> aggravated with higher a DOP.
> We should instead queue per-fragment instance reports and send out a 
> per-query report to the coordinator instead.
> For code references, see:
> PlanFragmentExecutor:: ReportProfile()
> PlanFragmentExecutor:: SendReport()
> FragmentExecState:: ReportStatusCb()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7103) Impala 3.1 Doc: Query option to enable EC

2018-11-05 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675959#comment-16675959
 ] 

ASF subversion and git services commented on IMPALA-7103:
-

Commit f7d89ef4ed3c03f88730226aaa5ae80fee541cda in impala's branch 
refs/heads/master from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=f7d89ef ]

IMPALA-7103: [DOCS] Document the ALLOW_ERASURE_CODED_FILES query option

Change-Id: I63b54031b725e528196d19eac9ddf36a19c43e28
Reviewed-on: http://gerrit.cloudera.org:8080/11855
Tested-by: Impala Public Jenkins 
Reviewed-by: Tim Armstrong 


> Impala 3.1 Doc: Query option to enable EC
> -
>
> Key: IMPALA-7103
> URL: https://issues.apache.org/jira/browse/IMPALA-7103
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc
> Fix For: Impala 3.1.0
>
>
> https://gerrit.cloudera.org/#/c/11855/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7815) Impala 3.1 Release Notes

2018-11-05 Thread Alex Rodoni (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-7815:

Target Version: Impala 3.1.0

> Impala 3.1 Release Notes
> 
>
> Key: IMPALA-7815
> URL: https://issues.apache.org/jira/browse/IMPALA-7815
> Project: IMPALA
>  Issue Type: Task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc
>
> Update:
> impala_keydefs.ditamap
> impala_release_notes.ditamap
> impala_fixed_issues.xml
> impala_incompatible_changes.xml
> impala_new_features.xml



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-7528) Division by zero when computing cardinalities of many to many joins on NULL columns

2018-11-05 Thread Bikramjeet Vig (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig reassigned IMPALA-7528:
--

Assignee: Bikramjeet Vig  (was: Adrian Ng)

> Division by zero when computing cardinalities of many to many joins on NULL 
> columns
> ---
>
> Key: IMPALA-7528
> URL: https://issues.apache.org/jira/browse/IMPALA-7528
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.12.0
>Reporter: Balazs Jeszenszky
>Assignee: Bikramjeet Vig
>Priority: Critical
>  Labels: planner
>
> The following:
> {code:java}
> | F00:PLAN FRAGMENT [RANDOM] hosts=1 instances=1 |
> | Per-Host Resources: mem-estimate=33.94MB mem-reservation=1.94MB|
> | 02:HASH JOIN [INNER JOIN, BROADCAST]   |
> | |  hash predicates: b.code = a.code|
> | |  fk/pk conjuncts: none   |
> | |  runtime filters: RF000 <- a.code|
> | |  mem-estimate=1.94MB mem-reservation=1.94MB spill-buffer=64.00KB |
> | |  tuple-ids=1,0 row-size=163B cardinality=9223372036854775807 |
> | |  |
> | |--03:EXCHANGE [BROADCAST] |
> | |  |  mem-estimate=0B mem-reservation=0B   |
> | |  |  tuple-ids=0 row-size=82B cardinality=823 |
> | |  |   |
> | |  F01:PLAN FRAGMENT [RANDOM] hosts=1 instances=1  |
> | |  Per-Host Resources: mem-estimate=32.00MB mem-reservation=0B |
> | |  00:SCAN HDFS [default.sample_07 a, RANDOM]  |
> | | partitions=1/1 files=1 size=44.98KB  |
> | | stats-rows=823 extrapolated-rows=disabled|
> | | table stats: rows=823 size=44.98KB   |
> | | column stats: all|
> | | mem-estimate=32.00MB mem-reservation=0B  |
> | | tuple-ids=0 row-size=82B cardinality=823 |
> | |  |
> | 01:SCAN HDFS [default.sample_08 b, RANDOM] |
> |partitions=1/1 files=1 size=44.99KB |
> |runtime filters: RF000 -> b.code|
> |stats-rows=823 extrapolated-rows=disabled   |
> |table stats: rows=823 size=44.99KB  |
> |column stats: all   |
> |mem-estimate=32.00MB mem-reservation=0B |
> |tuple-ids=1 row-size=82B cardinality=823|
> ++
> {code}
> is the result of both join columns having 0 as NDV.
> https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/main/java/org/apache/impala/planner/JoinNode.java#L368
> should handle this more gracefully.
> IMPALA-7310 makes it a bit more likely that someone will run into this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7815) Impala 3.1 Release Notes

2018-11-05 Thread Alex Rodoni (JIRA)

Alex Rodoni created IMPALA-7815:
---

 Summary: Impala 3.1 Release Notes
 Key: IMPALA-7815
 URL: https://issues.apache.org/jira/browse/IMPALA-7815
 Project: IMPALA
  Issue Type: Task
  Components: Docs
Reporter: Alex Rodoni
Assignee: Alex Rodoni


Update:

impala_keydefs.ditamap
impala_release_notes.ditamap
impala_fixed_issues.xml
impala_incompatible_changes.xml
impala_new_features.xml



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IMPALA-7815) Impala 3.1 Release Notes

2018-11-05 Thread Alex Rodoni (JIRA)

Alex Rodoni created IMPALA-7815:
---

 Summary: Impala 3.1 Release Notes
 Key: IMPALA-7815
 URL: https://issues.apache.org/jira/browse/IMPALA-7815
 Project: IMPALA
  Issue Type: Task
  Components: Docs
Reporter: Alex Rodoni
Assignee: Alex Rodoni


Update:

impala_keydefs.ditamap
impala_release_notes.ditamap
impala_fixed_issues.xml
impala_incompatible_changes.xml
impala_new_features.xml



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Closed] (IMPALA-7103) Impala 3.1 Doc: Query option to enable EC

2018-11-05 Thread Alex Rodoni (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni closed IMPALA-7103.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Impala 3.1 Doc: Query option to enable EC
> -
>
> Key: IMPALA-7103
> URL: https://issues.apache.org/jira/browse/IMPALA-7103
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc
> Fix For: Impala 3.1.0
>
>
> https://gerrit.cloudera.org/#/c/11855/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (IMPALA-7103) Impala 3.1 Doc: Query option to enable EC

2018-11-05 Thread Alex Rodoni (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni closed IMPALA-7103.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Impala 3.1 Doc: Query option to enable EC
> -
>
> Key: IMPALA-7103
> URL: https://issues.apache.org/jira/browse/IMPALA-7103
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc
> Fix For: Impala 3.1.0
>
>
> https://gerrit.cloudera.org/#/c/11855/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-7814) AggregationNode's memory estimate should be based on NDV only for non-grouping aggs

2018-11-05 Thread Pooja Nilangekar (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pooja Nilangekar reassigned IMPALA-7814:


Assignee: Pooja Nilangekar

> AggregationNode's memory estimate should be based on NDV only for 
> non-grouping aggs 
> 
>
> Key: IMPALA-7814
> URL: https://issues.apache.org/jira/browse/IMPALA-7814
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Pooja Nilangekar
>Assignee: Pooja Nilangekar
>Priority: Major
>
> Currently, the AggregationNode always computes the NDV to estimate the number 
> of rows. However, for grouping aggregates, the entire input has to be 
> consumed before the output can be produced, hence its memory estimate should 
> not consider the NDV.  This is acceptable for non-grouping aggregates because 
> it only need to store the value expression during the build phase, instead of 
> the entire tuple. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7814) AggregationNode's memory estimate should be based on NDV only for non-grouping aggs

2018-11-05 Thread Pooja Nilangekar (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pooja Nilangekar updated IMPALA-7814:
-
Description: Currently, the AggregationNode always computes the NDV to 
estimate the number of rows. However, for grouping aggregates, the entire input 
has to be consumed before the output can be produced, hence its memory estimate 
should not consider the NDV.  This is acceptable for non-grouping aggregates 
because it only need to store the value expression during the build phase, 
instead of the entire tuple. 
Summary: AggregationNode's memory estimate should be based on NDV only 
for non-grouping aggs   (was: Aggregation Node's memory estimate should be 
based on NDV only for non-grouping aggs )

> AggregationNode's memory estimate should be based on NDV only for 
> non-grouping aggs 
> 
>
> Key: IMPALA-7814
> URL: https://issues.apache.org/jira/browse/IMPALA-7814
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Pooja Nilangekar
>Priority: Major
>
> Currently, the AggregationNode always computes the NDV to estimate the number 
> of rows. However, for grouping aggregates, the entire input has to be 
> consumed before the output can be produced, hence its memory estimate should 
> not consider the NDV.  This is acceptable for non-grouping aggregates because 
> it only need to store the value expression during the build phase, instead of 
> the entire tuple. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7814) Aggregation Node's memory estimate should be based on NDV only for non-grouping aggs

2018-11-05 Thread Pooja Nilangekar (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pooja Nilangekar updated IMPALA-7814:
-
Summary: Aggregation Node's memory estimate should be based on NDV only for 
non-grouping aggs   (was: Aggregation Node)

> Aggregation Node's memory estimate should be based on NDV only for 
> non-grouping aggs 
> -
>
> Key: IMPALA-7814
> URL: https://issues.apache.org/jira/browse/IMPALA-7814
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Pooja Nilangekar
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7814) Aggregation Node

2018-11-05 Thread Pooja Nilangekar (JIRA)

Pooja Nilangekar created IMPALA-7814:


 Summary: Aggregation Node
 Key: IMPALA-7814
 URL: https://issues.apache.org/jira/browse/IMPALA-7814
 Project: IMPALA
  Issue Type: Sub-task
Reporter: Pooja Nilangekar






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7814) Aggregation Node

2018-11-05 Thread Pooja Nilangekar (JIRA)

Pooja Nilangekar created IMPALA-7814:


 Summary: Aggregation Node
 Key: IMPALA-7814
 URL: https://issues.apache.org/jira/browse/IMPALA-7814
 Project: IMPALA
  Issue Type: Sub-task
Reporter: Pooja Nilangekar






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (IMPALA-402) Add test for dynamic partition expr involving rand()

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-402.
--
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Add test for dynamic partition expr involving rand()
> 
>
> Key: IMPALA-402
> URL: https://issues.apache.org/jira/browse/IMPALA-402
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 1.0, Impala 2.5.0, Impala 2.6.0, Impala 2.7.0, 
> Impala 2.8.0, Impala 2.9.0
> Environment: CentOS 6.3
>Reporter: Benyi Wang
>Assignee: Tim Armstrong
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> I found two problems:
> * "Insert overwrite table" doesn't clean up the directory (external table)
> {code}
> $ hadoop fs -ls -R /user/benyiw/tmp_abc;
> drwxr-xr-x   - impala supergroup  0 2013-06-06 12:46 
> /user/benyiw/tmp_abc/slot=1
> -rw-r--r--   2 impala supergroup  16088 2013-06-06 12:46 
> /user/benyiw/tmp_abc/slot=1/3456606565886086588--5331466032849119435_641430213_data.0
> -rw-r--r--   2 impala supergroup 100691 2013-06-06 12:46 
> /user/benyiw/tmp_abc/slot=1/3456606565886086588--5331466032849119436_1260163059_data.0
> -rw-r--r--   2 impala supergroup  43875 2013-06-06 12:46 
> /user/benyiw/tmp_abc/slot=1/3456606565886086588--5331466032849119437_929705780_data.0
> drwxr-xr-x   - impala supergroup  0 2013-06-06 12:40 
> /user/benyiw/tmp_abc/slot=2
> -rw-r--r--   2 impala supergroup  8 2013-06-06 12:40 
> /user/benyiw/tmp_abc/slot=2/-8660787917599456385--5527614477985301990_1328141055_data.0
> drwxr-xr-x   - impala supergroup  0 2013-06-06 12:40 
> /user/benyiw/tmp_abc/slot=3
> -rw-r--r--   2 impala supergroup  8 2013-06-06 12:40 
> /user/benyiw/tmp_abc/slot=3/-8660787917599456385--5527614477985301990_501684742_data.0
> drwxr-xr-x   - impala supergroup  0 2013-06-06 12:47 
> /user/benyiw/tmp_abc/slot=b
> -rw-r--r--   2 impala supergroup  16130 2013-06-06 12:47 
> /user/benyiw/tmp_abc/slot=b/705210285518833776--6520969021873146409_792816606_data.0
> -rw-r--r--   2 impala supergroup 100728 2013-06-06 12:47 
> /user/benyiw/tmp_abc/slot=b/705210285518833776--6520969021873146410_157404218_data.0
> -rw-r--r--   2 impala supergroup  43796 2013-06-06 12:47 
> /user/benyiw/tmp_abc/slot=b/705210285518833776--6520969021873146411_157404218_data.0
> {code}
> * When I ran the following queries, all output files are put into the same 
> partition. 
> {code}
> create table tmp_abc (
>   customer_id string,
>   email string
> ) partitioned by (slot string)
> row format delimited fields terminated by '\t' lines terminated by '\n'
> stored as TextFile
> location '/user/benyiw/tmp_abc';
> insert overwrite table tmp_abc partition (slot) select customer_id, email, 
> case when slot1 < 0.10 then "a" when slot1 < 0.70 then "b" else "c" end as 
> slot from ( select customer_id, email, rand() as slot1 from (select 
> customer_id, max(email) as email, sum(case when seg_num >= 0 then 1 else 0 
> end) as included from customers where ( (seg_num in (1) and member = 'Y') or 
> (seg_num = -1) ) and site_key = 'a_site' and coll_def_id = 'everything' group 
> by customer_id having included > 0 ) a ) b
> {code}
> {code}
> $ hadoop fs -ls -R /user/benyiw/tmp_abc;
> drwxr-xr-x   - impala supergroup  0 2013-06-06 13:01 
> /user/benyiw/tmp_abc/slot=a
> -rw-r--r--   2 impala supergroup  16021 2013-06-06 13:01 
> /user/benyiw/tmp_abc/slot=a/-7883266034308591983--7883771993317985492_909811936_data.0
> -rw-r--r--   2 impala supergroup 100713 2013-06-06 13:01 
> /user/benyiw/tmp_abc/slot=a/-7883266034308591983--7883771993317985493_272258764_data.0
> -rw-r--r--   2 impala supergroup  43920 2013-06-06 13:01 
> /user/benyiw/tmp_abc/slot=a/-7883266034308591983--7883771993317985494_272258764_data.0
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-402) Add test for dynamic partition expr involving rand()

2018-11-05 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675907#comment-16675907
 ] 

ASF subversion and git services commented on IMPALA-402:


Commit 58cd69ac48d4014ef956a7df9dce63c0b8f122c4 in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=58cd69a ]

IMPALA-402: test for random partitioning in insert

This adds a basic regression test for the bug reported in IMPALA-402.

Testing:
Exhaustive build.

Looped the modified test overnight.

Change-Id: I4bbca5c64977cadf79dabd72f0c8876a40fdf410
Reviewed-on: http://gerrit.cloudera.org:8080/11799
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add test for dynamic partition expr involving rand()
> 
>
> Key: IMPALA-402
> URL: https://issues.apache.org/jira/browse/IMPALA-402
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 1.0, Impala 2.5.0, Impala 2.6.0, Impala 2.7.0, 
> Impala 2.8.0, Impala 2.9.0
> Environment: CentOS 6.3
>Reporter: Benyi Wang
>Assignee: Tim Armstrong
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> I found two problems:
> * "Insert overwrite table" doesn't clean up the directory (external table)
> {code}
> $ hadoop fs -ls -R /user/benyiw/tmp_abc;
> drwxr-xr-x   - impala supergroup  0 2013-06-06 12:46 
> /user/benyiw/tmp_abc/slot=1
> -rw-r--r--   2 impala supergroup  16088 2013-06-06 12:46 
> /user/benyiw/tmp_abc/slot=1/3456606565886086588--5331466032849119435_641430213_data.0
> -rw-r--r--   2 impala supergroup 100691 2013-06-06 12:46 
> /user/benyiw/tmp_abc/slot=1/3456606565886086588--5331466032849119436_1260163059_data.0
> -rw-r--r--   2 impala supergroup  43875 2013-06-06 12:46 
> /user/benyiw/tmp_abc/slot=1/3456606565886086588--5331466032849119437_929705780_data.0
> drwxr-xr-x   - impala supergroup  0 2013-06-06 12:40 
> /user/benyiw/tmp_abc/slot=2
> -rw-r--r--   2 impala supergroup  8 2013-06-06 12:40 
> /user/benyiw/tmp_abc/slot=2/-8660787917599456385--5527614477985301990_1328141055_data.0
> drwxr-xr-x   - impala supergroup  0 2013-06-06 12:40 
> /user/benyiw/tmp_abc/slot=3
> -rw-r--r--   2 impala supergroup  8 2013-06-06 12:40 
> /user/benyiw/tmp_abc/slot=3/-8660787917599456385--5527614477985301990_501684742_data.0
> drwxr-xr-x   - impala supergroup  0 2013-06-06 12:47 
> /user/benyiw/tmp_abc/slot=b
> -rw-r--r--   2 impala supergroup  16130 2013-06-06 12:47 
> /user/benyiw/tmp_abc/slot=b/705210285518833776--6520969021873146409_792816606_data.0
> -rw-r--r--   2 impala supergroup 100728 2013-06-06 12:47 
> /user/benyiw/tmp_abc/slot=b/705210285518833776--6520969021873146410_157404218_data.0
> -rw-r--r--   2 impala supergroup  43796 2013-06-06 12:47 
> /user/benyiw/tmp_abc/slot=b/705210285518833776--6520969021873146411_157404218_data.0
> {code}
> * When I ran the following queries, all output files are put into the same 
> partition. 
> {code}
> create table tmp_abc (
>   customer_id string,
>   email string
> ) partitioned by (slot string)
> row format delimited fields terminated by '\t' lines terminated by '\n'
> stored as TextFile
> location '/user/benyiw/tmp_abc';
> insert overwrite table tmp_abc partition (slot) select customer_id, email, 
> case when slot1 < 0.10 then "a" when slot1 < 0.70 then "b" else "c" end as 
> slot from ( select customer_id, email, rand() as slot1 from (select 
> customer_id, max(email) as email, sum(case when seg_num >= 0 then 1 else 0 
> end) as included from customers where ( (seg_num in (1) and member = 'Y') or 
> (seg_num = -1) ) and site_key = 'a_site' and coll_def_id = 'everything' group 
> by customer_id having included > 0 ) a ) b
> {code}
> {code}
> $ hadoop fs -ls -R /user/benyiw/tmp_abc;
> drwxr-xr-x   - impala supergroup  0 2013-06-06 13:01 
> /user/benyiw/tmp_abc/slot=a
> -rw-r--r--   2 impala supergroup  16021 2013-06-06 13:01 
> /user/benyiw/tmp_abc/slot=a/-7883266034308591983--7883771993317985492_909811936_data.0
> -rw-r--r--   2 impala supergroup 100713 2013-06-06 13:01 
> /user/benyiw/tmp_abc/slot=a/-7883266034308591983--7883771993317985493_272258764_data.0
> -rw-r--r--   2 impala supergroup  43920 2013-06-06 13:01 
> /user/benyiw/tmp_abc/slot=a/-7883266034308591983--7883771993317985494_272258764_data.0
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6249) Expose several build flags via web UI

2018-11-05 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675903#comment-16675903
 ] 

ASF subversion and git services commented on IMPALA-6249:
-

Commit 691f9d9ff98e90da7c5552d4a883b4fb28acb6b0 in impala's branch 
refs/heads/master from [~stakiar]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=691f9d9 ]

IMPALA-6249: Expose several build flags via web UI

Exposes a list of build flags via the impalad web UI. The build flags
can be viewed on the root page under the "Version" section. They can
be accessed via other tests through the debug version of the root page
(e.g. adding  to the URL). The build flags are listed in a JSON
array so that they can be parsed easily. This should help run Impala
tests against a remote Impala cluster.

The build flags are read in CMakeLists.txt and then stored in
preprocessor variables.

Three build flags are exposed as part of this commit:
- Is_NDEBUG = [true, false]
- Whether NDEBUG was true or false at compile time
- CMake_Build_Type = [DEBUG, RELEASE, ADDRESS_SANITIZER, TIDY, UBSAN,
  UBSAN_FULL, TSAN, CODE_COVERAGE_RELEASE, CODE_COVERAGE_DEBUG]
- The value of CMAKE_BUILD_TYPE at compile time
- Library_Link_Type = [DYNAMIC, STATIC]
- Derived from the compile time value of BUILD_SHARED_LIBS

There are a few other minor changes that are apart of this commit:

* The patch modifies environ.py so that it supports fetching build metadata
for both local and remote clusters.

* The tests under the tests/webserver directory were not being run because
'webserver' was not whitelisted in tests/run-tests.py. This patch fixes
that and addresses several test failures in run-tests.py.

* It reverts part of IMPALA-6947 so that their is no dependency from
start-impala-cluster.py to environ.py. The timeout discussed IMPALA-6947
is now set at compile time.

Testing:

Added new tests to webserver/test_web_pages.py to ensure that the build
flags are being set. Some tests are only run when run against a local
cluster because we have no way of getting the build info from a remote
cluster, whereas local clusters contain a .cmake_build_type file.

Change-Id: I47e3ad4cbf844909bdaf22a6f9d7bd915dce3f19
Reviewed-on: http://gerrit.cloudera.org:8080/11410
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Expose several build flags via web UI
> -
>
> Key: IMPALA-6249
> URL: https://issues.apache.org/jira/browse/IMPALA-6249
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: Tim Armstrong
>Assignee: Sahil Takiar
>Priority: Minor
> Attachments: Screen Shot 2018-09-06 at 11.47.45 AM.png
>
>
> IMPALA-6241 added a .cmake_build_type file with the CMAKE_BUILD_TYPE value 
> for the last build. The file is used to detect the type of the build that the 
> python tests are running against. However, this assumes that the tests are 
> running from the same directory that the Impala cluster under test was built 
> from, which isn't necessarily true for all dev workflows and for remote 
> cluster tests.
> It would be convenient if CMAKE_BUILD_TYPE was exposed from the Impalad web 
> UI. Currently we expose DEBUG/RELEASE depending on the value of NDEBUG - see 
> GetVersionString() and impalad-host:25000/?json=true, but we could expose the 
> precise build type, then allow the python tests to parse it from the web UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-402) Add test for dynamic partition expr involving rand()

2018-11-05 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675908#comment-16675908
 ] 

ASF subversion and git services commented on IMPALA-402:


Commit 58cd69ac48d4014ef956a7df9dce63c0b8f122c4 in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=58cd69a ]

IMPALA-402: test for random partitioning in insert

This adds a basic regression test for the bug reported in IMPALA-402.

Testing:
Exhaustive build.

Looped the modified test overnight.

Change-Id: I4bbca5c64977cadf79dabd72f0c8876a40fdf410
Reviewed-on: http://gerrit.cloudera.org:8080/11799
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add test for dynamic partition expr involving rand()
> 
>
> Key: IMPALA-402
> URL: https://issues.apache.org/jira/browse/IMPALA-402
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 1.0, Impala 2.5.0, Impala 2.6.0, Impala 2.7.0, 
> Impala 2.8.0, Impala 2.9.0
> Environment: CentOS 6.3
>Reporter: Benyi Wang
>Assignee: Tim Armstrong
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> I found two problems:
> * "Insert overwrite table" doesn't clean up the directory (external table)
> {code}
> $ hadoop fs -ls -R /user/benyiw/tmp_abc;
> drwxr-xr-x   - impala supergroup  0 2013-06-06 12:46 
> /user/benyiw/tmp_abc/slot=1
> -rw-r--r--   2 impala supergroup  16088 2013-06-06 12:46 
> /user/benyiw/tmp_abc/slot=1/3456606565886086588--5331466032849119435_641430213_data.0
> -rw-r--r--   2 impala supergroup 100691 2013-06-06 12:46 
> /user/benyiw/tmp_abc/slot=1/3456606565886086588--5331466032849119436_1260163059_data.0
> -rw-r--r--   2 impala supergroup  43875 2013-06-06 12:46 
> /user/benyiw/tmp_abc/slot=1/3456606565886086588--5331466032849119437_929705780_data.0
> drwxr-xr-x   - impala supergroup  0 2013-06-06 12:40 
> /user/benyiw/tmp_abc/slot=2
> -rw-r--r--   2 impala supergroup  8 2013-06-06 12:40 
> /user/benyiw/tmp_abc/slot=2/-8660787917599456385--5527614477985301990_1328141055_data.0
> drwxr-xr-x   - impala supergroup  0 2013-06-06 12:40 
> /user/benyiw/tmp_abc/slot=3
> -rw-r--r--   2 impala supergroup  8 2013-06-06 12:40 
> /user/benyiw/tmp_abc/slot=3/-8660787917599456385--5527614477985301990_501684742_data.0
> drwxr-xr-x   - impala supergroup  0 2013-06-06 12:47 
> /user/benyiw/tmp_abc/slot=b
> -rw-r--r--   2 impala supergroup  16130 2013-06-06 12:47 
> /user/benyiw/tmp_abc/slot=b/705210285518833776--6520969021873146409_792816606_data.0
> -rw-r--r--   2 impala supergroup 100728 2013-06-06 12:47 
> /user/benyiw/tmp_abc/slot=b/705210285518833776--6520969021873146410_157404218_data.0
> -rw-r--r--   2 impala supergroup  43796 2013-06-06 12:47 
> /user/benyiw/tmp_abc/slot=b/705210285518833776--6520969021873146411_157404218_data.0
> {code}
> * When I ran the following queries, all output files are put into the same 
> partition. 
> {code}
> create table tmp_abc (
>   customer_id string,
>   email string
> ) partitioned by (slot string)
> row format delimited fields terminated by '\t' lines terminated by '\n'
> stored as TextFile
> location '/user/benyiw/tmp_abc';
> insert overwrite table tmp_abc partition (slot) select customer_id, email, 
> case when slot1 < 0.10 then "a" when slot1 < 0.70 then "b" else "c" end as 
> slot from ( select customer_id, email, rand() as slot1 from (select 
> customer_id, max(email) as email, sum(case when seg_num >= 0 then 1 else 0 
> end) as included from customers where ( (seg_num in (1) and member = 'Y') or 
> (seg_num = -1) ) and site_key = 'a_site' and coll_def_id = 'everything' group 
> by customer_id having included > 0 ) a ) b
> {code}
> {code}
> $ hadoop fs -ls -R /user/benyiw/tmp_abc;
> drwxr-xr-x   - impala supergroup  0 2013-06-06 13:01 
> /user/benyiw/tmp_abc/slot=a
> -rw-r--r--   2 impala supergroup  16021 2013-06-06 13:01 
> /user/benyiw/tmp_abc/slot=a/-7883266034308591983--7883771993317985492_909811936_data.0
> -rw-r--r--   2 impala supergroup 100713 2013-06-06 13:01 
> /user/benyiw/tmp_abc/slot=a/-7883266034308591983--7883771993317985493_272258764_data.0
> -rw-r--r--   2 impala supergroup  43920 2013-06-06 13:01 
> /user/benyiw/tmp_abc/slot=a/-7883266034308591983--7883771993317985494_272258764_data.0
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6947) kudu: GetTableLocations RPC timing out with ASAN

2018-11-05 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675904#comment-16675904
 ] 

ASF subversion and git services commented on IMPALA-6947:
-

Commit 691f9d9ff98e90da7c5552d4a883b4fb28acb6b0 in impala's branch 
refs/heads/master from [~stakiar]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=691f9d9 ]

IMPALA-6249: Expose several build flags via web UI

Exposes a list of build flags via the impalad web UI. The build flags
can be viewed on the root page under the "Version" section. They can
be accessed via other tests through the debug version of the root page
(e.g. adding  to the URL). The build flags are listed in a JSON
array so that they can be parsed easily. This should help run Impala
tests against a remote Impala cluster.

The build flags are read in CMakeLists.txt and then stored in
preprocessor variables.

Three build flags are exposed as part of this commit:
- Is_NDEBUG = [true, false]
- Whether NDEBUG was true or false at compile time
- CMake_Build_Type = [DEBUG, RELEASE, ADDRESS_SANITIZER, TIDY, UBSAN,
  UBSAN_FULL, TSAN, CODE_COVERAGE_RELEASE, CODE_COVERAGE_DEBUG]
- The value of CMAKE_BUILD_TYPE at compile time
- Library_Link_Type = [DYNAMIC, STATIC]
- Derived from the compile time value of BUILD_SHARED_LIBS

There are a few other minor changes that are apart of this commit:

* The patch modifies environ.py so that it supports fetching build metadata
for both local and remote clusters.

* The tests under the tests/webserver directory were not being run because
'webserver' was not whitelisted in tests/run-tests.py. This patch fixes
that and addresses several test failures in run-tests.py.

* It reverts part of IMPALA-6947 so that their is no dependency from
start-impala-cluster.py to environ.py. The timeout discussed IMPALA-6947
is now set at compile time.

Testing:

Added new tests to webserver/test_web_pages.py to ensure that the build
flags are being set. Some tests are only run when run against a local
cluster because we have no way of getting the build info from a remote
cluster, whereas local clusters contain a .cmake_build_type file.

Change-Id: I47e3ad4cbf844909bdaf22a6f9d7bd915dce3f19
Reviewed-on: http://gerrit.cloudera.org:8080/11410
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> kudu: GetTableLocations RPC timing out with ASAN
> 
>
> Key: IMPALA-6947
> URL: https://issues.apache.org/jira/browse/IMPALA-6947
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Michael Brown
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> {noformat}
> query_test/test_kudu.py:84: in test_kudu_insert
> self.run_test_case('QueryTest/kudu_insert', vector, 
> use_db=unique_database)
> common/impala_test_suite.py:398: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:613: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:341: in __execute_query
> self.wait_for_completion(handle)
> beeswax/impala_beeswax.py:361: in wait_for_completion
> raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EQuery aborted:Kudu error(s) reported, first error: Timed out: 
> GetTableLocations { table: 'impala::test_kudu_insert_70eff904.kudu_test', 
> partition-key: (HASH (a, b): 2), attempt: 1 } failed: GetTableLocations RPC 
> to 127.0.0.1:7051 timed out after 10.000s (SENT)
> E   
> E   Key already present in Kudu table 
> 'impala::test_kudu_insert_70eff904.kudu_test'. (1 of 3 similar)
> E   Error in Kudu table 'impala::test_kudu_insert_70eff904.kudu_test': Timed 
> out: GetTableLocations { table: 
> 'impala::test_kudu_insert_70eff904.kudu_test', partition-key: (HASH (a, b): 
> 2), attempt: 1 } failed: GetTableLocations RPC to 127.0.0.1:7051 timed out 
> after 10.000s (SENT) (1 of 21 similar)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-402) Add test for dynamic partition expr involving rand()

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-402.
--
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Add test for dynamic partition expr involving rand()
> 
>
> Key: IMPALA-402
> URL: https://issues.apache.org/jira/browse/IMPALA-402
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 1.0, Impala 2.5.0, Impala 2.6.0, Impala 2.7.0, 
> Impala 2.8.0, Impala 2.9.0
> Environment: CentOS 6.3
>Reporter: Benyi Wang
>Assignee: Tim Armstrong
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> I found two problems:
> * "Insert overwrite table" doesn't clean up the directory (external table)
> {code}
> $ hadoop fs -ls -R /user/benyiw/tmp_abc;
> drwxr-xr-x   - impala supergroup  0 2013-06-06 12:46 
> /user/benyiw/tmp_abc/slot=1
> -rw-r--r--   2 impala supergroup  16088 2013-06-06 12:46 
> /user/benyiw/tmp_abc/slot=1/3456606565886086588--5331466032849119435_641430213_data.0
> -rw-r--r--   2 impala supergroup 100691 2013-06-06 12:46 
> /user/benyiw/tmp_abc/slot=1/3456606565886086588--5331466032849119436_1260163059_data.0
> -rw-r--r--   2 impala supergroup  43875 2013-06-06 12:46 
> /user/benyiw/tmp_abc/slot=1/3456606565886086588--5331466032849119437_929705780_data.0
> drwxr-xr-x   - impala supergroup  0 2013-06-06 12:40 
> /user/benyiw/tmp_abc/slot=2
> -rw-r--r--   2 impala supergroup  8 2013-06-06 12:40 
> /user/benyiw/tmp_abc/slot=2/-8660787917599456385--5527614477985301990_1328141055_data.0
> drwxr-xr-x   - impala supergroup  0 2013-06-06 12:40 
> /user/benyiw/tmp_abc/slot=3
> -rw-r--r--   2 impala supergroup  8 2013-06-06 12:40 
> /user/benyiw/tmp_abc/slot=3/-8660787917599456385--5527614477985301990_501684742_data.0
> drwxr-xr-x   - impala supergroup  0 2013-06-06 12:47 
> /user/benyiw/tmp_abc/slot=b
> -rw-r--r--   2 impala supergroup  16130 2013-06-06 12:47 
> /user/benyiw/tmp_abc/slot=b/705210285518833776--6520969021873146409_792816606_data.0
> -rw-r--r--   2 impala supergroup 100728 2013-06-06 12:47 
> /user/benyiw/tmp_abc/slot=b/705210285518833776--6520969021873146410_157404218_data.0
> -rw-r--r--   2 impala supergroup  43796 2013-06-06 12:47 
> /user/benyiw/tmp_abc/slot=b/705210285518833776--6520969021873146411_157404218_data.0
> {code}
> * When I ran the following queries, all output files are put into the same 
> partition. 
> {code}
> create table tmp_abc (
>   customer_id string,
>   email string
> ) partitioned by (slot string)
> row format delimited fields terminated by '\t' lines terminated by '\n'
> stored as TextFile
> location '/user/benyiw/tmp_abc';
> insert overwrite table tmp_abc partition (slot) select customer_id, email, 
> case when slot1 < 0.10 then "a" when slot1 < 0.70 then "b" else "c" end as 
> slot from ( select customer_id, email, rand() as slot1 from (select 
> customer_id, max(email) as email, sum(case when seg_num >= 0 then 1 else 0 
> end) as included from customers where ( (seg_num in (1) and member = 'Y') or 
> (seg_num = -1) ) and site_key = 'a_site' and coll_def_id = 'everything' group 
> by customer_id having included > 0 ) a ) b
> {code}
> {code}
> $ hadoop fs -ls -R /user/benyiw/tmp_abc;
> drwxr-xr-x   - impala supergroup  0 2013-06-06 13:01 
> /user/benyiw/tmp_abc/slot=a
> -rw-r--r--   2 impala supergroup  16021 2013-06-06 13:01 
> /user/benyiw/tmp_abc/slot=a/-7883266034308591983--7883771993317985492_909811936_data.0
> -rw-r--r--   2 impala supergroup 100713 2013-06-06 13:01 
> /user/benyiw/tmp_abc/slot=a/-7883266034308591983--7883771993317985493_272258764_data.0
> -rw-r--r--   2 impala supergroup  43920 2013-06-06 13:01 
> /user/benyiw/tmp_abc/slot=a/-7883266034308591983--7883771993317985494_272258764_data.0
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (IMPALA-6947) kudu: GetTableLocations RPC timing out with ASAN

2018-11-05 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675905#comment-16675905
 ] 

ASF subversion and git services commented on IMPALA-6947:
-

Commit 691f9d9ff98e90da7c5552d4a883b4fb28acb6b0 in impala's branch 
refs/heads/master from [~stakiar]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=691f9d9 ]

IMPALA-6249: Expose several build flags via web UI

Exposes a list of build flags via the impalad web UI. The build flags
can be viewed on the root page under the "Version" section. They can
be accessed via other tests through the debug version of the root page
(e.g. adding  to the URL). The build flags are listed in a JSON
array so that they can be parsed easily. This should help run Impala
tests against a remote Impala cluster.

The build flags are read in CMakeLists.txt and then stored in
preprocessor variables.

Three build flags are exposed as part of this commit:
- Is_NDEBUG = [true, false]
- Whether NDEBUG was true or false at compile time
- CMake_Build_Type = [DEBUG, RELEASE, ADDRESS_SANITIZER, TIDY, UBSAN,
  UBSAN_FULL, TSAN, CODE_COVERAGE_RELEASE, CODE_COVERAGE_DEBUG]
- The value of CMAKE_BUILD_TYPE at compile time
- Library_Link_Type = [DYNAMIC, STATIC]
- Derived from the compile time value of BUILD_SHARED_LIBS

There are a few other minor changes that are apart of this commit:

* The patch modifies environ.py so that it supports fetching build metadata
for both local and remote clusters.

* The tests under the tests/webserver directory were not being run because
'webserver' was not whitelisted in tests/run-tests.py. This patch fixes
that and addresses several test failures in run-tests.py.

* It reverts part of IMPALA-6947 so that their is no dependency from
start-impala-cluster.py to environ.py. The timeout discussed IMPALA-6947
is now set at compile time.

Testing:

Added new tests to webserver/test_web_pages.py to ensure that the build
flags are being set. Some tests are only run when run against a local
cluster because we have no way of getting the build info from a remote
cluster, whereas local clusters contain a .cmake_build_type file.

Change-Id: I47e3ad4cbf844909bdaf22a6f9d7bd915dce3f19
Reviewed-on: http://gerrit.cloudera.org:8080/11410
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> kudu: GetTableLocations RPC timing out with ASAN
> 
>
> Key: IMPALA-6947
> URL: https://issues.apache.org/jira/browse/IMPALA-6947
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Michael Brown
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> {noformat}
> query_test/test_kudu.py:84: in test_kudu_insert
> self.run_test_case('QueryTest/kudu_insert', vector, 
> use_db=unique_database)
> common/impala_test_suite.py:398: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:613: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:341: in __execute_query
> self.wait_for_completion(handle)
> beeswax/impala_beeswax.py:361: in wait_for_completion
> raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EQuery aborted:Kudu error(s) reported, first error: Timed out: 
> GetTableLocations { table: 'impala::test_kudu_insert_70eff904.kudu_test', 
> partition-key: (HASH (a, b): 2), attempt: 1 } failed: GetTableLocations RPC 
> to 127.0.0.1:7051 timed out after 10.000s (SENT)
> E   
> E   Key already present in Kudu table 
> 'impala::test_kudu_insert_70eff904.kudu_test'. (1 of 3 similar)
> E   Error in Kudu table 'impala::test_kudu_insert_70eff904.kudu_test': Timed 
> out: GetTableLocations { table: 
> 'impala::test_kudu_insert_70eff904.kudu_test', partition-key: (HASH (a, b): 
> 2), attempt: 1 } failed: GetTableLocations RPC to 127.0.0.1:7051 timed out 
> after 10.000s (SENT) (1 of 21 similar)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-5031) UBSAN clean and method for testing UBSAN cleanliness

2018-11-05 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675906#comment-16675906
 ] 

ASF subversion and git services commented on IMPALA-5031:
-

Commit 78b6f1db69eafaf01408fda444e53513200852f3 in impala's branch 
refs/heads/master from [~jbapple]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=78b6f1d ]

IMPALA-5031: Make UBSAN-friendly arithmetic generic

ArithmeticUtil::AsUnsigned() makes it possible to do arithmetic on
signed integers in a way that does not invoke undefined behavior, but
it only works on integers. This patch adds ArithmeticUtil::Compute(),
which dispatches (at compile time) to the normal arithmetic evaluation
method if the type of the values is a floating point type, but uses
AsUnsigned() if the type of the values is an integral type.

Change-Id: I73bec71e59c5a921003d0ebca52a1d4e49bbef66
Reviewed-on: http://gerrit.cloudera.org:8080/11810
Reviewed-by: Jim Apple 
Tested-by: Impala Public Jenkins 


> UBSAN clean and method for testing UBSAN cleanliness
> 
>
> Key: IMPALA-5031
> URL: https://issues.apache.org/jira/browse/IMPALA-5031
> Project: IMPALA
>  Issue Type: Task
>  Components: Backend, Infrastructure
>Affects Versions: Impala 2.9.0
>Reporter: Jim Apple
>Assignee: Jim Apple
>Priority: Minor
>
> http://releases.llvm.org/3.8.0/tools/clang/docs/UndefinedBehaviorSanitizer.html
>  builds are supported after https://gerrit.cloudera.org/#/c/6186/, but 
> Impala's test suite triggers many errors under UBSAN. Those errors should be 
> fixed and then there should be a way to run the test suite under UBSAN and 
> fail if there were any errors detected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7813) Non-deterministic partition expression may be redundantly evaluated

2018-11-05 Thread Tim Armstrong (JIRA)

Tim Armstrong created IMPALA-7813:
-

 Summary: Non-deterministic partition expression may be redundantly 
evaluated
 Key: IMPALA-7813
 URL: https://issues.apache.org/jira/browse/IMPALA-7813
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.0, Impala 2.11.0
Reporter: Tim Armstrong


Following on from IMPALA-402, I believe there is another issue where the 
partition value evaluated in the exchange, the sort and the table sink may all 
turn out to be different values. This would break various clustering 
optimisations and result in small files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IMPALA-7813) Non-deterministic partition expression may be redundantly evaluated

2018-11-05 Thread Tim Armstrong (JIRA)

Tim Armstrong created IMPALA-7813:
-

 Summary: Non-deterministic partition expression may be redundantly 
evaluated
 Key: IMPALA-7813
 URL: https://issues.apache.org/jira/browse/IMPALA-7813
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.0, Impala 2.11.0
Reporter: Tim Armstrong


Following on from IMPALA-402, I believe there is another issue where the 
partition value evaluated in the exchange, the sort and the table sink may all 
turn out to be different values. This would break various clustering 
optimisations and result in small files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-691) Process mem limit does not account for the JVM's memory usage

2018-11-05 Thread Tim Armstrong (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675898#comment-16675898
 ] 

Tim Armstrong commented on IMPALA-691:
--

It would be useful to add this as an option sometime soon - it makes it easier 
for people deploying Impala to plan for memory consumption. Currently you need 
to manually add the process memory limit and JVM heap together.

> Process mem limit does not account for the JVM's memory usage
> -
>
> Key: IMPALA-691
> URL: https://issues.apache.org/jira/browse/IMPALA-691
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 1.2.1, Impala 2.0, Impala 2.1, Impala 2.2, Impala 
> 2.3.0
>Reporter: Skye Wanderman-Milne
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: incompatibility, resource-management
>
> The JVM doesn't appear to use malloc, so it's memory usage is not reported by 
> tcmalloc and we do not count it in the process mem limit. I verified this by 
> adding a large allocation in the FE, and noting that the total memory usage 
> (virtual or resident) reported in /memz is not affected, but the virtual and 
> resident memory usage reported by top is.
> This is problematic especially because Impala caches table metadata in the FE 
> (JVM) which can become quite big (few GBs) in extreme cases.
> *Workaround*
> As a workaround, we recommend reducing the process memory limit by 1-2GB to 
> "reserve" memory for the JVM. How much memory you should reserve typically 
> depends on the size of your catalog ( number of 
> tables/partitions/columns/blocks etc.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7812) Count JVM memory usage against process memory limit by default

2018-11-05 Thread Tim Armstrong (JIRA)

Tim Armstrong created IMPALA-7812:
-

 Summary: Count JVM memory usage against process memory limit by 
default
 Key: IMPALA-7812
 URL: https://issues.apache.org/jira/browse/IMPALA-7812
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Reporter: Tim Armstrong






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7811) Add flag to count JVM memory against process limit

2018-11-05 Thread Tim Armstrong (JIRA)

Tim Armstrong created IMPALA-7811:
-

 Summary: Add flag to count JVM memory against process limit
 Key: IMPALA-7811
 URL: https://issues.apache.org/jira/browse/IMPALA-7811
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Reporter: Tim Armstrong
Assignee: Tim Armstrong






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IMPALA-7811) Add flag to count JVM memory against process limit

2018-11-05 Thread Tim Armstrong (JIRA)

Tim Armstrong created IMPALA-7811:
-

 Summary: Add flag to count JVM memory against process limit
 Key: IMPALA-7811
 URL: https://issues.apache.org/jira/browse/IMPALA-7811
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Reporter: Tim Armstrong
Assignee: Tim Armstrong






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-691) Process mem limit does not account for the JVM's memory usage

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-691:


Assignee: Tim Armstrong

> Process mem limit does not account for the JVM's memory usage
> -
>
> Key: IMPALA-691
> URL: https://issues.apache.org/jira/browse/IMPALA-691
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 1.2.1, Impala 2.0, Impala 2.1, Impala 2.2, Impala 
> 2.3.0
>Reporter: Skye Wanderman-Milne
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: incompatibility, resource-management
>
> The JVM doesn't appear to use malloc, so it's memory usage is not reported by 
> tcmalloc and we do not count it in the process mem limit. I verified this by 
> adding a large allocation in the FE, and noting that the total memory usage 
> (virtual or resident) reported in /memz is not affected, but the virtual and 
> resident memory usage reported by top is.
> This is problematic especially because Impala caches table metadata in the FE 
> (JVM) which can become quite big (few GBs) in extreme cases.
> *Workaround*
> As a workaround, we recommend reducing the process memory limit by 1-2GB to 
> "reserve" memory for the JVM. How much memory you should reserve typically 
> depends on the size of your catalog ( number of 
> tables/partitions/columns/blocks etc.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7810) query-state.cc:295] Check failed: profile_buf == nullptr

2018-11-05 Thread Michael Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho updated IMPALA-7810:
---
Labels: broken-build  (was: )

> query-state.cc:295] Check failed: profile_buf == nullptr
> 
>
> Key: IMPALA-7810
> URL: https://issues.apache.org/jira/browse/IMPALA-7810
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 3.2.0
>Reporter: Michael Ho
>Assignee: Michael Ho
>Priority: Blocker
>  Labels: broken-build
>
> Apparently, some custom cluster tests hit the following DCHECK in 
> {{QueryState::ConstructReport()}}. This is a bad DCHECK which was supposed to 
> be removed after the refactoring in that function during the revision in the 
> last couple of iterations of review.
> {noformat}
> // Debug action to simulate failure to serialize the profile.
> if (!DebugAction(query_options(), "REPORT_EXEC_STATUS_PROFILE").ok()) {
>   DCHECK(profile_buf == nullptr);
>   return;
> }
> {noformat}
> Anyhow, this is accidentally fixed in the patch of IMPALA-4063 as the 
> offending DCHECK is removed from that patch. The fix of IMPALA-4063 is being 
> merged now.
> {noformat}
> #0  0x7f2d7ebde1f7 in raise () from /lib64/libc.so.6
> #1  0x7f2d7ebdf8e8 in abort () from /lib64/libc.so.6
> #2  0x0451c8e4 in google::DumpStackTraceAndExit() ()
> #3  0x0451333d in google::LogMessage::Fail() ()
> #4  0x04514be2 in google::LogMessage::SendToLog() ()
> #5  0x04512d17 in google::LogMessage::Flush() ()
> #6  0x045162de in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x01eea5a7 in impala::QueryState::ConstructReport 
> (this=0xf86a800, done=false, status=..., fis=0xee01b00, 
> report=0x7f2cdabce200, serializer=0x7f2cdabce1c0, profile_buf=0x7f2cdabce338, 
> profile_len=0x7f2cdabce334) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/query-state.cc:295
> #8  0x01eeacfe in impala::QueryState::ReportExecStatusAux 
> (this=0xf86a800, done=false, status=..., fis=0xee01b00, 
> instances_started=true) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/query-state.cc:334
> #9  0x01eea20d in impala::QueryState::ReportExecStatus 
> (this=0xf86a800, done=false, status=..., fis=0xee01b00) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/query-state.cc:261
> #10 0x01ee0600 in impala::FragmentInstanceState::SendReport 
> (this=0xee01b00, done=false, status=...) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/fragment-instance-state.cc:408
> #11 0x01edfed5 in impala::FragmentInstanceState::ReportProfileThread 
> (this=0xee01b00) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/fragment-instance-state.cc:389
> #12 0x01edc339 in 
> impala::FragmentInstanceStateoperator()(void) const 
> (__closure=0x7f2cdabceba8) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/fragment-instance-state.cc:246
> #13 0x01ee2219 in 
> boost::detail::function::void_function_obj_invoker0,
>  void>::invoke(boost::detail::function::function_buffer &) 
> (function_obj_ptr=...) at 
> /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153
> #14 0x01d0595c in boost::function0::operator() 
> (this=0x7f2cdabceba0) at 
> /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:767
> #15 0x02185773 in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) (name=..., category=..., 
> functor=..., parent_thread_info=0x7f2cdc3d1850, 
> thread_started=0x7f2cdc3d0b90) at 
> /data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/util/thread.cc:359
> #16 0x0218da93 in boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> 
> >::operator() boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), boost::_bi::list0&, int) 
> (this=0xe7399c0, f=@0xe7399b8: 0x218540c 
>  boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*)>, a=...) at 
> /data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525
> #17 0x0218d9b7 in boost::_bi::bind_t const&,

[jira] [Resolved] (IMPALA-7777) Fix crash due to arithmetic overflows in Exchange Node

2018-11-05 Thread Sahil Takiar (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-.
--
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Fix crash due to arithmetic overflows in Exchange Node
> --
>
> Key: IMPALA-
> URL: https://issues.apache.org/jira/browse/IMPALA-
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> A follow up to IMPALA-5004. Impala allows a value of LIMIT and OFFSET up to 
> 2^63. However, if a user tries to run a query with a large offset (e.g. 
> slightly lower than 2^63), the query will crash the impalad due to a 
> {{DCHECK_LE}} in {{row-batch.h}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7777) Fix crash due to arithmetic overflows in Exchange Node

2018-11-05 Thread Sahil Takiar (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-.
--
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Fix crash due to arithmetic overflows in Exchange Node
> --
>
> Key: IMPALA-
> URL: https://issues.apache.org/jira/browse/IMPALA-
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> A follow up to IMPALA-5004. Impala allows a value of LIMIT and OFFSET up to 
> 2^63. However, if a user tries to run a query with a large offset (e.g. 
> slightly lower than 2^63), the query will crash the impalad due to a 
> {{DCHECK_LE}} in {{row-batch.h}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IMPALA-7810) query-state.cc:295] Check failed: profile_buf == nullptr

2018-11-05 Thread Michael Ho (JIRA)

Michael Ho created IMPALA-7810:
--

 Summary: query-state.cc:295] Check failed: profile_buf == nullptr
 Key: IMPALA-7810
 URL: https://issues.apache.org/jira/browse/IMPALA-7810
 Project: IMPALA
  Issue Type: Bug
  Components: Distributed Exec
Affects Versions: Impala 3.2.0
Reporter: Michael Ho
Assignee: Michael Ho


Apparently, some custom cluster tests hit the following DCHECK in 
{{QueryState::ConstructReport()}}. This is a bad DCHECK which was supposed to 
be removed after the refactoring in that function during the revision in the 
last couple of iterations of review.

{noformat}
// Debug action to simulate failure to serialize the profile.
if (!DebugAction(query_options(), "REPORT_EXEC_STATUS_PROFILE").ok()) {
  DCHECK(profile_buf == nullptr);
  return;
}
{noformat}

Anyhow, this is accidentally fixed in the patch of IMPALA-4063 as the offending 
DCHECK is removed from that patch. The fix of IMPALA-4063 is being merged now.

{noformat}
#0  0x7f2d7ebde1f7 in raise () from /lib64/libc.so.6
#1  0x7f2d7ebdf8e8 in abort () from /lib64/libc.so.6
#2  0x0451c8e4 in google::DumpStackTraceAndExit() ()
#3  0x0451333d in google::LogMessage::Fail() ()
#4  0x04514be2 in google::LogMessage::SendToLog() ()
#5  0x04512d17 in google::LogMessage::Flush() ()
#6  0x045162de in google::LogMessageFatal::~LogMessageFatal() ()
#7  0x01eea5a7 in impala::QueryState::ConstructReport (this=0xf86a800, 
done=false, status=..., fis=0xee01b00, report=0x7f2cdabce200, 
serializer=0x7f2cdabce1c0, profile_buf=0x7f2cdabce338, 
profile_len=0x7f2cdabce334) at 
/data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/query-state.cc:295
#8  0x01eeacfe in impala::QueryState::ReportExecStatusAux 
(this=0xf86a800, done=false, status=..., fis=0xee01b00, instances_started=true) 
at 
/data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/query-state.cc:334
#9  0x01eea20d in impala::QueryState::ReportExecStatus (this=0xf86a800, 
done=false, status=..., fis=0xee01b00) at 
/data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/query-state.cc:261
#10 0x01ee0600 in impala::FragmentInstanceState::SendReport 
(this=0xee01b00, done=false, status=...) at 
/data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/fragment-instance-state.cc:408
#11 0x01edfed5 in impala::FragmentInstanceState::ReportProfileThread 
(this=0xee01b00) at 
/data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/fragment-instance-state.cc:389
#12 0x01edc339 in 
impala::FragmentInstanceStateoperator()(void) const 
(__closure=0x7f2cdabceba8) at 
/data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/runtime/fragment-instance-state.cc:246
#13 0x01ee2219 in 
boost::detail::function::void_function_obj_invoker0,
 void>::invoke(boost::detail::function::function_buffer &) 
(function_obj_ptr=...) at 
/data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153
#14 0x01d0595c in boost::function0::operator() 
(this=0x7f2cdabceba0) at 
/data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:767
#15 0x02185773 in impala::Thread::SuperviseThread(std::string const&, 
std::string const&, boost::function, impala::ThreadDebugInfo const*, 
impala::Promise*) (name=..., category=..., 
functor=..., parent_thread_info=0x7f2cdc3d1850, thread_started=0x7f2cdc3d0b90) 
at 
/data/jenkins/workspace/impala-private-parameterized/repos/Impala/be/src/util/thread.cc:359
#16 0x0218da93 in boost::_bi::list5, 
boost::_bi::value, boost::_bi::value >, 
boost::_bi::value, 
boost::_bi::value*> 
>::operator(), impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0>(boost::_bi::type, void 
(*&)(std::string const&, std::string const&, boost::function, 
impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0&, int) (this=0xe7399c0, 
f=@0xe7399b8: 0x218540c , impala::ThreadDebugInfo const*, 
impala::Promise*)>, a=...) at 
/data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525
#17 0x0218d9b7 in boost::_bi::bind_t, impala::ThreadDebugInfo const*, 
impala::Promise*), 
boost::_bi::list5, 
boost::_bi::value, boost::_bi::value >, 
boost::_bi::value, 
boost::_bi::value*> > 
>::operator()() (this=0xe7399b8) at 
/data/jenkins/workspace/impala-private-parameterized/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind_template.hpp:20
#18 0x0218d97a in boost::detail::thread_data, 
impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list5,

[jira] [Updated] (IMPALA-7808) Refactor Analyzer for easier debugging

2018-11-05 Thread Paul Rogers (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated IMPALA-7808:

Description: 
The analysis steps in {{SelectStmt}} and {{AnalysisContext}} are large and 
cumbersome. There is ample evidence in the literature that simpler, smaller 
functions are easier to understand and debug than larger, more complex 
functions. This ticket requests breaking up the large functions in these two 
cases into smaller, easier-understood units in preparation for tracking down 
issues related to missing rewrites of the {{WHERE}} and {{GROUP BY}} clauses.

One might argue that large functions perform better by eliminating unnecessary 
function calls. However, the planner is not performance sensitive, and the 
dozen extra calls that this change introduce will not change performance given 
the thousands of calls already made.

Experience has shown that the JIT compiler in the JVM actually does a better 
job optimizing smaller functions, and gives up when functions get to large. So, 
by creating smaller functions, we may actually allow the JIT compiler to 
generate better code.

And, this refactoring is in support of a possible outcome that the planner can 
handle rewrites without making multiple passes through the analyzer: that 
savings will far outweigh the few extra calls this change introduces.

  was:
The analysis steps in {{SelectStmt}} and {{AnalysisContext}} are large and 
cumbersome. There is ample evidence in the literature that simpler, smaller 
functions are easier to understand and debug than larger, more complex 
functions. This ticket requests breaking up the large functions in these two 
cases into smaller, easier-understood units in preparation for tracking down 
issues related to missing rewrites of the {{WHERE}} and {{GROUP BY}} clauses.

One might argue that large functions perform better by eliminating unnecessary 
function calls. However, the planner is not performance sensitive, and the 
dozen extra calls that this change introduce will not change performance given 
the thousands of calls already made.

Experience has shown that the JIT compiler in the JVM actually does a better 
job optimizing smaller functions, and gives up when functions get to large. So, 
by creating smaller functions, we may actually allow the JIT compiler to 
generate better code.

And, this refactoring is in support of a likely outcome that the planner can 
handle rewrites without making multiple passes through the analyzer: that 
savings will far outweigh the few extra calls this change introduces.


> Refactor Analyzer for easier debugging
> --
>
> Key: IMPALA-7808
> URL: https://issues.apache.org/jira/browse/IMPALA-7808
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> The analysis steps in {{SelectStmt}} and {{AnalysisContext}} are large and 
> cumbersome. There is ample evidence in the literature that simpler, smaller 
> functions are easier to understand and debug than larger, more complex 
> functions. This ticket requests breaking up the large functions in these two 
> cases into smaller, easier-understood units in preparation for tracking down 
> issues related to missing rewrites of the {{WHERE}} and {{GROUP BY}} clauses.
> One might argue that large functions perform better by eliminating 
> unnecessary function calls. However, the planner is not performance 
> sensitive, and the dozen extra calls that this change introduce will not 
> change performance given the thousands of calls already made.
> Experience has shown that the JIT compiler in the JVM actually does a better 
> job optimizing smaller functions, and gives up when functions get to large. 
> So, by creating smaller functions, we may actually allow the JIT compiler to 
> generate better code.
> And, this refactoring is in support of a possible outcome that the planner 
> can handle rewrites without making multiple passes through the analyzer: that 
> savings will far outweigh the few extra calls this change introduces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Work started] (IMPALA-7809) test_concurrent_schema_change incompatible with Kudu 1.9

2018-11-05 Thread Michael Brown (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7809 started by Michael Brown.
-
> test_concurrent_schema_change incompatible with Kudu 1.9
> 
>
> Key: IMPALA-7809
> URL: https://issues.apache.org/jira/browse/IMPALA-7809
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.2.0
>Reporter: Michael Brown
>Assignee: Michael Brown
>Priority: Blocker
>  Labels: broken-build
>
> With Kudu 1.9, test_concurrent_schema_change incompatible fails because the 
> message format is slightly different.
> {noformat}
> query_test/test_kudu.py:442: in test_concurrent_schema_change assert "has 
> fewer columns (1) than the SELECT / VALUES clause returns (2)" in msg \ E   
> assert ('has fewer columns (1) than the SELECT / VALUES clause returns (2)' 
> in 'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, first 
> error: Invalid argument: Failed to write batch ...redacted.com:31201): Client 
> provided column col1 INT64 NULLABLE not present in tablet (1 of 2 
> similar)\n\n' or "(type: TINYINT) is not compatible with column 'col1' (type: 
> STRING)" in 'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, 
> first error: Invalid argument: Failed to write batch ...redacted.com:31201): 
> Client provided column col1 INT64 NULLABLE not present in tablet (1 of 2 
> similar)\n\n' or 'has fewer columns than expected.' in 
> 'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, first error: 
> Invalid argument: Failed to write batch ...redacted.com:31201): Client 
> provided column col1 INT64 NULLABLE not present in tablet (1 of 2 
> similar)\n\n' or 'Column col1 has unexpected type.' in 
> 'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, first error: 
> Invalid argument: Failed to write batch ...redacted.com:31201): Client 
> provided column col1 INT64 NULLABLE not present in tablet (1 of 2 
> similar)\n\n' or 'Client provided column col1[int64 NULLABLE] not present in 
> tablet' in 'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, 
> first error: Invalid argument: Failed to write batch ...redacted.com:31201): 
> Client provided column col1 INT64 NULLABLE not present in tablet (1 of 2 
> similar)\n\n')
> Stacktrace
> query_test/test_kudu.py:442: in test_concurrent_schema_change
> assert "has fewer columns (1) than the SELECT / VALUES clause returns 
> (2)" in msg \
> E   assert ('has fewer columns (1) than the SELECT / VALUES clause returns 
> (2)' in 'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, 
> first error: Invalid argument: Failed to write batch ...redacted.com:31201): 
> Client provided column col1 INT64 NULLABLE not present in tablet (1 of 2 
> similar)\n\n' or "(type: TINYINT) is not compatible with column 'col1' (type: 
> STRING)" in 'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, 
> first error: Invalid argument: Failed to write batch ...redacted.com:31201): 
> Client provided column col1 INT64 NULLABLE not present in tablet (1 of 2 
> similar)\n\n' or 'has fewer columns than expected.' in 
> 'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, first error: 
> Invalid argument: Failed to write batch ...redacted.com:31201): Client 
> provided column col1 INT64 NULLABLE not present in tablet (1 of 2 
> similar)\n\n' or 'Column col1 has unexpected type.' in 
> 'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, first error: 
> Invalid argument: Failed to write batch ...redacted.com:31201): Client 
> provided column col1 INT64 NULLABLE not present in tablet (1 of 2 
> similar)\n\n' or 'Client provided column col1[int64 NULLABLE] not present in 
> tablet' in 'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, 
> first error: Invalid argument: Failed to write batch ...redacted.com:31201): 
> Client provided column col1 INT64 NULLABLE not present in tablet (1 of 2 
> similar)\n\n')
> {noformat}
> We see that the message returned includes {{Client provided column col1 INT64 
> NULLABLE not present in tablet}} and the code is looking for:
> {noformat}
>  437 for error in insert_thread.errors:
>  438   msg = str(error)
>  439   # The first two are AnalysisExceptions, the next two come from 
> KuduTableSink::Open()
>  440   # if the schema has changed since analysis, the last comes from 
> the Kudu server if
>  441   # the schema changes between KuduTableSink::Open() and when the 
> write ops are sent.
>  442   assert "has fewer columns (1) than the SELECT / VALUES clause 
> returns (2)" in msg \
>  443 or "(type: TINYINT) is not compatible with column 'col1' (type: 
> STRING)" in msg \
>  444 or "has

[jira] [Updated] (IMPALA-7808) Refactor Analyzer for easier debugging

2018-11-05 Thread Paul Rogers (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated IMPALA-7808:

Description: 
The analysis steps in {{SelectStmt}} and {{AnalysisContext}} are large and 
cumbersome. There is ample evidence in the literature that simpler, smaller 
functions are easier to understand and debug than larger, more complex 
functions. This ticket requests breaking up the large functions in these two 
cases into smaller, easier-understood units in preparation for tracking down 
issues related to missing rewrites of the {{WHERE}} and {{GROUP BY}} clauses.

One might argue that large functions perform better by eliminating unnecessary 
function calls. However, the planner is not performance sensitive, and the 
dozen extra calls that this change introduce will not change performance given 
the thousands of calls already made.

Experience has shown that the JIT compiler in the JVM actually does a better 
job optimizing smaller functions, and gives up when functions get to large. So, 
by creating smaller functions, we may actually allow the JIT compiler to 
generate better code.

And, this refactoring is in support of a likely outcome that the planner can 
handle rewrites without making multiple passes through the analyzer: that 
savings will far outweigh the few extra calls this change introduces.

  was:
The analysis steps in {{Analyzer}} and {{AnalysisContext}} are large and 
cumbersome. There is ample evidence in the literature that simpler, smaller 
functions are easier to understand and debug than larger, more complex 
functions. This ticket requests breaking up the large functions in these two 
cases into smaller, easier-understood units in preparation for tracking down 
issues related to missing rewrites of the {{WHERE}} and {{GROUP BY}} clauses.

One might argue that large functions perform better by eliminating unnecessary 
function calls. However, the planner is not performance sensitive, and the 
dozen extra calls that this change introduce will not change performance given 
the thousands of calls already made.

Experience has shown that the JIT compiler in the JVM actually does a better 
job optimizing smaller functions, and gives up when functions get to large. So, 
by creating smaller functions, we may actually allow the JIT compiler to 
generate better code.

And, this refactoring is in support of a likely outcome that the planner can 
handle rewrites without making multiple passes through the analyzer: that 
savings will far outweigh the few extra calls this change introduces.


> Refactor Analyzer for easier debugging
> --
>
> Key: IMPALA-7808
> URL: https://issues.apache.org/jira/browse/IMPALA-7808
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> The analysis steps in {{SelectStmt}} and {{AnalysisContext}} are large and 
> cumbersome. There is ample evidence in the literature that simpler, smaller 
> functions are easier to understand and debug than larger, more complex 
> functions. This ticket requests breaking up the large functions in these two 
> cases into smaller, easier-understood units in preparation for tracking down 
> issues related to missing rewrites of the {{WHERE}} and {{GROUP BY}} clauses.
> One might argue that large functions perform better by eliminating 
> unnecessary function calls. However, the planner is not performance 
> sensitive, and the dozen extra calls that this change introduce will not 
> change performance given the thousands of calls already made.
> Experience has shown that the JIT compiler in the JVM actually does a better 
> job optimizing smaller functions, and gives up when functions get to large. 
> So, by creating smaller functions, we may actually allow the JIT compiler to 
> generate better code.
> And, this refactoring is in support of a likely outcome that the planner can 
> handle rewrites without making multiple passes through the analyzer: that 
> savings will far outweigh the few extra calls this change introduces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7809) test_concurrent_schema_change incompatible with Kudu 1.9

2018-11-05 Thread Michael Brown (JIRA)

Michael Brown created IMPALA-7809:
-

 Summary: test_concurrent_schema_change incompatible with Kudu 1.9
 Key: IMPALA-7809
 URL: https://issues.apache.org/jira/browse/IMPALA-7809
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.2.0
Reporter: Michael Brown
Assignee: Michael Brown


With Kudu 1.9, test_concurrent_schema_change incompatible fails because the 
message format is slightly different.

{noformat}
query_test/test_kudu.py:442: in test_concurrent_schema_change assert "has 
fewer columns (1) than the SELECT / VALUES clause returns (2)" in msg \ E   
assert ('has fewer columns (1) than the SELECT / VALUES clause returns (2)' in 
'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, first error: 
Invalid argument: Failed to write batch ...redacted.com:31201): Client provided 
column col1 INT64 NULLABLE not present in tablet (1 of 2 similar)\n\n' or 
"(type: TINYINT) is not compatible with column 'col1' (type: STRING)" in 
'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, first error: 
Invalid argument: Failed to write batch ...redacted.com:31201): Client provided 
column col1 INT64 NULLABLE not present in tablet (1 of 2 similar)\n\n' or 'has 
fewer columns than expected.' in 'ImpalaBeeswaxException:\n Query aborted:Kudu 
error(s) reported, first error: Invalid argument: Failed to write batch 
...redacted.com:31201): Client provided column col1 INT64 NULLABLE not present 
in tablet (1 of 2 similar)\n\n' or 'Column col1 has unexpected type.' in 
'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, first error: 
Invalid argument: Failed to write batch ...redacted.com:31201): Client provided 
column col1 INT64 NULLABLE not present in tablet (1 of 2 similar)\n\n' or 
'Client provided column col1[int64 NULLABLE] not present in tablet' in 
'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, first error: 
Invalid argument: Failed to write batch ...redacted.com:31201): Client provided 
column col1 INT64 NULLABLE not present in tablet (1 of 2 similar)\n\n')
Stacktrace
query_test/test_kudu.py:442: in test_concurrent_schema_change
assert "has fewer columns (1) than the SELECT / VALUES clause returns (2)" 
in msg \
E   assert ('has fewer columns (1) than the SELECT / VALUES clause returns (2)' 
in 'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, first 
error: Invalid argument: Failed to write batch ...redacted.com:31201): Client 
provided column col1 INT64 NULLABLE not present in tablet (1 of 2 similar)\n\n' 
or "(type: TINYINT) is not compatible with column 'col1' (type: STRING)" in 
'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, first error: 
Invalid argument: Failed to write batch ...redacted.com:31201): Client provided 
column col1 INT64 NULLABLE not present in tablet (1 of 2 similar)\n\n' or 'has 
fewer columns than expected.' in 'ImpalaBeeswaxException:\n Query aborted:Kudu 
error(s) reported, first error: Invalid argument: Failed to write batch 
...redacted.com:31201): Client provided column col1 INT64 NULLABLE not present 
in tablet (1 of 2 similar)\n\n' or 'Column col1 has unexpected type.' in 
'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, first error: 
Invalid argument: Failed to write batch ...redacted.com:31201): Client provided 
column col1 INT64 NULLABLE not present in tablet (1 of 2 similar)\n\n' or 
'Client provided column col1[int64 NULLABLE] not present in tablet' in 
'ImpalaBeeswaxException:\n Query aborted:Kudu error(s) reported, first error: 
Invalid argument: Failed to write batch ...redacted.com:31201): Client provided 
column col1 INT64 NULLABLE not present in tablet (1 of 2 similar)\n\n')
{noformat}

We see that the message returned includes {{Client provided column col1 INT64 
NULLABLE not present in tablet}} and the code is looking for:
{noformat}
 437 for error in insert_thread.errors:
 438   msg = str(error)
 439   # The first two are AnalysisExceptions, the next two come from 
KuduTableSink::Open()
 440   # if the schema has changed since analysis, the last comes from the 
Kudu server if
 441   # the schema changes between KuduTableSink::Open() and when the 
write ops are sent.
 442   assert "has fewer columns (1) than the SELECT / VALUES clause 
returns (2)" in msg \
 443 or "(type: TINYINT) is not compatible with column 'col1' (type: 
STRING)" in msg \
 444 or "has fewer columns than expected." in msg \
 445 or "Column col1 has unexpected type." in msg \
 446 or "Client provided column col1[int64 NULLABLE] not present in 
tablet" in msg
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For

[jira] [Created] (IMPALA-7808) Refactor Analyzer for easier debugging

2018-11-05 Thread Paul Rogers (JIRA)

Paul Rogers created IMPALA-7808:
---

 Summary: Refactor Analyzer for easier debugging
 Key: IMPALA-7808
 URL: https://issues.apache.org/jira/browse/IMPALA-7808
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Affects Versions: Impala 3.0
Reporter: Paul Rogers
Assignee: Paul Rogers


The analysis steps in {{Analyzer}} and {{AnalysisContext}} are large and 
cumbersome. There is ample evidence in the literature that simpler, smaller 
functions are easier to understand and debug than larger, more complex 
functions. This ticket requests breaking up the large functions in these two 
cases into smaller, easier-understood units in preparation for tracking down 
issues related to missing rewrites of the {{WHERE}} and {{GROUP BY}} clauses.

One might argue that large functions perform better by eliminating unnecessary 
function calls. However, the planner is not performance sensitive, and the 
dozen extra calls that this change introduce will not change performance given 
the thousands of calls already made.

Experience has shown that the JIT compiler in the JVM actually does a better 
job optimizing smaller functions, and gives up when functions get to large. So, 
by creating smaller functions, we may actually allow the JIT compiler to 
generate better code.

And, this refactoring is in support of a likely outcome that the planner can 
handle rewrites without making multiple passes through the analyzer: that 
savings will far outweigh the few extra calls this change introduces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7777) Fix crash due to arithmetic overflows in Exchange Node

2018-11-05 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675803#comment-16675803
 ] 

ASF subversion and git services commented on IMPALA-:
-

Commit 31669a6703474f27259c8ad52208cd26d5788a1c in impala's branch 
refs/heads/master from stakiar
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=31669a6 ]

IMPALA-: Fix crash due to arithmetic overflows in Exchange Node

Fixes an arithmetic overflow in ExchangeNode::GetNextMerging. Prior to
this patch, the code read:

int rows_to_keep = num_rows_skipped_ - offset_;

Where num_rows_skipped_ and offset_ were of type int64_t. The result was
cast to an int which can lead to an overflow if the result exceeds the
value of 2^31. The value of rows_to_keep would be passed into
row-batch.h::CopyRows which would crash due to a DCHECK_LE error.

This crash arises when the value of the OFFSET is a large number, for
example, the query:

select int_col from functional.alltypes order by 1 limit
1 offset 9223372036854775800;

Would crash the Impalad executor for this query.

The fix is to change rows_to_keep to an int64_t to avoid the overflow,
which prevents the DCHECK_LE from failing.

Change-Id: I8bb8064aae6ad25c8a19f6a8869086be7e70400a
Reviewed-on: http://gerrit.cloudera.org:8080/11844
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> Fix crash due to arithmetic overflows in Exchange Node
> --
>
> Key: IMPALA-
> URL: https://issues.apache.org/jira/browse/IMPALA-
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> A follow up to IMPALA-5004. Impala allows a value of LIMIT and OFFSET up to 
> 2^63. However, if a user tries to run a query with a large offset (e.g. 
> slightly lower than 2^63), the query will crash the impalad due to a 
> {{DCHECK_LE}} in {{row-batch.h}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7775) StatestoreSslTest and SessionExpiryTest can crash in pthread_mutex_lock

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-7775.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> StatestoreSslTest and SessionExpiryTest can crash in pthread_mutex_lock
> ---
>
> Key: IMPALA-7775
> URL: https://issues.apache.org/jira/browse/IMPALA-7775
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: broken-build
> Fix For: Impala 3.1.0
>
>
> {noformat}
> 20:17:28 [==] Running 2 tests from 2 test cases.
> 20:17:28 [--] Global test environment set-up.
> 20:17:28 [--] 1 test from StatestoreTest
> 20:17:28 [ RUN  ] StatestoreTest.SmokeTest
> 20:17:28 [   OK ] StatestoreTest.SmokeTest (24 ms)
> 20:17:28 [--] 1 test from StatestoreTest (24 ms total)
> 20:17:28 
> 20:17:28 [--] 1 test from StatestoreSslTest
> 20:17:28 [ RUN  ] StatestoreSslTest.SmokeTest
> 20:17:28 terminate called after throwing an instance of 
> 'boost::exception_detail::clone_impl
>  >'
> 20:17:28   what():  boost: mutex lock failed in pthread_mutex_lock: Invalid 
> argument
> 20:17:28 Wrote minidump to 
> /home/ubuntu/Impala/logs/be_tests/minidumps/statestore-test/63ff46ee-a127-4ef6-5bccb5ba-dc73c28a.dmp
> 20:17:28 Wrote minidump to 
> /home/ubuntu/Impala/logs/be_tests/minidumps/statestore-test/63ff46ee-a127-4ef6-5bccb5ba-dc73c28a.dmp
> 20:17:28 
> {noformat}
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3441
> This smells like a lifecycle bug in the backend test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (IMPALA-6436) Impala Catalog generates a core file / mini dump when the HMS is not available

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-6436.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Impala Catalog generates a core file / mini dump when the HMS is not available
> --
>
> Key: IMPALA-6436
> URL: https://issues.apache.org/jira/browse/IMPALA-6436
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.10.0
>Reporter: Luis E Martinez-Poblete
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: supportability
> Fix For: Impala 3.1.0
>
>
> Synopsis:
>  =
>  Impala Catalog generates a core file / mini dump when the HMS is not 
> available
> Problem:
>  
> Catalog server created multiple Catalog core files. During the investigation 
> it was determine that the cause of the core files was because the Hive Meta 
> Store was not available and the option "Enable Core Dump" was enabled when 
> starting the Impala service.
> Below is the back trace of the core file:
> #0 0x7f72e93ee5d7 in raise () from /root/191729/slib/lib64/libc.so.6
>  #1 0x7f72e93efcc8 in abort () from /root/191729/slib/lib64/libc.so.6
>  #2 0x01ba5754 in google::DumpStackTraceAndExit() ()
>  #3 0x01b9c1cd in google::LogMessage::Fail() ()
>  #4 0x01b9da72 in google::LogMessage::SendToLog() ()
>  #5 0x01b9bba7 in google::LogMessage::Flush() ()
>  #6 0x01b9f16e in google::LogMessageFatal::~LogMessageFatal() ()
>  #7 0x0083067e in impala::Catalog::(GetCatalogVersion (this=0x0, 
> version=0x7ffc2aa6b750) at 
> /usr/src/debug/impala-2.10.0-cdh5.13.1/be/src/catalog/catalog.cc:88
>  #8 0x008143c9 in impala::CatalogServer::Start() () at 
> /usr/src/debug/impala-2.10.0-cdh5.13.1/be/src/catalog/catalog-server.cc:175
> The corresponding entries in the Catalog server log show the following fatal 
> error:
> F0111 09:48:05.017491 14571 catalog.cc:76] java.lang.IllegalStateException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient
>  at 
> org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.(MetaStoreClientPool.java:99)
>  at 
> org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.(MetaStoreClientPool.java:72)
>  at 
> org.apache.impala.catalog.MetaStoreClientPool.initClients(MetaStoreClientPool.java:168)
>  at org.apache.impala.catalog.Catalog.(Catalog.java:103)
>  at 
> org.apache.impala.catalog.CatalogServiceCatalog.(CatalogServiceCatalog.java:163)
>  at org.apache.impala.service.JniCatalog.(JniCatalog.java:104)
>  Caused by: java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient
> I was able to reproduce this issue. When the option "Enable Core Dump" is 
> enable and the Hive Meta Store is not available, the system generates a core 
> file. If the option "Enable Core Dump" is disabled, the system generates a 
> mini dump.
> Crashing due to an error is not expected. Impala should fail in a more user 
> friendly way.
> Reproduction case:
>  ==
>  1) Enable the option "Enable Core Dump" for the Impala service in CM.
>  2) Stop Hive and Impala services.
>  3) Start Impala Catalog server



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (IMPALA-6436) Impala Catalog generates a core file / mini dump when the HMS is not available

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-6436.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Impala Catalog generates a core file / mini dump when the HMS is not available
> --
>
> Key: IMPALA-6436
> URL: https://issues.apache.org/jira/browse/IMPALA-6436
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.10.0
>Reporter: Luis E Martinez-Poblete
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: supportability
> Fix For: Impala 3.1.0
>
>
> Synopsis:
>  =
>  Impala Catalog generates a core file / mini dump when the HMS is not 
> available
> Problem:
>  
> Catalog server created multiple Catalog core files. During the investigation 
> it was determine that the cause of the core files was because the Hive Meta 
> Store was not available and the option "Enable Core Dump" was enabled when 
> starting the Impala service.
> Below is the back trace of the core file:
> #0 0x7f72e93ee5d7 in raise () from /root/191729/slib/lib64/libc.so.6
>  #1 0x7f72e93efcc8 in abort () from /root/191729/slib/lib64/libc.so.6
>  #2 0x01ba5754 in google::DumpStackTraceAndExit() ()
>  #3 0x01b9c1cd in google::LogMessage::Fail() ()
>  #4 0x01b9da72 in google::LogMessage::SendToLog() ()
>  #5 0x01b9bba7 in google::LogMessage::Flush() ()
>  #6 0x01b9f16e in google::LogMessageFatal::~LogMessageFatal() ()
>  #7 0x0083067e in impala::Catalog::(GetCatalogVersion (this=0x0, 
> version=0x7ffc2aa6b750) at 
> /usr/src/debug/impala-2.10.0-cdh5.13.1/be/src/catalog/catalog.cc:88
>  #8 0x008143c9 in impala::CatalogServer::Start() () at 
> /usr/src/debug/impala-2.10.0-cdh5.13.1/be/src/catalog/catalog-server.cc:175
> The corresponding entries in the Catalog server log show the following fatal 
> error:
> F0111 09:48:05.017491 14571 catalog.cc:76] java.lang.IllegalStateException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient
>  at 
> org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.(MetaStoreClientPool.java:99)
>  at 
> org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.(MetaStoreClientPool.java:72)
>  at 
> org.apache.impala.catalog.MetaStoreClientPool.initClients(MetaStoreClientPool.java:168)
>  at org.apache.impala.catalog.Catalog.(Catalog.java:103)
>  at 
> org.apache.impala.catalog.CatalogServiceCatalog.(CatalogServiceCatalog.java:163)
>  at org.apache.impala.service.JniCatalog.(JniCatalog.java:104)
>  Caused by: java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient
> I was able to reproduce this issue. When the option "Enable Core Dump" is 
> enable and the Hive Meta Store is not available, the system generates a core 
> file. If the option "Enable Core Dump" is disabled, the system generates a 
> mini dump.
> Crashing due to an error is not expected. Impala should fail in a more user 
> friendly way.
> Reproduction case:
>  ==
>  1) Enable the option "Enable Core Dump" for the Impala service in CM.
>  2) Stop Hive and Impala services.
>  3) Start Impala Catalog server



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7775) StatestoreSslTest and SessionExpiryTest can crash in pthread_mutex_lock

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-7775.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> StatestoreSslTest and SessionExpiryTest can crash in pthread_mutex_lock
> ---
>
> Key: IMPALA-7775
> URL: https://issues.apache.org/jira/browse/IMPALA-7775
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: broken-build
> Fix For: Impala 3.1.0
>
>
> {noformat}
> 20:17:28 [==] Running 2 tests from 2 test cases.
> 20:17:28 [--] Global test environment set-up.
> 20:17:28 [--] 1 test from StatestoreTest
> 20:17:28 [ RUN  ] StatestoreTest.SmokeTest
> 20:17:28 [   OK ] StatestoreTest.SmokeTest (24 ms)
> 20:17:28 [--] 1 test from StatestoreTest (24 ms total)
> 20:17:28 
> 20:17:28 [--] 1 test from StatestoreSslTest
> 20:17:28 [ RUN  ] StatestoreSslTest.SmokeTest
> 20:17:28 terminate called after throwing an instance of 
> 'boost::exception_detail::clone_impl
>  >'
> 20:17:28   what():  boost: mutex lock failed in pthread_mutex_lock: Invalid 
> argument
> 20:17:28 Wrote minidump to 
> /home/ubuntu/Impala/logs/be_tests/minidumps/statestore-test/63ff46ee-a127-4ef6-5bccb5ba-dc73c28a.dmp
> 20:17:28 Wrote minidump to 
> /home/ubuntu/Impala/logs/be_tests/minidumps/statestore-test/63ff46ee-a127-4ef6-5bccb5ba-dc73c28a.dmp
> 20:17:28 
> {noformat}
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3441
> This smells like a lifecycle bug in the backend test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-5031) UBSAN clean and method for testing UBSAN cleanliness

2018-11-05 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675800#comment-16675800
 ] 

ASF subversion and git services commented on IMPALA-5031:
-

Commit a03e22011dfc6c93818e923d0dc29e6f61fefbe9 in impala's branch 
refs/heads/master from [~jbapple]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=a03e220 ]

IMPALA-5031: memcpy cannot take null arguments

This patch fixes UBSAN "null pointer passed as argument" errors in
data loading. These are undefined behavior according to "7.1.4 Use of
library functions" in the C99 standard (which is included in C++14 in
section [intro.refs]):

If an argument to a function has an invalid value (such as a value
outside the domain of the function, or a pointer outside the
address space of the program, or a null pointer, or a pointer to
non-modifiable storage when the corresponding parameter is not
const-qualified) or a type (after promotion) not expected by a
function with variable number of arguments, the behavior is
undefined.

The interesting parts of the backtraces for the errors fixed in this
patch are below:

runtime/string-buffer.h:54:12: runtime error: null pointer passed as 
argument 1, which is declared to never be null
/usr/include/string.h:43:45: note: nonnull attribute specified here
StringBuffer::Append(char const*, long) runtime/string-buffer.h:54:5
ColumnStatsBase::CopyToBuffer(StringBuffer*, StringValue*) 
exec/parquet-column-stats.cc:151:51
ColumnStats::MaterializeStringValuesToInternalBuffers() 
exec/parquet-column-stats.inline.h:237:70
HdfsParquetTableWriter::BaseColumnWriter::MaterializeStatsValues() 
exec/hdfs-parquet-table-writer.cc:149:63
HdfsParquetTableWriter::AppendRows(RowBatch*, vector const&, bool*) 
exec/hdfs-parquet-table-writer.cc:1129:53
HdfsTableSink::WriteRowsToPartition(RuntimeState*, RowBatch*, 
pair >, vector > >*) exec/hdfs-table-sink.cc:256:71
HdfsTableSink::Send(RuntimeState*, RowBatch*) exec/hdfs-table-sink.cc:591:45

util/streaming-sampler.h:111:22: runtime error: null pointer passed as 
argument 2, which is declared to never be null
/usr/include/string.h:43:45: note: nonnull attribute specified here
StreamingSampler::SetSamples(int, vector const&) 
util/streaming-sampler.h:111:5
RuntimeProfile::Update(vector const&, int*) 
util/runtime-profile.cc:313:30
RuntimeProfile::Update(TRuntimeProfileTree const&) 
util/runtime-profile.cc:246:3

Coordinator::BackendState::InstanceStats::Update(TFragmentInstanceExecStatus 
const&, Coordinator::ExecSummary*, ProgressUpdater*) 
runtime/coordinator-backend-state.cc:474:13
Coordinator::BackendState::ApplyExecStatusReport(TReportExecStatusParams 
const&, Coordinator::ExecSummary*, ProgressUpdater*) 
runtime/coordinator-backend-state.cc:287:21
Coordinator::UpdateBackendExecStatus(TReportExecStatusParams const&) 
runtime/coordinator.cc:679:22
ClientRequestState::UpdateBackendExecStatus(TReportExecStatusParams const&) 
service/client-request-state.cc:1254:18
ImpalaServer::ReportExecStatus(TReportExecStatusResult&, 
TReportExecStatusParams const&) service/impala-server.cc:1343:18
ImpalaInternalService::ReportExecStatus(TReportExecStatusResult&, 
TReportExecStatusParams const&) service/impala-internal-service.cc:87:19

Change-Id: Ib9acc8c32409e67253a987eb3d1fd7d921efcb51
Reviewed-on: http://gerrit.cloudera.org:8080/11812
Reviewed-by: Jim Apple 
Tested-by: Impala Public Jenkins 


> UBSAN clean and method for testing UBSAN cleanliness
> 
>
> Key: IMPALA-5031
> URL: https://issues.apache.org/jira/browse/IMPALA-5031
> Project: IMPALA
>  Issue Type: Task
>  Components: Backend, Infrastructure
>Affects Versions: Impala 2.9.0
>Reporter: Jim Apple
>Assignee: Jim Apple
>Priority: Minor
>
> http://releases.llvm.org/3.8.0/tools/clang/docs/UndefinedBehaviorSanitizer.html
>  builds are supported after https://gerrit.cloudera.org/#/c/6186/, but 
> Impala's test suite triggers many errors under UBSAN. Those errors should be 
> fixed and then there should be a way to run the test suite under UBSAN and 
> fail if there were any errors detected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6436) Impala Catalog generates a core file / mini dump when the HMS is not available

2018-11-05 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675801#comment-16675801
 ] 

ASF subversion and git services commented on IMPALA-6436:
-

Commit f08642bf43102cf326f4f00e9b9e3536d6906b2c in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=f08642b ]

IMPALA-6436: exit instead of abort for catalog startup failure

Rename EXIT_WITH_EXC to ABORT_WITH_EXC to make the behaviour more
obvious at callsites.

Handle exceptions from Catalog constructor by logging the backtrace and
exiting cleanly, rather than aborting. This will prevent generation of a
coredump or minidump.

Testing:
Tested starting the catalogd locally without the HMS running and a
low connection timeout:

  start-impala-cluster.py --catalogd_args=--initial_hms_cnxn_timeout_s=2

Confirmed that the backtrace was logged to catalogd.ERROR and that no
core or minidump was generated.

Change-Id: I4026dccb39843b847426112fc0fe9ba897e48dcc
Reviewed-on: http://gerrit.cloudera.org:8080/11871
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Impala Catalog generates a core file / mini dump when the HMS is not available
> --
>
> Key: IMPALA-6436
> URL: https://issues.apache.org/jira/browse/IMPALA-6436
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.10.0
>Reporter: Luis E Martinez-Poblete
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: supportability
>
> Synopsis:
>  =
>  Impala Catalog generates a core file / mini dump when the HMS is not 
> available
> Problem:
>  
> Catalog server created multiple Catalog core files. During the investigation 
> it was determine that the cause of the core files was because the Hive Meta 
> Store was not available and the option "Enable Core Dump" was enabled when 
> starting the Impala service.
> Below is the back trace of the core file:
> #0 0x7f72e93ee5d7 in raise () from /root/191729/slib/lib64/libc.so.6
>  #1 0x7f72e93efcc8 in abort () from /root/191729/slib/lib64/libc.so.6
>  #2 0x01ba5754 in google::DumpStackTraceAndExit() ()
>  #3 0x01b9c1cd in google::LogMessage::Fail() ()
>  #4 0x01b9da72 in google::LogMessage::SendToLog() ()
>  #5 0x01b9bba7 in google::LogMessage::Flush() ()
>  #6 0x01b9f16e in google::LogMessageFatal::~LogMessageFatal() ()
>  #7 0x0083067e in impala::Catalog::(GetCatalogVersion (this=0x0, 
> version=0x7ffc2aa6b750) at 
> /usr/src/debug/impala-2.10.0-cdh5.13.1/be/src/catalog/catalog.cc:88
>  #8 0x008143c9 in impala::CatalogServer::Start() () at 
> /usr/src/debug/impala-2.10.0-cdh5.13.1/be/src/catalog/catalog-server.cc:175
> The corresponding entries in the Catalog server log show the following fatal 
> error:
> F0111 09:48:05.017491 14571 catalog.cc:76] java.lang.IllegalStateException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient
>  at 
> org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.(MetaStoreClientPool.java:99)
>  at 
> org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.(MetaStoreClientPool.java:72)
>  at 
> org.apache.impala.catalog.MetaStoreClientPool.initClients(MetaStoreClientPool.java:168)
>  at org.apache.impala.catalog.Catalog.(Catalog.java:103)
>  at 
> org.apache.impala.catalog.CatalogServiceCatalog.(CatalogServiceCatalog.java:163)
>  at org.apache.impala.service.JniCatalog.(JniCatalog.java:104)
>  Caused by: java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient
> I was able to reproduce this issue. When the option "Enable Core Dump" is 
> enable and the Hive Meta Store is not available, the system generates a core 
> file. If the option "Enable Core Dump" is disabled, the system generates a 
> mini dump.
> Crashing due to an error is not expected. Impala should fail in a more user 
> friendly way.
> Reproduction case:
>  ==
>  1) Enable the option "Enable Core Dump" for the Impala service in CM.
>  2) Stop Hive and Impala services.
>  3) Start Impala Catalog server



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7775) StatestoreSslTest and SessionExpiryTest can crash in pthread_mutex_lock

2018-11-05 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675802#comment-16675802
 ] 

ASF subversion and git services commented on IMPALA-7775:
-

Commit 572722e3505a663560e3fff1fe04d8fe932c0959 in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=572722e ]

IMPALA-7775: fix some lifecycle issues in statestore/session tests

Background threads from the Statestore's thread pool continued
running in the background and could dereference invalid memory.
We make sure these threads are cleaned up before moving onto
the next test. Note that we don't clean up all background
threads, just the ones that had caused issues here.

I refactored the memory management a bit to put all objects
that we can't safely free into a single ObjectPool.

The statestore tests also had an issue with the lifetime of the
string flags FLAGS_ssl_*_certificate. Those were overwritten
with new values while the thread pool threads were still running,
which could cause use-after-free bugs.

Testing:
Looped the tests under ASAN with the "stress" utility running at the
same time to flush out races.

Ran core tests.

Change-Id: I3b25c8b8a96bfa1183ce273b3bb4debde234dd01
Reviewed-on: http://gerrit.cloudera.org:8080/11864
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> StatestoreSslTest and SessionExpiryTest can crash in pthread_mutex_lock
> ---
>
> Key: IMPALA-7775
> URL: https://issues.apache.org/jira/browse/IMPALA-7775
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: broken-build
>
> {noformat}
> 20:17:28 [==] Running 2 tests from 2 test cases.
> 20:17:28 [--] Global test environment set-up.
> 20:17:28 [--] 1 test from StatestoreTest
> 20:17:28 [ RUN  ] StatestoreTest.SmokeTest
> 20:17:28 [   OK ] StatestoreTest.SmokeTest (24 ms)
> 20:17:28 [--] 1 test from StatestoreTest (24 ms total)
> 20:17:28 
> 20:17:28 [--] 1 test from StatestoreSslTest
> 20:17:28 [ RUN  ] StatestoreSslTest.SmokeTest
> 20:17:28 terminate called after throwing an instance of 
> 'boost::exception_detail::clone_impl
>  >'
> 20:17:28   what():  boost: mutex lock failed in pthread_mutex_lock: Invalid 
> argument
> 20:17:28 Wrote minidump to 
> /home/ubuntu/Impala/logs/be_tests/minidumps/statestore-test/63ff46ee-a127-4ef6-5bccb5ba-dc73c28a.dmp
> 20:17:28 Wrote minidump to 
> /home/ubuntu/Impala/logs/be_tests/minidumps/statestore-test/63ff46ee-a127-4ef6-5bccb5ba-dc73c28a.dmp
> 20:17:28 
> {noformat}
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3441
> This smells like a lifecycle bug in the backend test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7807) Analysis test fixture to enable deeper testing

2018-11-05 Thread Paul Rogers (JIRA)

Paul Rogers created IMPALA-7807:
---

 Summary: Analysis test fixture to enable deeper testing
 Key: IMPALA-7807
 URL: https://issues.apache.org/jira/browse/IMPALA-7807
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Affects Versions: Impala 3.0
Reporter: Paul Rogers
Assignee: Paul Rogers


The Impala front-end provides a number of JUnit tests such as 
{{ExprRewriteRulesTest}}. These tests verify rewrites by providing layers of 
functions that build up a query, analyze the query, run rewrite rules, and test 
one part of the result.

The tests are fine as far as they go, but they do not cover all cases. For 
example, they tests rewrites in the {{SELECT}} clause, but not {{ORDER BY}} or 
{{GROUP BY}}. (Testing of those uncovered previously hidden bugs.) In some 
cases, we want to test rewrite rules in detail, but the existing tests only 
support a wholesale rewrite.

Since the existing tests are function based, it is hard to inject new behavior 
somewhere in the process, for example, to test the {{WHERE}} clause rather than 
{{SELECT}} To do that, we need to copy the {{SELECT}} functions, and make 
changes to test {{WHERE}}.

Since copying of code is generally an undesirable approach, a better approach 
is to use a "test fixture": a class that performs the required steps, maintains 
intermediate state for inspection, and acts as the foundation for various kinds 
of tests (such as the various clauses mentioned above.)

In practice, all that is required is moving some code from functions on the 
test class to be methods on a fixture class, which also holds onto state that 
would otherwise be lost in function calls.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7805) NumericLiteral toSql() should render zero as 0, not 0-E38, 0.000, etc.

2018-11-05 Thread Paul Rogers (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated IMPALA-7805:

Description: 
Testing of other issues revealed a somewhat bizarre aspect of how the planner 
expression nodes render 0. {{NumericLiteral.toSql()}} uses the Java 
{{BigDecimal}} class to convert a numeric value to a string for use in 
explained plans.

The default Java behavior is to consider scale when rendering numbers, 
including 0. Thus, depending on precision and scale, you may get:

{noformat}
0
0.0
0.00
0.000
...
0E-38
{noformat}

Mathematically, zero is zero. Unlike Java, SQL attaches no significance to the 
decimal point. (In Java, 0 is an integer, 0.0 is a float.) Nor does SQL attach 
significance to the number of zeros past the decimal point. And, of course, 
we're only talking about the output of {{EXPLAIN}}, which is never parsed 
anyway (except in tests.)

To make testing easier, change the behavior to always emit "0" when the value 
is zero, regardless of precision or scale.

  was:
Testing of other issues revealed a somewhat bizarre aspect of how the planner 
expression nodes render 0. {{NumericLiteral.toSql()}} uses the Java 
{{BigDecimal}} class to convert a numeric value to a string for use in 
explained plans.

The default Java behavior is to consider scale when rendering numbers, 
including 0. Thus, depending on precision and scale, you may get:

{noformat}
0
0.0
0.00
0.000
...
0E-38
{noformat}

Mathematically, zero is zero. Unlike Java, SQL attaches no significance to the 
decimal point. (In Java, 0 is an integer, 0.0 is a float.) Nor does SQL attach 
significance to the number of zeros past the decimal point.

To make testing easier, change the behavior to always emit "0" when the value 
is zero, regardless of precision or scale.


> NumericLiteral toSql() should render zero as 0, not 0-E38, 0.000, etc.
> --
>
> Key: IMPALA-7805
> URL: https://issues.apache.org/jira/browse/IMPALA-7805
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> Testing of other issues revealed a somewhat bizarre aspect of how the planner 
> expression nodes render 0. {{NumericLiteral.toSql()}} uses the Java 
> {{BigDecimal}} class to convert a numeric value to a string for use in 
> explained plans.
> The default Java behavior is to consider scale when rendering numbers, 
> including 0. Thus, depending on precision and scale, you may get:
> {noformat}
> 0
> 0.0
> 0.00
> 0.000
> ...
> 0E-38
> {noformat}
> Mathematically, zero is zero. Unlike Java, SQL attaches no significance to 
> the decimal point. (In Java, 0 is an integer, 0.0 is a float.) Nor does SQL 
> attach significance to the number of zeros past the decimal point. And, of 
> course, we're only talking about the output of {{EXPLAIN}}, which is never 
> parsed anyway (except in tests.)
> To make testing easier, change the behavior to always emit "0" when the value 
> is zero, regardless of precision or scale.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-6015) Simplify ownership of FilterContexts and MemPools in ScannerContext once non-MT scan node is removed

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-6015:
--
Priority: Trivial  (was: Minor)

> Simplify ownership of FilterContexts and MemPools in ScannerContext once 
> non-MT scan node is removed
> 
>
> Key: IMPALA-6015
> URL: https://issues.apache.org/jira/browse/IMPALA-6015
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Tim Armstrong
>Priority: Trivial
>
> The ownership of these objects is a bit complicated because they can be 
> either stack-allocated in the scanner thread or owned by a single-threaded 
> scan node .
> It should be possible to simplify this once the multithreaded scan node is 
> removed.
> See https://gerrit.cloudera.org/#/c/8025/11/be/src/exec/scanner-context.h@337 
> for some context.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-4261) Create tests for multi-threaded query execution

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-4261:
-

Assignee: (was: Alexander Behm)

> Create tests for multi-threaded query execution
> ---
>
> Key: IMPALA-4261
> URL: https://issues.apache.org/jira/browse/IMPALA-4261
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 2.7.0
>Reporter: Marcel Kornacker
>Priority: Major
>
> Add tests that exercise the currently supported subset of multi-threaded 
> execution functionality. 
> At present, this is a small subset of overall functionality, so we can't 
> simply use the existing tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-6436) Impala Catalog generates a core file / mini dump when the HMS is not available

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-6436:
--
Target Version: Impala 3.1.0  (was: Impala 3.2.0)

> Impala Catalog generates a core file / mini dump when the HMS is not available
> --
>
> Key: IMPALA-6436
> URL: https://issues.apache.org/jira/browse/IMPALA-6436
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.10.0
>Reporter: Luis E Martinez-Poblete
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: supportability
>
> Synopsis:
>  =
>  Impala Catalog generates a core file / mini dump when the HMS is not 
> available
> Problem:
>  
> Catalog server created multiple Catalog core files. During the investigation 
> it was determine that the cause of the core files was because the Hive Meta 
> Store was not available and the option "Enable Core Dump" was enabled when 
> starting the Impala service.
> Below is the back trace of the core file:
> #0 0x7f72e93ee5d7 in raise () from /root/191729/slib/lib64/libc.so.6
>  #1 0x7f72e93efcc8 in abort () from /root/191729/slib/lib64/libc.so.6
>  #2 0x01ba5754 in google::DumpStackTraceAndExit() ()
>  #3 0x01b9c1cd in google::LogMessage::Fail() ()
>  #4 0x01b9da72 in google::LogMessage::SendToLog() ()
>  #5 0x01b9bba7 in google::LogMessage::Flush() ()
>  #6 0x01b9f16e in google::LogMessageFatal::~LogMessageFatal() ()
>  #7 0x0083067e in impala::Catalog::(GetCatalogVersion (this=0x0, 
> version=0x7ffc2aa6b750) at 
> /usr/src/debug/impala-2.10.0-cdh5.13.1/be/src/catalog/catalog.cc:88
>  #8 0x008143c9 in impala::CatalogServer::Start() () at 
> /usr/src/debug/impala-2.10.0-cdh5.13.1/be/src/catalog/catalog-server.cc:175
> The corresponding entries in the Catalog server log show the following fatal 
> error:
> F0111 09:48:05.017491 14571 catalog.cc:76] java.lang.IllegalStateException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient
>  at 
> org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.(MetaStoreClientPool.java:99)
>  at 
> org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.(MetaStoreClientPool.java:72)
>  at 
> org.apache.impala.catalog.MetaStoreClientPool.initClients(MetaStoreClientPool.java:168)
>  at org.apache.impala.catalog.Catalog.(Catalog.java:103)
>  at 
> org.apache.impala.catalog.CatalogServiceCatalog.(CatalogServiceCatalog.java:163)
>  at org.apache.impala.service.JniCatalog.(JniCatalog.java:104)
>  Caused by: java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient
> I was able to reproduce this issue. When the option "Enable Core Dump" is 
> enable and the Hive Meta Store is not available, the system generates a core 
> file. If the option "Enable Core Dump" is disabled, the system generates a 
> mini dump.
> Crashing due to an error is not expected. Impala should fail in a more user 
> friendly way.
> Reproduction case:
>  ==
>  1) Enable the option "Enable Core Dump" for the Impala service in CM.
>  2) Stop Hive and Impala services.
>  3) Start Impala Catalog server



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7775) StatestoreSslTest and SessionExpiryTest can crash in pthread_mutex_lock

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7775:
--
Target Version: Impala 3.1.0  (was: Impala 3.2.0)

> StatestoreSslTest and SessionExpiryTest can crash in pthread_mutex_lock
> ---
>
> Key: IMPALA-7775
> URL: https://issues.apache.org/jira/browse/IMPALA-7775
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: broken-build
>
> {noformat}
> 20:17:28 [==] Running 2 tests from 2 test cases.
> 20:17:28 [--] Global test environment set-up.
> 20:17:28 [--] 1 test from StatestoreTest
> 20:17:28 [ RUN  ] StatestoreTest.SmokeTest
> 20:17:28 [   OK ] StatestoreTest.SmokeTest (24 ms)
> 20:17:28 [--] 1 test from StatestoreTest (24 ms total)
> 20:17:28 
> 20:17:28 [--] 1 test from StatestoreSslTest
> 20:17:28 [ RUN  ] StatestoreSslTest.SmokeTest
> 20:17:28 terminate called after throwing an instance of 
> 'boost::exception_detail::clone_impl
>  >'
> 20:17:28   what():  boost: mutex lock failed in pthread_mutex_lock: Invalid 
> argument
> 20:17:28 Wrote minidump to 
> /home/ubuntu/Impala/logs/be_tests/minidumps/statestore-test/63ff46ee-a127-4ef6-5bccb5ba-dc73c28a.dmp
> 20:17:28 Wrote minidump to 
> /home/ubuntu/Impala/logs/be_tests/minidumps/statestore-test/63ff46ee-a127-4ef6-5bccb5ba-dc73c28a.dmp
> 20:17:28 
> {noformat}
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3441
> This smells like a lifecycle bug in the backend test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Closed] (IMPALA-7123) Impala 3.1 or 4.0 Doc: Restrict Impala to only support timezones that work in Hive (IANA + Java)

2018-11-05 Thread Alex Rodoni (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni closed IMPALA-7123.
---
Resolution: Won't Fix

> Impala 3.1 or 4.0 Doc: Restrict Impala to only support timezones that work in 
> Hive (IANA + Java)
> 
>
> Key: IMPALA-7123
> URL: https://issues.apache.org/jira/browse/IMPALA-7123
> Project: IMPALA
>  Issue Type: Task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (IMPALA-7123) Impala 3.1 or 4.0 Doc: Restrict Impala to only support timezones that work in Hive (IANA + Java)

2018-11-05 Thread Alex Rodoni (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni closed IMPALA-7123.
---
Resolution: Won't Fix

> Impala 3.1 or 4.0 Doc: Restrict Impala to only support timezones that work in 
> Hive (IANA + Java)
> 
>
> Key: IMPALA-7123
> URL: https://issues.apache.org/jira/browse/IMPALA-7123
> Project: IMPALA
>  Issue Type: Task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Work started] (IMPALA-7764) Add test coverage for SentryProxy

2018-11-05 Thread Fredy Wijaya (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7764 started by Fredy Wijaya.

> Add test coverage for SentryProxy
> -
>
> Key: IMPALA-7764
> URL: https://issues.apache.org/jira/browse/IMPALA-7764
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Major
>
> There are currently no unit tests in SentryProxy, which can lead to a number 
> of bugs when changing code in SentryProxy. There aren't many end-to-end tests 
> related to SentryProxy so that needs to be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7806) Impala 3.1 Doc: Check the existing known issues against 3.1 fixes

2018-11-05 Thread Alex Rodoni (JIRA)

Alex Rodoni created IMPALA-7806:
---

 Summary: Impala 3.1 Doc: Check the existing known issues against 
3.1 fixes
 Key: IMPALA-7806
 URL: https://issues.apache.org/jira/browse/IMPALA-7806
 Project: IMPALA
  Issue Type: Task
  Components: Docs
Reporter: Alex Rodoni
Assignee: Alex Rodoni






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7806) Impala 3.1 Doc: Check the existing known issues against 3.1 fixes

2018-11-05 Thread Alex Rodoni (JIRA)

Alex Rodoni created IMPALA-7806:
---

 Summary: Impala 3.1 Doc: Check the existing known issues against 
3.1 fixes
 Key: IMPALA-7806
 URL: https://issues.apache.org/jira/browse/IMPALA-7806
 Project: IMPALA
  Issue Type: Task
  Components: Docs
Reporter: Alex Rodoni
Assignee: Alex Rodoni






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (IMPALA-7805) NumericLiteral toSql() should render zero as 0, not 0-E38, 0.000, etc.

2018-11-05 Thread Tim Armstrong (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675650#comment-16675650
 ] 

Tim Armstrong commented on IMPALA-7805:
---

This would be a step towards sanity. IMPALA-5821 helps by representing type 
information about numeric literals in a consistent way.

> NumericLiteral toSql() should render zero as 0, not 0-E38, 0.000, etc.
> --
>
> Key: IMPALA-7805
> URL: https://issues.apache.org/jira/browse/IMPALA-7805
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> Testing of other issues revealed a somewhat bizarre aspect of how the planner 
> expression nodes render 0. {{NumericLiteral.toSql()}} uses the Java 
> {{BigDecimal}} class to convert a numeric value to a string for use in 
> explained plans.
> The default Java behavior is to consider scale when rendering numbers, 
> including 0. Thus, depending on precision and scale, you may get:
> {noformat}
> 0
> 0.0
> 0.00
> 0.000
> ...
> 0E-38
> {noformat}
> Mathematically, zero is zero. Unlike Java, SQL attaches no significance to 
> the decimal point. (In Java, 0 is an integer, 0.0 is a float.) Nor does SQL 
> attach significance to the number of zeros past the decimal point.
> To make testing easier, change the behavior to always emit "0" when the value 
> is zero, regardless of precision or scale.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7805) NumericLiteral toSql() should render zero as 0, not 0-E38, 0.000, etc.

2018-11-05 Thread Paul Rogers (JIRA)

Paul Rogers created IMPALA-7805:
---

 Summary: NumericLiteral toSql() should render zero as 0, not 
0-E38, 0.000, etc.
 Key: IMPALA-7805
 URL: https://issues.apache.org/jira/browse/IMPALA-7805
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Affects Versions: Impala 3.0
Reporter: Paul Rogers
Assignee: Paul Rogers


Testing of other issues revealed a somewhat bizarre aspect of how the planner 
expression nodes render 0. {{NumericLiteral.toSql()}} uses the Java 
{{BigDecimal}} class to convert a numeric value to a string for use in 
explained plans.

The default Java behavior is to consider scale when rendering numbers, 
including 0. Thus, depending on precision and scale, you may get:

{noformat}
0
0.0
0.00
0.000
...
0E-38
{noformat}

Mathematically, zero is zero. Unlike Java, SQL attaches no significance to the 
decimal point. (In Java, 0 is an integer, 0.0 is a float.) Nor does SQL attach 
significance to the number of zeros past the decimal point.

To make testing easier, change the behavior to always emit "0" when the value 
is zero, regardless of precision or scale.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IMPALA-7805) NumericLiteral toSql() should render zero as 0, not 0-E38, 0.000, etc.

2018-11-05 Thread Paul Rogers (JIRA)

Paul Rogers created IMPALA-7805:
---

 Summary: NumericLiteral toSql() should render zero as 0, not 
0-E38, 0.000, etc.
 Key: IMPALA-7805
 URL: https://issues.apache.org/jira/browse/IMPALA-7805
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Affects Versions: Impala 3.0
Reporter: Paul Rogers
Assignee: Paul Rogers


Testing of other issues revealed a somewhat bizarre aspect of how the planner 
expression nodes render 0. {{NumericLiteral.toSql()}} uses the Java 
{{BigDecimal}} class to convert a numeric value to a string for use in 
explained plans.

The default Java behavior is to consider scale when rendering numbers, 
including 0. Thus, depending on precision and scale, you may get:

{noformat}
0
0.0
0.00
0.000
...
0E-38
{noformat}

Mathematically, zero is zero. Unlike Java, SQL attaches no significance to the 
decimal point. (In Java, 0 is an integer, 0.0 is a float.) Nor does SQL attach 
significance to the number of zeros past the decimal point.

To make testing easier, change the behavior to always emit "0" when the value 
is zero, regardless of precision or scale.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7793) Do not rewrite CAST(non-null-literal AS type) exprs

2018-11-05 Thread Paul Rogers (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated IMPALA-7793:

Summary: Do not rewrite CAST(non-null-literal AS type) exprs  (was: CASE 
must not rewrite CAST(literal AS type) exprs)

> Do not rewrite CAST(non-null-literal AS type) exprs
> ---
>
> Key: IMPALA-7793
> URL: https://issues.apache.org/jira/browse/IMPALA-7793
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Priority: Major
>
> The test suite {{QueryTest/decimal-exprs}} contains the following test:
> {code:sql}
> set decimal_v2=false;
> set ENABLE_EXPR_REWRITES=false;
> select coalesce(1.8, cast(0 as decimal(38,38)))
> {code}
> Which produces this result:
> {noformat}
> +--+
> | coalesce(1.8, cast(0 as decimal(38,38))) |
> +--+
> | 0.00 |
> +--+
> WARNINGS: UDF WARNING: Decimal expression overflowed, returning NULL
> {noformat}
> Notice that the "1.8" overflowed when being put into a {{DECIMAL(38,38)}} 
> type. (The precision and range are both 38, meaning all digits are after the 
> decimal point.)
> The {{coalesce()}} function caught the overflow, treated it as a {{NULL}}, 
> and selected the second value from the list, which is 0.
> Very good. Now, try the equivalent CASE form (from MPALA-7655):
> {noformat}
> select CASE WHEN 1.8 IS NOT NULL THEN 1.8 ELSE cast(0 as decimal(38,38)) END;
> +---+
> | case when 1.8 is not null then 1.8 else cast(0 as decimal(38,38)) end |
> +---+
> | NULL  |
> +---+
> WARNINGS: UDF WARNING: Decimal expression overflowed, returning NULL
> {noformat}
> Apparently, the overflow somehow caused the {{ELSE}} clause to not fire.
> This one is likely a bug in the BE code generation. Though, tried the 
> {{CASE}} query with a variety of options:
> {noformat}
> set disable_codegen=true;
> and
> set disable_codegen=false;
> set disable_codegen_rows_threshold=0;
> and
> set disable_codegen_rows_threshold=10;
> {noformat}
> In all cases, the {{CASE}} produced the wrong result. Also tried wrapping the 
> expression {{1.8 IS NOT NULL}} in a variety of forms: {{IS TRUE}}, {{IS NOT 
> FALSE}}. None of this worked correctly.
> The result of this bug is the the above-mentioned test case fails in a build 
> that contains IMPALA-7655.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7793) CASE must not rewrite CAST(literal AS type) exprs

2018-11-05 Thread Paul Rogers (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675587#comment-16675587
 ] 

Paul Rogers commented on IMPALA-7793:
-

This issue identified two problems.

First, V2 of Decimal support does not perform type propagation, but V1 does. 
This would appear to be a regression. That is, as shown above, in V1 the 
planner worked out that it needed to cast 1.8 to DECIMAL(38,38) (which then 
overflowed.) In V2, no such type inference is done and the user must explicitly 
include a cast. A bit of research is needed to determine if this is a feature 
(perhaps to be compatible with some other SQL version) or a bug (a regression.)

Second, during expression rewrite, the constant folding rule will try to 
execute {{CAST(1.8 AS DECIMAL(38,38))}} but will fail (due to overflow). The 
constant folding rule simply leaves the expression unchanged. All subsequent 
rules need to be very careful to treat this kind of expression as a 
non-constant. For example, the {{coalesce()}} example above works only because 
the planner does not rewrite it. Add a bit of logic to remove constants (as was 
done as part of the project that found the bug) and the statement is rewritten 
as just the cast (the first non-null value) which caused the behavior described 
here.

The upshot is that this is not a problem with {{CASE}} per-se, it is a subtle 
issue with how the planner needs to handle failed constant folding expressions.

The fix is to mark seemingly-constant expression as non-constant (using 
existing flags) if it fails to execute so that downstream rules do not treat 
the expression as a constant and do rewrites based on that assumption.

> CASE must not rewrite CAST(literal AS type) exprs
> -
>
> Key: IMPALA-7793
> URL: https://issues.apache.org/jira/browse/IMPALA-7793
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Priority: Major
>
> The test suite {{QueryTest/decimal-exprs}} contains the following test:
> {code:sql}
> set decimal_v2=false;
> set ENABLE_EXPR_REWRITES=false;
> select coalesce(1.8, cast(0 as decimal(38,38)))
> {code}
> Which produces this result:
> {noformat}
> +--+
> | coalesce(1.8, cast(0 as decimal(38,38))) |
> +--+
> | 0.00 |
> +--+
> WARNINGS: UDF WARNING: Decimal expression overflowed, returning NULL
> {noformat}
> Notice that the "1.8" overflowed when being put into a {{DECIMAL(38,38)}} 
> type. (The precision and range are both 38, meaning all digits are after the 
> decimal point.)
> The {{coalesce()}} function caught the overflow, treated it as a {{NULL}}, 
> and selected the second value from the list, which is 0.
> Very good. Now, try the equivalent CASE form (from MPALA-7655):
> {noformat}
> select CASE WHEN 1.8 IS NOT NULL THEN 1.8 ELSE cast(0 as decimal(38,38)) END;
> +---+
> | case when 1.8 is not null then 1.8 else cast(0 as decimal(38,38)) end |
> +---+
> | NULL  |
> +---+
> WARNINGS: UDF WARNING: Decimal expression overflowed, returning NULL
> {noformat}
> Apparently, the overflow somehow caused the {{ELSE}} clause to not fire.
> This one is likely a bug in the BE code generation. Though, tried the 
> {{CASE}} query with a variety of options:
> {noformat}
> set disable_codegen=true;
> and
> set disable_codegen=false;
> set disable_codegen_rows_threshold=0;
> and
> set disable_codegen_rows_threshold=10;
> {noformat}
> In all cases, the {{CASE}} produced the wrong result. Also tried wrapping the 
> expression {{1.8 IS NOT NULL}} in a variety of forms: {{IS TRUE}}, {{IS NOT 
> FALSE}}. None of this worked correctly.
> The result of this bug is the the above-mentioned test case fails in a build 
> that contains IMPALA-7655.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7793) CASE must not rewrite CAST(literal AS type) exprs

2018-11-05 Thread Paul Rogers (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated IMPALA-7793:

Summary: CASE must not rewrite CAST(literal AS type) exprs  (was: CASE 
statement does not handle NULL from UDF overflow)

> CASE must not rewrite CAST(literal AS type) exprs
> -
>
> Key: IMPALA-7793
> URL: https://issues.apache.org/jira/browse/IMPALA-7793
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Priority: Major
>
> The test suite {{QueryTest/decimal-exprs}} contains the following test:
> {code:sql}
> set decimal_v2=false;
> set ENABLE_EXPR_REWRITES=false;
> select coalesce(1.8, cast(0 as decimal(38,38)))
> {code}
> Which produces this result:
> {noformat}
> +--+
> | coalesce(1.8, cast(0 as decimal(38,38))) |
> +--+
> | 0.00 |
> +--+
> WARNINGS: UDF WARNING: Decimal expression overflowed, returning NULL
> {noformat}
> Notice that the "1.8" overflowed when being put into a {{DECIMAL(38,38)}} 
> type. (The precision and range are both 38, meaning all digits are after the 
> decimal point.)
> The {{coalesce()}} function caught the overflow, treated it as a {{NULL}}, 
> and selected the second value from the list, which is 0.
> Very good. Now, try the equivalent CASE form (from MPALA-7655):
> {noformat}
> select CASE WHEN 1.8 IS NOT NULL THEN 1.8 ELSE cast(0 as decimal(38,38)) END;
> +---+
> | case when 1.8 is not null then 1.8 else cast(0 as decimal(38,38)) end |
> +---+
> | NULL  |
> +---+
> WARNINGS: UDF WARNING: Decimal expression overflowed, returning NULL
> {noformat}
> Apparently, the overflow somehow caused the {{ELSE}} clause to not fire.
> This one is likely a bug in the BE code generation. Though, tried the 
> {{CASE}} query with a variety of options:
> {noformat}
> set disable_codegen=true;
> and
> set disable_codegen=false;
> set disable_codegen_rows_threshold=0;
> and
> set disable_codegen_rows_threshold=10;
> {noformat}
> In all cases, the {{CASE}} produced the wrong result. Also tried wrapping the 
> expression {{1.8 IS NOT NULL}} in a variety of forms: {{IS TRUE}}, {{IS NOT 
> FALSE}}. None of this worked correctly.
> The result of this bug is the the above-mentioned test case fails in a build 
> that contains IMPALA-7655.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Closed] (IMPALA-5563) Timezone lookup may be ambiguous

2018-11-05 Thread Attila Jeges (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges closed IMPALA-5563.


> Timezone lookup may be ambiguous
> 
>
> Key: IMPALA-5563
> URL: https://issues.apache.org/jira/browse/IMPALA-5563
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.0
>Reporter: Matthew Jacobs
>Assignee: Attila Jeges
>Priority: Major
>  Labels: timezone
> Fix For: Impala 3.1.0
>
>
> When using functions like {{to_utc_timestamp}} that take a string timezone 
> parameter, if the timezone is not a 'region specifier' [1] (i.e. a key into 
> the timezone database entries), then Impala attempts to match the string 
> parameter against a number of other properties of the timezones in the 
> database [2]:
> * a zone's std abbreviation
> * a zone's std full name
> * a zone's dst abbreviation
> * a zone's dst full name
> {code}
> time_zone_ptr TimezoneDatabase::FindTimezone(
> const string& tz, const TimestampValue& tv, bool tv_in_utc) {
>   ...
>   // See if they specified a zone id
>   time_zone_ptr tzp = tz_database_.time_zone_from_region(tz);
>   if (tzp != NULL) return tzp;
>   for (vector::const_iterator iter = tz_region_list_.begin();
>iter != tz_region_list_.end(); ++iter) {
> time_zone_ptr tzp = tz_database_.time_zone_from_region(*iter);
> DCHECK(tzp != NULL);
> if (tzp->dst_zone_abbrev() == tz) return tzp;
> if (tzp->std_zone_abbrev() == tz) return tzp;
> if (tzp->dst_zone_name() == tz) return tzp;
> if (tzp->std_zone_name() == tz) return tzp;
>   }
>   return time_zone_ptr();
> }
> {code}
> This can result in ambiguous zones being used because the properties listed 
> above are not unique, e.g.
> {code}
> mj@mj-desktop:~/dev/Impala$ grep CEST be/src/exprs/timezone_db.cc 
> \"Africa/Ceuta\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Antarctica/Troll\",\"UTC\",\"Coordinated Universal 
> Time\",\"CEST\",\"Central European Summer 
> Time\",\"+00:00:00\",\"+02:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Arctic/Longyearbyen\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Atlantic/Jan_Mayen\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"CET\",\"CET\",\"Central European Time\",\"CEST\",\"Central European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+02:00:00\",\"-1;0;10\",\"+02:00:00\"\n\
> \"ECT\",\"CET\",\"Central European Time\",\"CEST\",\"Central European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Amsterdam\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Andorra\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Belgrade\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Berlin\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Bratislava\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Brussels\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Budapest\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Busingen\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Copenhagen\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Gibraltar\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
>

[jira] [Closed] (IMPALA-5563) Timezone lookup may be ambiguous

2018-11-05 Thread Attila Jeges (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges closed IMPALA-5563.


> Timezone lookup may be ambiguous
> 
>
> Key: IMPALA-5563
> URL: https://issues.apache.org/jira/browse/IMPALA-5563
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.0
>Reporter: Matthew Jacobs
>Assignee: Attila Jeges
>Priority: Major
>  Labels: timezone
> Fix For: Impala 3.1.0
>
>
> When using functions like {{to_utc_timestamp}} that take a string timezone 
> parameter, if the timezone is not a 'region specifier' [1] (i.e. a key into 
> the timezone database entries), then Impala attempts to match the string 
> parameter against a number of other properties of the timezones in the 
> database [2]:
> * a zone's std abbreviation
> * a zone's std full name
> * a zone's dst abbreviation
> * a zone's dst full name
> {code}
> time_zone_ptr TimezoneDatabase::FindTimezone(
> const string& tz, const TimestampValue& tv, bool tv_in_utc) {
>   ...
>   // See if they specified a zone id
>   time_zone_ptr tzp = tz_database_.time_zone_from_region(tz);
>   if (tzp != NULL) return tzp;
>   for (vector::const_iterator iter = tz_region_list_.begin();
>iter != tz_region_list_.end(); ++iter) {
> time_zone_ptr tzp = tz_database_.time_zone_from_region(*iter);
> DCHECK(tzp != NULL);
> if (tzp->dst_zone_abbrev() == tz) return tzp;
> if (tzp->std_zone_abbrev() == tz) return tzp;
> if (tzp->dst_zone_name() == tz) return tzp;
> if (tzp->std_zone_name() == tz) return tzp;
>   }
>   return time_zone_ptr();
> }
> {code}
> This can result in ambiguous zones being used because the properties listed 
> above are not unique, e.g.
> {code}
> mj@mj-desktop:~/dev/Impala$ grep CEST be/src/exprs/timezone_db.cc 
> \"Africa/Ceuta\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Antarctica/Troll\",\"UTC\",\"Coordinated Universal 
> Time\",\"CEST\",\"Central European Summer 
> Time\",\"+00:00:00\",\"+02:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Arctic/Longyearbyen\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Atlantic/Jan_Mayen\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"CET\",\"CET\",\"Central European Time\",\"CEST\",\"Central European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+02:00:00\",\"-1;0;10\",\"+02:00:00\"\n\
> \"ECT\",\"CET\",\"Central European Time\",\"CEST\",\"Central European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Amsterdam\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Andorra\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Belgrade\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Berlin\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Bratislava\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Brussels\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Budapest\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Busingen\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Copenhagen\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Gibraltar\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
>

[jira] [Resolved] (IMPALA-5563) Timezone lookup may be ambiguous

2018-11-05 Thread Attila Jeges (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges resolved IMPALA-5563.
--
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

Fixing IMPALA-3307 fixed this issue as well.

> Timezone lookup may be ambiguous
> 
>
> Key: IMPALA-5563
> URL: https://issues.apache.org/jira/browse/IMPALA-5563
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.0
>Reporter: Matthew Jacobs
>Assignee: Attila Jeges
>Priority: Major
>  Labels: timezone
> Fix For: Impala 3.1.0
>
>
> When using functions like {{to_utc_timestamp}} that take a string timezone 
> parameter, if the timezone is not a 'region specifier' [1] (i.e. a key into 
> the timezone database entries), then Impala attempts to match the string 
> parameter against a number of other properties of the timezones in the 
> database [2]:
> * a zone's std abbreviation
> * a zone's std full name
> * a zone's dst abbreviation
> * a zone's dst full name
> {code}
> time_zone_ptr TimezoneDatabase::FindTimezone(
> const string& tz, const TimestampValue& tv, bool tv_in_utc) {
>   ...
>   // See if they specified a zone id
>   time_zone_ptr tzp = tz_database_.time_zone_from_region(tz);
>   if (tzp != NULL) return tzp;
>   for (vector::const_iterator iter = tz_region_list_.begin();
>iter != tz_region_list_.end(); ++iter) {
> time_zone_ptr tzp = tz_database_.time_zone_from_region(*iter);
> DCHECK(tzp != NULL);
> if (tzp->dst_zone_abbrev() == tz) return tzp;
> if (tzp->std_zone_abbrev() == tz) return tzp;
> if (tzp->dst_zone_name() == tz) return tzp;
> if (tzp->std_zone_name() == tz) return tzp;
>   }
>   return time_zone_ptr();
> }
> {code}
> This can result in ambiguous zones being used because the properties listed 
> above are not unique, e.g.
> {code}
> mj@mj-desktop:~/dev/Impala$ grep CEST be/src/exprs/timezone_db.cc 
> \"Africa/Ceuta\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Antarctica/Troll\",\"UTC\",\"Coordinated Universal 
> Time\",\"CEST\",\"Central European Summer 
> Time\",\"+00:00:00\",\"+02:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Arctic/Longyearbyen\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Atlantic/Jan_Mayen\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"CET\",\"CET\",\"Central European Time\",\"CEST\",\"Central European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+02:00:00\",\"-1;0;10\",\"+02:00:00\"\n\
> \"ECT\",\"CET\",\"Central European Time\",\"CEST\",\"Central European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Amsterdam\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Andorra\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Belgrade\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Berlin\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Bratislava\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Brussels\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Budapest\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Busingen\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Copenhagen\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Gibraltar\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European

[jira] [Resolved] (IMPALA-5563) Timezone lookup may be ambiguous

2018-11-05 Thread Attila Jeges (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges resolved IMPALA-5563.
--
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

Fixing IMPALA-3307 fixed this issue as well.

> Timezone lookup may be ambiguous
> 
>
> Key: IMPALA-5563
> URL: https://issues.apache.org/jira/browse/IMPALA-5563
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.0
>Reporter: Matthew Jacobs
>Assignee: Attila Jeges
>Priority: Major
>  Labels: timezone
> Fix For: Impala 3.1.0
>
>
> When using functions like {{to_utc_timestamp}} that take a string timezone 
> parameter, if the timezone is not a 'region specifier' [1] (i.e. a key into 
> the timezone database entries), then Impala attempts to match the string 
> parameter against a number of other properties of the timezones in the 
> database [2]:
> * a zone's std abbreviation
> * a zone's std full name
> * a zone's dst abbreviation
> * a zone's dst full name
> {code}
> time_zone_ptr TimezoneDatabase::FindTimezone(
> const string& tz, const TimestampValue& tv, bool tv_in_utc) {
>   ...
>   // See if they specified a zone id
>   time_zone_ptr tzp = tz_database_.time_zone_from_region(tz);
>   if (tzp != NULL) return tzp;
>   for (vector::const_iterator iter = tz_region_list_.begin();
>iter != tz_region_list_.end(); ++iter) {
> time_zone_ptr tzp = tz_database_.time_zone_from_region(*iter);
> DCHECK(tzp != NULL);
> if (tzp->dst_zone_abbrev() == tz) return tzp;
> if (tzp->std_zone_abbrev() == tz) return tzp;
> if (tzp->dst_zone_name() == tz) return tzp;
> if (tzp->std_zone_name() == tz) return tzp;
>   }
>   return time_zone_ptr();
> }
> {code}
> This can result in ambiguous zones being used because the properties listed 
> above are not unique, e.g.
> {code}
> mj@mj-desktop:~/dev/Impala$ grep CEST be/src/exprs/timezone_db.cc 
> \"Africa/Ceuta\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Antarctica/Troll\",\"UTC\",\"Coordinated Universal 
> Time\",\"CEST\",\"Central European Summer 
> Time\",\"+00:00:00\",\"+02:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Arctic/Longyearbyen\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Atlantic/Jan_Mayen\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"CET\",\"CET\",\"Central European Time\",\"CEST\",\"Central European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+02:00:00\",\"-1;0;10\",\"+02:00:00\"\n\
> \"ECT\",\"CET\",\"Central European Time\",\"CEST\",\"Central European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Amsterdam\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Andorra\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Belgrade\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Berlin\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Bratislava\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Brussels\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Budapest\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Busingen\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Copenhagen\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European Summer 
> Time\",\"+01:00:00\",\"+01:00:00\",\"-1;0;3\",\"+01:00:00\",\"-1;0;10\",\"+01:00:00\"\n\
> \"Europe/Gibraltar\",\"CET\",\"Central European Time\",\"CEST\",\"Central 
> European

[jira] [Commented] (IMPALA-6433) Add read support for PageHeaderV2 to the parquet scanner

2018-11-05 Thread Tim Armstrong (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675448#comment-16675448
 ] 

Tim Armstrong commented on IMPALA-6433:
---

Thanks! I was trying to figure out if it might be a decent ramp-up task, but 
yeah, definitely if you can fix it that would be awesome.

> Add read support for PageHeaderV2 to the parquet scanner
> 
>
> Key: IMPALA-6433
> URL: https://issues.apache.org/jira/browse/IMPALA-6433
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Lars Volker
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>  Labels: parquet
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6897) Catalog server should flag tables with large number of small files

2018-11-05 Thread Tim Armstrong (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675447#comment-16675447
 ] 

Tim Armstrong commented on IMPALA-6897:
---

I guess I don't understand why we'd include it in profiles if it's just a 
global property of the system and not related to the actual files that the 
query is scanning. Can't we just make a catalog server metric or something?

> Catalog server should flag tables with large number of small files
> --
>
> Key: IMPALA-6897
> URL: https://issues.apache.org/jira/browse/IMPALA-6897
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.13.0
>Reporter: bharath v
>Priority: Major
>  Labels: ramp-up, supportability
>
> Since Catalog has all the file metadata information available, it should help 
> flag tables with large number of small files. This information can be 
> propagated to the coordinators and should be reflected in the query profiles 
> like how we do for "missing stats".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-5956) Add TPC-DS q31 and q89 to test suite

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-5956:
--
Fix Version/s: Impala 3.1.0

> Add TPC-DS q31 and q89 to test suite
> 
>
> Key: IMPALA-5956
> URL: https://issues.apache.org/jira/browse/IMPALA-5956
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Tim Wood
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: tpcds
> Fix For: Impala 3.1.0
>
> Attachments: q59-flap.out, q89-flap1.out, q89-flap2.out, ttq-243.out, 
> ttq-256.out
>
>
> When run esp. as part of the TPC-DS suite, query #89 returns varying results 
> in the LSD of a calculation.  Using the output of the previous run as the 
> expected result for the next run fails. This SELECT item is a ROUND() 
> expression, so it's not clear why the result is not deterministic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7532) Add retry/back-off to fetch-from-catalog RPCs

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7532:
--
Fix Version/s: Impala 3.1.0

> Add retry/back-off to fetch-from-catalog RPCs
> -
>
> Key: IMPALA-7532
> URL: https://issues.apache.org/jira/browse/IMPALA-7532
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Todd Lipcon
>Assignee: Tianyi Wang
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> Currently if there is an error connecting to the catalog server, the 'fetch 
> from catalog' implementation will retry with no apparent backoff. We should 
> retry for some period of time with backoff in between the attempts, so that 
> impala can ride over short interruptions of the catalog service.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7735) Expose admission control status in impala-shell

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7735:
--
Fix Version/s: Impala 3.1.0

> Expose admission control status in impala-shell
> ---
>
> Key: IMPALA-7735
> URL: https://issues.apache.org/jira/browse/IMPALA-7735
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Clients
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Critical
>  Labels: admission-control
> Fix For: Impala 3.1.0
>
> Attachments: Screenshot1.png
>
>
> Following on from IMPALA-7545 we should also expose this in impala-shell. I 
> left some notes on that JIRA.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7504) ParseKerberosPrincipal() should use krb5_parse_name() instead

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7504:
--
Fix Version/s: (was: Impala 3.1.0)

> ParseKerberosPrincipal() should use krb5_parse_name() instead
> -
>
> Key: IMPALA-7504
> URL: https://issues.apache.org/jira/browse/IMPALA-7504
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Security
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Michael Ho
>Priority: Minor
>  Labels: ramp-up
>
> [~tlipcon] pointed out during code review that we should be using 
> krb5_parse_name() to parse the principal instead of creating our own
> bq. I wonder whether we should just be using krb5_parse_name here instead of 
> implementing our own parsing? According to 
> [http://web.mit.edu/kerberos/krb5-1.15/doc/appdev/refs/api/krb5_parse_name.html]
>  there are various escapings, etc, that this function isn't currently 
> supporting.
> We currently do the following to parse the principal:
> {noformat}
>   vector names;
>   split(names, principal, is_any_of("/"));
>   if (names.size() != 2) return Status(TErrorCode::BAD_PRINCIPAL_FORMAT, 
> principal);
>   *service_name = names[0];
>   string remaining_principal = names[1];
>   split(names, remaining_principal, is_any_of("@"));
>   if (names.size() != 2) return Status(TErrorCode::BAD_PRINCIPAL_FORMAT, 
> principal);
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7665) Bringing up stopped statestore causes queries to fail

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7665:
--
Fix Version/s: (was: Impala 3.1.0)

> Bringing up stopped statestore causes queries to fail
> -
>
> Key: IMPALA-7665
> URL: https://issues.apache.org/jira/browse/IMPALA-7665
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: statestore
>
> I can reproduce this by running a long-running query then cycling the 
> statestore:
> {noformat}
> tarmstrong@tarmstrong-box:~/Impala/incubator-impala$ impala-shell.sh -q 
> "select distinct * from tpch10_parquet.lineitem"
> Starting Impala Shell without Kerberos authentication
> Connected to localhost:21000
> Server version: impalad version 3.1.0-SNAPSHOT DEBUG (build 
> c486fb9ea4330e1008fa9b7ceaa60492e43ee120)
> Query: select distinct * from tpch10_parquet.lineitem
> Query submitted at: 2018-10-04 17:06:48 (Coordinator: 
> http://tarmstrong-box:25000)
> {noformat}
> If I kill the statestore, the query runs fine, but if I start up the 
> statestore again, it fails.
> {noformat}
> # In one terminal, start up the statestore
> $ 
> /home/tarmstrong/Impala/incubator-impala/be/build/latest/statestore/statestored
>  -log_filename=statestored 
> -log_dir=/home/tarmstrong/Impala/incubator-impala/logs/cluster -v=1 
> -logbufsecs=5 -max_log_files=10
> # The running query then fails
> WARNINGS: Failed due to unreachable impalad(s): tarmstrong-box:22001, 
> tarmstrong-box:22002
> {noformat}
> Note that I've seen different subsets impalads reported as failed, e.g. 
> "Failed due to unreachable impalad(s): tarmstrong-box:22001"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7665) Bringing up stopped statestore causes queries to fail

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7665:
--
Target Version: Impala 3.2.0  (was: Impala 3.1.0)

> Bringing up stopped statestore causes queries to fail
> -
>
> Key: IMPALA-7665
> URL: https://issues.apache.org/jira/browse/IMPALA-7665
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: statestore
>
> I can reproduce this by running a long-running query then cycling the 
> statestore:
> {noformat}
> tarmstrong@tarmstrong-box:~/Impala/incubator-impala$ impala-shell.sh -q 
> "select distinct * from tpch10_parquet.lineitem"
> Starting Impala Shell without Kerberos authentication
> Connected to localhost:21000
> Server version: impalad version 3.1.0-SNAPSHOT DEBUG (build 
> c486fb9ea4330e1008fa9b7ceaa60492e43ee120)
> Query: select distinct * from tpch10_parquet.lineitem
> Query submitted at: 2018-10-04 17:06:48 (Coordinator: 
> http://tarmstrong-box:25000)
> {noformat}
> If I kill the statestore, the query runs fine, but if I start up the 
> statestore again, it fails.
> {noformat}
> # In one terminal, start up the statestore
> $ 
> /home/tarmstrong/Impala/incubator-impala/be/build/latest/statestore/statestored
>  -log_filename=statestored 
> -log_dir=/home/tarmstrong/Impala/incubator-impala/logs/cluster -v=1 
> -logbufsecs=5 -max_log_files=10
> # The running query then fails
> WARNINGS: Failed due to unreachable impalad(s): tarmstrong-box:22001, 
> tarmstrong-box:22002
> {noformat}
> Note that I've seen different subsets impalads reported as failed, e.g. 
> "Failed due to unreachable impalad(s): tarmstrong-box:22001"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7350) More accurate memory estimates for admission

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7350:
--
Fix Version/s: (was: Impala 3.1.0)

> More accurate memory estimates for admission
> 
>
> Key: IMPALA-7350
> URL: https://issues.apache.org/jira/browse/IMPALA-7350
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: admission-control, resource-management
>
> For IMPALA-7349, we will be relying more on memory estimates. This is an 
> umbrella JIRA to track improvements to memory estimates where the current 
> estimates are way off and result in over- or under- admission. over-admission 
> is probably the more significant concern.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7350) More accurate memory estimates for admission

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7350:
--
Target Version: Impala 3.2.0  (was: Impala 3.1.0)

> More accurate memory estimates for admission
> 
>
> Key: IMPALA-7350
> URL: https://issues.apache.org/jira/browse/IMPALA-7350
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: admission-control, resource-management
>
> For IMPALA-7349, we will be relying more on memory estimates. This is an 
> umbrella JIRA to track improvements to memory estimates where the current 
> estimates are way off and result in over- or under- admission. over-admission 
> is probably the more significant concern.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-6656) Metrics for time spent in BufferAllocator

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-6656:
--
Target Version: Impala 3.2.0  (was: Impala 3.1.0)

> Metrics for time spent in BufferAllocator
> -
>
> Key: IMPALA-6656
> URL: https://issues.apache.org/jira/browse/IMPALA-6656
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: observability, resource-management
>
> We should track the total time spent and the time spent in TCMalloc so we can 
> understand where time is going globally. 
> I think we should shard these metrics across the arenas so we can see if the 
> problem is just per-arena, and also to avoid contention between threads when 
> updating the metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-6656) Metrics for time spent in BufferAllocator

2018-11-05 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-6656:
--
Fix Version/s: (was: Impala 3.1.0)

> Metrics for time spent in BufferAllocator
> -
>
> Key: IMPALA-6656
> URL: https://issues.apache.org/jira/browse/IMPALA-6656
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: observability, resource-management
>
> We should track the total time spent and the time spent in TCMalloc so we can 
> understand where time is going globally. 
> I think we should shard these metrics across the arenas so we can see if the 
> problem is just per-arena, and also to avoid contention between threads when 
> updating the metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7710) test_owner_privileges_with_grant failed with AuthorizationException

2018-11-05 Thread Fredy Wijaya (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya resolved IMPALA-7710.
--
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> test_owner_privileges_with_grant failed with AuthorizationException 
> 
>
> Key: IMPALA-7710
> URL: https://issues.apache.org/jira/browse/IMPALA-7710
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Michael Ho
>Assignee: Adam Holley
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 3.1.0
>
>
> A build with the fix of IMPALA-7633 failed like the following. 
> {noformat}
> authorization.test_owner_privileges.TestOwnerPrivileges.test_owner_privileges_with_grant[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none] (from pytest)
> Failing for the past 1 build (Since Failed#35 )
> Took 1 min 39 sec.
> add description
> Error Message
> ImpalaBeeswaxException: ImpalaBeeswaxException:  INNER EXCEPTION:  'beeswaxd.ttypes.BeeswaxException'>  MESSAGE: AuthorizationException: User 
> 'oo_user1' does not have privileges to execute 'DROP' on: 
> test_owner_privileges_with_grant_77e49af8.owner_priv_view
> Stacktrace
> authorization/test_owner_privileges.py:165: in 
> test_owner_privileges_with_grant
> sentry_refresh_timeout_s=SENTRY_REFRESH_TIMEOUT_S)
> authorization/test_owner_privileges.py:225: in __execute_owner_privilege_tests
> test_obj.obj_name), user="oo_user1")
> common/sentry_cache_test_suite.py:106: in user_query
> return self.execute_query_expect_success(client, query, user=user)
> common/impala_test_suite.py:523: in wrapper
> return function(*args, **kwargs)
> common/impala_test_suite.py:531: in execute_query_expect_success
> result = cls.__execute_query(impalad_client, query, query_options, user)
> common/impala_test_suite.py:621: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:339: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:335: in execute_query_async
> return self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:460: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: AuthorizationException: User 'oo_user1' does not have 
> privileges to execute 'DROP' on: 
> test_owner_privileges_with_grant_77e49af8.owner_priv_view
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7710) test_owner_privileges_with_grant failed with AuthorizationException

2018-11-05 Thread Fredy Wijaya (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya resolved IMPALA-7710.
--
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> test_owner_privileges_with_grant failed with AuthorizationException 
> 
>
> Key: IMPALA-7710
> URL: https://issues.apache.org/jira/browse/IMPALA-7710
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Michael Ho
>Assignee: Adam Holley
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 3.1.0
>
>
> A build with the fix of IMPALA-7633 failed like the following. 
> {noformat}
> authorization.test_owner_privileges.TestOwnerPrivileges.test_owner_privileges_with_grant[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none] (from pytest)
> Failing for the past 1 build (Since Failed#35 )
> Took 1 min 39 sec.
> add description
> Error Message
> ImpalaBeeswaxException: ImpalaBeeswaxException:  INNER EXCEPTION:  'beeswaxd.ttypes.BeeswaxException'>  MESSAGE: AuthorizationException: User 
> 'oo_user1' does not have privileges to execute 'DROP' on: 
> test_owner_privileges_with_grant_77e49af8.owner_priv_view
> Stacktrace
> authorization/test_owner_privileges.py:165: in 
> test_owner_privileges_with_grant
> sentry_refresh_timeout_s=SENTRY_REFRESH_TIMEOUT_S)
> authorization/test_owner_privileges.py:225: in __execute_owner_privilege_tests
> test_obj.obj_name), user="oo_user1")
> common/sentry_cache_test_suite.py:106: in user_query
> return self.execute_query_expect_success(client, query, user=user)
> common/impala_test_suite.py:523: in wrapper
> return function(*args, **kwargs)
> common/impala_test_suite.py:531: in execute_query_expect_success
> result = cls.__execute_query(impalad_client, query, query_options, user)
> common/impala_test_suite.py:621: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:339: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:335: in execute_query_async
> return self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:460: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: AuthorizationException: User 'oo_user1' does not have 
> privileges to execute 'DROP' on: 
> test_owner_privileges_with_grant_77e49af8.owner_priv_view
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (IMPALA-4909) Redhat timezone update rpm causes queries to disappear from CM screen

2018-11-05 Thread hakki (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675092#comment-16675092
 ] 

hakki commented on IMPALA-4909:
---

Rather, it seems to be a cloudera management service (CMS) issue. After 
locating the new joda jar file under the /usr/share/cmf/common_jars and 
restarting the CMS, the issue resolved.

> Redhat timezone update rpm causes queries to disappear from CM screen
> -
>
> Key: IMPALA-4909
> URL: https://issues.apache.org/jira/browse/IMPALA-4909
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.2, Impala 2.6.0
>Reporter: hakki
>Priority: Minor
>
> When implemented timezone update packages (tzdata-2016g-2.el6.noarch.rpm for 
> redhat and tzdata2016g.tar.gz for java) to redhat 6.6 on which impala daemons 
> run, queries does not appear on the cloudera manager impala queries screen.
> Note: Timezone update package is also applied to the java. Cloudera manager 
> server is located on identical servers with impala daemons, catalog server 
> and statestore.
> Reproduce scenario:
> 1- Install CDH-5.4.7 with parcel and embedded postgresql database. (all the 
> os are redhat 6.6, the default timezone was EEST, initially)
> 2- After installation, apply tzdata-2016g-2.el6.noarch.rpm to all servers.
> 3- Apply java tz update package to java (java version "1.7.0_67" Java(TM) SE 
> Runtime Environment (build 1.7.0_67-b01))
> 4- Run a query from impala
> 5- Open the impala queries screen from the cloudera manager.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7543) Enhance scan ranges to support sub-ranges

2018-11-05 Thread JIRA



 [ 
https://issues.apache.org/jira/browse/IMPALA-7543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy resolved IMPALA-7543.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Enhance scan ranges to support sub-ranges
> -
>
> Key: IMPALA-7543
> URL: https://issues.apache.org/jira/browse/IMPALA-7543
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Zoltán Borók-Nagy
>Assignee: Zoltán Borók-Nagy
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> For IMPALA-5843 we need to have smarter scan ranges that only read a list of 
> inner ranges.
> It'll be useful for Parquet files that have page index, so Impala will only 
> read the relevant pages.
> More information can be found in 
> [https://docs.google.com/document/d/1D-el8njq_I-JKd3NDcW1mRXID_n0dBDKIkjWxwULVus]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (IMPALA-7543) Enhance scan ranges to support sub-ranges

2018-11-05 Thread JIRA



 [ 
https://issues.apache.org/jira/browse/IMPALA-7543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy resolved IMPALA-7543.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Enhance scan ranges to support sub-ranges
> -
>
> Key: IMPALA-7543
> URL: https://issues.apache.org/jira/browse/IMPALA-7543
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Zoltán Borók-Nagy
>Assignee: Zoltán Borók-Nagy
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> For IMPALA-5843 we need to have smarter scan ranges that only read a list of 
> inner ranges.
> It'll be useful for Parquet files that have page index, so Impala will only 
> read the relevant pages.
> More information can be found in 
> [https://docs.google.com/document/d/1D-el8njq_I-JKd3NDcW1mRXID_n0dBDKIkjWxwULVus]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-6433) Add read support for PageHeaderV2 to the parquet scanner

2018-11-05 Thread JIRA



 [ 
https://issues.apache.org/jira/browse/IMPALA-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy reassigned IMPALA-6433:
-

Assignee: Zoltán Borók-Nagy

> Add read support for PageHeaderV2 to the parquet scanner
> 
>
> Key: IMPALA-6433
> URL: https://issues.apache.org/jira/browse/IMPALA-6433
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Lars Volker
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>  Labels: parquet
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6433) Add read support for PageHeaderV2 to the parquet scanner

2018-11-05 Thread JIRA



[ 
https://issues.apache.org/jira/browse/IMPALA-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16674910#comment-16674910
 ] 

Zoltán Borók-Nagy commented on IMPALA-6433:
---

[~tarmstrong] it's not necessary to have it for page skipping, but I can pick 
this up since I'm digging that part of the code.

> Add read support for PageHeaderV2 to the parquet scanner
> 
>
> Key: IMPALA-6433
> URL: https://issues.apache.org/jira/browse/IMPALA-6433
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Lars Volker
>Priority: Major
>  Labels: parquet
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

88 matches

Mail list logo