[jira] [Created] (IMPALA-7675) The result of UpdateTableUsage() RPC is not correctly handled.

2018-10-05 Thread Tianyi Wang (JIRA)
Tianyi Wang created IMPALA-7675:
---

 Summary: The result of UpdateTableUsage() RPC is not correctly 
handled.
 Key: IMPALA-7675
 URL: https://issues.apache.org/jira/browse/IMPALA-7675
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.1.0
Reporter: Tianyi Wang
Assignee: Tianyi Wang


ImpalaTableUsageTracker.report() doesn't handle the result of 
UpdateTableUsage() RPC correctly and triggers NullpointerException:

{noformat}
W1003 11:07:39.252918  6910 ImpaladTableUsageTracker.java:116] Unable to report 
table usage information to catalog server.
Java exception follows:
java.lang.NullPointerException
at 
org.apache.impala.catalog.ImpaladTableUsageTracker.report(ImpaladTableUsageTracker.java:110)
at 
org.apache.impala.catalog.ImpaladTableUsageTracker.access$000(ImpaladTableUsageTracker.java:44)
at 
org.apache.impala.catalog.ImpaladTableUsageTracker$1.run(ImpaladTableUsageTracker.java:56)
at java.lang.Thread.run(Thread.java:748)
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7674) Impala should compress older log files

2018-10-05 Thread Zoram Thanga (JIRA)
Zoram Thanga created IMPALA-7674:


 Summary: Impala should compress older log files
 Key: IMPALA-7674
 URL: https://issues.apache.org/jira/browse/IMPALA-7674
 Project: IMPALA
  Issue Type: Improvement
Affects Versions: Impala 2.12.0, Impala 3.0, Impala 3.1.0
Reporter: Zoram Thanga
Assignee: Zoram Thanga


By default, Impala keeps ten log files of each severity level (INFO, WARN, 
ERROR), and the size limit of each is set to 200MB or so. The cleaning or old 
file deletion is controlled by the FLAGS_max_log_files parameter. 

On busy clusters we've found that log deletion can throw away debug information 
too quickly, often making troubleshooting harder than it needs to be.

We can compress the log files to:

# Reduce the disk space consumption by 10x or more.
# Keep more log files around for the same disk space budget.
# Have 10x or more historical diagnostics data available.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7653) Improve accuracy of compute incremental stats cardinality estimation

2018-10-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-7653:
-

Assignee: Pooja Nilangekar

> Improve accuracy of compute incremental stats cardinality estimation
> 
>
> Key: IMPALA-7653
> URL: https://issues.apache.org/jira/browse/IMPALA-7653
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Balazs Jeszenszky
>Assignee: Pooja Nilangekar
>Priority: Major
>  Labels: resource-management
>
> Currently, the operators of a compute [incremental] stats' subquery rely on 
> combined selectivities - as usual - to estimate cardinality, e.g. during 
> aggregation. For example, note the expected cardinality of the aggregation on 
> this subquery:
> {code}
> F00:PLAN FRAGMENT [RANDOM] hosts=1 instances=4
> Per-Host Resources: mem-estimate=305.20GB mem-reservation=136.00MB
> 01:AGGREGATE [STREAMING]
> |  output: [...]
> |  group by: col_a, col_b, col_c
> |  mem-estimate=76.21GB mem-reservation=34.00MB spill-buffer=2.00MB
> |  tuple-ids=1 row-size=104.83KB cardinality=693000
> |
> 00:SCAN HDFS [default.test, RANDOM]
>partitions=1/554 files=1 size=109.65MB
>stats-rows=1506374 extrapolated-rows=disabled
>table stats: rows=821958291 size=unavailable
>column stats: all
>mem-estimate=88.00MB mem-reservation=0B
>tuple-ids=0 row-size=2.06KB cardinality=1506374
> {code}
> This was generated as a result of compute incremental stats on a single 
> partition, so the output of that aggregation is a single row. Due to the 
> width of the intermediate rows, such overestimations lead to bloated memory 
> estimates. Since the amount of partitions to be updated is known at 
> plan-time, Impala could use that to set the aggregation's cardinality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7653) Improve accuracy of compute incremental stats cardinality estimation

2018-10-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7653:
--
Labels: resource-management  (was: )

> Improve accuracy of compute incremental stats cardinality estimation
> 
>
> Key: IMPALA-7653
> URL: https://issues.apache.org/jira/browse/IMPALA-7653
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Balazs Jeszenszky
>Priority: Major
>  Labels: resource-management
>
> Currently, the operators of a compute [incremental] stats' subquery rely on 
> combined selectivities - as usual - to estimate cardinality, e.g. during 
> aggregation. For example, note the expected cardinality of the aggregation on 
> this subquery:
> {code}
> F00:PLAN FRAGMENT [RANDOM] hosts=1 instances=4
> Per-Host Resources: mem-estimate=305.20GB mem-reservation=136.00MB
> 01:AGGREGATE [STREAMING]
> |  output: [...]
> |  group by: col_a, col_b, col_c
> |  mem-estimate=76.21GB mem-reservation=34.00MB spill-buffer=2.00MB
> |  tuple-ids=1 row-size=104.83KB cardinality=693000
> |
> 00:SCAN HDFS [default.test, RANDOM]
>partitions=1/554 files=1 size=109.65MB
>stats-rows=1506374 extrapolated-rows=disabled
>table stats: rows=821958291 size=unavailable
>column stats: all
>mem-estimate=88.00MB mem-reservation=0B
>tuple-ids=0 row-size=2.06KB cardinality=1506374
> {code}
> This was generated as a result of compute incremental stats on a single 
> partition, so the output of that aggregation is a single row. Due to the 
> width of the intermediate rows, such overestimations lead to bloated memory 
> estimates. Since the amount of partitions to be updated is known at 
> plan-time, Impala could use that to set the aggregation's cardinality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7673) Parse --var variable values to replace variables within the value

2018-10-05 Thread Aaron Baff (JIRA)
Aaron Baff created IMPALA-7673:
--

 Summary: Parse --var variable values to replace variables within 
the value
 Key: IMPALA-7673
 URL: https://issues.apache.org/jira/browse/IMPALA-7673
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Affects Versions: Impala 2.11.0
 Environment: CentOS Linux release 7.4.1708
CDH 5.14.4
Reporter: Aaron Baff


Related to IMPALA-2180

In working on a query using SET variables, and trying to move them to 
impala-shell --var options to set the variables, the later variable which 
depends on the 1st one doesn't have the 1st one be replaced properly like it 
does with a SET.

For example:

--var="DATA_DATE_START='2018-09-28'

--var="START_ACTION_CLICK_RANGE=from_timestamp(date_sub(to_timestamp(\${var:DATA_DATE_START},'-MM-dd'),
 93), '-MM-dd')"

In the query that gets run, the ${var:START_ACTION_CLICK_RANGE} gets replaced 
with

from_timestamp(date_sub(to_timestamp(${var:DATA_DATE_START},'-MM-dd'), 93), 
'-MM-dd')

not with

from_timestamp(date_sub(to_timestamp('2018-09-28','-MM-dd'), 93), 
'-MM-dd')

as I would expect it to.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7664) Flag to tell imapala-shell to don't try to escape anything with -q/-o

2018-10-05 Thread Aaron Baff (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Baff updated IMPALA-7664:
---
Labels:   (was: shell)

> Flag to tell imapala-shell to don't try to escape anything with -q/-o
> -
>
> Key: IMPALA-7664
> URL: https://issues.apache.org/jira/browse/IMPALA-7664
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 2.11.0
> Environment: CentOS Linux release 7.4.1708
> CDH 5.14.4
>Reporter: Aaron Baff
>Priority: Minor
>
> I'm running into a real pain point where I'm assembling a SQL query to store 
> into our RDBMS, and some of the data contains a double-quote in it. When I 
> run it in impala-shell with -o output file option, it turns the double quote 
> into 2 double quotes, and double quotes around the entire string.
> While I understand this is technically the correct way to go about a double 
> quote in a tab/comma delimited file, this is the opposite of what I want 
> since it's only a single field, and in the SQL query it's already properly 
> enclosed in a single quoted string.
> So it'd be really nice to have a option or variable setting to have 
> impala-shell NOT do this behavior. Just take whatever the value of the field 
> and output as is, ignoring any attempts to escape/quote the value. Maybe this 
> is an extreme corner case, but I imagine some other folks have probably hit 
> it. As it is, I'm going to have to figure out a work-around which is probably 
> going to be some kind of sed on the output file.
>  
> Since this is on CDH, here's the full version info:
> Shell version: Impala Shell v2.11.0-cdh5.14.4 (20e6356) built on Tue Jun 12 
> 03:43:08 PDT 2018
> Server version: impalad version 2.11.0-cdh5.14.4 RELEASE (build 
> 20e635646a13347800fad36a7d0b1da25ab32404)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-7651) Add Kudu support to scheduler-related query hints and options

2018-10-05 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni closed IMPALA-7651.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Add Kudu support to scheduler-related query hints and options
> -
>
> Key: IMPALA-7651
> URL: https://issues.apache.org/jira/browse/IMPALA-7651
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Affects Versions: Impala 3.1.0
>Reporter: Lars Volker
>Assignee: Alex Rodoni
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> Scheduling for Kudu works the same way for HDFS and Kudu, however our docs 
> pages don't mention the latter: 
> https://impala.apache.org/docs/build/html/topics/impala_schedule_random_replica.html
> We should add Kudu to the docs for the SCHEDULE_RANDOM_REPLICA query option 
> and the RANDOM_REPLICA hint.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-4137) Rolling restart of Impala

2018-10-05 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640246#comment-16640246
 ] 

Tim Armstrong commented on IMPALA-4137:
---

Just an update here. We have the graceful shutdown command, which is an 
important building block. Alan's comments about statestore and catalogd require 
some addressing (since we don't tolerate restarts of those well in general):

Statestore:
I think statestore restarts should be tolerated OK in theory, but there's a bug 
in practice: IMPALA-7665. The other caveat is if the statestore is down a long 
time and another impalad goes down, we don't detect it and remove it from the 
cluster membership (not sure if we have a JIRA for that yet). So if we fix 
IMPALA-7665 we should be able to restart the statestore, so long as we're not 
bouncing impala daemons at the same time. So we're not that far off from clean 
statestore restarts, but statestore HA where we can tolerate extended outages 
is a bigger task.

Catalog:
Also in theory we should be able to restart the catalog without disrupting 
running queries and only delaying queries that need new metadata. This would 
need more testing to make sure that it works well in all cases. I think the 
metadata delay is also somewhat unacceptable for true rolling restart. The 
likely solution to that is IMPALA-7127 removing the catalog dependency and 
having coordinators load data directly.

> Rolling restart of Impala
> -
>
> Key: IMPALA-4137
> URL: https://issues.apache.org/jira/browse/IMPALA-4137
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 2.2.4
>Reporter: Alan Jackoway
>Priority: Major
>  Labels: resource-management
>
> Apologies if a jira exists for this. I could not find one.
> It would be very helpful to us to be able to rolling restart (and hopefully 
> rolling upgrade) Impala.
> Based on my understanding of impala internals, I think this would require:
> * Highly available statestore and catalog. Currently catalog's metadata 
> reload is the long pole in our impala restarts.
> * Impalads being able to stop without killing queries they are working on. 
> Most of our queries are short so for us it would be sufficient to give the 
> impala daemon a way to stop taking new work, then restart when it completes 
> all work it has assigned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7672) Play nice with load balancers when shutting down coordinator

2018-10-05 Thread Tim Armstrong (JIRA)
Tim Armstrong created IMPALA-7672:
-

 Summary: Play nice with load balancers when shutting down 
coordinator
 Key: IMPALA-7672
 URL: https://issues.apache.org/jira/browse/IMPALA-7672
 Project: IMPALA
  Issue Type: Sub-task
  Components: Distributed Exec
Reporter: Tim Armstrong


This is a placeholder to figure out what we need to do to get load balancers 
like HAProxy and F5 to cleanly switch to alternative coordinators when we do a 
graceful shutdown. E.g. do we need to stop accepting new TCP connections?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7665) Bringing up stopped statestore causes queries to fail

2018-10-05 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640232#comment-16640232
 ] 

Tim Armstrong commented on IMPALA-7665:
---

This would be required to allow safely restarting the statestore without 
disrupting running queries. The original intent of the statestore design (I 
think) was that the cluster should tolerate statestore restarts by operating on 
stale membership data.

> Bringing up stopped statestore causes queries to fail
> -
>
> Key: IMPALA-7665
> URL: https://issues.apache.org/jira/browse/IMPALA-7665
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: statestore
>
> I can reproduce this by running a long-running query then cycling the 
> statestore:
> {noformat}
> tarmstrong@tarmstrong-box:~/Impala/incubator-impala$ impala-shell.sh -q 
> "select distinct * from tpch10_parquet.lineitem"
> Starting Impala Shell without Kerberos authentication
> Connected to localhost:21000
> Server version: impalad version 3.1.0-SNAPSHOT DEBUG (build 
> c486fb9ea4330e1008fa9b7ceaa60492e43ee120)
> Query: select distinct * from tpch10_parquet.lineitem
> Query submitted at: 2018-10-04 17:06:48 (Coordinator: 
> http://tarmstrong-box:25000)
> {noformat}
> If I kill the statestore, the query runs fine, but if I start up the 
> statestore again, it fails.
> {noformat}
> # In one terminal, start up the statestore
> $ 
> /home/tarmstrong/Impala/incubator-impala/be/build/latest/statestore/statestored
>  -log_filename=statestored 
> -log_dir=/home/tarmstrong/Impala/incubator-impala/logs/cluster -v=1 
> -logbufsecs=5 -max_log_files=10
> # The running query then fails
> WARNINGS: Failed due to unreachable impalad(s): tarmstrong-box:22001, 
> tarmstrong-box:22002
> {noformat}
> Note that I've seen different subsets impalads reported as failed, e.g. 
> "Failed due to unreachable impalad(s): tarmstrong-box:22001"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-4137) Rolling restart of Impala

2018-10-05 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-4137:
--
Labels: resource-management  (was: )

> Rolling restart of Impala
> -
>
> Key: IMPALA-4137
> URL: https://issues.apache.org/jira/browse/IMPALA-4137
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec
>Affects Versions: Impala 2.2.4
>Reporter: Alan Jackoway
>Priority: Major
>  Labels: resource-management
>
> Apologies if a jira exists for this. I could not find one.
> It would be very helpful to us to be able to rolling restart (and hopefully 
> rolling upgrade) Impala.
> Based on my understanding of impala internals, I think this would require:
> * Highly available statestore and catalog. Currently catalog's metadata 
> reload is the long pole in our impala restarts.
> * Impalads being able to stop without killing queries they are working on. 
> Most of our queries are short so for us it would be sufficient to give the 
> impala daemon a way to stop taking new work, then restart when it completes 
> all work it has assigned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7671) SHOW GRANT USER ON is broken

2018-10-05 Thread Fredy Wijaya (JIRA)
Fredy Wijaya created IMPALA-7671:


 Summary: SHOW GRANT USER ON  is broken
 Key: IMPALA-7671
 URL: https://issues.apache.org/jira/browse/IMPALA-7671
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.1.0
Reporter: Fredy Wijaya
Assignee: Fredy Wijaya


{noformat}
[localhost:21000] default> show grant user foobar;
Query: show grant user foobar
++++--+---++-+---+--+---+
| principal_type | principal_name | scope  | database | table | column | uri | 
privilege | grant_option | create_time   |
++++--+---++-+---+--+---+
| USER   | foobar| table  | default  | foo   || | 
owner | true | Fri, Oct 05 2018 11:38:14.173 |
++++--+---++-+---+--+---+

[localhost:21000] default> show grant user foobar on table foo;
Query: show grant user foobar on table foo
Fetched 0 row(s) in 0.01s
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-7671) SHOW GRANT USER ON is broken

2018-10-05 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7671 started by Fredy Wijaya.

> SHOW GRANT USER ON  is broken
> -
>
> Key: IMPALA-7671
> URL: https://issues.apache.org/jira/browse/IMPALA-7671
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Major
>
> {noformat}
> [localhost:21000] default> show grant user foobar;
> Query: show grant user foobar
> ++++--+---++-+---+--+---+
> | principal_type | principal_name | scope  | database | table | column | uri 
> | privilege | grant_option | create_time   |
> ++++--+---++-+---+--+---+
> | USER   | foobar| table  | default  | foo   || | 
> owner | true | Fri, Oct 05 2018 11:38:14.173 |
> ++++--+---++-+---+--+---+
> [localhost:21000] default> show grant user foobar on table foo;
> Query: show grant user foobar on table foo
> Fetched 0 row(s) in 0.01s
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7669) Concurrent invalidate with compute (or drop) stats throws NPE.

2018-10-05 Thread bharath v (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v reassigned IMPALA-7669:
-

Assignee: bharath v

> Concurrent invalidate with compute (or drop) stats throws NPE.
> --
>
> Key: IMPALA-7669
> URL: https://issues.apache.org/jira/browse/IMPALA-7669
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: bharath v
>Assignee: bharath v
>Priority: Critical
>
> *This is a Catalog V2 only bug*
> NPE is thrown when trying to getPartialInfo() from an IncompleteTable (result 
> of ivalidate) and cause_ is null.
> {noformat}
> @Override
>   public TGetPartialCatalogObjectResponse getPartialInfo(
>   TGetPartialCatalogObjectRequest req) throws TableLoadingException {
> Throwables.propagateIfPossible(cause_, TableLoadingException.class);
> throw new TableLoadingException(cause_.getMessage());  <-
>   }
> {noformat}
> {noformat}
> I1004 16:51:28.845305 85380 jni-util.cc:308] java.lang.NullPointerException
> at 
> org.apache.impala.catalog.IncompleteTable.getPartialInfo(IncompleteTable.java:140)
> at 
> org.apache.impala.catalog.CatalogServiceCatalog.getPartialCatalogObject(CatalogServiceCatalog.java:2171)
> at 
> org.apache.impala.service.JniCatalog.getPartialCatalogObject(JniCatalog.java:236)
> {noformat}
> Actual caller stack trace is this.
> {noformat}
> I1004 16:51:21.666422 67179 Frontend.java:1086] Analyzing query: compute 
> stats ads
> I1004 16:51:28.850023 67179 jni-util.cc:308] 
> org.apache.impala.catalog.local.LocalCatalogException: Could not load table 
> parnal.ads from metastore
> at 
> org.apache.impala.catalog.local.LocalTable.loadTableMetadata(LocalTable.java:128)
> at org.apache.impala.catalog.local.LocalTable.load(LocalTable.java:89)
> at org.apache.impala.catalog.local.LocalDb.getTable(LocalDb.java:119)
> at 
> org.apache.impala.analysis.StmtMetadataLoader.getMissingTables(StmtMetadataLoader.java:251)
> at 
> org.apache.impala.analysis.StmtMetadataLoader.loadTables(StmtMetadataLoader.java:140)
> at 
> org.apache.impala.analysis.StmtMetadataLoader.loadTables(StmtMetadataLoader.java:116)
> at 
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1118)
> at 
> org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1092)
> at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1064)
> at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:158)
> Caused by: org.apache.thrift.TException: 
> TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, 
> error_msgs:[NullPointerException: null]), lookup_status:OK)
> at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.sendRequest(CatalogdMetaProvider.java:354)
> at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.access$100(CatalogdMetaProvider.java:163)
> at 
> org.apache.impala.catalog.local.CatalogdMetaProvider$5.call(CatalogdMetaProvider.java:565)
> at 
> org.apache.impala.catalog.local.CatalogdMetaProvider$5.call(CatalogdMetaProvider.java:560)
> at 
> org.apache.impala.catalog.local.CatalogdMetaProvider$1.call(CatalogdMetaProvider.java:411)
> at 
> com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4767)
> at 
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
> at 
> com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
> at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
> at 
> com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
> at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
> at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
> at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.loadWithCaching(CatalogdMetaProvider.java:407)
> at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.loadTable(CatalogdMetaProvider.java:556)
> at 
> org.apache.impala.catalog.local.LocalTable.loadTableMetadata(LocalTable.java:126)
> ... 9 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7670) Drop table with a concurrent refresh throws ConcurrentModificationException

2018-10-05 Thread bharath v (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v reassigned IMPALA-7670:
-

Assignee: Tianyi Wang

> Drop table with a concurrent refresh throws ConcurrentModificationException
> ---
>
> Key: IMPALA-7670
> URL: https://issues.apache.org/jira/browse/IMPALA-7670
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: bharath v
>Assignee: Tianyi Wang
>Priority: Major
>
> * This is a Catalog V2 only bug*.
> Saw this in the Catalog server.
> {noformat}
> I1004 16:38:55.236702 85380 jni-util.cc:308] 
> java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextNode(HashMap.java:1442)
> at java.util.HashMap$ValueIterator.next(HashMap.java:1471)
> at 
> org.apache.impala.catalog.FeFsTable$Utils.getPartitionFromThriftPartitionSpec(FeFsTable.java:407)
> at 
> org.apache.impala.catalog.HdfsTable.getPartitionFromThriftPartitionSpec(HdfsTable.java:694)
> at 
> org.apache.impala.catalog.Catalog.getHdfsPartition(Catalog.java:407)
> at 
> org.apache.impala.catalog.Catalog.getHdfsPartition(Catalog.java:386)
> at 
> org.apache.impala.service.CatalogOpExecutor.bulkAlterPartitions(CatalogOpExecutor.java:3193)
> at 
> org.apache.impala.service.CatalogOpExecutor.dropTableStats(CatalogOpExecutor.java:1255)
> at 
> org.apache.impala.service.CatalogOpExecutor.dropStats(CatalogOpExecutor.java:1148)
> at 
> org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:301)
> at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:157)
> {noformat}
> Still need to dig into it, but seems like something is off with locking 
> somewhere.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6741) Profiles of running queries should tell last update time of counters

2018-10-05 Thread Michael Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640193#comment-16640193
 ] 

Michael Ho commented on IMPALA-6741:


[~jeszyb], do you think a recording the elapsed time since the last update is a 
reasonable solution ? We can add the warning as part of IMPALA-2990 but for the 
purpose of diagnostics, having the elapsed time since the last update may 
already be quite useful for identifying stuck fragment instances.

> Profiles of running queries should tell last update time of counters
> 
>
> Key: IMPALA-6741
> URL: https://issues.apache.org/jira/browse/IMPALA-6741
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Balazs Jeszenszky
>Assignee: Michael Ho
>Priority: Major
>
> When looking at the profile of a running query, it's impossible to tell the 
> degree of accuracy. We've seen issues both with instances not checking in 
> with the coordinator for a long time, and with hung instances that never 
> update their counters. There are some specific issues as well, see 
> IMPALA-5200. This means that profiles taken off of running queries can't be 
> used perf troubleshooting with confidence.
> Ideally, Impala should guarantee counters to be written at a certain 
> interval, and warn for counters or instances that are out of sync for some 
> reason.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-6741) Profiles of running queries should tell last update time of counters

2018-10-05 Thread Michael Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640193#comment-16640193
 ] 

Michael Ho edited comment on IMPALA-6741 at 10/5/18 6:31 PM:
-

[~jeszyb], do you think a recording the elapsed time since the last update is a 
reasonable solution ? We can add the warning as part of IMPALA-2990 but for the 
purpose of diagnostics, having the elapsed time since the last update may 
already be quite useful for identifying stuck fragment instances. Also, 
backporting it should be pretty straightforward.


was (Author: kwho):
[~jeszyb], do you think a recording the elapsed time since the last update is a 
reasonable solution ? We can add the warning as part of IMPALA-2990 but for the 
purpose of diagnostics, having the elapsed time since the last update may 
already be quite useful for identifying stuck fragment instances.

> Profiles of running queries should tell last update time of counters
> 
>
> Key: IMPALA-6741
> URL: https://issues.apache.org/jira/browse/IMPALA-6741
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Balazs Jeszenszky
>Assignee: Michael Ho
>Priority: Major
>
> When looking at the profile of a running query, it's impossible to tell the 
> degree of accuracy. We've seen issues both with instances not checking in 
> with the coordinator for a long time, and with hung instances that never 
> update their counters. There are some specific issues as well, see 
> IMPALA-5200. This means that profiles taken off of running queries can't be 
> used perf troubleshooting with confidence.
> Ideally, Impala should guarantee counters to be written at a certain 
> interval, and warn for counters or instances that are out of sync for some 
> reason.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7670) Drop table with a concurrent refresh throws ConcurrentModificationException

2018-10-05 Thread bharath v (JIRA)
bharath v created IMPALA-7670:
-

 Summary: Drop table with a concurrent refresh throws 
ConcurrentModificationException
 Key: IMPALA-7670
 URL: https://issues.apache.org/jira/browse/IMPALA-7670
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Affects Versions: Impala 3.1.0
Reporter: bharath v


* This is a Catalog V2 only bug*.

Saw this in the Catalog server.

{noformat}
I1004 16:38:55.236702 85380 jni-util.cc:308] 
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1442)
at java.util.HashMap$ValueIterator.next(HashMap.java:1471)
at 
org.apache.impala.catalog.FeFsTable$Utils.getPartitionFromThriftPartitionSpec(FeFsTable.java:407)
at 
org.apache.impala.catalog.HdfsTable.getPartitionFromThriftPartitionSpec(HdfsTable.java:694)
at org.apache.impala.catalog.Catalog.getHdfsPartition(Catalog.java:407)
at org.apache.impala.catalog.Catalog.getHdfsPartition(Catalog.java:386)
at 
org.apache.impala.service.CatalogOpExecutor.bulkAlterPartitions(CatalogOpExecutor.java:3193)
at 
org.apache.impala.service.CatalogOpExecutor.dropTableStats(CatalogOpExecutor.java:1255)
at 
org.apache.impala.service.CatalogOpExecutor.dropStats(CatalogOpExecutor.java:1148)
at 
org.apache.impala.service.CatalogOpExecutor.execDdlRequest(CatalogOpExecutor.java:301)
at org.apache.impala.service.JniCatalog.execDdl(JniCatalog.java:157)
{noformat}

Still need to dig into it, but seems like something is off with locking 
somewhere.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7669) Concurrent invalidate with compute (or drop) stats throws NPE.

2018-10-05 Thread bharath v (JIRA)
bharath v created IMPALA-7669:
-

 Summary: Concurrent invalidate with compute (or drop) stats throws 
NPE.
 Key: IMPALA-7669
 URL: https://issues.apache.org/jira/browse/IMPALA-7669
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Affects Versions: Impala 3.1.0
Reporter: bharath v


*This is a Catalog V2 only bug*

NPE is thrown when trying to getPartialInfo() from an IncompleteTable (result 
of ivalidate) and cause_ is null.

{noformat}
@Override
  public TGetPartialCatalogObjectResponse getPartialInfo(
  TGetPartialCatalogObjectRequest req) throws TableLoadingException {
Throwables.propagateIfPossible(cause_, TableLoadingException.class);
throw new TableLoadingException(cause_.getMessage());  <-
  }
{noformat}

{noformat}
I1004 16:51:28.845305 85380 jni-util.cc:308] java.lang.NullPointerException
at 
org.apache.impala.catalog.IncompleteTable.getPartialInfo(IncompleteTable.java:140)
at 
org.apache.impala.catalog.CatalogServiceCatalog.getPartialCatalogObject(CatalogServiceCatalog.java:2171)
at 
org.apache.impala.service.JniCatalog.getPartialCatalogObject(JniCatalog.java:236)
{noformat}

Actual caller stack trace is this.

{noformat}
I1004 16:51:21.666422 67179 Frontend.java:1086] Analyzing query: compute stats 
ads
I1004 16:51:28.850023 67179 jni-util.cc:308] 
org.apache.impala.catalog.local.LocalCatalogException: Could not load table 
parnal.ads from metastore
at 
org.apache.impala.catalog.local.LocalTable.loadTableMetadata(LocalTable.java:128)
at org.apache.impala.catalog.local.LocalTable.load(LocalTable.java:89)
at org.apache.impala.catalog.local.LocalDb.getTable(LocalDb.java:119)
at 
org.apache.impala.analysis.StmtMetadataLoader.getMissingTables(StmtMetadataLoader.java:251)
at 
org.apache.impala.analysis.StmtMetadataLoader.loadTables(StmtMetadataLoader.java:140)
at 
org.apache.impala.analysis.StmtMetadataLoader.loadTables(StmtMetadataLoader.java:116)
at 
org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1118)
at 
org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1092)
at 
org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1064)
at 
org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:158)
Caused by: org.apache.thrift.TException: 
TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, 
error_msgs:[NullPointerException: null]), lookup_status:OK)
at 
org.apache.impala.catalog.local.CatalogdMetaProvider.sendRequest(CatalogdMetaProvider.java:354)
at 
org.apache.impala.catalog.local.CatalogdMetaProvider.access$100(CatalogdMetaProvider.java:163)
at 
org.apache.impala.catalog.local.CatalogdMetaProvider$5.call(CatalogdMetaProvider.java:565)
at 
org.apache.impala.catalog.local.CatalogdMetaProvider$5.call(CatalogdMetaProvider.java:560)
at 
org.apache.impala.catalog.local.CatalogdMetaProvider$1.call(CatalogdMetaProvider.java:411)
at 
com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4767)
at 
com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
at 
com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
at 
com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
at 
com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
at 
org.apache.impala.catalog.local.CatalogdMetaProvider.loadWithCaching(CatalogdMetaProvider.java:407)
at 
org.apache.impala.catalog.local.CatalogdMetaProvider.loadTable(CatalogdMetaProvider.java:556)
at 
org.apache.impala.catalog.local.LocalTable.loadTableMetadata(LocalTable.java:126)
... 9 more

{noformat}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7326) test_kudu_partition_ddl failed with exception message: "Table already exists"

2018-10-05 Thread Thomas Tauber-Marshall (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Tauber-Marshall resolved IMPALA-7326.

Resolution: Cannot Reproduce

> test_kudu_partition_ddl failed with exception message: "Table already exists"
> -
>
> Key: IMPALA-7326
> URL: https://issues.apache.org/jira/browse/IMPALA-7326
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Michael Ho
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>  Labels: broken-build
>
> cc'ing [~twm378]. Does it look like some known issue ? Putting it in the 
> catalog category for now but please feel free to update the component as you 
> see fit.
> {noformat}
> query_test/test_kudu.py:96: in test_kudu_partition_ddl
> self.run_test_case('QueryTest/kudu_partition_ddl', vector, 
> use_db=unique_database)
> common/impala_test_suite.py:397: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:612: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:339: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:335: in execute_query_async
> return self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:460: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: ImpalaRuntimeException: Error creating Kudu table 
> 'impala::test_kudu_partition_ddl_7e04e8f9.simple_hash_range'
> E   CAUSED BY: NonRecoverableException: Table 
> impala::test_kudu_partition_ddl_7e04e8f9.simple_hash_range already exists 
> with id 3e81a4ceff27471cad9fcb3bc0b977c3
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7638) Lower default timeout for connection setup

2018-10-05 Thread Sailesh Mukil (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639961#comment-16639961
 ] 

Sailesh Mukil commented on IMPALA-7638:
---

Sorry, just saw this.
[~kwho] If it's the first time the client is connecting to that specific Impala 
server, *and* the KDC is under heavy load at that time, the negotiation could 
take quite a while, since the client doesn't have a ticket to talk to that 
server yet. But I agree that as more RPCs move over to KRPC, the number of 
requests to the KDC reduce, and this value can be brought down.

If we're looking for the right number, I think the best way would be to 
empirically find out on a large cluster with the latest Impala given that more 
RPCs have moved over to KRPC.

> Lower default timeout for connection setup
> --
>
> Key: IMPALA-7638
> URL: https://issues.apache.org/jira/browse/IMPALA-7638
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Lars Volker
>Priority: Major
> Fix For: Impala 2.11.0
>
>
> IMPALA-5394 added the sasl_connect_tcp_timeout_ms flag with a default timeout 
> of 5 minutes. This seems too long as broken clients will prevent new clients 
> from establishing connections for this time. In addition to increasing the 
> acceptor thread pool size (IMPALA-7565) we should lower this timeout 
> considerably, e.g. to 5 seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-6757) Improve description of catalog.curr-version and catalog.curr-topic metrics

2018-10-05 Thread Vincent Tran (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Tran reassigned IMPALA-6757:


Assignee: Vincent Tran

> Improve description of catalog.curr-version and catalog.curr-topic metrics
> --
>
> Key: IMPALA-6757
> URL: https://issues.apache.org/jira/browse/IMPALA-6757
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.12.0
>Reporter: Lars Volker
>Assignee: Vincent Tran
>Priority: Major
>  Labels: ramp-up, supportability
>
> IMPALA-6075 added metrics to expose the current catalog version of a backend. 
> The descriptions should be more clear, it took a search in the source code to 
> figure out what they mean.
> * catalog.curr-version: The current version of the catalog metadata.
> * catalog.curr-topic: The statestore catalog topic version in which the last 
> update was received.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6758) Add metric for current catalog version to catalog

2018-10-05 Thread Vincent Tran (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639907#comment-16639907
 ] 

Vincent Tran commented on IMPALA-6758:
--

Will do.

> Add metric for current catalog version to catalog
> -
>
> Key: IMPALA-6758
> URL: https://issues.apache.org/jira/browse/IMPALA-6758
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.12.0
>Reporter: Lars Volker
>Assignee: Vincent Tran
>Priority: Critical
>  Labels: newbie, observability
>
> IMPALA-6075 added metrics to expose the current catalog version of a backend. 
> However, the catalog itself does not seem to expose the same metric. The 
> statestore only exposes the topic version in which the catalog updates are 
> transported (catalog.curr-topic).
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-7663) count_user_privilege isn't 0 at the end of test_owner_privileges_without_grant

2018-10-05 Thread Adam Holley (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7663 started by Adam Holley.
---
> count_user_privilege isn't 0 at the end of test_owner_privileges_without_grant
> --
>
> Key: IMPALA-7663
> URL: https://issues.apache.org/jira/browse/IMPALA-7663
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Tianyi Wang
>Assignee: Adam Holley
>Priority: Critical
>
> {noformat}
> 08:35:35 === FAILURES 
> ===
> 08:35:35  
> TestOwnerPrivileges.test_owner_privileges_without_grant[exec_option: 
> {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none] 
> 08:35:35 authorization/test_owner_privileges.py:340: in 
> test_owner_privileges_without_grant
> 08:35:35 sentry_refresh_timeout_s=SENTRY_REFRESH_TIMEOUT_S)
> 08:35:35 authorization/test_owner_privileges.py:378: in 
> __execute_owner_privilege_tests_oo_nogrant
> 08:35:35 assert self.count_user_privileges(result) == 0
> 08:35:35 E   assert 1 == 0
> 08:35:35 E+  where 1 =  0x48a4ed8>( 0x7124690>)
> 08:35:35 E+where  = 
>  0x710d110>.count_user_privileges
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7359) Make local timezone deterministic during tests

2018-10-05 Thread Csaba Ringhofer (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639855#comment-16639855
 ] 

Csaba Ringhofer commented on IMPALA-7359:
-

[~tarmstrong]
Implementing IMPALA-7362 made this change possible, but the test cluster still 
uses the local timezone.

My plan is to implement IMPALA-7557, which should make timezone conversions 
free if the timezone is UTC. After that the defaults of the test cluster could 
be changed like this:
- convert_legacy_hive_parquet_utc_timestamps=true
- use_local_tz_for_unix_timestamp_conversions=true
- default timezone would be UTC (but $TZ would remain local time)
This would mean that Impala would behave like currently (with the exception of 
function now()) and that functions that are affected by local time could be 
tested without cluster restart (by setting query option timezone to something 
else than UTC).

So my plan looks like this:
1. implement IMPALA-7557 first
2. ask the community's opinion about this change
3. (if accepted) change the defaults in the test cluster (but only test 
cluster, so not in impalad itself)
4. change the related custom cluster tests to be normal query tests

> Make local timezone deterministic during tests
> --
>
> Key: IMPALA-7359
> URL: https://issues.apache.org/jira/browse/IMPALA-7359
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Major
>
> Currently Impala uses the timezone of the machine where impalad runs for 
> local<-> utc time conversions. This makes it hard to test these functions, as 
> they give different results depending on the location. Some tests solve this 
> by starting a custom cluster after setting env var TZ, while others do not 
> check the results precisely (only check that local!=utc, which is not true in 
> some region).
> I would prefer to avoid starting custom clusters for these tests, because 
> doing that makes the tests significantly slower.
> I see 3 possible solutions (easiest to hardest):
> - Start the minicluster with a specific timezone by default (e.g 
> America/Los_Angeles). This could surprise people a bit, but wouldn't cause 
> real issues as the minicluster is not meant for production. The simplest way 
> to do this is setting env var TZ, but this could have some unintended side 
> effects if other parts of the program use TZ too (e.g. logging).
> - Do the same but use a new startup flag instead of TZ. It would be possible 
> to ensure that there are no side effects this way.
> - Add a query option to set local time. HIVE has a similar feature, see 
> HIVE-16614.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7665) Bringing up stopped statestore causes queries to fail

2018-10-05 Thread Michael Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639408#comment-16639408
 ] 

Michael Ho commented on IMPALA-7665:


[~tarmstrong], as for what we have planned for IMPALA-2990 so far, there is no 
change in terms of the handling of the membership update from Statestore. In 
other words, if the Statestore update indicates a particular Impalad is removed 
from membership, the queries running those impalds will still be cancelled. For 
IMPALA-2990, the paths which may change are mostly in the coordinator logic 
which may have to cancel the query if certain backends haven't been heard from 
for a long time.

> Bringing up stopped statestore causes queries to fail
> -
>
> Key: IMPALA-7665
> URL: https://issues.apache.org/jira/browse/IMPALA-7665
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: statestore
>
> I can reproduce this by running a long-running query then cycling the 
> statestore:
> {noformat}
> tarmstrong@tarmstrong-box:~/Impala/incubator-impala$ impala-shell.sh -q 
> "select distinct * from tpch10_parquet.lineitem"
> Starting Impala Shell without Kerberos authentication
> Connected to localhost:21000
> Server version: impalad version 3.1.0-SNAPSHOT DEBUG (build 
> c486fb9ea4330e1008fa9b7ceaa60492e43ee120)
> Query: select distinct * from tpch10_parquet.lineitem
> Query submitted at: 2018-10-04 17:06:48 (Coordinator: 
> http://tarmstrong-box:25000)
> {noformat}
> If I kill the statestore, the query runs fine, but if I start up the 
> statestore again, it fails.
> {noformat}
> # In one terminal, start up the statestore
> $ 
> /home/tarmstrong/Impala/incubator-impala/be/build/latest/statestore/statestored
>  -log_filename=statestored 
> -log_dir=/home/tarmstrong/Impala/incubator-impala/logs/cluster -v=1 
> -logbufsecs=5 -max_log_files=10
> # The running query then fails
> WARNINGS: Failed due to unreachable impalad(s): tarmstrong-box:22001, 
> tarmstrong-box:22002
> {noformat}
> Note that I've seen different subsets impalads reported as failed, e.g. 
> "Failed due to unreachable impalad(s): tarmstrong-box:22001"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7521) CLONE - Speed up sub-second unix time->TimestampValue conversions

2018-10-05 Thread Csaba Ringhofer (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer resolved IMPALA-7521.
-
Resolution: Implemented

> CLONE - Speed up sub-second unix time->TimestampValue conversions
> -
>
> Key: IMPALA-7521
> URL: https://issues.apache.org/jira/browse/IMPALA-7521
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Major
>  Labels: performance, timestamp
>
> Currently Impala converts from sub-second unix time to TimestampValue (which 
> is split do date_ and time_ similarly to boost::posix_time::ptime ) by first 
> splitting the input into seconds and sub-seconds part, converting the seconds 
> part with  boost::posix_time::from_time_t(), and then adding the sub-seconds 
> part to this timestamp. This can be done much faster  by splitting the 
> sub-second input into date_ and time_ directly.
> Avoiding boost::posix_time::from_time_t() would be also nice because it can 
> only deal with timestamps from 1677 to 2262, which adds extra complexity to 
> the related code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7668) close() URLClassLoaders after usage.

2018-10-05 Thread bharath v (JIRA)
bharath v created IMPALA-7668:
-

 Summary: close() URLClassLoaders after usage.
 Key: IMPALA-7668
 URL: https://issues.apache.org/jira/browse/IMPALA-7668
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Reporter: bharath v
Assignee: bharath v
 Fix For: Impala 3.1.0


There are a few places in the code that uses URLClassLoaders to load some java 
classes at runtime. One example is when loading Java UDFs at startup.

{code:java}
public static List extractFunctions(String db,
  ...
  URL[] classLoaderUrls = new URL[] {new URL(localJarPath.toString())};
  URLClassLoader urlClassLoader = new URLClassLoader(classLoaderUrls);
{code}

Starting JDK7, URLClassloader lets the caller close all the closeables opened 
by it, avoiding bugs like FD leaks etc.

https://docs.oracle.com/javase/7/docs/api/java/net/URLClassLoader.html#close()

We have seen issues like lingering FDs from this code using certain versions of 
JDKs where the FDs of temporary jars (copied to /tmp) by this code are not 
closed and hence their space from disk is not claimed causing disk space issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org