date:20190124

[jira] [Updated] (IMPALA-8072) Clean up config files in docker containers

2019-01-24 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-8072:
--
Description: 
Currently the docker containers include a bunch of config files copied 
indiscriminately from the dev environment. Mostly these aren't valid for a 
production container and it's expected that the real config files will be 
mounted at /opt/impala/conf.

We should instead include a more reasonable set of default configs (e.g. for 
admission control), plus placeholders for other config files that may need to 
be overridden with site-specific configs.

  was:
Currently the docker containers include a bunch of config files copied 
indiscriminately from the dev environment. Mostly these aren't valid for a 
production container and it's expected that the real config files will be 
mounted at /opt/impala/conf.

We should stop including those config files or at least include a more 
reasonable set of default configs.


> Clean up config files in docker containers
> --
>
> Key: IMPALA-8072
> URL: https://issues.apache.org/jira/browse/IMPALA-8072
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Infrastructure
>Affects Versions: Impala 3.2.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: docker
>
> Currently the docker containers include a bunch of config files copied 
> indiscriminately from the dev environment. Mostly these aren't valid for a 
> production container and it's expected that the real config files will be 
> mounted at /opt/impala/conf.
> We should instead include a more reasonable set of default configs (e.g. for 
> admission control), plus placeholders for other config files that may need to 
> be overridden with site-specific configs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-8114) Build test failure in test_breakpad.py

2019-01-24 Thread Lars Volker (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker reassigned IMPALA-8114:
---

Assignee: Lars Volker  (was: Tim Armstrong)

> Build test failure in test_breakpad.py
> --
>
> Key: IMPALA-8114
> URL: https://issues.apache.org/jira/browse/IMPALA-8114
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Paul Rogers
>Assignee: Lars Volker
>Priority: Blocker
>
> Recent builds have failed due to a failure in {{test_breakpad.py}}. Assigning 
> to Tim as the person who most recently touched this file.
> Test output:
> {noformat}
> 09:04:35  ERRORS 
> 
> 09:04:35 ___ ERROR at teardown of 
> TestBreakpadExhaustive.test_minidump_cleanup_thread ___
> 09:04:35 custom_cluster/test_breakpad.py:49: in teardown_method
> 09:04:35 self.kill_cluster(SIGKILL)
> 09:04:35 custom_cluster/test_breakpad.py:80: in kill_cluster
> 09:04:35 self.kill_processes(processes, signal)
> 09:04:35 custom_cluster/test_breakpad.py:85: in kill_processes
> 09:04:35 process.kill(signal)
> 09:04:35 common/impala_cluster.py:330: in kill
> 09:04:35 assert 0, "No processes %s found" % self.cmd
> 09:04:35 E   AssertionError: No processes 
> ['/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/be/build/latest/service/impalad',
>  '-kudu_client_rpc_timeout_ms', '0', '-kudu_master_hosts', 'localhost', 
> '--mem_limit=12884901888', '-logbufsecs=5', '-v=1', '-max_log_files=0', 
> '-log_filename=impalad', 
> '-log_dir=/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/logs/custom_cluster_tests',
>  '-beeswax_port=21000', '-hs2_port=21050', '-be_port=22000', 
> '-krpc_port=27000', '-state_store_subscriber_port=23000', 
> '-webserver_port=25000', '-max_minidumps=2', '-logbufsecs=1', 
> '-minidump_path=/tmp/tmpKaSw_w', '--default_query_options='] found
> {noformat}
> Distilled {{TEST-impala-custom-cluster.xml}} output:
> {noformat}
> -- 2019-01-23 08:00:43,585 INFO MainThread: Found 3 impalad/1 
> statestored/1 catalogd process(es)
> …
> -- 2019-01-23 08:00:43,667 INFO MainThread: Killing: 
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/be/build/latest/service/statestored
>  -logbufsecs=5 -v=1 -max_log_files=0 -log_filename=statestored 
> -log_dir=/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/logs/custom_cluster_tests
>  -max_minidumps=2 -logbufsecs=1 -minidump_path=/tmp/tmpKaSw_w (PID: 16809) 
> with signal 10
> -- 2019-01-23 08:00:43,692 INFO MainThread: Found 6 impalad/1 
> statestored/1 catalogd process(es)
> ...
> E   AssertionError: No processes 
> [/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/be/build/latest/service/impalad
> {noformat}
> Notice that the main thread appaars to be killing statestore, but fails to 
> kill impalad. Notice that a message appears that says that all impalads are 
> running in the midst of the code that tries to shut down the cluster. Is this 
> test multi-threaded? Is there more than one “main thread” Are these main 
> threads working at cross purposes? What recent change may have caused this?
> Also, looks like the script is sending signal 10 (SIGUSR1) while the 
> statestore (in its log) says it got a SIGTERM (15):
> {noformat}
> I0123 08:00:44.086009 16868 thrift-client.cc:78] Couldn't open transport for 
> impala-ec2-centoCaught signal: SIGTERM. Daemon will exit.
> {noformat}
> Not terribly familiar with this area of the product, so bumping it over to 
> the BE team.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8116) Impala Doc: Create Impala Limitations doc

2019-01-24 Thread Alex Rodoni (JIRA)

Alex Rodoni created IMPALA-8116:
---

 Summary: Impala Doc: Create Impala Limitations doc
 Key: IMPALA-8116
 URL: https://issues.apache.org/jira/browse/IMPALA-8116
 Project: IMPALA
  Issue Type: Improvement
  Components: Docs
Affects Versions: Impala 3.1.0
Reporter: Alex Rodoni
Assignee: Alex Rodoni


Create a separate document that focuses on design limitations more than bugs. 
It could also include functional limitations like "cannot write nested types", 
etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IMPALA-8116) Impala Doc: Create Impala Limitations doc

2019-01-24 Thread Alex Rodoni (JIRA)

Alex Rodoni created IMPALA-8116:
---

 Summary: Impala Doc: Create Impala Limitations doc
 Key: IMPALA-8116
 URL: https://issues.apache.org/jira/browse/IMPALA-8116
 Project: IMPALA
  Issue Type: Improvement
  Components: Docs
Affects Versions: Impala 3.1.0
Reporter: Alex Rodoni
Assignee: Alex Rodoni


Create a separate document that focuses on design limitations more than bugs. 
It could also include functional limitations like "cannot write nested types", 
etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8090) DiskIoMgrTest.SyncReadTest hits file_ != nullptr DCHECK in LocalFileReader::ReadFromPos()

2019-01-24 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-8090.
---
   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> DiskIoMgrTest.SyncReadTest hits file_ != nullptr DCHECK in 
> LocalFileReader::ReadFromPos()
> -
>
> Key: IMPALA-8090
> URL: https://issues.apache.org/jira/browse/IMPALA-8090
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: David Knupp
>Assignee: Tim Armstrong
>Priority: Critical
> Fix For: Impala 3.2.0
>
>
> *Test output*:
> {noformat}
> 45/99 Test #45: disk-io-mgr-test .***Exception: Other 43.29 
> sec
> Turning perftools heap leak checking off
> [==] Running 25 tests from 1 test case.
> [--] Global test environment set-up.
> [--] 25 tests from DiskIoMgrTest
> [ RUN  ] DiskIoMgrTest.SingleWriter
> 19/01/16 15:57:09 INFO util.JvmPauseMonitor: Starting JVM pause monitor
> [   OK ] DiskIoMgrTest.SingleWriter (3407 ms)
> [ RUN  ] DiskIoMgrTest.InvalidWrite
> [   OK ] DiskIoMgrTest.InvalidWrite (281 ms)
> [ RUN  ] DiskIoMgrTest.WriteErrors
> [   OK ] DiskIoMgrTest.WriteErrors (235 ms)
> [ RUN  ] DiskIoMgrTest.SingleWriterCancel
> [   OK ] DiskIoMgrTest.SingleWriterCancel (1165 ms)
> [ RUN  ] DiskIoMgrTest.SingleReader
> [   OK ] DiskIoMgrTest.SingleReader (5835 ms)
> [ RUN  ] DiskIoMgrTest.SingleReaderSubRanges
> [   OK ] DiskIoMgrTest.SingleReaderSubRanges (16404 ms)
> [ RUN  ] DiskIoMgrTest.AddScanRangeTest
> [   OK ] DiskIoMgrTest.AddScanRangeTest (1210 ms)
> [ RUN  ] DiskIoMgrTest.SyncReadTest
> *** Check failure stack trace: ***
> @  0x4825dcc
> @  0x4827671
> @  0x48257a6
> @  0x4828d6d
> @  0x1af39ec
> @  0x1ae90a4
> @  0x1ac30ea
> @  0x1accad3
> @  0x1acc660
> @  0x1acbf3e
> @  0x1acb62d
> @  0x1b03671
> @  0x1f79988
> @  0x1f82b60
> @  0x1f82a84
> @  0x1f82a47
> @  0x3751579
> @   0x3ea4807850
> @   0x3ea44e894c
> Wrote minidump to 
> /data/jenkins/workspace/<...>/repos/Impala/logs/be_tests/minidumps/disk-io-mgr-test/5bbf76f7-e5d6-4ac9-bdae9d9b-065c32ec.dmp
> {noformat}
> *Error*:
> {noformat}
> Operating system: Linux
>   0.0.0 Linux 2.6.32-358.14.1.el6.centos.plus.x86_64 #1 SMP 
> Tue Jul 16 21:33:24 UTC 2013 x86_64
> CPU: amd64
>  family 6 model 45 stepping 7
>  8 CPUs
> GPU: UNKNOWN
> Crash reason:  SIGABRT
> Crash address: 0x4522fa1
> Process uptime: not available
> Thread 205 (crashed)
>  0  libc-2.12.so + 0x328e5
> rax = 0x   rdx = 0x0006
> rcx = 0x   rbx = 0x06adf9c0
> rsi = 0x0563   rdi = 0x2fa1
> rbp = 0x7f8009b8ffe0   rsp = 0x7f8009b8fc78
>  r8 = 0x7f8009b8fd00r9 = 0x0563
> r10 = 0x0008   r11 = 0x0202
> r12 = 0x06adfa40   r13 = 0x001f
> r14 = 0x06ae7384   r15 = 0x06adf9c0
> rip = 0x003ea44328e5
> Found by: given as instruction pointer in context
>  1  libc-2.12.so + 0x340c5
> rbp = 0x7f8009b8ffe0   rsp = 0x7f8009b8fc80
> rip = 0x003ea44340c5
> Found by: stack scanning
>  2  disk-io-mgr-test!boost::_bi::bind_t impala::io::DiskQueue, impala::io::DiskIoMgr*>, 
> boost::_bi::list2, 
> boost::_bi::value > >::operator()() 
> [bind_template.hpp : 20 + 0x21]
> rbp = 0x7f8009b8ffe0   rsp = 0x7f8009b8fc88
> rip = 0x01acbf3e
> Found by: stack scanning
>  3  disk-io-mgr-test!google::LogMessage::Flush() + 0x157
> rbx = 0x0007   rbp = 0x06adf980
> rsp = 0x7f8009b8fff0   rip = 0x048257a7
> Found by: call frame info
>  4  disk-io-mgr-test!google::LogMessageFatal::~LogMessageFatal() + 0xe
> rbx = 0x7f8009b90110   rbp = 0x7f8009b903f0
> rsp = 0x7f8009b90070   r12 = 0x0001
> r13 = 0x06aee8b8   r14 = 0x0c213538
> r15 = 0x0007   rip = 0x04828d6e
> Found by: call frame info
>  5  disk-io-mgr-test!impala::io::LocalFileReader::ReadFromPos(long, unsigned 
> char*, long, long*, bool*) [local-file-reader.cc : 67 + 0x10]
> rbx = 0x0001   rbp = 0x7f8009b903f0
> rsp = 0x7f8009b90090   r12 = 0x0001
> r13 = 0x06aee8b8   r14 = 0x0c213538
> r15 = 0x0007   rip = 0x01af39ed
> Found by: call frame

[jira] [Commented] (IMPALA-8090) DiskIoMgrTest.SyncReadTest hits file_ != nullptr DCHECK in LocalFileReader::ReadFromPos()

2019-01-24 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751732#comment-16751732
 ] 

ASF subversion and git services commented on IMPALA-8090:
-

Commit aa603d8cb9e870919b60d80828951beee1db6622 in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=aa603d8 ]

IMPALA-8090: race when reusing ScanRange in test

The issue occurs in test code where two reads are issued
with the same ScanRange object back-to-back, e.g. in
SyncReadTest. The DiskIoMgr enqueues the last buffer
or cancels the scan range *before* closing the
'file_reader_', which means that the client thread
thinks the ScanRange is done and can re-enqueue
it into the DiskIoMgr.

This is not an issue in Impala itself because scan
ranges are not reused in this fashion.

I evaluated adding a DCHECK but the lifecycle
of the ScanRange objects is complicated enough
that there wasn't a straightforward invariant
to enforce.

Testing:
Looped the previously-failing test overnight.

Change-Id: I3122e5b2efea60ffe82d780930301d5be108876b
Reviewed-on: http://gerrit.cloudera.org:8080/12238
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 


> DiskIoMgrTest.SyncReadTest hits file_ != nullptr DCHECK in 
> LocalFileReader::ReadFromPos()
> -
>
> Key: IMPALA-8090
> URL: https://issues.apache.org/jira/browse/IMPALA-8090
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: David Knupp
>Assignee: Tim Armstrong
>Priority: Critical
> Fix For: Impala 3.2.0
>
>
> *Test output*:
> {noformat}
> 45/99 Test #45: disk-io-mgr-test .***Exception: Other 43.29 
> sec
> Turning perftools heap leak checking off
> [==] Running 25 tests from 1 test case.
> [--] Global test environment set-up.
> [--] 25 tests from DiskIoMgrTest
> [ RUN  ] DiskIoMgrTest.SingleWriter
> 19/01/16 15:57:09 INFO util.JvmPauseMonitor: Starting JVM pause monitor
> [   OK ] DiskIoMgrTest.SingleWriter (3407 ms)
> [ RUN  ] DiskIoMgrTest.InvalidWrite
> [   OK ] DiskIoMgrTest.InvalidWrite (281 ms)
> [ RUN  ] DiskIoMgrTest.WriteErrors
> [   OK ] DiskIoMgrTest.WriteErrors (235 ms)
> [ RUN  ] DiskIoMgrTest.SingleWriterCancel
> [   OK ] DiskIoMgrTest.SingleWriterCancel (1165 ms)
> [ RUN  ] DiskIoMgrTest.SingleReader
> [   OK ] DiskIoMgrTest.SingleReader (5835 ms)
> [ RUN  ] DiskIoMgrTest.SingleReaderSubRanges
> [   OK ] DiskIoMgrTest.SingleReaderSubRanges (16404 ms)
> [ RUN  ] DiskIoMgrTest.AddScanRangeTest
> [   OK ] DiskIoMgrTest.AddScanRangeTest (1210 ms)
> [ RUN  ] DiskIoMgrTest.SyncReadTest
> *** Check failure stack trace: ***
> @  0x4825dcc
> @  0x4827671
> @  0x48257a6
> @  0x4828d6d
> @  0x1af39ec
> @  0x1ae90a4
> @  0x1ac30ea
> @  0x1accad3
> @  0x1acc660
> @  0x1acbf3e
> @  0x1acb62d
> @  0x1b03671
> @  0x1f79988
> @  0x1f82b60
> @  0x1f82a84
> @  0x1f82a47
> @  0x3751579
> @   0x3ea4807850
> @   0x3ea44e894c
> Wrote minidump to 
> /data/jenkins/workspace/<...>/repos/Impala/logs/be_tests/minidumps/disk-io-mgr-test/5bbf76f7-e5d6-4ac9-bdae9d9b-065c32ec.dmp
> {noformat}
> *Error*:
> {noformat}
> Operating system: Linux
>   0.0.0 Linux 2.6.32-358.14.1.el6.centos.plus.x86_64 #1 SMP 
> Tue Jul 16 21:33:24 UTC 2013 x86_64
> CPU: amd64
>  family 6 model 45 stepping 7
>  8 CPUs
> GPU: UNKNOWN
> Crash reason:  SIGABRT
> Crash address: 0x4522fa1
> Process uptime: not available
> Thread 205 (crashed)
>  0  libc-2.12.so + 0x328e5
> rax = 0x   rdx = 0x0006
> rcx = 0x   rbx = 0x06adf9c0
> rsi = 0x0563   rdi = 0x2fa1
> rbp = 0x7f8009b8ffe0   rsp = 0x7f8009b8fc78
>  r8 = 0x7f8009b8fd00r9 = 0x0563
> r10 = 0x0008   r11 = 0x0202
> r12 = 0x06adfa40   r13 = 0x001f
> r14 = 0x06ae7384   r15 = 0x06adf9c0
> rip = 0x003ea44328e5
> Found by: given as instruction pointer in context
>  1  libc-2.12.so + 0x340c5
> rbp = 0x7f8009b8ffe0   rsp = 0x7f8009b8fc80
> rip = 0x003ea44340c5
> Found by: stack scanning
>  2  disk-io-mgr-test!boost::_bi::bind_t impala::io::DiskQueue, impala::io::DiskIoMgr*>, 
> boost::_bi::list2, 
> boost::_bi::value > >::operator()() 
> [bind_template.hpp : 20 + 0x21]
> rbp = 0x7f8009b8ffe0

[jira] [Resolved] (IMPALA-8107) Support EXEC_TIME_LIMIT_S in resource pool setting

2019-01-24 Thread Quanlong Huang (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang resolved IMPALA-8107.

Resolution: Resolved

Already supported by IMPALA-2538. Close this JIRA.

> Support EXEC_TIME_LIMIT_S in resource pool setting
> --
>
> Key: IMPALA-8107
> URL: https://issues.apache.org/jira/browse/IMPALA-8107
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Quanlong Huang
>Priority: Major
>  Labels: admission-control
>
> Timeout limit should be different for different kinds of queries. For 
> example, resource pool for adhoc queries may set EXEC_TIME_LIMIT_S to 60s. 
> Resource pool for building pre-aggregaions or other ETL may need a larger 
> EXEC_TIME_LIMIT_S like 30 minutes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (IMPALA-8107) Support EXEC_TIME_LIMIT_S in resource pool setting

2019-01-24 Thread Quanlong Huang (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang resolved IMPALA-8107.

Resolution: Resolved

Already supported by IMPALA-2538. Close this JIRA.

> Support EXEC_TIME_LIMIT_S in resource pool setting
> --
>
> Key: IMPALA-8107
> URL: https://issues.apache.org/jira/browse/IMPALA-8107
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Quanlong Huang
>Priority: Major
>  Labels: admission-control
>
> Timeout limit should be different for different kinds of queries. For 
> example, resource pool for adhoc queries may set EXEC_TIME_LIMIT_S to 60s. 
> Resource pool for building pre-aggregaions or other ETL may need a larger 
> EXEC_TIME_LIMIT_S like 30 minutes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8115) some jenkins workers slow to do real work due to dpkg lock conflicts

2019-01-24 Thread Michael Brown (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751703#comment-16751703
 ] 

Michael Brown commented on IMPALA-8115:
---

I noticed this because I posted https://gerrit.cloudera.org/#/c/12272/ at 2:42 
PT and at 3:45 PT there is still no update that the pre-review job completed.

https://jenkins.impala.io/job/gerrit-code-review-checks/1879/console shows a 
slow timeline:
{noformat}
Triggered by Gerrit: http://gerrit.cloudera.org:8080/12272
Running in Durability level: MAX_SURVIVABILITY
[Pipeline] timestamps
[Pipeline] {
[Pipeline] ansiColor
[Pipeline] {
[Pipeline] timeout
22:42:09 Timeout set to expire in 10 hr
[Pipeline] {
[Pipeline] parallel
[Pipeline] { (Branch: Tidy)
[Pipeline] { (Branch: BuildOnly)
[Pipeline] { (Branch: Python26Compatibility)
[Pipeline] { (Branch: Rat)
[Pipeline] build (Building clang-tidy-ub1604)
22:42:09 Scheduling project: clang-tidy-ub1604
[Pipeline] build (Building ubuntu-16.04-build-only)
22:42:09 Scheduling project: ubuntu-16.04-build-only
[Pipeline] build (Building python26-incompatibility-check)
22:42:09 Scheduling project: python26-incompatibility-check
[Pipeline] build (Building rat-check-ub1604)
22:42:09 Scheduling project: rat-check-ub1604
22:44:22 Starting building: python26-incompatibility-check #1548
[Pipeline] }
23:35:15 Starting building: ubuntu-16.04-build-only #5027
23:36:34 Starting building: rat-check-ub1604 #5404
[Pipeline] }
23:37:14 Starting building: clang-tidy-ub1604 #4922
{noformat}

It's taken nearly an hour to start rat-check-ub1604 #5404 and clang-tidy-ub1604 
#4922.

My 15 minutes math doesn't account for all of this. I'm not yet sure where else 
we could accelerate/improve the process.

> some jenkins workers slow to do real work due to dpkg lock conflicts
> 
>
> Key: IMPALA-8115
> URL: https://issues.apache.org/jira/browse/IMPALA-8115
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Michael Brown
>Priority: Major
>
> A Jenkins worker for label {{ubuntu-16.04}} took about 15 minutes to start 
> doing real work. I noticed that it was retrying {{apt-get update}}:
> {noformat}
> ++ sudo apt-get --yes install openjdk-8-jdk
> E: Could not get lock /var/lib/dpkg/lock - open (11: Resource temporarily 
> unavailable)
> E: Unable to lock the administration directory (/var/lib/dpkg/), is another 
> process using it?
> ++ date
> Thu Jan 24 23:37:33 UTC 2019
> ++ sudo apt-get update
> ++ sleep 10
> ++ sudo apt-get --yes install openjdk-8-jdk
> [etc]
> {noformat}
> I ssh'd into a host and saw that, yes, something else was holding onto the 
> dpkg log (confirmed with lsof and not pasted here. dpkg process PID 11459 was 
> the culprit)
> {noformat}
> root   1750  0.0  0.0   4508  1664 ?Ss   23:21   0:00 /bin/sh 
> /usr/lib/apt/apt.systemd.daily
> root   1804 12.3  0.1 141076 80452 ?S23:22   1:24  \_ 
> /usr/bin/python3 /usr/bin/unattended-upgrade
> root   3263  0.0  0.1 140960 72896 ?S23:23   0:00  \_ 
> /usr/bin/python3 /usr/bin/unattended-upgrade
> root  11459  0.6  0.0  45920 25184 pts/1Ss+  23:24   0:03  \_ 
> /usr/bin/dpkg --status-fd 10 --unpack --auto-deconfigure 
> /var/cache/apt/archives/tzdata_2018i-0ubuntu0.16.04_all.deb 
> /var/cache/apt/archives/distro-info-data_0.28ubuntu0.9_all.deb 
> /var/cache/apt/archives/file_1%3a5.25-2ubuntu1.1_amd64.deb 
> /var/cache/apt/archives/libmagic1_1%3a5.25-2ubuntu1.1_amd64.deb 
> /var/cache/apt/archives/libisc-export160_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb
>  
> /var/cache/apt/archives/libdns-export162_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb
>  /var/cache/apt/archives/isc-dhcp-client_4.3.3-5ubuntu12.9_amd64.deb 
> /var/cache/apt/archives/isc-dhcp-common_4.3.3-5ubuntu12.9_amd64.deb 
> /var/cache/apt/archives/libidn11_1.32-3ubuntu1.2_amd64.deb 
> /var/cache/apt/archives/libpng12-0_1.2.54-1ubuntu1.1_amd64.deb 
> /var/cache/apt/archives/libtasn1-6_4.7-3ubuntu0.16.04.3_amd64.deb 
> /var/cache/apt/archives/libapparmor-perl_2.10.95-0ubuntu2.10_amd64.deb 
> /var/cache/apt/archives/apparmor_2.10.95-0ubuntu2.10_amd64.deb 
> /var/cache/apt/archives/curl_7.47.0-1ubuntu2.11_amd64.deb 
> /var/cache/apt/archives/libgssapi-krb5-2_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
> /var/cache/apt/archives/libkrb5-3_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
> /var/cache/apt/archives/libkrb5support0_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
> /var/cache/apt/archives/libk5crypto3_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
> /var/cache/apt/archives/libcurl3-gnutls_7.47.0-1ubuntu2.11_amd64.deb 
> /var/cache/apt/archives/apt-transport-https_1.2.29ubuntu0.1_amd64.deb 
> /var/cache/apt/archives/libicu55_55.1-7ubuntu0.4_amd64.deb 
> /var/cache/apt/archives/libxml2_2.9.3+dfsg1-1ubuntu0.6_amd64.deb 
>

[jira] [Commented] (IMPALA-8107) Support EXEC_TIME_LIMIT_S in resource pool setting

2019-01-24 Thread Quanlong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751711#comment-16751711
 ] 

Quanlong Huang commented on IMPALA-8107:


Oh, I really missed the impala.admission-control.pool-default-query-options 
settings! Thanks for pointing that!

> Support EXEC_TIME_LIMIT_S in resource pool setting
> --
>
> Key: IMPALA-8107
> URL: https://issues.apache.org/jira/browse/IMPALA-8107
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Quanlong Huang
>Priority: Major
>  Labels: admission-control
>
> Timeout limit should be different for different kinds of queries. For 
> example, resource pool for adhoc queries may set EXEC_TIME_LIMIT_S to 60s. 
> Resource pool for building pre-aggregaions or other ETL may need a larger 
> EXEC_TIME_LIMIT_S like 30 minutes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8115) some jenkins workers slow to do real work due to dpkg lock conflicts

2019-01-24 Thread Michael Brown (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Brown updated IMPALA-8115:
--
Summary: some jenkins workers slow to do real work due to dpkg lock 
conflicts  (was: some jenkins workers slow to spawn to to dpkg lock conflicts)

> some jenkins workers slow to do real work due to dpkg lock conflicts
> 
>
> Key: IMPALA-8115
> URL: https://issues.apache.org/jira/browse/IMPALA-8115
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Michael Brown
>Priority: Major
>
> A Jenkins worker for label {{ubuntu-16.04}} took about 15 minutes to start 
> doing real work. I noticed that it was retrying {{apt-get update}}:
> {noformat}
> ++ sudo apt-get --yes install openjdk-8-jdk
> E: Could not get lock /var/lib/dpkg/lock - open (11: Resource temporarily 
> unavailable)
> E: Unable to lock the administration directory (/var/lib/dpkg/), is another 
> process using it?
> ++ date
> Thu Jan 24 23:37:33 UTC 2019
> ++ sudo apt-get update
> ++ sleep 10
> ++ sudo apt-get --yes install openjdk-8-jdk
> [etc]
> {noformat}
> I ssh'd into a host and saw that, yes, something else was holding onto the 
> dpkg log (confirmed with lsof and not pasted here. dpkg process PID 11459 was 
> the culprit)
> {noformat}
> root   1750  0.0  0.0   4508  1664 ?Ss   23:21   0:00 /bin/sh 
> /usr/lib/apt/apt.systemd.daily
> root   1804 12.3  0.1 141076 80452 ?S23:22   1:24  \_ 
> /usr/bin/python3 /usr/bin/unattended-upgrade
> root   3263  0.0  0.1 140960 72896 ?S23:23   0:00  \_ 
> /usr/bin/python3 /usr/bin/unattended-upgrade
> root  11459  0.6  0.0  45920 25184 pts/1Ss+  23:24   0:03  \_ 
> /usr/bin/dpkg --status-fd 10 --unpack --auto-deconfigure 
> /var/cache/apt/archives/tzdata_2018i-0ubuntu0.16.04_all.deb 
> /var/cache/apt/archives/distro-info-data_0.28ubuntu0.9_all.deb 
> /var/cache/apt/archives/file_1%3a5.25-2ubuntu1.1_amd64.deb 
> /var/cache/apt/archives/libmagic1_1%3a5.25-2ubuntu1.1_amd64.deb 
> /var/cache/apt/archives/libisc-export160_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb
>  
> /var/cache/apt/archives/libdns-export162_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb
>  /var/cache/apt/archives/isc-dhcp-client_4.3.3-5ubuntu12.9_amd64.deb 
> /var/cache/apt/archives/isc-dhcp-common_4.3.3-5ubuntu12.9_amd64.deb 
> /var/cache/apt/archives/libidn11_1.32-3ubuntu1.2_amd64.deb 
> /var/cache/apt/archives/libpng12-0_1.2.54-1ubuntu1.1_amd64.deb 
> /var/cache/apt/archives/libtasn1-6_4.7-3ubuntu0.16.04.3_amd64.deb 
> /var/cache/apt/archives/libapparmor-perl_2.10.95-0ubuntu2.10_amd64.deb 
> /var/cache/apt/archives/apparmor_2.10.95-0ubuntu2.10_amd64.deb 
> /var/cache/apt/archives/curl_7.47.0-1ubuntu2.11_amd64.deb 
> /var/cache/apt/archives/libgssapi-krb5-2_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
> /var/cache/apt/archives/libkrb5-3_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
> /var/cache/apt/archives/libkrb5support0_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
> /var/cache/apt/archives/libk5crypto3_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
> /var/cache/apt/archives/libcurl3-gnutls_7.47.0-1ubuntu2.11_amd64.deb 
> /var/cache/apt/archives/apt-transport-https_1.2.29ubuntu0.1_amd64.deb 
> /var/cache/apt/archives/libicu55_55.1-7ubuntu0.4_amd64.deb 
> /var/cache/apt/archives/libxml2_2.9.3+dfsg1-1ubuntu0.6_amd64.deb 
> /var/cache/apt/archives/bind9-host_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
> /var/cache/apt/archives/dnsutils_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
> /var/cache/apt/archives/libisc160_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
> /var/cache/apt/archives/libdns162_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
> /var/cache/apt/archives/libisccc140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
> /var/cache/apt/archives/libisccfg140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
> /var/cache/apt/archives/liblwres141_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
> /var/cache/apt/archives/libbind9-140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
> /var/cache/apt/archives/openssl_1.0.2g-1ubuntu4.14_amd64.deb 
> /var/cache/apt/archives/ca-certificates_20170717~16.04.1_all.deb 
> /var/cache/apt/archives/libasprintf0v5_0.19.7-2ubuntu3.1_amd64.deb 
> /var/cache/apt/archives/gettext-base_0.19.7-2ubuntu3.1_amd64.deb 
> /var/cache/apt/archives/krb5-locales_1.13.2+dfsg-5ubuntu2.1_all.deb 
> /var/cache/apt/archives/libelf1_0.165-3ubuntu1.1_amd64.deb 
> /var/cache/apt/archives/libglib2.0-data_2.48.2-0ubuntu4.1_all.deb 
> /var/cache/apt/archives/libnuma1_2.0.11-1ubuntu1.1_amd64.deb 
> /var/cache/apt/archives/libpolkit-gobject-1-0_0.105-14.1ubuntu0.4_amd64.deb 
> /var/cache/apt/archives/libx11-data_2%3a1.6.3-1ubuntu2.1_all.deb 
> /var/cache/apt/archives/libx11-6_2%3a1.6.3-1ubuntu2.1_amd64.deb 
> /var/cache/apt/archives/openssh-sftp-server_1%3a7.2p2-4ubuntu2.6_amd64.deb

[jira] [Updated] (IMPALA-8115) some jenkins workers slow to spawn to to dpkg lock conflicts

2019-01-24 Thread Michael Brown (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Brown updated IMPALA-8115:
--
Description: 
A Jenkins worker for label {{ubuntu-16.04}} took about 15 minutes to start 
doing real work. I noticed that it was retrying {{apt-get update}}:
{noformat}
++ sudo apt-get --yes install openjdk-8-jdk
E: Could not get lock /var/lib/dpkg/lock - open (11: Resource temporarily 
unavailable)
E: Unable to lock the administration directory (/var/lib/dpkg/), is another 
process using it?
++ date
Thu Jan 24 23:37:33 UTC 2019
++ sudo apt-get update
++ sleep 10
++ sudo apt-get --yes install openjdk-8-jdk
[etc]
{noformat}

I ssh'd into a host and saw that, yes, something else was holding onto the dpkg 
log (confirmed with lsof and not pasted here. dpkg process PID 11459 was the 
culprit)

{noformat}
root   1750  0.0  0.0   4508  1664 ?Ss   23:21   0:00 /bin/sh 
/usr/lib/apt/apt.systemd.daily
root   1804 12.3  0.1 141076 80452 ?S23:22   1:24  \_ 
/usr/bin/python3 /usr/bin/unattended-upgrade
root   3263  0.0  0.1 140960 72896 ?S23:23   0:00  \_ 
/usr/bin/python3 /usr/bin/unattended-upgrade
root  11459  0.6  0.0  45920 25184 pts/1Ss+  23:24   0:03  \_ 
/usr/bin/dpkg --status-fd 10 --unpack --auto-deconfigure 
/var/cache/apt/archives/tzdata_2018i-0ubuntu0.16.04_all.deb 
/var/cache/apt/archives/distro-info-data_0.28ubuntu0.9_all.deb 
/var/cache/apt/archives/file_1%3a5.25-2ubuntu1.1_amd64.deb 
/var/cache/apt/archives/libmagic1_1%3a5.25-2ubuntu1.1_amd64.deb 
/var/cache/apt/archives/libisc-export160_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb
 
/var/cache/apt/archives/libdns-export162_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb
 /var/cache/apt/archives/isc-dhcp-client_4.3.3-5ubuntu12.9_amd64.deb 
/var/cache/apt/archives/isc-dhcp-common_4.3.3-5ubuntu12.9_amd64.deb 
/var/cache/apt/archives/libidn11_1.32-3ubuntu1.2_amd64.deb 
/var/cache/apt/archives/libpng12-0_1.2.54-1ubuntu1.1_amd64.deb 
/var/cache/apt/archives/libtasn1-6_4.7-3ubuntu0.16.04.3_amd64.deb 
/var/cache/apt/archives/libapparmor-perl_2.10.95-0ubuntu2.10_amd64.deb 
/var/cache/apt/archives/apparmor_2.10.95-0ubuntu2.10_amd64.deb 
/var/cache/apt/archives/curl_7.47.0-1ubuntu2.11_amd64.deb 
/var/cache/apt/archives/libgssapi-krb5-2_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
/var/cache/apt/archives/libkrb5-3_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
/var/cache/apt/archives/libkrb5support0_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
/var/cache/apt/archives/libk5crypto3_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
/var/cache/apt/archives/libcurl3-gnutls_7.47.0-1ubuntu2.11_amd64.deb 
/var/cache/apt/archives/apt-transport-https_1.2.29ubuntu0.1_amd64.deb 
/var/cache/apt/archives/libicu55_55.1-7ubuntu0.4_amd64.deb 
/var/cache/apt/archives/libxml2_2.9.3+dfsg1-1ubuntu0.6_amd64.deb 
/var/cache/apt/archives/bind9-host_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/dnsutils_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/libisc160_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/libdns162_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/libisccc140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/libisccfg140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/liblwres141_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/libbind9-140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/openssl_1.0.2g-1ubuntu4.14_amd64.deb 
/var/cache/apt/archives/ca-certificates_20170717~16.04.1_all.deb 
/var/cache/apt/archives/libasprintf0v5_0.19.7-2ubuntu3.1_amd64.deb 
/var/cache/apt/archives/gettext-base_0.19.7-2ubuntu3.1_amd64.deb 
/var/cache/apt/archives/krb5-locales_1.13.2+dfsg-5ubuntu2.1_all.deb 
/var/cache/apt/archives/libelf1_0.165-3ubuntu1.1_amd64.deb 
/var/cache/apt/archives/libglib2.0-data_2.48.2-0ubuntu4.1_all.deb 
/var/cache/apt/archives/libnuma1_2.0.11-1ubuntu1.1_amd64.deb 
/var/cache/apt/archives/libpolkit-gobject-1-0_0.105-14.1ubuntu0.4_amd64.deb 
/var/cache/apt/archives/libx11-data_2%3a1.6.3-1ubuntu2.1_all.deb 
/var/cache/apt/archives/libx11-6_2%3a1.6.3-1ubuntu2.1_amd64.deb 
/var/cache/apt/archives/openssh-sftp-server_1%3a7.2p2-4ubuntu2.6_amd64.deb 
/var/cache/apt/archives/openssh-server_1%3a7.2p2-4ubuntu2.6_amd64.deb 
/var/cache/apt/archives/openssh-client_1%3a7.2p2-4ubuntu2.6_amd64.deb 
/var/cache/apt/archives/rsync_3.1.1-3ubuntu1.2_amd64.deb 
/var/cache/apt/archives/tcpdump_4.9.2-0ubuntu0.16.04.1_amd64.deb 
/var/cache/apt/archives/wget_1.17.1-1ubuntu1.4_amd64.deb 
/var/cache/apt/archives/python3-problem-report_2.20.1-0ubuntu2.18_all.deb 
/var/cache/apt/archives/python3-apport_2.20.1-0ubuntu2.18_all.deb 
/var/cache/apt/archives/apport_2.20.1-0ubuntu2.18_all.deb 
/var/cache/apt/archives/dns-root-data_2018013001~16.04.1_all.deb 
/var/cache/apt/archives/dnsmasq-base_2.75-1ubuntu0.16.04.5_amd64.deb

[jira] [Created] (IMPALA-8115) some jenkins workers slow to spawn to to dpkg lock conflicts

2019-01-24 Thread Michael Brown (JIRA)

Michael Brown created IMPALA-8115:
-

 Summary: some jenkins workers slow to spawn to to dpkg lock 
conflicts
 Key: IMPALA-8115
 URL: https://issues.apache.org/jira/browse/IMPALA-8115
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Reporter: Michael Brown


A Jenkins worker for label {{ubuntu-16.04}} took about 15 minutes to start 
doing real work. I noticed that it was retrying {{apt-get update}}:
{noformat}
++ sudo apt-get --yes install openjdk-8-jdk
E: Could not get lock /var/lib/dpkg/lock - open (11: Resource temporarily 
unavailable)
E: Unable to lock the administration directory (/var/lib/dpkg/), is another 
process using it?
++ date
Thu Jan 24 23:37:33 UTC 2019
++ sudo apt-get update
++ sleep 10
++ sudo apt-get --yes install openjdk-8-jdk
[etc]
{noformat}

I ssh'd into a host and saw that, yes, something else was holding onto the dpkg 
log (confirmed with lsof and not pasted here. dpkg process PID 11459 was the 
culprit)

{noformat}
root   1750  0.0  0.0   4508  1664 ?Ss   23:21   0:00 /bin/sh 
/usr/lib/apt/apt.systemd.daily
root   1804 12.3  0.1 141076 80452 ?S23:22   1:24  \_ 
/usr/bin/python3 /usr/bin/unattended-upgrade
root   3263  0.0  0.1 140960 72896 ?S23:23   0:00  \_ 
/usr/bin/python3 /usr/bin/unattended-upgrade
root  11459  0.6  0.0  45920 25184 pts/1Ss+  23:24   0:03  \_ 
/usr/bin/dpkg --status-fd 10 --unpack --auto-deconfigure 
/var/cache/apt/archives/tzdata_2018i-0ubuntu0.16.04_all.deb 
/var/cache/apt/archives/distro-info-data_0.28ubuntu0.9_all.deb 
/var/cache/apt/archives/file_1%3a5.25-2ubuntu1.1_amd64.deb 
/var/cache/apt/archives/libmagic1_1%3a5.25-2ubuntu1.1_amd64.deb 
/var/cache/apt/archives/libisc-export160_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb
 
/var/cache/apt/archives/libdns-export162_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb
 /var/cache/apt/archives/isc-dhcp-client_4.3.3-5ubuntu12.9_amd64.deb 
/var/cache/apt/archives/isc-dhcp-common_4.3.3-5ubuntu12.9_amd64.deb 
/var/cache/apt/archives/libidn11_1.32-3ubuntu1.2_amd64.deb 
/var/cache/apt/archives/libpng12-0_1.2.54-1ubuntu1.1_amd64.deb 
/var/cache/apt/archives/libtasn1-6_4.7-3ubuntu0.16.04.3_amd64.deb 
/var/cache/apt/archives/libapparmor-perl_2.10.95-0ubuntu2.10_amd64.deb 
/var/cache/apt/archives/apparmor_2.10.95-0ubuntu2.10_amd64.deb 
/var/cache/apt/archives/curl_7.47.0-1ubuntu2.11_amd64.deb 
/var/cache/apt/archives/libgssapi-krb5-2_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
/var/cache/apt/archives/libkrb5-3_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
/var/cache/apt/archives/libkrb5support0_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
/var/cache/apt/archives/libk5crypto3_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
/var/cache/apt/archives/libcurl3-gnutls_7.47.0-1ubuntu2.11_amd64.deb 
/var/cache/apt/archives/apt-transport-https_1.2.29ubuntu0.1_amd64.deb 
/var/cache/apt/archives/libicu55_55.1-7ubuntu0.4_amd64.deb 
/var/cache/apt/archives/libxml2_2.9.3+dfsg1-1ubuntu0.6_amd64.deb 
/var/cache/apt/archives/bind9-host_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/dnsutils_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/libisc160_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/libdns162_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/libisccc140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/libisccfg140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/liblwres141_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/libbind9-140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/openssl_1.0.2g-1ubuntu4.14_amd64.deb 
/var/cache/apt/archives/ca-certificates_20170717~16.04.1_all.deb 
/var/cache/apt/archives/libasprintf0v5_0.19.7-2ubuntu3.1_amd64.deb 
/var/cache/apt/archives/gettext-base_0.19.7-2ubuntu3.1_amd64.deb 
/var/cache/apt/archives/krb5-locales_1.13.2+dfsg-5ubuntu2.1_all.deb 
/var/cache/apt/archives/libelf1_0.165-3ubuntu1.1_amd64.deb 
/var/cache/apt/archives/libglib2.0-data_2.48.2-0ubuntu4.1_all.deb 
/var/cache/apt/archives/libnuma1_2.0.11-1ubuntu1.1_amd64.deb 
/var/cache/apt/archives/libpolkit-gobject-1-0_0.105-14.1ubuntu0.4_amd64.deb 
/var/cache/apt/archives/libx11-data_2%3a1.6.3-1ubuntu2.1_all.deb 
/var/cache/apt/archives/libx11-6_2%3a1.6.3-1ubuntu2.1_amd64.deb 
/var/cache/apt/archives/openssh-sftp-server_1%3a7.2p2-4ubuntu2.6_amd64.deb 
/var/cache/apt/archives/openssh-server_1%3a7.2p2-4ubuntu2.6_amd64.deb 
/var/cache/apt/archives/openssh-client_1%3a7.2p2-4ubuntu2.6_amd64.deb 
/var/cache/apt/archives/rsync_3.1.1-3ubuntu1.2_amd64.deb 
/var/cache/apt/archives/tcpdump_4.9.2-0ubuntu0.16.04.1_amd64.deb 
/var/cache/apt/archives/wget_1.17.1-1ubuntu1.4_amd64.deb 
/var/cache/apt/archives/python3-problem-report_2.20.1-0ubuntu2.18_all.deb 
/var/cache/apt/archives/python3-apport_2.20.1-0ubuntu2.18_all.deb

[jira] [Created] (IMPALA-8115) some jenkins workers slow to spawn to to dpkg lock conflicts

2019-01-24 Thread Michael Brown (JIRA)

Michael Brown created IMPALA-8115:
-

 Summary: some jenkins workers slow to spawn to to dpkg lock 
conflicts
 Key: IMPALA-8115
 URL: https://issues.apache.org/jira/browse/IMPALA-8115
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Reporter: Michael Brown


A Jenkins worker for label {{ubuntu-16.04}} took about 15 minutes to start 
doing real work. I noticed that it was retrying {{apt-get update}}:
{noformat}
++ sudo apt-get --yes install openjdk-8-jdk
E: Could not get lock /var/lib/dpkg/lock - open (11: Resource temporarily 
unavailable)
E: Unable to lock the administration directory (/var/lib/dpkg/), is another 
process using it?
++ date
Thu Jan 24 23:37:33 UTC 2019
++ sudo apt-get update
++ sleep 10
++ sudo apt-get --yes install openjdk-8-jdk
[etc]
{noformat}

I ssh'd into a host and saw that, yes, something else was holding onto the dpkg 
log (confirmed with lsof and not pasted here. dpkg process PID 11459 was the 
culprit)

{noformat}
root   1750  0.0  0.0   4508  1664 ?Ss   23:21   0:00 /bin/sh 
/usr/lib/apt/apt.systemd.daily
root   1804 12.3  0.1 141076 80452 ?S23:22   1:24  \_ 
/usr/bin/python3 /usr/bin/unattended-upgrade
root   3263  0.0  0.1 140960 72896 ?S23:23   0:00  \_ 
/usr/bin/python3 /usr/bin/unattended-upgrade
root  11459  0.6  0.0  45920 25184 pts/1Ss+  23:24   0:03  \_ 
/usr/bin/dpkg --status-fd 10 --unpack --auto-deconfigure 
/var/cache/apt/archives/tzdata_2018i-0ubuntu0.16.04_all.deb 
/var/cache/apt/archives/distro-info-data_0.28ubuntu0.9_all.deb 
/var/cache/apt/archives/file_1%3a5.25-2ubuntu1.1_amd64.deb 
/var/cache/apt/archives/libmagic1_1%3a5.25-2ubuntu1.1_amd64.deb 
/var/cache/apt/archives/libisc-export160_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb
 
/var/cache/apt/archives/libdns-export162_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb
 /var/cache/apt/archives/isc-dhcp-client_4.3.3-5ubuntu12.9_amd64.deb 
/var/cache/apt/archives/isc-dhcp-common_4.3.3-5ubuntu12.9_amd64.deb 
/var/cache/apt/archives/libidn11_1.32-3ubuntu1.2_amd64.deb 
/var/cache/apt/archives/libpng12-0_1.2.54-1ubuntu1.1_amd64.deb 
/var/cache/apt/archives/libtasn1-6_4.7-3ubuntu0.16.04.3_amd64.deb 
/var/cache/apt/archives/libapparmor-perl_2.10.95-0ubuntu2.10_amd64.deb 
/var/cache/apt/archives/apparmor_2.10.95-0ubuntu2.10_amd64.deb 
/var/cache/apt/archives/curl_7.47.0-1ubuntu2.11_amd64.deb 
/var/cache/apt/archives/libgssapi-krb5-2_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
/var/cache/apt/archives/libkrb5-3_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
/var/cache/apt/archives/libkrb5support0_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
/var/cache/apt/archives/libk5crypto3_1.13.2+dfsg-5ubuntu2.1_amd64.deb 
/var/cache/apt/archives/libcurl3-gnutls_7.47.0-1ubuntu2.11_amd64.deb 
/var/cache/apt/archives/apt-transport-https_1.2.29ubuntu0.1_amd64.deb 
/var/cache/apt/archives/libicu55_55.1-7ubuntu0.4_amd64.deb 
/var/cache/apt/archives/libxml2_2.9.3+dfsg1-1ubuntu0.6_amd64.deb 
/var/cache/apt/archives/bind9-host_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/dnsutils_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/libisc160_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/libdns162_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/libisccc140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/libisccfg140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/liblwres141_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/libbind9-140_1%3a9.10.3.dfsg.P4-8ubuntu1.11_amd64.deb 
/var/cache/apt/archives/openssl_1.0.2g-1ubuntu4.14_amd64.deb 
/var/cache/apt/archives/ca-certificates_20170717~16.04.1_all.deb 
/var/cache/apt/archives/libasprintf0v5_0.19.7-2ubuntu3.1_amd64.deb 
/var/cache/apt/archives/gettext-base_0.19.7-2ubuntu3.1_amd64.deb 
/var/cache/apt/archives/krb5-locales_1.13.2+dfsg-5ubuntu2.1_all.deb 
/var/cache/apt/archives/libelf1_0.165-3ubuntu1.1_amd64.deb 
/var/cache/apt/archives/libglib2.0-data_2.48.2-0ubuntu4.1_all.deb 
/var/cache/apt/archives/libnuma1_2.0.11-1ubuntu1.1_amd64.deb 
/var/cache/apt/archives/libpolkit-gobject-1-0_0.105-14.1ubuntu0.4_amd64.deb 
/var/cache/apt/archives/libx11-data_2%3a1.6.3-1ubuntu2.1_all.deb 
/var/cache/apt/archives/libx11-6_2%3a1.6.3-1ubuntu2.1_amd64.deb 
/var/cache/apt/archives/openssh-sftp-server_1%3a7.2p2-4ubuntu2.6_amd64.deb 
/var/cache/apt/archives/openssh-server_1%3a7.2p2-4ubuntu2.6_amd64.deb 
/var/cache/apt/archives/openssh-client_1%3a7.2p2-4ubuntu2.6_amd64.deb 
/var/cache/apt/archives/rsync_3.1.1-3ubuntu1.2_amd64.deb 
/var/cache/apt/archives/tcpdump_4.9.2-0ubuntu0.16.04.1_amd64.deb 
/var/cache/apt/archives/wget_1.17.1-1ubuntu1.4_amd64.deb 
/var/cache/apt/archives/python3-problem-report_2.20.1-0ubuntu2.18_all.deb 
/var/cache/apt/archives/python3-apport_2.20.1-0ubuntu2.18_all.deb

[jira] [Updated] (IMPALA-6852) Add a section to the query profile covering number of fragments instances and total bytes scanned per host

2019-01-24 Thread Bikramjeet Vig (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig updated IMPALA-6852:
---
Summary: Add a section to the query profile covering number of fragments 
instances and total bytes scanned per host  (was: Add a section to the query 
profile covering number of fragments and total bytes scanned per host)

> Add a section to the query profile covering number of fragments instances and 
> total bytes scanned per host
> --
>
> Key: IMPALA-6852
> URL: https://issues.apache.org/jira/browse/IMPALA-6852
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.13.0
>Reporter: Mostafa Mokhtar
>Assignee: Alice Fan
>Priority: Major
>  Labels: ramp-up, supportability
>
> Root causing performance issues usually comes down to skew in terms of work 
> done across hosts. Un-even assignment of fragments or bytes scanned per host 
> is a common.
> Making this information readily available in the query profile will help 
> speedup RCA.
> Proposal is to add two tables to the query profile that cover
> * Number of fragments per host
> * Number of bytes scanned per host



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-7957) UNION ALL query returns incorrect results

2019-01-24 Thread Paul Rogers (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers reassigned IMPALA-7957:
---

Assignee: Paul Rogers  (was: Janaki Lahorani)

> UNION ALL query returns incorrect results
> -
>
> Key: IMPALA-7957
> URL: https://issues.apache.org/jira/browse/IMPALA-7957
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.12.0
>Reporter: Luis E Martinez-Poblete
>Assignee: Paul Rogers
>Priority: Blocker
>  Labels: correctness
>
> Synopsis:
> =
> UNION ALL query returns incorrect results
> Problem:
> 
> Customer reported a UNION ALL query returning incorrect results. The UNION 
> ALL query has 2 legs, but Impala is only returning information from one leg.
> Issue can be reproduced in the latest version of Impala. Below is the 
> reproduction case:
> {noformat}
> create table mytest_t (c1 timestamp, c2 timestamp, c3 int, c4 int);
> insert into mytest_t values (now(), ADDDATE (now(),1), 1,1);
> insert into mytest_t values (now(), ADDDATE (now(),1), 2,2);
> insert into mytest_t values (now(), ADDDATE (now(),1), 3,3);
> SELECT t.c1
> FROM
>  (SELECT c1, c2
>  FROM mytest_t) t
> LEFT JOIN
>  (SELECT c1, c2
>  FROM mytest_t
>  WHERE c2 = c1) t2 ON (t.c2 = t2.c2)
> UNION ALL
> VALUES (NULL)
> {noformat}
> The above query produces the following execution plan:
> {noformat}
> ++
> | Explain String  
>|
> ++
> | Max Per-Host Resource Reservation: Memory=34.02MB Threads=5 
>|
> | Per-Host Resource Estimates: Memory=2.06GB  
>|
> | WARNING: The following tables are missing relevant table and/or column 
> statistics. |
> | default.mytest_t
>|
> | 
>|
> | PLAN-ROOT SINK  
>|
> | |   
>|
> | 06:EXCHANGE [UNPARTITIONED] 
>|
> | |   
>|
> | 00:UNION
>|
> | |  constant-operands=1  
>|
> | |   
>|
> | 04:SELECT   
>|
> | |  predicates: default.mytest_t.c1 = default.mytest_t.c2
>|
> | |   
>|
> | 03:HASH JOIN [LEFT OUTER JOIN, BROADCAST]   
>|
> | |  hash predicates: c2 = c2 
>|
> | |   
>|
> | |--05:EXCHANGE [BROADCAST]  
>|
> | |  |
>|
> | |  02:SCAN HDFS [default.mytest_t]  
>|
> | | partitions=1/1 files=3 size=192B  
>|
> | | predicates: c2 = c1   
>|
> | |   
>|
> | 01:SCAN HDFS [default.mytest_t] 
>|
> |partitions=1/1 files=3 size=192B 
>|
> ++
> {noformat}
> The issue is in operator 4:
> {noformat}
> | 04:SELECT |
> | | predicates: default.mytest_t.c1 = default.mytest_t.c2 |
> {noformat}
> It's definitely a bug with predicate placement - that c1 = c2 predicate 
> shouldn't be evaluated outside the right branch of the LEFT JOIN.
> Thanks,
> Luis Martinez.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-8064) test_min_max_filters is flaky

2019-01-24 Thread Paul Rogers (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers reassigned IMPALA-8064:
---

Assignee: Pooja Nilangekar  (was: Janaki Lahorani)

> test_min_max_filters is flaky 
> --
>
> Key: IMPALA-8064
> URL: https://issues.apache.org/jira/browse/IMPALA-8064
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Pooja Nilangekar
>Assignee: Pooja Nilangekar
>Priority: Blocker
>  Labels: broken-build, flaky-test
> Attachments: profile.txt
>
>
> The following configuration of the test_min_max_filters:
> {code:java}
> query_test.test_runtime_filters.TestMinMaxFilters.test_min_max_filters[protocol:
>  beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: kudu/none]{code}
> It produces a higher aggregation of sum over the proberows than expected:
> {code:java}
> query_test/test_runtime_filters.py:113: in test_min_max_filters 
> self.run_test_case('QueryTest/min_max_filters', vector) 
> common/impala_test_suite.py:518: in run_test_case 
> update_section=pytest.config.option.update_results) 
> common/test_result_verifier.py:612: in verify_runtime_profile % 
> (function, field, expected_value, actual_value, actual)) 
> E   AssertionError: Aggregation of SUM over ProbeRows did not match expected 
> results. 
> E   EXPECTED VALUE: E   619 
> EACTUAL VALUE: E   652
> {code}
> This test was introduced in the patch for IMPALA-6533. The failure occurred 
> during an ASAN build. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8114) Build test failure in test_breakpad.py

2019-01-24 Thread Paul Rogers (JIRA)

Paul Rogers created IMPALA-8114:
---

 Summary: Build test failure in test_breakpad.py
 Key: IMPALA-8114
 URL: https://issues.apache.org/jira/browse/IMPALA-8114
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.1.0
Reporter: Paul Rogers
Assignee: Tim Armstrong


Recent builds have failed due to a failure in {{test_breakpad.py}}. Assigning 
to Tim as the person who most recently touched this file.

Test output:

{noformat}
09:04:35  ERRORS 

09:04:35 ___ ERROR at teardown of 
TestBreakpadExhaustive.test_minidump_cleanup_thread ___
09:04:35 custom_cluster/test_breakpad.py:49: in teardown_method
09:04:35 self.kill_cluster(SIGKILL)
09:04:35 custom_cluster/test_breakpad.py:80: in kill_cluster
09:04:35 self.kill_processes(processes, signal)
09:04:35 custom_cluster/test_breakpad.py:85: in kill_processes
09:04:35 process.kill(signal)
09:04:35 common/impala_cluster.py:330: in kill
09:04:35 assert 0, "No processes %s found" % self.cmd
09:04:35 E   AssertionError: No processes 
['/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/be/build/latest/service/impalad',
 '-kudu_client_rpc_timeout_ms', '0', '-kudu_master_hosts', 'localhost', 
'--mem_limit=12884901888', '-logbufsecs=5', '-v=1', '-max_log_files=0', 
'-log_filename=impalad', 
'-log_dir=/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/logs/custom_cluster_tests',
 '-beeswax_port=21000', '-hs2_port=21050', '-be_port=22000', 
'-krpc_port=27000', '-state_store_subscriber_port=23000', 
'-webserver_port=25000', '-max_minidumps=2', '-logbufsecs=1', 
'-minidump_path=/tmp/tmpKaSw_w', '--default_query_options='] found
{noformat}

Distilled {{TEST-impala-custom-cluster.xml}} output:

{noformat}
-- 2019-01-23 08:00:43,585 INFO MainThread: Found 3 impalad/1 statestored/1 
catalogd process(es)
…
-- 2019-01-23 08:00:43,667 INFO MainThread: Killing: 
/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/be/build/latest/service/statestored
 -logbufsecs=5 -v=1 -max_log_files=0 -log_filename=statestored 
-log_dir=/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/logs/custom_cluster_tests
 -max_minidumps=2 -logbufsecs=1 -minidump_path=/tmp/tmpKaSw_w (PID: 16809) with 
signal 10
-- 2019-01-23 08:00:43,692 INFO MainThread: Found 6 impalad/1 statestored/1 
catalogd process(es)
...
E   AssertionError: No processes 
[/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/be/build/latest/service/impalad
{noformat}

Notice that the main thread appaars to be killing statestore, but fails to kill 
impalad. Notice that a message appears that says that all impalads are running 
in the midst of the code that tries to shut down the cluster. Is this test 
multi-threaded? Is there more than one “main thread” Are these main threads 
working at cross purposes? What recent change may have caused this?

Also, looks like the script is sending signal 10 (SIGUSR1) while the statestore 
(in its log) says it got a SIGTERM (15):

{noformat}
I0123 08:00:44.086009 16868 thrift-client.cc:78] Couldn't open transport for 
impala-ec2-centoCaught signal: SIGTERM. Daemon will exit.
{noformat}

Not terribly familiar with this area of the product, so bumping it over to the 
BE team.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8114) Build test failure in test_breakpad.py

2019-01-24 Thread Paul Rogers (JIRA)

Paul Rogers created IMPALA-8114:
---

 Summary: Build test failure in test_breakpad.py
 Key: IMPALA-8114
 URL: https://issues.apache.org/jira/browse/IMPALA-8114
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.1.0
Reporter: Paul Rogers
Assignee: Tim Armstrong


Recent builds have failed due to a failure in {{test_breakpad.py}}. Assigning 
to Tim as the person who most recently touched this file.

Test output:

{noformat}
09:04:35  ERRORS 

09:04:35 ___ ERROR at teardown of 
TestBreakpadExhaustive.test_minidump_cleanup_thread ___
09:04:35 custom_cluster/test_breakpad.py:49: in teardown_method
09:04:35 self.kill_cluster(SIGKILL)
09:04:35 custom_cluster/test_breakpad.py:80: in kill_cluster
09:04:35 self.kill_processes(processes, signal)
09:04:35 custom_cluster/test_breakpad.py:85: in kill_processes
09:04:35 process.kill(signal)
09:04:35 common/impala_cluster.py:330: in kill
09:04:35 assert 0, "No processes %s found" % self.cmd
09:04:35 E   AssertionError: No processes 
['/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/be/build/latest/service/impalad',
 '-kudu_client_rpc_timeout_ms', '0', '-kudu_master_hosts', 'localhost', 
'--mem_limit=12884901888', '-logbufsecs=5', '-v=1', '-max_log_files=0', 
'-log_filename=impalad', 
'-log_dir=/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/logs/custom_cluster_tests',
 '-beeswax_port=21000', '-hs2_port=21050', '-be_port=22000', 
'-krpc_port=27000', '-state_store_subscriber_port=23000', 
'-webserver_port=25000', '-max_minidumps=2', '-logbufsecs=1', 
'-minidump_path=/tmp/tmpKaSw_w', '--default_query_options='] found
{noformat}

Distilled {{TEST-impala-custom-cluster.xml}} output:

{noformat}
-- 2019-01-23 08:00:43,585 INFO MainThread: Found 3 impalad/1 statestored/1 
catalogd process(es)
…
-- 2019-01-23 08:00:43,667 INFO MainThread: Killing: 
/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/be/build/latest/service/statestored
 -logbufsecs=5 -v=1 -max_log_files=0 -log_filename=statestored 
-log_dir=/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/logs/custom_cluster_tests
 -max_minidumps=2 -logbufsecs=1 -minidump_path=/tmp/tmpKaSw_w (PID: 16809) with 
signal 10
-- 2019-01-23 08:00:43,692 INFO MainThread: Found 6 impalad/1 statestored/1 
catalogd process(es)
...
E   AssertionError: No processes 
[/data/jenkins/workspace/impala-cdh6.x-exhaustive-release/repos/Impala/be/build/latest/service/impalad
{noformat}

Notice that the main thread appaars to be killing statestore, but fails to kill 
impalad. Notice that a message appears that says that all impalads are running 
in the midst of the code that tries to shut down the cluster. Is this test 
multi-threaded? Is there more than one “main thread” Are these main threads 
working at cross purposes? What recent change may have caused this?

Also, looks like the script is sending signal 10 (SIGUSR1) while the statestore 
(in its log) says it got a SIGTERM (15):

{noformat}
I0123 08:00:44.086009 16868 thrift-client.cc:78] Couldn't open transport for 
impala-ec2-centoCaught signal: SIGTERM. Daemon will exit.
{noformat}

Not terribly familiar with this area of the product, so bumping it over to the 
BE team.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IMPALA-8113) test_aggregation and test_avro_primitive_in_list fail in S3

2019-01-24 Thread Michael Brown (JIRA)

Michael Brown created IMPALA-8113:
-

 Summary: test_aggregation and test_avro_primitive_in_list fail in 
S3
 Key: IMPALA-8113
 URL: https://issues.apache.org/jira/browse/IMPALA-8113
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.2.0
Reporter: Michael Brown
Assignee: Michael Brown


Likely more victims of our infra in S3.
{noformat}
query_test/test_aggregation.py:138: in test_aggregation
result = self.execute_query(query, vector.get_value('exec_option'))
common/impala_test_suite.py:597: in wrapper
return function(*args, **kwargs)
common/impala_test_suite.py:628: in execute_query
return self.__execute_query(self.client, query, query_options)
common/impala_test_suite.py:695: in __execute_query
return impalad_client.execute(query, user=user)
common/impala_connection.py:174: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:182: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:359: in __execute_query
self.wait_for_finished(handle)
beeswax/impala_beeswax.py:380: in wait_for_finished
raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EQuery aborted:Disk I/O error: Error reading from HDFS file: 
s3a://impala-test-uswest2-1/test-warehouse/alltypesagg_parquet/year=2010/month=1/day=8/5642b2da93dae1ad-494132e5_592013737_data.0.parq
E   Error(255): Unknown error 255
E   Root cause: SdkClientException: Data read has a different length than the 
expected: dataLength=0; expectedLength=45494; includeSkipped=true; 
in.getClass()=class com.amazonaws.services.s3.AmazonS3Client$2; 
markedSupported=false; marked=0; resetSinceLastMarked=false; markCount=0; 
resetCount=0
{noformat}

{noformat}
query_test/test_nested_types.py:263: in test_avro_primitive_in_list
"AvroPrimitiveInList.parquet", vector)
query_test/test_nested_types.py:287: in __test_primitive_in_list
result = self.execute_query("select item from %s.col1" % full_name, qopts)
common/impala_test_suite.py:597: in wrapper
return function(*args, **kwargs)
common/impala_test_suite.py:628: in execute_query
return self.__execute_query(self.client, query, query_options)
common/impala_test_suite.py:695: in __execute_query
return impalad_client.execute(query, user=user)
common/impala_connection.py:174: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:182: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:359: in __execute_query
self.wait_for_finished(handle)
beeswax/impala_beeswax.py:380: in wait_for_finished
raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EQuery aborted:Disk I/O error: Failed to open HDFS file 
s3a://impala-test-uswest2-1/test-warehouse/test_avro_primitive_in_list_38f182c4.db/AvroPrimitiveInList/AvroPrimitiveInList.parquet
E   Error(2): No such file or directory
E   Root cause: FileNotFoundException: No such file or directory: 
s3a://impala-test-uswest2-1/test-warehouse/test_avro_primitive_in_list_38f182c4.db/AvroPrimitiveInList/AvroPrimitiveInList.parquet
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8113) test_aggregation and test_avro_primitive_in_list fail in S3

2019-01-24 Thread Michael Brown (JIRA)

Michael Brown created IMPALA-8113:
-

 Summary: test_aggregation and test_avro_primitive_in_list fail in 
S3
 Key: IMPALA-8113
 URL: https://issues.apache.org/jira/browse/IMPALA-8113
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.2.0
Reporter: Michael Brown
Assignee: Michael Brown


Likely more victims of our infra in S3.
{noformat}
query_test/test_aggregation.py:138: in test_aggregation
result = self.execute_query(query, vector.get_value('exec_option'))
common/impala_test_suite.py:597: in wrapper
return function(*args, **kwargs)
common/impala_test_suite.py:628: in execute_query
return self.__execute_query(self.client, query, query_options)
common/impala_test_suite.py:695: in __execute_query
return impalad_client.execute(query, user=user)
common/impala_connection.py:174: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:182: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:359: in __execute_query
self.wait_for_finished(handle)
beeswax/impala_beeswax.py:380: in wait_for_finished
raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EQuery aborted:Disk I/O error: Error reading from HDFS file: 
s3a://impala-test-uswest2-1/test-warehouse/alltypesagg_parquet/year=2010/month=1/day=8/5642b2da93dae1ad-494132e5_592013737_data.0.parq
E   Error(255): Unknown error 255
E   Root cause: SdkClientException: Data read has a different length than the 
expected: dataLength=0; expectedLength=45494; includeSkipped=true; 
in.getClass()=class com.amazonaws.services.s3.AmazonS3Client$2; 
markedSupported=false; marked=0; resetSinceLastMarked=false; markCount=0; 
resetCount=0
{noformat}

{noformat}
query_test/test_nested_types.py:263: in test_avro_primitive_in_list
"AvroPrimitiveInList.parquet", vector)
query_test/test_nested_types.py:287: in __test_primitive_in_list
result = self.execute_query("select item from %s.col1" % full_name, qopts)
common/impala_test_suite.py:597: in wrapper
return function(*args, **kwargs)
common/impala_test_suite.py:628: in execute_query
return self.__execute_query(self.client, query, query_options)
common/impala_test_suite.py:695: in __execute_query
return impalad_client.execute(query, user=user)
common/impala_connection.py:174: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:182: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:359: in __execute_query
self.wait_for_finished(handle)
beeswax/impala_beeswax.py:380: in wait_for_finished
raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EQuery aborted:Disk I/O error: Failed to open HDFS file 
s3a://impala-test-uswest2-1/test-warehouse/test_avro_primitive_in_list_38f182c4.db/AvroPrimitiveInList/AvroPrimitiveInList.parquet
E   Error(2): No such file or directory
E   Root cause: FileNotFoundException: No such file or directory: 
s3a://impala-test-uswest2-1/test-warehouse/test_avro_primitive_in_list_38f182c4.db/AvroPrimitiveInList/AvroPrimitiveInList.parquet
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (IMPALA-8112) test_cancel_select with debug action failed with unexpected error

2019-01-24 Thread Michael Brown (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Brown updated IMPALA-8112:
--
Labels: flaky  (was: )

> test_cancel_select with debug action failed with unexpected error
> -
>
> Key: IMPALA-8112
> URL: https://issues.apache.org/jira/browse/IMPALA-8112
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Michael Brown
>Assignee: Andrew Sherman
>Priority: Critical
>  Labels: flaky
>
> Stacktrace
> {noformat}
> query_test/test_cancellation.py:241: in test_cancel_select
> self.execute_cancel_test(vector)
> query_test/test_cancellation.py:213: in execute_cancel_test
> assert 'Cancelled' in str(thread.fetch_results_error)
> E   assert 'Cancelled' in "ImpalaBeeswaxException:\n INNER EXCEPTION:  'beeswaxd.ttypes.BeeswaxException'>\n MESSAGE: Unable to open Kudu table: 
> Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected 
> (error 107)\n"
> E+  where "ImpalaBeeswaxException:\n INNER EXCEPTION:  'beeswaxd.ttypes.BeeswaxException'>\n MESSAGE: Unable to open Kudu table: 
> Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected 
> (error 107)\n" = str(ImpalaBeeswaxException())
> E+where ImpalaBeeswaxException() =  140481071658752)>.fetch_results_error
> {noformat}
> Standard Error
> {noformat}
> SET 
> client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
> -- executing against localhost:21000
> use tpch_kudu;
> -- 2019-01-18 17:50:03,100 INFO MainThread: Started query 
> 4e4b3ab4cc7d:11efc3f5
> SET 
> client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET cpu_limit_s=10;
> SET debug_action=0:GETNEXT:WAIT|COORD_CANCEL_QUERY_FINSTANCES_RPC:FAIL;
> SET exec_single_node_rows_threshold=0;
> SET buffer_pool_limit=0;
> -- executing async: localhost:21000
> select l_returnflag from lineitem;
> -- 2019-01-18 17:50:03,139 INFO MainThread: Started query 
> fa4ddb9e62a01240:54c86ad
> SET 
> client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
> -- connecting to: localhost:21000
> -- fetching results from:  object at 0x6235e90>
> -- getting state for operation: 
> 
> -- canceling operation:  object at 0x6235e90>
> -- 2019-01-18 17:50:08,196 INFO Thread-4: Starting new HTTP connection 
> (1): localhost
> -- closing query for operation handle: 
> 
> {noformat}
> [~asherman] please take a look since it looks like you touched code around 
> this area last.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Issue Comment Deleted] (IMPALA-8112) test_cancel_select with debug action failed with unexpected error

2019-01-24 Thread Michael Brown (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Brown updated IMPALA-8112:
--
Comment: was deleted

(was: This has happened once on a downstream CI job 
https://master-02.jenkins.cloudera.com/job/impala-cdh6.x-core/532/ .)

> test_cancel_select with debug action failed with unexpected error
> -
>
> Key: IMPALA-8112
> URL: https://issues.apache.org/jira/browse/IMPALA-8112
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Michael Brown
>Assignee: Andrew Sherman
>Priority: Critical
>  Labels: flaky
>
> Stacktrace
> {noformat}
> query_test/test_cancellation.py:241: in test_cancel_select
> self.execute_cancel_test(vector)
> query_test/test_cancellation.py:213: in execute_cancel_test
> assert 'Cancelled' in str(thread.fetch_results_error)
> E   assert 'Cancelled' in "ImpalaBeeswaxException:\n INNER EXCEPTION:  'beeswaxd.ttypes.BeeswaxException'>\n MESSAGE: Unable to open Kudu table: 
> Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected 
> (error 107)\n"
> E+  where "ImpalaBeeswaxException:\n INNER EXCEPTION:  'beeswaxd.ttypes.BeeswaxException'>\n MESSAGE: Unable to open Kudu table: 
> Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected 
> (error 107)\n" = str(ImpalaBeeswaxException())
> E+where ImpalaBeeswaxException() =  140481071658752)>.fetch_results_error
> {noformat}
> Standard Error
> {noformat}
> SET 
> client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
> -- executing against localhost:21000
> use tpch_kudu;
> -- 2019-01-18 17:50:03,100 INFO MainThread: Started query 
> 4e4b3ab4cc7d:11efc3f5
> SET 
> client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET cpu_limit_s=10;
> SET debug_action=0:GETNEXT:WAIT|COORD_CANCEL_QUERY_FINSTANCES_RPC:FAIL;
> SET exec_single_node_rows_threshold=0;
> SET buffer_pool_limit=0;
> -- executing async: localhost:21000
> select l_returnflag from lineitem;
> -- 2019-01-18 17:50:03,139 INFO MainThread: Started query 
> fa4ddb9e62a01240:54c86ad
> SET 
> client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
> -- connecting to: localhost:21000
> -- fetching results from:  object at 0x6235e90>
> -- getting state for operation: 
> 
> -- canceling operation:  object at 0x6235e90>
> -- 2019-01-18 17:50:08,196 INFO Thread-4: Starting new HTTP connection 
> (1): localhost
> -- closing query for operation handle: 
> 
> {noformat}
> [~asherman] please take a look since it looks like you touched code around 
> this area last.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8112) test_cancel_select with debug action failed with unexpected error

2019-01-24 Thread Michael Brown (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751610#comment-16751610
 ] 

Michael Brown commented on IMPALA-8112:
---

This has happened once on a downstream CI job 
https://master-02.jenkins.cloudera.com/job/impala-cdh6.x-core/532/ .

> test_cancel_select with debug action failed with unexpected error
> -
>
> Key: IMPALA-8112
> URL: https://issues.apache.org/jira/browse/IMPALA-8112
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Michael Brown
>Assignee: Andrew Sherman
>Priority: Critical
>
> Stacktrace
> {noformat}
> query_test/test_cancellation.py:241: in test_cancel_select
> self.execute_cancel_test(vector)
> query_test/test_cancellation.py:213: in execute_cancel_test
> assert 'Cancelled' in str(thread.fetch_results_error)
> E   assert 'Cancelled' in "ImpalaBeeswaxException:\n INNER EXCEPTION:  'beeswaxd.ttypes.BeeswaxException'>\n MESSAGE: Unable to open Kudu table: 
> Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected 
> (error 107)\n"
> E+  where "ImpalaBeeswaxException:\n INNER EXCEPTION:  'beeswaxd.ttypes.BeeswaxException'>\n MESSAGE: Unable to open Kudu table: 
> Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected 
> (error 107)\n" = str(ImpalaBeeswaxException())
> E+where ImpalaBeeswaxException() =  140481071658752)>.fetch_results_error
> {noformat}
> Standard Error
> {noformat}
> SET 
> client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
> -- executing against localhost:21000
> use tpch_kudu;
> -- 2019-01-18 17:50:03,100 INFO MainThread: Started query 
> 4e4b3ab4cc7d:11efc3f5
> SET 
> client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET cpu_limit_s=10;
> SET debug_action=0:GETNEXT:WAIT|COORD_CANCEL_QUERY_FINSTANCES_RPC:FAIL;
> SET exec_single_node_rows_threshold=0;
> SET buffer_pool_limit=0;
> -- executing async: localhost:21000
> select l_returnflag from lineitem;
> -- 2019-01-18 17:50:03,139 INFO MainThread: Started query 
> fa4ddb9e62a01240:54c86ad
> SET 
> client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
> -- connecting to: localhost:21000
> -- fetching results from:  object at 0x6235e90>
> -- getting state for operation: 
> 
> -- canceling operation:  object at 0x6235e90>
> -- 2019-01-18 17:50:08,196 INFO Thread-4: Starting new HTTP connection 
> (1): localhost
> -- closing query for operation handle: 
> 
> {noformat}
> [~asherman] please take a look since it looks like you touched code around 
> this area last.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8112) test_cancel_select with debug action failed with unexpected error

2019-01-24 Thread Michael Brown (JIRA)

Michael Brown created IMPALA-8112:
-

 Summary: test_cancel_select with debug action failed with 
unexpected error
 Key: IMPALA-8112
 URL: https://issues.apache.org/jira/browse/IMPALA-8112
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.2.0
Reporter: Michael Brown
Assignee: Andrew Sherman


Stacktrace
{noformat}
query_test/test_cancellation.py:241: in test_cancel_select
self.execute_cancel_test(vector)
query_test/test_cancellation.py:213: in execute_cancel_test
assert 'Cancelled' in str(thread.fetch_results_error)
E   assert 'Cancelled' in "ImpalaBeeswaxException:\n INNER EXCEPTION: \n MESSAGE: Unable to open Kudu table: 
Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected 
(error 107)\n"
E+  where "ImpalaBeeswaxException:\n INNER EXCEPTION: \n MESSAGE: Unable to open Kudu table: 
Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected 
(error 107)\n" = str(ImpalaBeeswaxException())
E+where ImpalaBeeswaxException() = .fetch_results_error
{noformat}

Standard Error
{noformat}
SET 
client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
-- executing against localhost:21000
use tpch_kudu;

-- 2019-01-18 17:50:03,100 INFO MainThread: Started query 
4e4b3ab4cc7d:11efc3f5
SET 
client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET cpu_limit_s=10;
SET debug_action=0:GETNEXT:WAIT|COORD_CANCEL_QUERY_FINSTANCES_RPC:FAIL;
SET exec_single_node_rows_threshold=0;
SET buffer_pool_limit=0;
-- executing async: localhost:21000
select l_returnflag from lineitem;

-- 2019-01-18 17:50:03,139 INFO MainThread: Started query 
fa4ddb9e62a01240:54c86ad
SET 
client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
-- connecting to: localhost:21000
-- fetching results from: 
-- getting state for operation: 
-- canceling operation: 
-- 2019-01-18 17:50:08,196 INFO Thread-4: Starting new HTTP connection (1): 
localhost
-- closing query for operation handle: 

{noformat}

[~asherman] please take a look since it looks like you touched code around this 
area last.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IMPALA-8112) test_cancel_select with debug action failed with unexpected error

2019-01-24 Thread Michael Brown (JIRA)

Michael Brown created IMPALA-8112:
-

 Summary: test_cancel_select with debug action failed with 
unexpected error
 Key: IMPALA-8112
 URL: https://issues.apache.org/jira/browse/IMPALA-8112
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.2.0
Reporter: Michael Brown
Assignee: Andrew Sherman


Stacktrace
{noformat}
query_test/test_cancellation.py:241: in test_cancel_select
self.execute_cancel_test(vector)
query_test/test_cancellation.py:213: in execute_cancel_test
assert 'Cancelled' in str(thread.fetch_results_error)
E   assert 'Cancelled' in "ImpalaBeeswaxException:\n INNER EXCEPTION: \n MESSAGE: Unable to open Kudu table: 
Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected 
(error 107)\n"
E+  where "ImpalaBeeswaxException:\n INNER EXCEPTION: \n MESSAGE: Unable to open Kudu table: 
Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected 
(error 107)\n" = str(ImpalaBeeswaxException())
E+where ImpalaBeeswaxException() = .fetch_results_error
{noformat}

Standard Error
{noformat}
SET 
client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
-- executing against localhost:21000
use tpch_kudu;

-- 2019-01-18 17:50:03,100 INFO MainThread: Started query 
4e4b3ab4cc7d:11efc3f5
SET 
client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
SET batch_size=0;
SET num_nodes=0;
SET disable_codegen_rows_threshold=0;
SET disable_codegen=False;
SET abort_on_error=1;
SET cpu_limit_s=10;
SET debug_action=0:GETNEXT:WAIT|COORD_CANCEL_QUERY_FINSTANCES_RPC:FAIL;
SET exec_single_node_rows_threshold=0;
SET buffer_pool_limit=0;
-- executing async: localhost:21000
select l_returnflag from lineitem;

-- 2019-01-18 17:50:03,139 INFO MainThread: Started query 
fa4ddb9e62a01240:54c86ad
SET 
client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
-- connecting to: localhost:21000
-- fetching results from: 
-- getting state for operation: 
-- canceling operation: 
-- 2019-01-18 17:50:08,196 INFO Thread-4: Starting new HTTP connection (1): 
localhost
-- closing query for operation handle: 

{noformat}

[~asherman] please take a look since it looks like you touched code around this 
area last.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7351) Add memory estimates for plan nodes and sinks with missing estimates

2019-01-24 Thread Bikramjeet Vig (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751586#comment-16751586
 ] 

Bikramjeet Vig commented on IMPALA-7351:


[~tarmstrong] Only PlanRootSink is left and since IMPALA-4268 can change mem 
requirement drastically, I have deferred closing it till then. Also added that 
task as a dependency.

> Add memory estimates for plan nodes and sinks with missing estimates
> 
>
> Key: IMPALA-7351
> URL: https://issues.apache.org/jira/browse/IMPALA-7351
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: admission-control, resource-management
>
> Many plan nodes and sinks, e.g. KuduScanNode, KuduTableSink, ExchangeNode, 
> etc are missing memory estimates entirely. 
> We should add a basic estimate for all these cases based on experiments and 
> data from real workloads. In some cases 0 may be the right estimate (e.g. for 
> streaming nodes like SelectNode that just pass through data) but we should 
> remove TODOs and document the reasoning in those cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8111) Document workaround for some authentication issues with KRPC

2019-01-24 Thread Alex Rodoni (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751567#comment-16751567
 ] 

Alex Rodoni commented on IMPALA-8111:
-

[~kwho] I will add the 3 issues to the known issues doc.

> Document workaround for some authentication issues with KRPC
> 
>
> Key: IMPALA-8111
> URL: https://issues.apache.org/jira/browse/IMPALA-8111
> Project: IMPALA
>  Issue Type: Task
>  Components: Docs
>Affects Versions: Impala 2.12.0, Impala 3.1.0
>Reporter: Michael Ho
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc, in_32
>
> There have been complaints from users about not being able to use Impala 
> after upgrading to Impala version with KRPC enabled due to authentication 
> issues. Please document them in the known issues or best practice guide.
> 1. https://issues.apache.org/jira/browse/IMPALA-7585:
>  *Symptoms*: When using Impala with LDAP enabled, a user may hit the 
> following:
> {noformat}
> Not authorized: Client connection negotiation failed: client connection to 
> 127.0.0.1:27000: SASL(-1): generic failure: All-whitespace username.
> {noformat}
> *Root cause*: The following sequence can lead to the user "impala" not being 
> created in /etc/passwd.
> {quote}time 1: no impala in LDAP; things get installed; impala created in 
> /etc/passwd
>  time 2: impala added to LDAP
>  time 3: new machine added
> {quote}
> *Workaround*:
>  - Manually edit /etc/passwd to add the impala user
>  - Upgrade to a version of Impala with the patch IMPALA-7585
> 2. https://issues.apache.org/jira/browse/IMPALA-7298
>  *Symptoms*: When running with Kerberos enabled, a user may hit the following 
> error:
> {noformat}
> WARNINGS: TransmitData() to X.X.X.X:27000 failed: Not authorized: Client 
> connection negotiation failed: client connection to X.X.X.X:27000: Server 
> impala/x.x@vpc.cloudera.com not found in Kerberos database
> {noformat}
> *Root cause*:
>  KrpcDataStreamSender passes a resolved IP address when creating a proxy. 
> Instead, we should pass both the resolved address and the hostname when 
> creating the proxy so that we won't end up using the IP address as the 
> hostname in the Kerberos principal.
> *Workaround*:
>  - Set rdns=true in /etc/krb5.conf
>  - Upgrade to a version of Impala with the fix of IMPALA-7298
> 3. https://issues.apache.org/jira/browse/KUDU-2198
>  *Symptoms*: When running with Kerberos enabled, a user may hit the following 
> error message where  is some random string which doesn't match 
> the primary in the Kerberos principal
> {noformat}
> WARNINGS: TransmitData() to X.X.X.X:27000 failed: Remote error: Not 
> authorized: {username='', principal='impala/redacted'} is not 
> allowed to access DataStreamService
> {noformat}
> *Root cause*:
>  Due to system "auth_to_local" mapping, the principal may be mapped to some 
> local name.
> *Workaround*:
>  - Start Impala with the flag {{--use_system_auth_to_local=false}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7351) Add memory estimates for plan nodes and sinks with missing estimates

2019-01-24 Thread Tim Armstrong (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751495#comment-16751495
 ] 

Tim Armstrong commented on IMPALA-7351:
---

[~bikramjeet.vig] can we close this now?

> Add memory estimates for plan nodes and sinks with missing estimates
> 
>
> Key: IMPALA-7351
> URL: https://issues.apache.org/jira/browse/IMPALA-7351
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: admission-control, resource-management
>
> Many plan nodes and sinks, e.g. KuduScanNode, KuduTableSink, ExchangeNode, 
> etc are missing memory estimates entirely. 
> We should add a basic estimate for all these cases based on experiments and 
> data from real workloads. In some cases 0 may be the right estimate (e.g. for 
> streaming nodes like SelectNode that just pass through data) but we should 
> remove TODOs and document the reasoning in those cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8111) Document workaround for some authentication issues with KRPC

2019-01-24 Thread Alex Rodoni (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-8111:

Labels: future_release_doc in_32  (was: )

> Document workaround for some authentication issues with KRPC
> 
>
> Key: IMPALA-8111
> URL: https://issues.apache.org/jira/browse/IMPALA-8111
> Project: IMPALA
>  Issue Type: Task
>  Components: Docs
>Affects Versions: Impala 2.12.0, Impala 3.1.0
>Reporter: Michael Ho
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc, in_32
>
> There have been complaints from users about not being able to use Impala 
> after upgrading to Impala version with KRPC enabled due to authentication 
> issues. Please document them in the known issues or best practice guide.
> 1. https://issues.apache.org/jira/browse/IMPALA-7585:
>  *Symptoms*: When using Impala with LDAP enabled, a user may hit the 
> following:
> {noformat}
> Not authorized: Client connection negotiation failed: client connection to 
> 127.0.0.1:27000: SASL(-1): generic failure: All-whitespace username.
> {noformat}
> *Root cause*: The following sequence can lead to the user "impala" not being 
> created in /etc/passwd.
> {quote}time 1: no impala in LDAP; things get installed; impala created in 
> /etc/passwd
>  time 2: impala added to LDAP
>  time 3: new machine added
> {quote}
> *Workaround*:
>  - Manually edit /etc/passwd to add the impala user
>  - Upgrade to a version of Impala with the patch IMPALA-7585
> 2. https://issues.apache.org/jira/browse/IMPALA-7298
>  *Symptoms*: When running with Kerberos enabled, a user may hit the following 
> error:
> {noformat}
> WARNINGS: TransmitData() to X.X.X.X:27000 failed: Not authorized: Client 
> connection negotiation failed: client connection to X.X.X.X:27000: Server 
> impala/x.x@vpc.cloudera.com not found in Kerberos database
> {noformat}
> *Root cause*:
>  KrpcDataStreamSender passes a resolved IP address when creating a proxy. 
> Instead, we should pass both the resolved address and the hostname when 
> creating the proxy so that we won't end up using the IP address as the 
> hostname in the Kerberos principal.
> *Workaround*:
>  - Set rdns=true in /etc/krb5.conf
>  - Upgrade to a version of Impala with the fix of IMPALA-7298
> 3. https://issues.apache.org/jira/browse/KUDU-2198
>  *Symptoms*: When running with Kerberos enabled, a user may hit the following 
> error message where  is some random string which doesn't match 
> the primary in the Kerberos principal
> {noformat}
> WARNINGS: TransmitData() to X.X.X.X:27000 failed: Remote error: Not 
> authorized: {username='', principal='impala/redacted'} is not 
> allowed to access DataStreamService
> {noformat}
> *Root cause*:
>  Due to system "auth_to_local" mapping, the principal may be mapped to some 
> local name.
> *Workaround*:
>  - Start Impala with the flag {{--use_system_auth_to_local=false}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB

2019-01-24 Thread Tim Armstrong (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751462#comment-16751462
 ] 

Tim Armstrong commented on IMPALA-8109:
---

bytes_read_ used to be an int, but was switched to an int64_t by IMPALA-7543. I 
think that probably fixed this - I didn't see another way that the offset could 
overflow. [~boroknagyz] do you remember if you reproduced a bug like this when 
making that change or just did it as part of code cleanup?

> Impala cannot read the gzip files bigger than 2 GB
> --
>
> Key: IMPALA-8109
> URL: https://issues.apache.org/jira/browse/IMPALA-8109
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.12.0
>Reporter: hakki
>Priority: Minor
>
> When querying a partition containing gzip files, the query fails with the 
> error below: 
> WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: 
> hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: 
> Error(255): Unknown error 255
> Root cause: EOFException: Cannot seek to negative offset
> hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file is 
> a delimited text file and has a size of bigger than 2 GB (approx: 2.4 GB) The 
> uncompressed size is ~13GB
> The impalad version is : 2.12.0-cdh5.15.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8111) Document workaround for some authentication issues with KRPC

2019-01-24 Thread Michael Ho (JIRA)

Michael Ho created IMPALA-8111:
--

 Summary: Document workaround for some authentication issues with 
KRPC
 Key: IMPALA-8111
 URL: https://issues.apache.org/jira/browse/IMPALA-8111
 Project: IMPALA
  Issue Type: Task
  Components: Docs
Affects Versions: Impala 3.1.0, Impala 2.12.0
Reporter: Michael Ho
Assignee: Alex Rodoni


There have been complaints from users about not being able to use Impala after 
upgrading to Impala version with KRPC enabled due to authentication issues. 
Please document them in the known issues or best practice guide.

1. https://issues.apache.org/jira/browse/IMPALA-7585:
 *Symptoms*: When using Impala with LDAP enabled, a user may hit the following:
{noformat}
Not authorized: Client connection negotiation failed: client connection to 
127.0.0.1:27000: SASL(-1): generic failure: All-whitespace username.
{noformat}
*Root cause*: The following sequence can lead to the user "impala" not being 
created in /etc/passwd.
{quote}time 1: no impala in LDAP; things get installed; impala created in 
/etc/passwd
 time 2: impala added to LDAP
 time 3: new machine added
{quote}
*Workaround*:
 - Manually edit /etc/passwd to add the impala user
 - Upgrade to a version of Impala with the patch IMPALA-7585

2. https://issues.apache.org/jira/browse/IMPALA-7298
 *Symptoms*: When running with Kerberos enabled, a user may hit the following 
error:
{noformat}
WARNINGS: TransmitData() to X.X.X.X:27000 failed: Not authorized: Client 
connection negotiation failed: client connection to X.X.X.X:27000: Server 
impala/x.x@vpc.cloudera.com not found in Kerberos database
{noformat}
*Root cause*:
 KrpcDataStreamSender passes a resolved IP address when creating a proxy. 
Instead, we should pass both the resolved address and the hostname when 
creating the proxy so that we won't end up using the IP address as the hostname 
in the Kerberos principal.

*Workaround*:
 - Set rdns=true in /etc/krb5.conf
 - Upgrade to a version of Impala with the fix of IMPALA-7298

3. https://issues.apache.org/jira/browse/KUDU-2198
 *Symptoms*: When running with Kerberos enabled, a user may hit the following 
error message where  is some random string which doesn't match 
the primary in the Kerberos principal
{noformat}
WARNINGS: TransmitData() to X.X.X.X:27000 failed: Remote error: Not authorized: 
{username='', principal='impala/redacted'} is not allowed to 
access DataStreamService
{noformat}
*Root cause*:
 Due to system "auth_to_local" mapping, the principal may be mapped to some 
local name.

*Workaround*:
 - Start Impala with the flag {{--use_system_auth_to_local=false}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IMPALA-8111) Document workaround for some authentication issues with KRPC

2019-01-24 Thread Michael Ho (JIRA)

Michael Ho created IMPALA-8111:
--

 Summary: Document workaround for some authentication issues with 
KRPC
 Key: IMPALA-8111
 URL: https://issues.apache.org/jira/browse/IMPALA-8111
 Project: IMPALA
  Issue Type: Task
  Components: Docs
Affects Versions: Impala 3.1.0, Impala 2.12.0
Reporter: Michael Ho
Assignee: Alex Rodoni


There have been complaints from users about not being able to use Impala after 
upgrading to Impala version with KRPC enabled due to authentication issues. 
Please document them in the known issues or best practice guide.

1. https://issues.apache.org/jira/browse/IMPALA-7585:
 *Symptoms*: When using Impala with LDAP enabled, a user may hit the following:
{noformat}
Not authorized: Client connection negotiation failed: client connection to 
127.0.0.1:27000: SASL(-1): generic failure: All-whitespace username.
{noformat}
*Root cause*: The following sequence can lead to the user "impala" not being 
created in /etc/passwd.
{quote}time 1: no impala in LDAP; things get installed; impala created in 
/etc/passwd
 time 2: impala added to LDAP
 time 3: new machine added
{quote}
*Workaround*:
 - Manually edit /etc/passwd to add the impala user
 - Upgrade to a version of Impala with the patch IMPALA-7585

2. https://issues.apache.org/jira/browse/IMPALA-7298
 *Symptoms*: When running with Kerberos enabled, a user may hit the following 
error:
{noformat}
WARNINGS: TransmitData() to X.X.X.X:27000 failed: Not authorized: Client 
connection negotiation failed: client connection to X.X.X.X:27000: Server 
impala/x.x@vpc.cloudera.com not found in Kerberos database
{noformat}
*Root cause*:
 KrpcDataStreamSender passes a resolved IP address when creating a proxy. 
Instead, we should pass both the resolved address and the hostname when 
creating the proxy so that we won't end up using the IP address as the hostname 
in the Kerberos principal.

*Workaround*:
 - Set rdns=true in /etc/krb5.conf
 - Upgrade to a version of Impala with the fix of IMPALA-7298

3. https://issues.apache.org/jira/browse/KUDU-2198
 *Symptoms*: When running with Kerberos enabled, a user may hit the following 
error message where  is some random string which doesn't match 
the primary in the Kerberos principal
{noformat}
WARNINGS: TransmitData() to X.X.X.X:27000 failed: Remote error: Not authorized: 
{username='', principal='impala/redacted'} is not allowed to 
access DataStreamService
{noformat}
*Root cause*:
 Due to system "auth_to_local" mapping, the principal may be mapped to some 
local name.

*Workaround*:
 - Start Impala with the flag {{--use_system_auth_to_local=false}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8107) Support EXEC_TIME_LIMIT_S in resource pool setting

2019-01-24 Thread Tim Armstrong (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751393#comment-16751393
 ] 

Tim Armstrong commented on IMPALA-8107:
---

You can specify default query options for any resource pool - 
https://impala.apache.org/docs/build/html/topics/impala_admission.html

That's generally how I'd expect this to be used.

> Support EXEC_TIME_LIMIT_S in resource pool setting
> --
>
> Key: IMPALA-8107
> URL: https://issues.apache.org/jira/browse/IMPALA-8107
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Quanlong Huang
>Priority: Major
>  Labels: admission-control
>
> Timeout limit should be different for different kinds of queries. For 
> example, resource pool for adhoc queries may set EXEC_TIME_LIMIT_S to 60s. 
> Resource pool for building pre-aggregaions or other ETL may need a larger 
> EXEC_TIME_LIMIT_S like 30 minutes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7832) Support IF NOT EXISTS in alter table add columns

2019-01-24 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751384#comment-16751384
 ] 

ASF subversion and git services commented on IMPALA-7832:
-

Commit bfb9ccc8e02be20fb8b57bae4d55e4094ab7ea3f in impala's branch 
refs/heads/master from Fredy Wijaya
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=bfb9ccc ]

IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)

This patch adds IF NOT EXISTS support in ALTER TABLE ADD COLUMN and
ALTER TABLE ADD COLUMNS. If IF NOT EXISTS is specified and a column
already exists with this name, no error is thrown. If IF NOT EXISTS
is specified for multiple columns and a column already exists, no
error is thrown and a new column that does not exist will be added.

Syntax:
ALTER TABLE tbl ADD COLUMN [IF NOT EXISTS] i int
ALTER TABLE tbl ADD [IF NOT EXISTS] COLUMNS (i int, j int)

Testing:
- Added new FE tests
- Ran all FE tests
- Updated E2E DDL tests
- Ran all E2E DDL tests

Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9
Reviewed-on: http://gerrit.cloudera.org:8080/12181
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Support IF NOT EXISTS in alter table add columns
> 
>
> Key: IMPALA-7832
> URL: https://issues.apache.org/jira/browse/IMPALA-7832
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Fredy Wijaya
>Priority: Minor
>  Labels: ramp-up
> Fix For: Impala 3.2.0
>
>
> alter table  add [if not exists] columns (  [,  
> ...])
> would add the column only if a column of the same name does not already exist
> Probably worth checking out what other databases do in different situations, 
> eg. if the column already exists but with a different type, if "replace" is 
> used instead of "add", etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7905) ToSqlUtils does not correctly quote lower-case Hive keywords

2019-01-24 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751382#comment-16751382
 ] 

ASF subversion and git services commented on IMPALA-7905:
-

Commit 85a8b34645a46038fd217c03e64326b72d9669b5 in impala's branch 
refs/heads/master from Paul Rogers
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=85a8b34 ]

IMPALA-7905: Hive keywords not quoted for identifiers

Impala often generates SQL for statements using the toSql() call.
Generated SQL is often used during testing or when writing the query
plan. Impala keywords such as "create", when used as identifiers,
must be quoted:

SELECT `select`, `from` FROM `order` ...

The code in ToSqlUtils.getIdentSql() quotes the identifier if it is
an Impala or Hive keyword, or if it does not follow the identifier
pattern. The code uses the Hive lexer to detect a keyword. But, the
code contained a flaw: the lexer expects a case-insensitive input.
We provide a case sensitive input. As a result, "MONTH" is caught as a
Hive keyword and quoted, but "month" is not. This patch fixes that flaw.

This patch also fixes:

IMPALA-8051: Compute stats fails on a column with comment character in
name

The code uses the Hive lexical analyzer to check names. Since "#" and
"--" are comment characters, a name like "foo#" is parsed as "foo" which
does not need quotes, hence we don't quote "foo#", which causes issues.
Added a special check for "#" and "--" to resolve this issue.

Testing:

* Refactored getIdentSql() easier testing.
* Added a tests to the recently added ToSqlUtilsTest for this case and
  several others.
* Making this change caused the columns `month`, `year`, and `key` to be
  quoted when before they were not. Updated many tests as a result.
* Added a new identSql() function, for use in tests, to match the
  quoting that Impala uses, and to handle the wildcard, and multi-part
  names. Used this in ToSqlTest to handle the quoted names.
* PlannerTest emits statement SQL to the output file wrapped to 80
  columns and sometimes leaves trailing spaces at the end of the line.
  Some tools remove that trailing space, resulting in trivial file
  differences.  Fixed this to remove trailing spaces in order to simplify
  file comparisons.
* Tweaked the "In pipelines" output to avoid trailing spaces when no
  pipelines are listed.
* Reran all FE tests.

Change-Id: I06cc20b052a3a66535a171c36b4b31477c0ba6d0
Reviewed-on: http://gerrit.cloudera.org:8080/12009
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> ToSqlUtils does not correctly quote lower-case Hive keywords
> 
>
> Key: IMPALA-7905
> URL: https://issues.apache.org/jira/browse/IMPALA-7905
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> Not sure yet how to reproduce this error via the shell, but here is the code 
> analysis.
> The {{ToSqlUtils}} class generates a {{CREATE TABLE}} statement, which uses a 
> method {{getIdentSql()}} to possibly quote a table or column name. This same 
> method is used in multiple places in the {{toSql()}} logic for various 
> statements.
> The comment for the method says:
> bq. returns an identifier lexable by Impala and Hive, possibly by enclosing 
> the original identifier in "`" quotes.
> To check for a Hive-compatible identifier, the code uses the Hive lexer:
> {code:java}
> HiveLexer hiveLexer = new HiveLexer(new ANTLRStringStream(ident));
> {code}
> A unit test shows that this logic fails to catch lower case keywords: 
> "select", say, while it does catch upper-case keywords: "SELECT".
> Checking the Hive source, it appears we're using the lexer wrong:
> {code:java}
> HiveLexerX lexer = new HiveLexerX(new ANTLRNoCaseStringStream(command));
> {code}
> The fix is simple: upper-case the symbol before using he Hive lexer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7698) Add centos/redhat 6/7 support to bootstrap_system.sh

2019-01-24 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751381#comment-16751381
 ] 

ASF subversion and git services commented on IMPALA-7698:
-

Commit 81d0bcb3c967bf76b6858b221416e6fcb863b187 in impala's branch 
refs/heads/master from Philip Zeyliger
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=81d0bcb ]

Support centos:7 for test-with-docker.

As a follow-on to IMPALA-7698, adds various incantations
so that centos:7 can build under test-with-docker.

The core issue is that the centos:7 image doesn't let you start sshd
(necessary for the HBase startup scripts, and probably could be worked
around) or postgresql (harder to work around) with systemctl, because
systemd isn't "running." To avoid this, we start them manually
with /usr/sbin/sshd and pg_ctl.

Change-Id: I7577949b6eaaa2239bcf0fadf64e1490c2106b08
Reviewed-on: http://gerrit.cloudera.org:8080/12139
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add centos/redhat 6/7 support to bootstrap_system.sh
> 
>
> Key: IMPALA-7698
> URL: https://issues.apache.org/jira/browse/IMPALA-7698
> Project: IMPALA
>  Issue Type: Task
>  Components: Infrastructure
>Reporter: Philip Zeyliger
>Assignee: Philip Zeyliger
>Priority: Major
>
> {{bootstrap_system.sh}} currently only works on Ubuntu. Making it work on 
> CentOS/Redhat would open the door to running automated tests on those 
> platforms more readily, including using {{test-with-docker}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8051) Compute stats fails on a column with comment character in name

2019-01-24 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751383#comment-16751383
 ] 

ASF subversion and git services commented on IMPALA-8051:
-

Commit 85a8b34645a46038fd217c03e64326b72d9669b5 in impala's branch 
refs/heads/master from Paul Rogers
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=85a8b34 ]

IMPALA-7905: Hive keywords not quoted for identifiers

Impala often generates SQL for statements using the toSql() call.
Generated SQL is often used during testing or when writing the query
plan. Impala keywords such as "create", when used as identifiers,
must be quoted:

SELECT `select`, `from` FROM `order` ...

The code in ToSqlUtils.getIdentSql() quotes the identifier if it is
an Impala or Hive keyword, or if it does not follow the identifier
pattern. The code uses the Hive lexer to detect a keyword. But, the
code contained a flaw: the lexer expects a case-insensitive input.
We provide a case sensitive input. As a result, "MONTH" is caught as a
Hive keyword and quoted, but "month" is not. This patch fixes that flaw.

This patch also fixes:

IMPALA-8051: Compute stats fails on a column with comment character in
name

The code uses the Hive lexical analyzer to check names. Since "#" and
"--" are comment characters, a name like "foo#" is parsed as "foo" which
does not need quotes, hence we don't quote "foo#", which causes issues.
Added a special check for "#" and "--" to resolve this issue.

Testing:

* Refactored getIdentSql() easier testing.
* Added a tests to the recently added ToSqlUtilsTest for this case and
  several others.
* Making this change caused the columns `month`, `year`, and `key` to be
  quoted when before they were not. Updated many tests as a result.
* Added a new identSql() function, for use in tests, to match the
  quoting that Impala uses, and to handle the wildcard, and multi-part
  names. Used this in ToSqlTest to handle the quoted names.
* PlannerTest emits statement SQL to the output file wrapped to 80
  columns and sometimes leaves trailing spaces at the end of the line.
  Some tools remove that trailing space, resulting in trivial file
  differences.  Fixed this to remove trailing spaces in order to simplify
  file comparisons.
* Tweaked the "In pipelines" output to avoid trailing spaces when no
  pipelines are listed.
* Reran all FE tests.

Change-Id: I06cc20b052a3a66535a171c36b4b31477c0ba6d0
Reviewed-on: http://gerrit.cloudera.org:8080/12009
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Compute stats fails on a column with comment character in name
> --
>
> Key: IMPALA-8051
> URL: https://issues.apache.org/jira/browse/IMPALA-8051
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> Problem - "compute stats" query executed on a table containing a special 
> character "#" in one of its columns is failing with below error:
> WARNINGS: AnalysisException: Syntax error in line 1:
> ...length(cola)), NDV(colb#) AS colb#, CAST(-1 as BIG...
>  ^
> Encountered: Unexpected character
> Expected: ADD, ALTER, AND, ARRAY, AS, ASC, BETWEEN, BIGINT, BINARY, 
> BLOCK_SIZE, BOOLEAN, CACHED, CASCADE, CHANGE, CHAR, COMMENT, COMPRESSION, 
> CROSS, DATE, DATETIME, DECIMAL, DEFAULT, DESC, DIV, REAL, DROP, ELSE, 
> ENCODING, END, FLOAT, FOLLOWING, FROM, FULL, GROUP, IGNORE, HAVING, ILIKE, 
> IN, INNER, INTEGER, IREGEXP, IS, JOIN, LEFT, LIKE, LIMIT, LOCATION, MAP, NOT, 
> NULL, NULLS, OFFSET, ON, OR, ORDER, PARTITION, PARTITIONED, PRECEDING, 
> PRIMARY, PURGE, RANGE, RECOVER, REGEXP, RENAME, REPLACE, RESTRICT, RIGHT, 
> RLIKE, ROW, ROWS, SELECT, SET, SMALLINT, SORT, STORED, STRAIGHT_JOIN, STRING, 
> STRUCT, TABLESAMPLE, TBLPROPERTIES, THEN, TIMESTAMP, TINYINT, TO, UNCACHED, 
> UNION, USING, VALUES, VARCHAR, WHEN, WHERE, WITH, COMMA, IDENTIFIER
> Steps to reproduce the issue -
> # Create a table containing special character in one of it columns from Hive. 
> For example:
> {code:sql}
> CREATE TABLE test_special_character (`id#` int);
> {code}
> # Execute "INVALIDATE METADATA test_special_character" from Impala.
> # Execute "COMPUTE STATS test_special_character" from Impala and it'll lead 
> to above mentioned error.
> Impala does not allow to create tables with columns containing special 
> characters but Hive allows it by using back ticks (``) to escape it. However, 
> Impala still can load the metadata of table and can read from column 
> containing special character as well by escaping the special character using 
> back ticks (``). For example, below query can be executed from Impala -
> {code:sql}
> select `id#` from test_special_character;
>

[jira] [Resolved] (IMPALA-7832) Support IF NOT EXISTS in alter table add columns

2019-01-24 Thread Fredy Wijaya (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya resolved IMPALA-7832.
--
   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> Support IF NOT EXISTS in alter table add columns
> 
>
> Key: IMPALA-7832
> URL: https://issues.apache.org/jira/browse/IMPALA-7832
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Fredy Wijaya
>Priority: Minor
>  Labels: ramp-up
> Fix For: Impala 3.2.0
>
>
> alter table  add [if not exists] columns (  [,  
> ...])
> would add the column only if a column of the same name does not already exist
> Probably worth checking out what other databases do in different situations, 
> eg. if the column already exists but with a different type, if "replace" is 
> used instead of "add", etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (IMPALA-7832) Support IF NOT EXISTS in alter table add columns

2019-01-24 Thread Fredy Wijaya (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya resolved IMPALA-7832.
--
   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> Support IF NOT EXISTS in alter table add columns
> 
>
> Key: IMPALA-7832
> URL: https://issues.apache.org/jira/browse/IMPALA-7832
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Fredy Wijaya
>Priority: Minor
>  Labels: ramp-up
> Fix For: Impala 3.2.0
>
>
> alter table  add [if not exists] columns (  [,  
> ...])
> would add the column only if a column of the same name does not already exist
> Probably worth checking out what other databases do in different situations, 
> eg. if the column already exists but with a different type, if "replace" is 
> used instead of "add", etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8110) Parquet stat filtering does not handle narrowed int types correctly

2019-01-24 Thread Csaba Ringhofer (JIRA)

Csaba Ringhofer created IMPALA-8110:
---

 Summary: Parquet stat filtering does not handle narrowed int types 
correctly
 Key: IMPALA-8110
 URL: https://issues.apache.org/jira/browse/IMPALA-8110
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Csaba Ringhofer


Impala can read int32 Parquet columns as tiny/smallint SQL columns. If the 
value does not fit into the 8/16 bit signed int's range, the value will 
overflow, e.g writing 128 as int32 and then rereading it as int8 will return 
-128. This is normal as far as I understand, but min/max stat filtering does 
not handle this case correctly:

create table tnarrow (i int) stored as parquet;
insert into tnarrow values (1), (201); 
alter table tnarrow change column i i tinyint;
set PARQUET_READ_STATISTICS=0;
select * from tnarrow where i < 0;
-> returns 1 row: -56
set PARQUET_READ_STATISTICS=1;
-> returns 0 row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8110) Parquet stat filtering does not handle narrowed int types correctly

2019-01-24 Thread Csaba Ringhofer (JIRA)

Csaba Ringhofer created IMPALA-8110:
---

 Summary: Parquet stat filtering does not handle narrowed int types 
correctly
 Key: IMPALA-8110
 URL: https://issues.apache.org/jira/browse/IMPALA-8110
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Csaba Ringhofer


Impala can read int32 Parquet columns as tiny/smallint SQL columns. If the 
value does not fit into the 8/16 bit signed int's range, the value will 
overflow, e.g writing 128 as int32 and then rereading it as int8 will return 
-128. This is normal as far as I understand, but min/max stat filtering does 
not handle this case correctly:

create table tnarrow (i int) stored as parquet;
insert into tnarrow values (1), (201); 
alter table tnarrow change column i i tinyint;
set PARQUET_READ_STATISTICS=0;
select * from tnarrow where i < 0;
-> returns 1 row: -56
set PARQUET_READ_STATISTICS=1;
-> returns 0 row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB

2019-01-24 Thread hakki (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hakki updated IMPALA-8109:
--
Description: 
When querying a partition containing gzip files, the query fails with the error 
below: 
WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: 
hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: 
Error(255): Unknown error 255
Root cause: EOFException: Cannot seek to negative offset

hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file is a 
delimited text file and has a size of bigger than 2 GB (approx: 2.4 GB) The 
uncompressed size is ~13GB

The impalad version is : 2.12.0-cdh5.15.0

  was:
When querying a partition containing gzip files, the query fails with the error 
below: 
WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: 
hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: 
Error(255): Unknown error 255
Root cause: EOFException: Cannot seek to negative offset

hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file has a 
size of bigger than 2 GB (approx: 2.4 GB)
The exact version is : 2.12.0-cdh5.15.0


> Impala cannot read the gzip files bigger than 2 GB
> --
>
> Key: IMPALA-8109
> URL: https://issues.apache.org/jira/browse/IMPALA-8109
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.12.0
>Reporter: hakki
>Priority: Minor
>
> When querying a partition containing gzip files, the query fails with the 
> error below: 
> WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: 
> hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: 
> Error(255): Unknown error 255
> Root cause: EOFException: Cannot seek to negative offset
> hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file is 
> a delimited text file and has a size of bigger than 2 GB (approx: 2.4 GB) The 
> uncompressed size is ~13GB
> The impalad version is : 2.12.0-cdh5.15.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB

2019-01-24 Thread hakki (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hakki updated IMPALA-8109:
--
Description: 
When querying a partition containing gzip files, the query fails with the error 
below: 
WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: 
hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: 
Error(255): Unknown error 255
Root cause: EOFException: Cannot seek to negative offset

hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file has a 
size of bigger than 2 GB (approx: 2.4 GB)
The exact version is : 2.12.0-cdh5.15.0

  was:
When querying a partition containing gzip files, the query fails with the error 
below: 
WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: 
hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: 
Error(255): Unknown error 255
Root cause: EOFException: Cannot seek to negative offset

hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file has a 
size of bigger than 2 GB (approx: 2.4 GB)


> Impala cannot read the gzip files bigger than 2 GB
> --
>
> Key: IMPALA-8109
> URL: https://issues.apache.org/jira/browse/IMPALA-8109
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.12.0
>Reporter: hakki
>Priority: Minor
>
> When querying a partition containing gzip files, the query fails with the 
> error below: 
> WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: 
> hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: 
> Error(255): Unknown error 255
> Root cause: EOFException: Cannot seek to negative offset
> hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file has 
> a size of bigger than 2 GB (approx: 2.4 GB)
> The exact version is : 2.12.0-cdh5.15.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB

2019-01-24 Thread hakki (JIRA)

hakki created IMPALA-8109:
-

 Summary: Impala cannot read the gzip files bigger than 2 GB
 Key: IMPALA-8109
 URL: https://issues.apache.org/jira/browse/IMPALA-8109
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.12.0
Reporter: hakki


When querying a partition containing gzip files, the query fails with the error 
below: 
WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: 
hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: 
Error(255): Unknown error 255
Root cause: EOFException: Cannot seek to negative offset

hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file has a 
size of bigger than 2 GB (approx: 2.4 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB

2019-01-24 Thread hakki (JIRA)

hakki created IMPALA-8109:
-

 Summary: Impala cannot read the gzip files bigger than 2 GB
 Key: IMPALA-8109
 URL: https://issues.apache.org/jira/browse/IMPALA-8109
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.12.0
Reporter: hakki


When querying a partition containing gzip files, the query fails with the error 
below: 
WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: 
hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: 
Error(255): Unknown error 255
Root cause: EOFException: Cannot seek to negative offset

hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file has a 
size of bigger than 2 GB (approx: 2.4 GB)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8108) Impala query returns TIMESTAMP values in different types

2019-01-24 Thread Robbie Zhang (JIRA)

Robbie Zhang created IMPALA-8108:


 Summary: Impala query returns TIMESTAMP values in different types
 Key: IMPALA-8108
 URL: https://issues.apache.org/jira/browse/IMPALA-8108
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Robbie Zhang


When a timestamp has a .000 or .00 or .0 (when fraction value is 
zeros) the timestamp is displayed with no fraction of second. For example:
{code:java}
select cast(ts as timestamp) from 
 (values 
 ('2019-01-11 10:40:18' as ts),
 ('2019-01-11 10:40:19.0'),
 ('2019-01-11 10:40:19.00'), 
 ('2019-01-11 10:40:19.000'),
 ('2019-01-11 10:40:19.'),
 ('2019-01-11 10:40:19.0'),
 ('2019-01-11 10:40:19.00'),
 ('2019-01-11 10:40:19.000'),
 ('2019-01-11 10:40:19.'),
 ('2019-01-11 10:40:19.0'),
 ('2019-01-11 10:40:19.1')
 ) t;{code}
The output is:
{code:java}
+---+
|cast(ts as timestamp)|
+---+
|2019-01-11 10:40:18|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19.1|
+---+
{code}

As we can see, values of the same column are returned in two different types. 
The inconsistency breaks some downstream use cases. 

The reason is that impala uses function 
boost::posix_time::to_simple_string(time_duration) to convert timestamp to a 
string and to_simple_string() remove fractional seconds if they are all zeros. 
Perhaps we can append ".0" if the length of the string is 8 (HH:MM:SS).

For now we can work around it by using function from_timestamp(ts, '-mm-dd 
hh:mm.ss.s') to unify the output (convert to string), or using function 
millisecond(ts) to get fractional seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8108) Impala query returns TIMESTAMP values in different types

2019-01-24 Thread Robbie Zhang (JIRA)

Robbie Zhang created IMPALA-8108:


 Summary: Impala query returns TIMESTAMP values in different types
 Key: IMPALA-8108
 URL: https://issues.apache.org/jira/browse/IMPALA-8108
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Robbie Zhang


When a timestamp has a .000 or .00 or .0 (when fraction value is 
zeros) the timestamp is displayed with no fraction of second. For example:
{code:java}
select cast(ts as timestamp) from 
 (values 
 ('2019-01-11 10:40:18' as ts),
 ('2019-01-11 10:40:19.0'),
 ('2019-01-11 10:40:19.00'), 
 ('2019-01-11 10:40:19.000'),
 ('2019-01-11 10:40:19.'),
 ('2019-01-11 10:40:19.0'),
 ('2019-01-11 10:40:19.00'),
 ('2019-01-11 10:40:19.000'),
 ('2019-01-11 10:40:19.'),
 ('2019-01-11 10:40:19.0'),
 ('2019-01-11 10:40:19.1')
 ) t;{code}
The output is:
{code:java}
+---+
|cast(ts as timestamp)|
+---+
|2019-01-11 10:40:18|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19|
|2019-01-11 10:40:19.1|
+---+
{code}

As we can see, values of the same column are returned in two different types. 
The inconsistency breaks some downstream use cases. 

The reason is that impala uses function 
boost::posix_time::to_simple_string(time_duration) to convert timestamp to a 
string and to_simple_string() remove fractional seconds if they are all zeros. 
Perhaps we can append ".0" if the length of the string is 8 (HH:MM:SS).

For now we can work around it by using function from_timestamp(ts, '-mm-dd 
hh:mm.ss.s') to unify the output (convert to string), or using function 
millisecond(ts) to get fractional seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (IMPALA-8107) Support EXEC_TIME_LIMIT_S in resource pool setting

2019-01-24 Thread Quanlong Huang (JIRA)

Quanlong Huang created IMPALA-8107:
--

 Summary: Support EXEC_TIME_LIMIT_S in resource pool setting
 Key: IMPALA-8107
 URL: https://issues.apache.org/jira/browse/IMPALA-8107
 Project: IMPALA
  Issue Type: New Feature
Reporter: Quanlong Huang


Timeout limit should be different for different kinds of queries. For example, 
resource pool for adhoc queries may set EXEC_TIME_LIMIT_S to 60s. Resource pool 
for building pre-aggregaions or other ETL may need a larger EXEC_TIME_LIMIT_S 
like 30 minutes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8107) Support EXEC_TIME_LIMIT_S in resource pool setting

2019-01-24 Thread Quanlong Huang (JIRA)

Quanlong Huang created IMPALA-8107:
--

 Summary: Support EXEC_TIME_LIMIT_S in resource pool setting
 Key: IMPALA-8107
 URL: https://issues.apache.org/jira/browse/IMPALA-8107
 Project: IMPALA
  Issue Type: New Feature
Reporter: Quanlong Huang


Timeout limit should be different for different kinds of queries. For example, 
resource pool for adhoc queries may set EXEC_TIME_LIMIT_S to 60s. Resource pool 
for building pre-aggregaions or other ETL may need a larger EXEC_TIME_LIMIT_S 
like 30 minutes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work started] (IMPALA-4018) Add support for SQL:2016 datetime templates/patterns/masks to CAST(... AS ... FORMAT )

2019-01-24 Thread Gabor Kaszab (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-4018 started by Gabor Kaszab.

> Add support for SQL:2016 datetime templates/patterns/masks to CAST(... AS ... 
> FORMAT )
> 
>
> Key: IMPALA-4018
> URL: https://issues.apache.org/jira/browse/IMPALA-4018
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Frontend
>Affects Versions: Impala 2.2.4
>Reporter: Greg Rahn
>Assignee: Gabor Kaszab
>Priority: Critical
>  Labels: ansi-sql, compatibility, sql-language
>
> *Summary*
> The format masks/templates for currently are implemented using the [Java 
> SimpleDateFormat 
> patterns|http://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html],
>  and although this is what Hive has implemented, it is not what most standard 
> SQL systems implement.  For example see 
> [Vertica|https://my.vertica.com/docs/7.2.x/HTML/Content/Authoring/SQLReferenceManual/Functions/Formatting/TemplatePatternsForDateTimeFormatting.htm],
>  
> [Netezza|http://www.ibm.com/support/knowledgecenter/SSULQD_7.2.1/com.ibm.nz.dbu.doc/r_dbuser_ntz_sql_extns_templ_patterns_date_time_conv.html],
>   
> [Oracle|https://docs.oracle.com/database/121/SQLRF/sql_elements004.htm#SQLRF00212],
>  and 
> [PostgreSQL|https://www.postgresql.org/docs/9.5/static/functions-formatting.html#FUNCTIONS-FORMATTING-DATETIME-TABLE].
>  
> *Examples of incompatibilities*
> {noformat}
> -- PostgreSQL/Netezza/Vertica/Oracle
> select to_timestamp('May 15, 2015 12:00:00', 'mon dd,  hh:mi:ss');
> -- Impala
> select to_timestamp('May 15, 2015 12:00:00', 'MMM dd,  HH:mm:ss');
> -- PostgreSQL/Netezza/Vertica/Oracle
> select to_timestamp('2015-02-14 20:19:07','-mm-dd hh24:mi:ss');
> -- Impala
> select to_timestamp('2015-02-14 20:19:07','-MM-dd HH:mm:ss');
> -- Vertica/Oracle
> select to_timestamp('2015-02-14 20:19:07.123456','-mm-dd hh24:mi:ss.ff');
> -- Impala
> select to_timestamp('2015-02-14 20:19:07.123456','-MM-dd 
> HH:mm:ss.SS');
> {noformat}
> *Considerations*
> Because this is a change in default behavior for to_timestamp(), if possible, 
> having a feature flag to revert to the legacy Java SimpleDateFormat patterns 
> should be strongly considered.  This would allow users to chose the behavior 
> they desire and scope it to a session if need be.
> SQL:2016 defines the following datetime templates
> {noformat}
>  ::=
>   {  }...
>  ::=
> 
>   | 
>  ::=
> 
>   | 
>   | 
>   | 
>   | 
>   | 
>   | 
>   | 
>   | 
>   | 
>   | 
>   | 
>   | 
>   | 
>  ::=
> 
>   | 
>   | 
>   | 
>   | 
>   | 
>   | 
> | 
>  ::=
>    | YYY | YY | Y
>  ::=
>    | RR
>  ::=
>   MM
>  ::=
>   DD
>  ::=
>   DDD
>  ::=
>   HH | HH12
>  ::=
>   HH24
>  ::=
>   MI
>  ::=
>   SS
>  ::=
>   S
>  ::=
>   FF1 | FF2 | FF3 | FF4 | FF5 | FF6 | FF7 | FF8 | FF9
>  ::=
>   A.M. | P.M.
>  ::=
>   TZH
>  ::=
>   TZM
> {noformat}
> SQL:2016 also introduced the FORMAT clause for CAST which is the standard way 
> to do string <> datetime conversions
> {noformat}
>  ::=
>   CAST 
>AS 
>   [ FORMAT  ]
>   
>  ::=
> 
>   | 
>  ::=
> 
> | 
>  ::=
>   
> {noformat}
> For example:
> {noformat}
> CAST( AS  [FORMAT ])
> CAST( AS  [FORMAT ])
> cast(dt as string format 'DD-MM-')
> cast('01-05-2017' as date format 'DD-MM-')
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

51 matches

Mail list logo