[jira] [Commented] (IMPALA-5384) Simplify coordinator locking protocol

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480165#comment-16480165
 ] 

ASF subversion and git services commented on IMPALA-5384:
-

Commit c1c122a10177920903009420d2faac673d867c4b in impala's branch 
refs/heads/master from [~dhecht]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=c1c122a ]

IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

The is the final change to clarify and break up the Coordinator's lock.
The state machine for the coordinator is made explicit, distinguishing
between executing state and multiple terminal states. Logic to
transition into a terminal state is centralized in one location and
executes exactly once for each coordinator object.

Derived from a patch for IMPALA_5384 by Marcel Kornacker.

Testing:
- exhaustive functional tests
- stress test on minicluster with memory overcommitment. Verified from
  the logs that this exercises all these paths:
  - successful queries
  - client requested cancellation
  - error from exec FInstances RPC
  - error reported asynchronously via report status RPC
  - eos before backend execution completed
- loop query_test & failure for 12 hours with no dchecks or crashes
  (This had previously reproduced IMPALA-7030 and IMPALA-7033 with
  the previous version of this change).

Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
Reviewed-on: http://gerrit.cloudera.org:8080/10440
Reviewed-by: Dan Hecht 
Tested-by: Impala Public Jenkins 


> Simplify coordinator locking protocol
> -
>
> Key: IMPALA-5384
> URL: https://issues.apache.org/jira/browse/IMPALA-5384
> Project: IMPALA
>  Issue Type: Improvement
>Affects Versions: Impala 2.9.0
>Reporter: Marcel Kornacker
>Assignee: Dan Hecht
>Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> The coordinator has a central lock (lock_) which is used very liberally to 
> synchronize state changes that don't need to be synchronized, creating a 
> concurrency bottleneck.
> Also, the coordinator contains a number of data structures related to INSERT 
> finalization that don't need to be part of and synchronized with the rest of 
> the coordinator state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6941) Allow loading more text scanner plugins

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-6941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480164#comment-16480164
 ] 

ASF subversion and git services commented on IMPALA-6941:
-

Commit f4f28d310c08b97171a50147e283c1153fc57679 in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=f4f28d3 ]

IMPALA-6941: load more text scanner compression plugins

Add extensions for LZ4 and ZSTD (which are supported by Hadoop).
Even without a plugin this results in better behaviour because
we don't try to treat the files with unknown extensions as
uncompressed text.

Also allow loading tables containing files with unsupported
compression types. There was weird behaviour before we knew
of the file extension but didn't support querying the table -
the catalog would load the table but the impalad would fail
processing the catalog update. The simplest way to fix it
is to just allow loading the tables.

Similarly, make the "LOAD DATA" operation more permissive -
we can copy files into a directory even if we can't
decompress them.

Switch to always checking plugin version - running mismatched plugin
is inherently unsafe.

Testing:
Positive case where LZO is loaded is exercised. Added
coverage for negative case where LZO is disabled.

Fixed test gaps:
* Querying LZO table with LZO plugin not available.
* Interacting with tables with known but unsupported text
  compressions.
* Querying files with unknown compression suffixes (which are
  treated as uncompressed text).

Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Reviewed-on: http://gerrit.cloudera.org:8080/10165
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> Allow loading more text scanner plugins
> ---
>
> Key: IMPALA-6941
> URL: https://issues.apache.org/jira/browse/IMPALA-6941
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>
> It would be nice if Impala supported loading plugins for scanning additional 
> text formats aside from LZO - the current logic is fairly specialized but 
> could easily be extended to load libraries for codecs like LZ4 and ZSTD if 
> available. It's kind of weird that we only support that one format.
> This might help a bit with IMPALA-6941 and IMPALA-3898 since we could test 
> the plugin-loading mechanism without relying on the external Impala-lzo 
> codebase.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7030) crash in impala::PartitionedAggregationNode::ProcessBatchNoGrouping

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480166#comment-16480166
 ] 

ASF subversion and git services commented on IMPALA-7030:
-

Commit c1c122a10177920903009420d2faac673d867c4b in impala's branch 
refs/heads/master from [~dhecht]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=c1c122a ]

IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

The is the final change to clarify and break up the Coordinator's lock.
The state machine for the coordinator is made explicit, distinguishing
between executing state and multiple terminal states. Logic to
transition into a terminal state is centralized in one location and
executes exactly once for each coordinator object.

Derived from a patch for IMPALA_5384 by Marcel Kornacker.

Testing:
- exhaustive functional tests
- stress test on minicluster with memory overcommitment. Verified from
  the logs that this exercises all these paths:
  - successful queries
  - client requested cancellation
  - error from exec FInstances RPC
  - error reported asynchronously via report status RPC
  - eos before backend execution completed
- loop query_test & failure for 12 hours with no dchecks or crashes
  (This had previously reproduced IMPALA-7030 and IMPALA-7033 with
  the previous version of this change).

Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
Reviewed-on: http://gerrit.cloudera.org:8080/10440
Reviewed-by: Dan Hecht 
Tested-by: Impala Public Jenkins 


>  crash in impala::PartitionedAggregationNode::ProcessBatchNoGrouping
> 
>
> Key: IMPALA-7030
> URL: https://issues.apache.org/jira/browse/IMPALA-7030
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Michael Brown
>Assignee: Dan Hecht
>Priority: Blocker
> Attachments: crash.dump.gz, gdb.out.gz, hs_err_pid1621.log.gz
>
>
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/2176/
> {noformat}
> #0  0x7fc896430428 in __GI_raise (sig=sig@entry=6) at 
> ../sysdeps/unix/sysv/linux/raise.c:54
> #1  0x7fc89643202a in __GI_abort () at abort.c:89
> #2  0x7fc899379c59 in os::abort(bool) (dump_core=) at 
> /build/openjdk-8-wnL82d/openjdk-8-8u171-b11/src/hotspot/src/os/linux/vm/os_linux.cpp:1509
> #3  0x7fc89952f047 in VMError::report_and_die() 
> (this=this@entry=0x7fc7e90287d0) at 
> /build/openjdk-8-wnL82d/openjdk-8-8u171-b11/src/hotspot/src/share/vm/utilities/vmError.cpp:1060
> #4  0x7fc8993836ef in JVM_handle_linux_signal(int, siginfo_t*, void*, 
> int) (sig=sig@entry=11, info=info@entry=0x7fc7e9028a70, 
> ucVoid=ucVoid@entry=0x7fc7e9028940, 
> abort_if_unrecognized=abort_if_unrecognized@entry=1)
> at 
> /build/openjdk-8-wnL82d/openjdk-8-8u171-b11/src/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:541
> #5  0x7fc899376d88 in signalHandler(int, siginfo_t*, void*) (sig=11, 
> info=0x7fc7e9028a70, uc=0x7fc7e9028940) at 
> /build/openjdk-8-wnL82d/openjdk-8-8u171-b11/src/hotspot/src/os/linux/vm/os_linux.cpp:4432
> #6  0x7fc8967d6390 in  () at 
> /lib/x86_64-linux-gnu/libpthread.so.0
> #7  0x7fc8584ca000 in 
> impala::PartitionedAggregationNode::ProcessBatchNoGrouping(impala::RowBatch*) 
> [clone .1] ()
> #8  0x02cd5bcf in 
> impala::PartitionedAggregationNode::Open(impala::RuntimeState*) 
> (this=0x15795200, state=0x14e95d40) at 
> /home/ubuntu/Impala/be/src/exec/partitioned-aggregation-node.cc:314
> #9  0x01c94775 in impala::FragmentInstanceState::Open() 
> (this=0x1cc19e00) at 
> /home/ubuntu/Impala/be/src/runtime/fragment-instance-state.cc:268
> #10 0x01c91faf in impala::FragmentInstanceState::Exec() 
> (this=0x1cc19e00) at 
> /home/ubuntu/Impala/be/src/runtime/fragment-instance-state.cc:81
> #11 0x01ca175b in 
> impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) 
> (this=0x3a1f6000, fis=0x1cc19e00) at 
> /home/ubuntu/Impala/be/src/runtime/query-state.cc:401
> #12 0x01c9ffce in impala::QueryState::::operator()(void) 
> const (__closure=0x7fc7e9029ce8) at 
> /home/ubuntu/Impala/be/src/runtime/query-state.cc:341
> #13 0x01ca2479 in 
> boost::detail::function::void_function_obj_invoker0,
>  void>::invoke(boost::detail::function::function_buffer &) 
> (function_obj_ptr=...)
> at 
> /home/ubuntu/Impala/toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153
> #14 0x01bd9e58 in boost::function0::operator()() const 
> (this=0x7fc7e9029ce0) at 
> /home/ubuntu/Impala/toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:767
> #15 0x01ec50a9 in 

[jira] [Commented] (IMPALA-3833) Fix invalid data handling in Sequence and RCFile scanners

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480168#comment-16480168
 ] 

ASF subversion and git services commented on IMPALA-3833:
-

Commit ab75dd12e49100f153911bef87a9dab810cf9b58 in impala's branch 
refs/heads/master from [~pranay_singh]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=ab75dd1 ]

IMPALA-3833: Fix invalid data handling in Sequence and RCFile scanners

Introduced new error message when scanning a corrupt Sequence or RCFile.
Added new checks to detect buffer overrun while handling Sequence or RCFile.

Testing:
  a) Made changes to fuzz test for RCFile/Sequence file, ran fuzz test in a loop
  with 200 iteration without failure.

  b) Ran exhaustive test on the changes without failure.

Change-Id: Ic9cfc38af3f30c65ada9734eb471dbfa6ecdd74a
Reviewed-on: http://gerrit.cloudera.org:8080/8936
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> Fix invalid data handling in Sequence and RCFile scanners
> -
>
> Key: IMPALA-3833
> URL: https://issues.apache.org/jira/browse/IMPALA-3833
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.7.0
>Reporter: Tim Armstrong
>Assignee: Pranay Singh
>Priority: Critical
>  Labels: crash, downgraded
>
> The fuzz testing found multiple crashes in sequence and RCFile scanners. 
> https://gerrit.cloudera.org/#/c/3448/
> I haven't triaged the crashes, but filing this issue to track them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6827) airlines_parquet data not available in dropbox

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480160#comment-16480160
 ] 

ASF subversion and git services commented on IMPALA-6827:
-

Commit 04add98a341f3ae8e1e4e0613c82188eec5fc0d9 in impala's branch 
refs/heads/2.x from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=04add98 ]

IMPALA-6827: [DOCS] Updated the download link for the tutorial data

Updated the link to download the Parquet airline files for tutorial.

Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Reviewed-on: http://gerrit.cloudera.org:8080/10393
Reviewed-by: Michael Brown 
Tested-by: Impala Public Jenkins 


> airlines_parquet data not available in dropbox
> --
>
> Key: IMPALA-6827
> URL: https://issues.apache.org/jira/browse/IMPALA-6827
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Reporter: sathishkumar paramasivam
>Assignee: Alex Rodoni
>Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> Hi,
>  
> I am doing self learning on impala where I am trying to download the 
> airlines_parquet dataset as said in the impala user guide
>  
> wget -O airlines_parquet.tar.gz https://www.dropbox.com/s/ol9x51tqp6cv4yc/
> airlines_parquet.tar.gz?dl=0
>  
> but not downloading completing, only download as html file, so not able to 
> ruun tar command
>  
> Could you please help on this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6995) False-positive DCHECK when converting whitespace-only strings to timestamp

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-6995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480155#comment-16480155
 ] 

ASF subversion and git services commented on IMPALA-6995:
-

Commit 88facf3fbc56a62a383a4be6cd7a5ff77c0c9589 in impala's branch 
refs/heads/2.x from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=88facf3 ]

IMPALA-6995: avoid DCHECK in TimestampParse::Parse()

The bug was that the string's length is checked before trimming leading
and trailing spaces instead of afterwards. The bug has been present for
a long time but couldn't hit a DCHECK until recently.

Testing:
Added some backend tests that reproduce the crash.

Change-Id: I02a18ffd8893fe74f5830144300f745ce31477b1
Reviewed-on: http://gerrit.cloudera.org:8080/10349
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> False-positive DCHECK when converting whitespace-only strings to timestamp
> --
>
> Key: IMPALA-6995
> URL: https://issues.apache.org/jira/browse/IMPALA-6995
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: crash
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> {noformat}
> select cast(' ' as timestamp);
> {noformat}
> {noformat}
> F0508 14:32:07.245255 11824 timestamp-parse-util.cc:241] Check failed: 
> dt_ctx->fmt_len > 0 (0 vs. 0) 
> *** Check failure stack trace: ***
> @  0x428956d  google::LogMessage::Fail()
> @  0x428ae12  google::LogMessage::SendToLog()
> @  0x4288f47  google::LogMessage::Flush()
> @  0x428c50e  google::LogMessageFatal::~LogMessageFatal()
> @  0x1c3f485  impala::TimestampParser::ParseFormatTokensByStr()
> @  0x1c40553  impala::TimestampParser::Parse()
> @  0x1c4712a  impala::TimestampValue::Parse()
> @  0x2e5d8fa  impala::CastFunctions::CastToTimestampVal()
> @  0x2e45322  impala::ScalarFnCall::InterpretEval<>()
> @  0x2e27de5  impala::ScalarFnCall::GetTimestampVal()
> @  0x2de72de  impala::ScalarExprEvaluator::GetValue()
> @  0x2de6e69  impala::ScalarExprEvaluator::GetValue()
> @  0x1d1dbbf  
> Java_org_apache_impala_service_FeSupport_NativeEvalExprsWithoutRow
> @ 0x7fb7cc1d07e8  (unknown)
> Picked up JAVA_TOOL_OPTIONS: 
> -agentlib:jdwp=transport=dt_socket,address=3,server=y,suspend=n 
> Wrote minidump to 
> /home/tarmstrong/Impala/incubator-impala/logs/cluster/minidumps/impalad/42afc7f9-5b4a-4ed7-b34ad782-d7904747.dmp
> {noformat}
> It seems to work fine on a release build.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7000) Wrong info about Impala dedicated executors

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480161#comment-16480161
 ] 

ASF subversion and git services commented on IMPALA-7000:
-

Commit 1440401bb867eb35440ea00dadded4ae050e41eb in impala's branch 
refs/heads/2.x from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=1440401 ]

IMPALA-7000: [DOCS] Correct info about dedicated executors

Change-Id: I4b7e6c57188a41a45d5813882b6dbc37cf47cf1f
Reviewed-on: http://gerrit.cloudera.org:8080/10357
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> Wrong info about Impala dedicated executors
> ---
>
> Key: IMPALA-7000
> URL: https://issues.apache.org/jira/browse/IMPALA-7000
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Affects Versions: Impala 2.12.0
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> The following is not correct.
> "Then, you specify that the other hosts act as executors but not 
> coordinators. These hosts do not communicate with the statestored daemon or 
> process the final result sets from queries. You cannot connect to these hosts 
> through clients such as impala-shell or business intelligence tools."
> executor still communicates with statestore for other topics (membership, 
> admission control, etc.) The only part it doesn't get from statestore is the 
> metadata topic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7025) PlannerTest.testTableSample has wrong mem-reservation

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480153#comment-16480153
 ] 

ASF subversion and git services commented on IMPALA-7025:
-

Commit 14c715c02bdb2285f14e3defc55dae6cdc9acd6b in impala's branch 
refs/heads/2.x from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=14c715c ]

IMPALA-7025: ignore resources in some planner test

The issue was that the tablesample test verified the mem-estimate
number, which depends on file sizes, which can vary slightly between
data loads.

Instead of trying to tweak the test to avoid the issue, instead provide
a mechanism to ignore the exact values of resources in planner tests
where they are not significant.

Testing:
Manually modified some values in tablesample.test, made sure that the
test still passed. Manually modified the partition count in the
expected output, made sure that the test failed.

Change-Id: I91e3e416ec6242fbf22d9f566fdd1ce225cb16ac
Reviewed-on: http://gerrit.cloudera.org:8080/10410
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> PlannerTest.testTableSample has wrong mem-reservation
> -
>
> Key: IMPALA-7025
> URL: https://issues.apache.org/jira/browse/IMPALA-7025
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: broken-build, flaky
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> Seen once on master exhaustive:
> {noformat}
> Error Message
> Section PLAN of query:
> select id from functional_parquet.alltypes tablesample system(10) 
> repeatable(1234)
> Actual does not match expected result:
> F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> |  Per-Host Resources: mem-estimate=16.00MB mem-reservation=8.00KB 
> thread-reservation=2
> ^^^
> PLAN-ROOT SINK
> |  mem-estimate=0B mem-reservation=0B thread-reservation=0
> |
> 00:SCAN HDFS [functional_parquet.alltypes]
>partitions=3/24 files=3 size=23.70KB
>stored statistics:
>  table: rows=unavailable size=unavailable
>  partitions: 0/24 rows=unavailable
>  columns: unavailable
>extrapolated-rows=disabled max-scan-range-rows=unavailable
>mem-estimate=16.00MB mem-reservation=8.00KB thread-reservation=1
>tuple-ids=0 row-size=4B cardinality=unavailable
> Expected:
> F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> |  Per-Host Resources: mem-estimate=16.00MB mem-reservation=16.00KB 
> thread-reservation=2
> PLAN-ROOT SINK
> |  mem-estimate=0B mem-reservation=0B thread-reservation=0
> |
> 00:SCAN HDFS [functional_parquet.alltypes]
>partitions=3/24 files=3 size=24.23KB
>stored statistics:
>  table: rows=unavailable size=unavailable
>  partitions: 0/24 rows=unavailable
>  columns: unavailable
>extrapolated-rows=disabled max-scan-range-rows=unavailable
>mem-estimate=16.00MB mem-reservation=16.00KB thread-reservation=1
>tuple-ids=0 row-size=4B cardinality=unavailable{noformat}
> This succeeded on the next build, so it is flaky and might not recur.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-5842) Write page index in Parquet files

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-5842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480162#comment-16480162
 ] 

ASF subversion and git services commented on IMPALA-5842:
-

Commit 5f9641043aed8590cad37f003921c462cda934af in impala's branch 
refs/heads/2.x from [~boroknagyz]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=5f96410 ]

IMPALA-5842: Write page index in Parquet files

This commit builds on the previous work of
Pooja Nilangekar: https://gerrit.cloudera.org/#/c/7464/

The commit implements the write path of PARQUET-922:
"Add column indexes to parquet.thrift". As specified in the
parquet-format, Impala writes the page indexes just before
the footer. This allows much more efficient page filtering
than using the same information from the 'statistics' field
of DataPageHeader.

I updated Pooja's python tests as well.

Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Reviewed-on: http://gerrit.cloudera.org:8080/9693
Reviewed-by: Zoltan Borok-Nagy 
Tested-by: Impala Public Jenkins 


> Write page index in Parquet files
> -
>
> Key: IMPALA-5842
> URL: https://issues.apache.org/jira/browse/IMPALA-5842
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 2.10.0
>Reporter: Lars Volker
>Assignee: Zoltán Borók-Nagy
>Priority: Critical
>  Labels: parquet
>
> Once PARQUET-922 has been resolved, we should start writing page indices to 
> Parquet files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6997) Query execution should notice UDF MemLimitExceeded errors more quickly

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-6997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480152#comment-16480152
 ] 

ASF subversion and git services commented on IMPALA-6997:
-

Commit 8e5c18c3b789da8208611e77cd25899be78d4c8e in impala's branch 
refs/heads/2.x from [~joemcdonnell]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=8e5c18c ]

IMPALA-6997: Avoid redundant dumping in SetMemLimitExceeded()

When a UDF hits a MemLimitExceeded, the query does not
immediately abort. Instead, UDFs rely on the caller
checking the query_status_ periodically. This means that
on some codepaths, UDFs can call SetMemLimitExceeded()
many times (e.g. once per row) before the query fragment
exits.

RuntimeState::SetMemLimitExceeded() currently constructs
a MemLimitExceeded Status and dumps it for each call, even
if the query has already hit an error. This is expensive
and can delay an fragment from exiting when UDFs are
repeatedly hitting MemLimitExceeded.

This changes SetMemLimitExceeded() to avoid dumping if
the query_status_ is already not ok.

Change-Id: I92b87f370a68a2f695ebbc2520a98dd143730701
Reviewed-on: http://gerrit.cloudera.org:8080/10364
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> Query execution should notice UDF MemLimitExceeded errors more quickly
> --
>
> Key: IMPALA-6997
> URL: https://issues.apache.org/jira/browse/IMPALA-6997
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.13.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
>
> When a UDF hits a memory limit, it calls RuntimeState::SetMemLimitExceeded() 
> which sets the query status, but it has no way of returning status directly. 
> It relies on the caller checking status periodically.
> HdfsTableSink::Send() checks for errors by calling 
> RuntimeState::CheckQueryState() once at the beginning. If it is evaluating a 
> UDF and that UDF hits the memory limit, it will need to process the whole 
> RowBatch before it aborts the query. This could be 1024 rows and each row may 
> hit a memory limit in that UDF. Other locations that process UDFs may be 
> processing considerably more rows.
> There are two general approaches:
>  # Code locations should check for status more frequently and thus abort 
> faster after a RuntimeState::SetMemLImitExceeded() call.
>  # RuntimeState::SetMemLimitExceeded() should be substantially cheaper, 
> allowing the rows to be processed faster.
> RuntimeState::SetMemLimitExceeded() currently calls 
> MemTracker::MemLimitExceeded() unconditionally. It then checks to see if it 
> should update query_status_ (i.e. query_status_ is currently ok). Then it 
> logs this error. This is wasteful, because MemTracker::MemLimitExceeded() is 
> not a cheap function, and this is flooding the log for each row. 
> RuntimeState::SetMemLimitExceeded() should check status before running 
> MemTracker::MemoryLimitExceeded(). If query_status_ is already not ok, it can 
> avoid the cost of the dump and logging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7043) Failure in HBase splitting should not fail dataload

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480159#comment-16480159
 ] 

ASF subversion and git services commented on IMPALA-7043:
-

Commit 99e379dc57e561f476b3d35e5c52c0ae0f8ac767 in impala's branch 
refs/heads/2.x from [~joemcdonnell]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=99e379d ]

IMPALA-7043: HBase split failure should not fail dataload

HBase splitting can fail due to changes in HBase code. It
is useful to still do tests even if HBase splitting failed.
As it is today, buildall.sh will abort if
create-load-data.sh's invocation of split-hbase.sh fails.
No tests run, even though the HBase splitting affects only
a small portion of our tests.

This changes create-load-data.sh to keep going with
dataload if HBase splitting fails. It outputs the same
errors to the log as it would before this change.
It adds a message to explain that it is ignoring
the failure and there may be related test failures.

Change-Id: I7497fe8c9f1655a34b2743462d8b7248eb94554e
Reviewed-on: http://gerrit.cloudera.org:8080/10437
Reviewed-by: Philip Zeyliger 
Tested-by: Impala Public Jenkins 


> Failure in HBase splitting should not fail dataload
> ---
>
> Key: IMPALA-7043
> URL: https://issues.apache.org/jira/browse/IMPALA-7043
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Blocker
>  Labels: broken-build
>
> Dataload splits two of the HBase tables to provide consistent state for 
> frontend tests. However, sometimes HBase will change and the splitting code 
> will fail. Since this is happening during dataload, the whole invocation of 
> buildall.sh fails. This means that no tests run and any minor problem with 
> HBase can impact all testing, even of things that are not impacted by the 
> HBase splitting.
>  
> The HBase splitting should not fail dataload. Some tests may fail, but the 
> tests that are unrelated can run and pass.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7017) TestMetadataReplicas.test_catalog_restart fails with exception

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480158#comment-16480158
 ] 

ASF subversion and git services commented on IMPALA-7017:
-

Commit bf7e766dbebd26fa48926c5f25ca5734be1d021b in impala's branch 
refs/heads/2.x from [~vercego]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=bf7e766 ]

IMPALA-7017: deflake/fix test_catalog_restart test

The custom_cluster/test_metadata_replicas.py:test_catalog_restart
test has been recently flaky/broken for two reasons:

1) Variable support for Hive and non-hdfs filesystems. Other tests that
depend on Hive have disabled tests for non-hdfs filesystems. Since the
functionality tested is not intended for all filesystems, this change
disables this test for all filesystems other than hdfs.

2) Several builds have been flaky when looking up catalogd's version.
This change adds a retry for obtaining the version.

Change-Id: Iab6edb01f0bd7f5408cfef28fd05fdc95fb78469
Reviewed-on: http://gerrit.cloudera.org:8080/10397
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 


> TestMetadataReplicas.test_catalog_restart fails with exception
> --
>
> Key: IMPALA-7017
> URL: https://issues.apache.org/jira/browse/IMPALA-7017
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.13.0
>Reporter: Joe McDonnell
>Assignee: Vuk Ercegovac
>Priority: Blocker
>  Labels: broken-build, flaky
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> An exhaustive build with Thrift RPC on the 2.x branch encountered an error on 
> custom_cluster.test_metadata_replicas.TestMetadataReplicas.test_catalog_restart:
> {noformat}
> custom_cluster/test_metadata_replicas.py:71: in test_catalog_restart
> assert False, "Unexpected exception: " + str(e)
> E   AssertionError: Unexpected exception: 'version'
> E   assert False{noformat}
> This has happened once. I will attach more log information below.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6983) stress test binary search exits if process mem_limit is too low

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480154#comment-16480154
 ] 

ASF subversion and git services commented on IMPALA-6983:
-

Commit 6daf9800c07e6d291409dcf1abf06f3f97b85e9e in impala's branch 
refs/heads/2.x from [~mikesbrown]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=6daf980 ]

IMPALA-6983: stress: don't write a null runtime profile

This patch fixes a problem in which the writing of profiles always
assumed profiles would be collected during binary. That's not the case
when queries are too expensive to run. The simple fix is to just check
for None and not perform the write.

Change-Id: Ic8299a8a97ad1f2bd1f2927e3111db8df1d3a3e5
Reviewed-on: http://gerrit.cloudera.org:8080/10381
Reviewed-by: Michael Brown 
Tested-by: Michael Brown 


> stress test binary search exits if process mem_limit is too low
> ---
>
> Key: IMPALA-6983
> URL: https://issues.apache.org/jira/browse/IMPALA-6983
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Dan Hecht
>Assignee: Michael Brown
>Priority: Major
> Fix For: Impala 3.1.0
>
>
> This was running stress test on tpch SF=20 and minicluster process 
> mem_limit=7857355161.
> {code:java}
> 2018-05-04 18:25:03,800 18531 MainThread 
> INFO:concurrent_select[1303]:Collecting runtime info for query q5:
> select
> n_name,
> sum(l_extendedprice * (1 - l_discount)) as revenue
> from
> customer,
> orders,
> lineitem,
> supplier,
> nation,
> region
> where
> c_custkey = o_custkey
> and l_orderkey = o_orderkey
> and l_suppkey = s_suppkey
> and c_nationkey = s_nationkey
> and s_nationkey = n_nationkey
> and n_regionkey = r_regionkey
> and r_name = 'ASIA'
> and o_orderdate >= '1994-01-01'
> and o_orderdate < '1995-01-01'
> group by
> n_name
> order by
> revenue desc
> 2018-05-04 18:25:07,790 18531 MainThread INFO:concurrent_select[1406]:Finding 
> a starting point for binary search
> 2018-05-04 18:25:07,790 18531 MainThread INFO:concurrent_select[1409]:Next 
> mem_limit: 7493
> 2018-05-04 18:28:06,380 18531 MainThread 
> WARNING:concurrent_select[1416]:Query couldn't be run even when using all 
> available memory
> select
> n_name,
> sum(l_extendedprice * (1 - l_discount)) as revenue
> from
> customer,
> orders,
> lineitem,
> supplier,
> nation,
> region
> where
> c_custkey = o_custkey
> and l_orderkey = o_orderkey
> and l_suppkey = s_suppkey
> and c_nationkey = s_nationkey
> and s_nationkey = n_nationkey
> and n_regionkey = r_regionkey
> and r_name = 'ASIA'
> and o_orderdate >= '1994-01-01'
> and o_orderdate < '1995-01-01'
> group by
> n_name
> order by
> revenue desc
> Traceback (most recent call last):
> File "./tests/stress/concurrent_select.py", line 2265, in 
> main()
> File "./tests/stress/concurrent_select.py", line 2162, in main
> queries, impala, converted_args, 
> queries_with_runtime_info_by_db_sql_and_options)
> File "./tests/stress/concurrent_select.py", line 1879, in populate_all_queries
> os.path.join(converted_args.results_dir, PROFILES_DIR))
> File "./tests/stress/concurrent_select.py", line 964, in 
> write_runtime_info_profiles
> fh.write(profile)
> TypeError: expected a string or other character buffer object{code}
> I don't understand the details of {{concurrent_select.py}} control flow, but 
> it looks like in this case {{update_runtime_info()}} won't get called leading 
> to this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7047) REFRESH on unpartitioned tables calls getBlockLocations on every file

2018-05-17 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480133#comment-16480133
 ] 

Todd Lipcon commented on IMPALA-7047:
-

On a relatively small table with 374 files, REFRESH spends about a second in 
this code path (each RPC is 2-3ms due to RTT).

> REFRESH on unpartitioned tables calls getBlockLocations on every file
> -
>
> Key: IMPALA-7047
> URL: https://issues.apache.org/jira/browse/IMPALA-7047
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.13.0
>Reporter: Todd Lipcon
>Priority: Major
>  Labels: metadata
>
> In HdfsTable.updateUnpartitionedTableFileMd() the existing default Partition 
> object is reset, and a new empty one is created. It then calls 
> refreshPartitionFileMetadata with this new partition which has an empty list 
> of file descriptors. This ends up listing the directory, and for each file, 
> since it doesn't find it in the empty descriptor list, will make a separate 
> RPC to HDFS to get the locations.
> This is quite wasteful vs just using the API that returns the located 
> statuses for the directory.
> Alternatively, it seems like it should probably keep around the old file 
> descriptor list in the new Partition object so that the incremental refresh 
> path can work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7047) REFRESH on unpartitioned tables calls getBlockLocations on every file

2018-05-17 Thread Todd Lipcon (JIRA)
Todd Lipcon created IMPALA-7047:
---

 Summary: REFRESH on unpartitioned tables calls getBlockLocations 
on every file
 Key: IMPALA-7047
 URL: https://issues.apache.org/jira/browse/IMPALA-7047
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Affects Versions: Impala 2.13.0
Reporter: Todd Lipcon


In HdfsTable.updateUnpartitionedTableFileMd() the existing default Partition 
object is reset, and a new empty one is created. It then calls 
refreshPartitionFileMetadata with this new partition which has an empty list of 
file descriptors. This ends up listing the directory, and for each file, since 
it doesn't find it in the empty descriptor list, will make a separate RPC to 
HDFS to get the locations.

This is quite wasteful vs just using the API that returns the located statuses 
for the directory.

Alternatively, it seems like it should probably keep around the old file 
descriptor list in the new Partition object so that the incremental refresh 
path can work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7029) LHS of a between expr is not cloned when rewritten into LHS >= x and LHS <= y

2018-05-17 Thread Tianyi Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-7029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianyi Wang updated IMPALA-7029:

Fix Version/s: (was: Impala 2.13.0)

> LHS of a between expr is not cloned when rewritten into LHS >= x and LHS <= y
> -
>
> Key: IMPALA-7029
> URL: https://issues.apache.org/jira/browse/IMPALA-7029
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Tianyi Wang
>Priority: Minor
>
> When a between predicates is rewritten into two comparing exprs, the LHS 
> obviously should be cloned but it's not:
> https://github.com/apache/impala/blob/19bcc3099ef5e244fe74a0466b0b0eeb673acc8e/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java#L46



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-7029) LHS of a between expr is not cloned when rewritten into LHS >= x and LHS <= y

2018-05-17 Thread Tianyi Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-7029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianyi Wang reopened IMPALA-7029:
-

> LHS of a between expr is not cloned when rewritten into LHS >= x and LHS <= y
> -
>
> Key: IMPALA-7029
> URL: https://issues.apache.org/jira/browse/IMPALA-7029
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Tianyi Wang
>Priority: Minor
> Fix For: Impala 2.13.0
>
>
> When a between predicates is rewritten into two comparing exprs, the LHS 
> obviously should be cloned but it's not:
> https://github.com/apache/impala/blob/19bcc3099ef5e244fe74a0466b0b0eeb673acc8e/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java#L46



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7029) LHS of a between expr is not cloned when rewritten into LHS >= x and LHS <= y

2018-05-17 Thread Tianyi Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-7029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianyi Wang resolved IMPALA-7029.
-
Resolution: Fixed

This is not a issue. I changed some expr in place while working on IMPALA-4025, 
which is wrong.

> LHS of a between expr is not cloned when rewritten into LHS >= x and LHS <= y
> -
>
> Key: IMPALA-7029
> URL: https://issues.apache.org/jira/browse/IMPALA-7029
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Tianyi Wang
>Priority: Minor
> Fix For: Impala 2.13.0
>
>
> When a between predicates is rewritten into two comparing exprs, the LHS 
> obviously should be cloned but it's not:
> https://github.com/apache/impala/blob/19bcc3099ef5e244fe74a0466b0b0eeb673acc8e/fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java#L46



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7008) TestSpillingDebugActionDimensions.test_spilling test setup fails

2018-05-17 Thread Joe McDonnell (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479853#comment-16479853
 ] 

Joe McDonnell commented on IMPALA-7008:
---

[~dknupp] This is happening pretty frequently on s3/local jobs. Is there 
anything we can do to avoid it?

> TestSpillingDebugActionDimensions.test_spilling test setup fails
> 
>
> Key: IMPALA-7008
> URL: https://issues.apache.org/jira/browse/IMPALA-7008
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0, Impala 2.13.0
>Reporter: Sailesh Mukil
>Priority: Blocker
>  Labels: broken-build, flaky
>
> We've seen multiple instances of this test failing with the following error:
> {code:java}
> Error Message
> test setup failure
> Stacktrace
> Slave 'gw0' crashed while running 
> "query_test/test_spilling.py::TestSpillingDebugActionDimensions::()::test_spilling[exec_option:
>  {'debug_action': None, 'default_spillable_buffer_size': '256k'} | 
> table_format: parquet/none]"
> {code}
> We need to investigate why this is happening.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7035) Impala HDFS Encryption tests failing after OpenJDK update

2018-05-17 Thread Philip Zeyliger (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Zeyliger resolved IMPALA-7035.
-
Resolution: Fixed
  Assignee: Philip Zeyliger

> Impala HDFS Encryption tests failing after OpenJDK update
> -
>
> Key: IMPALA-7035
> URL: https://issues.apache.org/jira/browse/IMPALA-7035
> Project: IMPALA
>  Issue Type: Task
>Reporter: Philip Zeyliger
>Assignee: Philip Zeyliger
>Priority: Major
>
> I have seen {{impala-py.test tests/metadata/test_hdfs_encryption.py}} fail 
> with the following error:
> {{E AssertionError: Error creating encryption zone: RemoteException: Can't 
> recover key for testkey1 from keystore 
> [file:/home/impdev/Impala/testdata/cluster/cdh6/node-1/data/kms.keystore|file:///home/impdev/Impala/testdata/cluster/cdh6/node-1/data/kms.keystore]}}
> I believe what's going on is described in 
> https://issues.apache.org/jira/browse/HDFS-13494. In short, the JDK now has a 
> special whitelist for an API as a result of a security vulnerability.
> A workaround in the KMS init script to configure $HADOOP_OPTS seems to do the 
> trick.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7035) Impala HDFS Encryption tests failing after OpenJDK update

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479817#comment-16479817
 ] 

ASF subversion and git services commented on IMPALA-7035:
-

Commit 5b824408af17d084a5ea3464e0ff913f2c94e4c4 in impala's branch 
refs/heads/master from [~philip]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=5b82440 ]

IMPALA-7035: Configure jceks.key.serialFilter for KMS.

Configures a Java property for KMS to account for JDK 8u171's security fixes. I
was seeing impala-py.test tests/metadata/test_hdfs_encryption.py fail with the
following error:

  AssertionError: Error creating encryption zone: RemoteException: Can't 
recover key for testkey1 from keystore 
file:/home/impdev/Impala/testdata/cluster/cdh6/node-1/data/kms.keystore

The issue is described in HDFS-13494, and I imagine it'll be fixed in due time. 
In the
meanwhile, setting this property seems to do the trick.

Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Reviewed-on: http://gerrit.cloudera.org:8080/10418
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 


> Impala HDFS Encryption tests failing after OpenJDK update
> -
>
> Key: IMPALA-7035
> URL: https://issues.apache.org/jira/browse/IMPALA-7035
> Project: IMPALA
>  Issue Type: Task
>Reporter: Philip Zeyliger
>Priority: Major
>
> I have seen {{impala-py.test tests/metadata/test_hdfs_encryption.py}} fail 
> with the following error:
> {{E AssertionError: Error creating encryption zone: RemoteException: Can't 
> recover key for testkey1 from keystore 
> [file:/home/impdev/Impala/testdata/cluster/cdh6/node-1/data/kms.keystore|file:///home/impdev/Impala/testdata/cluster/cdh6/node-1/data/kms.keystore]}}
> I believe what's going on is described in 
> https://issues.apache.org/jira/browse/HDFS-13494. In short, the JDK now has a 
> special whitelist for an API as a result of a security vulnerability.
> A workaround in the KMS init script to configure $HADOOP_OPTS seems to do the 
> trick.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7000) Wrong info about Impala dedicated executors

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479802#comment-16479802
 ] 

ASF subversion and git services commented on IMPALA-7000:
-

Commit 05e0db3a0e9c00ff075452941082408d39f794f8 in impala's branch 
refs/heads/master from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=05e0db3 ]

IMPALA-7000: [DOCS] Correct info about dedicated executors

Change-Id: I4b7e6c57188a41a45d5813882b6dbc37cf47cf1f
Reviewed-on: http://gerrit.cloudera.org:8080/10357
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> Wrong info about Impala dedicated executors
> ---
>
> Key: IMPALA-7000
> URL: https://issues.apache.org/jira/browse/IMPALA-7000
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Affects Versions: Impala 2.12.0
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> The following is not correct.
> "Then, you specify that the other hosts act as executors but not 
> coordinators. These hosts do not communicate with the statestored daemon or 
> process the final result sets from queries. You cannot connect to these hosts 
> through clients such as impala-shell or business intelligence tools."
> executor still communicates with statestore for other topics (membership, 
> admission control, etc.) The only part it doesn't get from statestore is the 
> metadata topic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7003) Support erasure-coding in impala

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479800#comment-16479800
 ] 

ASF subversion and git services commented on IMPALA-7003:
-

Commit 51bca9099789fdb48d06a6f0574647d0a9029f0d in impala's branch 
refs/heads/2.x from [~tianyiwang]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=51bca90 ]

Ignore "IMPALA-7003: Deflake erasure coding data loading"

The commit message in IMPALA-7003 mistakenly didn't include
"not for 2.x".

Change-Id: Ic9beafd4b0f0fc163ebe969fc39b4fdb6b27c0fa
Reviewed-on: http://gerrit.cloudera.org:8080/10445
Reviewed-by: Joe McDonnell 
Tested-by: Tianyi Wang 


> Support erasure-coding in impala
> 
>
> Key: IMPALA-7003
> URL: https://issues.apache.org/jira/browse/IMPALA-7003
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend, Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Tianyi Wang
>Priority: Critical
>
> This is the parent Jira for the erasure coding feature



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6827) airlines_parquet data not available in dropbox

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479801#comment-16479801
 ] 

ASF subversion and git services commented on IMPALA-6827:
-

Commit 015058d0f28e8e2b9b28f659a04a5748399ca8f1 in impala's branch 
refs/heads/master from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=015058d ]

IMPALA-6827: [DOCS] Updated the download link for the tutorial data

Updated the link to download the Parquet airline files for tutorial.

Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Reviewed-on: http://gerrit.cloudera.org:8080/10393
Reviewed-by: Michael Brown 
Tested-by: Impala Public Jenkins 


> airlines_parquet data not available in dropbox
> --
>
> Key: IMPALA-6827
> URL: https://issues.apache.org/jira/browse/IMPALA-6827
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Reporter: sathishkumar paramasivam
>Assignee: Alex Rodoni
>Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> Hi,
>  
> I am doing self learning on impala where I am trying to download the 
> airlines_parquet dataset as said in the impala user guide
>  
> wget -O airlines_parquet.tar.gz https://www.dropbox.com/s/ol9x51tqp6cv4yc/
> airlines_parquet.tar.gz?dl=0
>  
> but not downloading completing, only download as html file, so not able to 
> ruun tar command
>  
> Could you please help on this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-5842) Write page index in Parquet files

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-5842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479803#comment-16479803
 ] 

ASF subversion and git services commented on IMPALA-5842:
-

Commit ccf19f9f8f2914639b6997849a56c13cfd2399b8 in impala's branch 
refs/heads/master from [~boroknagyz]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=ccf19f9 ]

IMPALA-5842: Write page index in Parquet files

This commit builds on the previous work of
Pooja Nilangekar: https://gerrit.cloudera.org/#/c/7464/

The commit implements the write path of PARQUET-922:
"Add column indexes to parquet.thrift". As specified in the
parquet-format, Impala writes the page indexes just before
the footer. This allows much more efficient page filtering
than using the same information from the 'statistics' field
of DataPageHeader.

I updated Pooja's python tests as well.

Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Reviewed-on: http://gerrit.cloudera.org:8080/9693
Reviewed-by: Zoltan Borok-Nagy 
Tested-by: Impala Public Jenkins 


> Write page index in Parquet files
> -
>
> Key: IMPALA-5842
> URL: https://issues.apache.org/jira/browse/IMPALA-5842
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 2.10.0
>Reporter: Lars Volker
>Assignee: Zoltán Borók-Nagy
>Priority: Critical
>  Labels: parquet
>
> Once PARQUET-922 has been resolved, we should start writing page indices to 
> Parquet files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7003) Support erasure-coding in impala

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479799#comment-16479799
 ] 

ASF subversion and git services commented on IMPALA-7003:
-

Commit 51bca9099789fdb48d06a6f0574647d0a9029f0d in impala's branch 
refs/heads/2.x from [~tianyiwang]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=51bca90 ]

Ignore "IMPALA-7003: Deflake erasure coding data loading"

The commit message in IMPALA-7003 mistakenly didn't include
"not for 2.x".

Change-Id: Ic9beafd4b0f0fc163ebe969fc39b4fdb6b27c0fa
Reviewed-on: http://gerrit.cloudera.org:8080/10445
Reviewed-by: Joe McDonnell 
Tested-by: Tianyi Wang 


> Support erasure-coding in impala
> 
>
> Key: IMPALA-7003
> URL: https://issues.apache.org/jira/browse/IMPALA-7003
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend, Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Tianyi Wang
>Priority: Critical
>
> This is the parent Jira for the erasure coding feature



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-6827) airlines_parquet data not available in dropbox

2018-05-17 Thread Alex Rodoni (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni closed IMPALA-6827.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> airlines_parquet data not available in dropbox
> --
>
> Key: IMPALA-6827
> URL: https://issues.apache.org/jira/browse/IMPALA-6827
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Reporter: sathishkumar paramasivam
>Assignee: Alex Rodoni
>Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> Hi,
>  
> I am doing self learning on impala where I am trying to download the 
> airlines_parquet dataset as said in the impala user guide
>  
> wget -O airlines_parquet.tar.gz https://www.dropbox.com/s/ol9x51tqp6cv4yc/
> airlines_parquet.tar.gz?dl=0
>  
> but not downloading completing, only download as html file, so not able to 
> ruun tar command
>  
> Could you please help on this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7024) Convert Coordinator::wait_lock_ from boost::mutex to SpinLock

2018-05-17 Thread Dan Hecht (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-7024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Hecht resolved IMPALA-7024.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> Convert Coordinator::wait_lock_ from boost::mutex to SpinLock
> -
>
> Key: IMPALA-7024
> URL: https://issues.apache.org/jira/browse/IMPALA-7024
> Project: IMPALA
>  Issue Type: Improvement
>Affects Versions: Impala 3.1.0
>Reporter: Dan Hecht
>Assignee: Dan Hecht
>Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> For consistency with other locks in this class, convert 
> Coordinator::wait_lock_. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-7000) Wrong info about Impala dedicated executors

2018-05-17 Thread Alex Rodoni (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-7000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni closed IMPALA-7000.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> Wrong info about Impala dedicated executors
> ---
>
> Key: IMPALA-7000
> URL: https://issues.apache.org/jira/browse/IMPALA-7000
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Affects Versions: Impala 2.12.0
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> The following is not correct.
> "Then, you specify that the other hosts act as executors but not 
> coordinators. These hosts do not communicate with the statestored daemon or 
> process the final result sets from queries. You cannot connect to these hosts 
> through clients such as impala-shell or business intelligence tools."
> executor still communicates with statestore for other topics (membership, 
> admission control, etc.) The only part it doesn't get from statestore is the 
> metadata topic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7000) Wrong info about Impala dedicated executors

2018-05-17 Thread Tim Armstrong (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-7000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7000:
--
Summary: Wrong info about Impala dedicated executors  (was: Wrong info 
about Impala dedicated executore)

> Wrong info about Impala dedicated executors
> ---
>
> Key: IMPALA-7000
> URL: https://issues.apache.org/jira/browse/IMPALA-7000
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Affects Versions: Impala 2.12.0
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>
> The following is not correct.
> "Then, you specify that the other hosts act as executors but not 
> coordinators. These hosts do not communicate with the statestored daemon or 
> process the final result sets from queries. You cannot connect to these hosts 
> through clients such as impala-shell or business intelligence tools."
> executor still communicates with statestore for other topics (membership, 
> admission control, etc.) The only part it doesn't get from statestore is the 
> metadata topic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-6235) Reduce runtime of test_admission_control in exhaustive build

2018-05-17 Thread Dan Hecht (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Hecht updated IMPALA-6235:
--
Labels: admission-control resource-management  (was: resource-management)

> Reduce runtime of test_admission_control in exhaustive build
> 
>
> Key: IMPALA-6235
> URL: https://issues.apache.org/jira/browse/IMPALA-6235
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 2.11.0
>Reporter: Tim Armstrong
>Priority: Minor
>  Labels: admission-control, resource-management
>
> The admission control test can take over half an hour to run with all the 
> exhaustive combinations. 
> {code}
> $ time impala-py.test tests/custom_cluster/test_admission_controller.py 
> --workload_exploration_strategy="functional-query:exhaustive" 
> ...
> real41m44.964s
> user1m55.404s
> sys 0m12.180s
> {code}
> The system is mostly idle during that time.
> We should speed up the test if possible without making it flaky and, if that 
> isn't possible, consider running the stress test as part of a separate build.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6929) Create Kudu table syntax does not allow multi-column range partitions

2018-05-17 Thread Thomas Tauber-Marshall (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479456#comment-16479456
 ] 

Thomas Tauber-Marshall commented on IMPALA-6929:


https://gerrit.cloudera.org/#/c/10441/

> Create Kudu table syntax does not allow multi-column range partitions
> -
>
> Key: IMPALA-6929
> URL: https://issues.apache.org/jira/browse/IMPALA-6929
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.11.0
>Reporter: Dan Burkert
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>
> The Impala CREATE TABLE syntax guide includes this bit of grammar in the Kudu 
> partitioning section:
> {code:java}
> range_clause ::=
>   RANGE [ (pk_col [, ...]) ]
>   (
> {
>   PARTITION constant_expression range_comparison_operator VALUES 
> range_comparison_operator constant_expression
>   | PARTITION VALUE = constant_expression_or_tuple
> }
>[, ...]
>   ){code}
> This is suspicious because {{constant_expression}} is used in the range 
> clause, and {{constant_expression_or_tuple}} is used in the single-value 
> clause.  I believe both should allow for tuples.
> In other words, today a CREATE TABLE statement such as
> {code:java}
> CREATE TABLE t (a BIGINT, b BIGINT, PRIMARY KEY (a, b))
> PARTITION BY RANGE (a, b) (
>     PARTITION (0, 0) <= VALUES < (10, 0)
> ) STORED AS KUDU;{code}
> results in a syntax error, and it should not.  CC [~twmarshall]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7046) Add targeted regression test for race in IMPALA-7033

2018-05-17 Thread Dan Hecht (JIRA)
Dan Hecht created IMPALA-7046:
-

 Summary: Add targeted regression test for race in IMPALA-7033
 Key: IMPALA-7046
 URL: https://issues.apache.org/jira/browse/IMPALA-7046
 Project: IMPALA
  Issue Type: Task
  Components: Backend
Affects Versions: Impala 3.1.0
Reporter: Dan Hecht
Assignee: Dan Hecht


I'd like to add a regression test to trigger the race in IMPALA-7033 more 
reliably, but it will involve doing some sleeps at specific places, so I'd like 
to add it after [~bikramjeet.vig] commits a change that provides some 
infrastructure for that.

The race was:

1) Coordinator::Exec() takes the QueryState ExecResources reference count.

2) Coordinator sends out exec rpc to non-coordinator backend.

3) Some other backend sends a failure report which invokes 
HandleExecStateTransition, which drops the coordinator's reference to the exec 
resources.

4) Coordinator sends out exec rpc to coordinator backend, which takes the exec 
resources reference and releases it. We don't expect the reference count to 
become non-zero after it has already gone through a cycle.

The fix for this race is included in https://gerrit.cloudera.org/#/c/10440



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7023) TestInsertQueries.test_insert_overwrite fails by hitting memory limit

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479332#comment-16479332
 ] 

ASF subversion and git services commented on IMPALA-7023:
-

Commit 1e6544f7da1d756b437d8b0f12a6446f10f1f836 in impala's branch 
refs/heads/master from [~joemcdonnell]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=1e6544f ]

IMPALA-7023: Wait for fragments to finish for test_insert.py

The arrangement of tests in test_insert.py changed with
IMPALA-7010, splitting out the memory limit tests into
test_insert_mem_limit(). On exhaustive, the combination
of test dimensions means test_insert_mem_limit() executes
11 different combinations. Each of these statements can
use a large amount of memory and this is not cleaned
up immediately. This has been causing
test_insert_overwrite(), which immediately follows
test_insert_mem_limit(), to hit the process memory limit.

This changes test_insert_mem_limit() to make it wait
for its fragments to finish.

Change-Id: I5642e9cb32dd02afd74dde7e0d3b31bddbee3ccd
Reviewed-on: http://gerrit.cloudera.org:8080/10426
Reviewed-by: Philip Zeyliger 
Tested-by: Impala Public Jenkins 


> TestInsertQueries.test_insert_overwrite fails by hitting memory limit
> -
>
> Key: IMPALA-7023
> URL: https://issues.apache.org/jira/browse/IMPALA-7023
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Blocker
>  Labels: broken-build
>
> This failure is seen on exhaustive builds on both master and 2.x:
> {noformat}
> Error Message
> ImpalaBeeswaxException: ImpalaBeeswaxException:  INNER EXCEPTION:  'beeswaxd.ttypes.BeeswaxException'>  MESSAGE: AnalysisException: Failed to 
> evaluate expr: 20 CAUSED BY: InternalException: Memory limit exceeded: Error 
> occurred on backend 
> impala-boost-static-burst-slave-el7-03d4.vpc.cloudera.com:22000 by fragment 
> 0:0 Memory left in process limit: -4.29 GB 
> Query(d24ea53242b4cedc:fc8e0885): Total=0 Peak=0   : Total=0 
> Peak=0Process: memory limit exceeded. Limit=12.00 GB Total=16.29 GB 
> Peak=16.29 GB   Buffer Pool: Free Buffers: Total=160.00 KB   Buffer Pool: 
> Clean Pages: Total=0   Buffer Pool: Unused Reservation: Total=-328.00 KB   
> Data Stream Service Queue: Limit=614.40 MB Total=0 Peak=116.12 KB   Data 
> Stream Manager Early RPCs: Total=0 Peak=6.76 KB   TCMalloc Overhead: 
> Total=103.56 MB   RequestPool=fe-eval-exprs: Total=0 Peak=52.83 KB 
> Query(d24ea53242b4cedc:fc8e0885): Total=0 Peak=0   
> RequestPool=default-pool: Total=5.20 GB Peak=5.20 GB 
> Query(f4014f7bb49ea78:6b926b19): memory limit exceeded. Limit=64.00 
> MB Reservation=0 ReservationLimit=32.00 MB OtherMemory=1.76 GB Total=1.76 GB 
> Peak=1.96 GB Query(2c44b65fbcb4e1ce:3d73badf): memory limit 
> exceeded. Limit=64.00 MB Reservation=0 ReservationLimit=32.00 MB 
> OtherMemory=1.04 GB Total=1.04 GB Peak=1.61 GB 
> Query(214cc23c1376176f:7844977b): memory limit exceeded. Limit=64.00 
> MB Reservation=0 ReservationLimit=32.00 MB OtherMemory=1.23 GB Total=1.23 GB 
> Peak=1.23 GB Query(8949bdf792a32ad2:33a36c03): memory limit 
> exceeded. Limit=64.00 MB Reservation=0 ReservationLimit=32.00 MB 
> OtherMemory=642.20 MB Total=642.20 MB Peak=642.20 MB 
> Query(5412ff4e6065721:519d3e61): memory limit exceeded. Limit=64.00 
> MB Reservation=0 ReservationLimit=32.00 MB OtherMemory=556.49 MB Total=556.49 
> MB Peak=556.49 MB   Untracked Memory: Total=10.98 GB
> Stacktrace
> query_test/test_insert.py:132: in test_insert_overwrite
> multiple_impalad=vector.get_value('exec_option')['sync_ddl'] == 1)
> common/impala_test_suite.py:405: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:620: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:339: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:335: in execute_query_async
> return self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:460: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: AnalysisException: Failed to evaluate expr: 20
> E   

[jira] [Commented] (IMPALA-7017) TestMetadataReplicas.test_catalog_restart fails with exception

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479334#comment-16479334
 ] 

ASF subversion and git services commented on IMPALA-7017:
-

Commit 6af65697f291b859509d756b1d839176f664111d in impala's branch 
refs/heads/master from [~vercego]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=6af6569 ]

IMPALA-7017: deflake/fix test_catalog_restart test

The custom_cluster/test_metadata_replicas.py:test_catalog_restart
test has been recently flaky/broken for two reasons:

1) Variable support for Hive and non-hdfs filesystems. Other tests that
depend on Hive have disabled tests for non-hdfs filesystems. Since the
functionality tested is not intended for all filesystems, this change
disables this test for all filesystems other than hdfs.

2) Several builds have been flaky when looking up catalogd's version.
This change adds a retry for obtaining the version.

Change-Id: Iab6edb01f0bd7f5408cfef28fd05fdc95fb78469
Reviewed-on: http://gerrit.cloudera.org:8080/10397
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 


> TestMetadataReplicas.test_catalog_restart fails with exception
> --
>
> Key: IMPALA-7017
> URL: https://issues.apache.org/jira/browse/IMPALA-7017
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.13.0
>Reporter: Joe McDonnell
>Assignee: Vuk Ercegovac
>Priority: Blocker
>  Labels: broken-build, flaky
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> An exhaustive build with Thrift RPC on the 2.x branch encountered an error on 
> custom_cluster.test_metadata_replicas.TestMetadataReplicas.test_catalog_restart:
> {noformat}
> custom_cluster/test_metadata_replicas.py:71: in test_catalog_restart
> assert False, "Unexpected exception: " + str(e)
> E   AssertionError: Unexpected exception: 'version'
> E   assert False{noformat}
> This has happened once. I will attach more log information below.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7043) Failure in HBase splitting should not fail dataload

2018-05-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479335#comment-16479335
 ] 

ASF subversion and git services commented on IMPALA-7043:
-

Commit 2e9f5c90ebbe0a9b63db92238787071b027b2c66 in impala's branch 
refs/heads/master from [~joemcdonnell]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=2e9f5c9 ]

IMPALA-7043: HBase split failure should not fail dataload

HBase splitting can fail due to changes in HBase code. It
is useful to still do tests even if HBase splitting failed.
As it is today, buildall.sh will abort if
create-load-data.sh's invocation of split-hbase.sh fails.
No tests run, even though the HBase splitting affects only
a small portion of our tests.

This changes create-load-data.sh to keep going with
dataload if HBase splitting fails. It outputs the same
errors to the log as it would before this change.
It adds a message to explain that it is ignoring
the failure and there may be related test failures.

Change-Id: I7497fe8c9f1655a34b2743462d8b7248eb94554e
Reviewed-on: http://gerrit.cloudera.org:8080/10437
Reviewed-by: Philip Zeyliger 
Tested-by: Impala Public Jenkins 


> Failure in HBase splitting should not fail dataload
> ---
>
> Key: IMPALA-7043
> URL: https://issues.apache.org/jira/browse/IMPALA-7043
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Blocker
>  Labels: broken-build
>
> Dataload splits two of the HBase tables to provide consistent state for 
> frontend tests. However, sometimes HBase will change and the splitting code 
> will fail. Since this is happening during dataload, the whole invocation of 
> buildall.sh fails. This means that no tests run and any minor problem with 
> HBase can impact all testing, even of things that are not impacted by the 
> HBase splitting.
>  
> The HBase splitting should not fail dataload. Some tests may fail, but the 
> tests that are unrelated can run and pass.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7045) CTAS doesn't allow NULL values

2018-05-17 Thread Pavel Grafkin (JIRA)
Pavel Grafkin created IMPALA-7045:
-

 Summary: CTAS doesn't allow NULL values
 Key: IMPALA-7045
 URL: https://issues.apache.org/jira/browse/IMPALA-7045
 Project: IMPALA
  Issue Type: Bug
 Environment: impalad version 2.7.0-cdh5.10.0 RELEASE (build 
785a073cd07e2540d521ecebb8b38161ccbd2aa2)
Built on Fri Jan 20 12:03:56 PST 2017
Reporter: Pavel Grafkin


When I run following:

{{CREATE TABLE temp_test AS SELECT CAST(null AS bigint) as nullable_int }}

with the error:

{{Unsupported type 'null_type' in column 'nullable_int' of table 'temp_test'}}

The reason for this, I assume, is following. In the INSERT statements data 
types of result set's columns are retrieved from the resulting values:
[https://github.com/apache/impala/blob/789c5aac23480acc6e18c057b767b65fdd791c97/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java#L275]

Whereas in CTAS they are calculated via base table of the column (and there is 
no base table for NULL):
[https://github.com/apache/impala/blob/ba84ad03cb83d7f7aed8524fcfbb0e2cdc9fdd53/fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java#L171]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7017) TestMetadataReplicas.test_catalog_restart fails with exception

2018-05-17 Thread Vuk Ercegovac (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vuk Ercegovac resolved IMPALA-7017.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> TestMetadataReplicas.test_catalog_restart fails with exception
> --
>
> Key: IMPALA-7017
> URL: https://issues.apache.org/jira/browse/IMPALA-7017
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.13.0
>Reporter: Joe McDonnell
>Assignee: Vuk Ercegovac
>Priority: Blocker
>  Labels: broken-build, flaky
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> An exhaustive build with Thrift RPC on the 2.x branch encountered an error on 
> custom_cluster.test_metadata_replicas.TestMetadataReplicas.test_catalog_restart:
> {noformat}
> custom_cluster/test_metadata_replicas.py:71: in test_catalog_restart
> assert False, "Unexpected exception: " + str(e)
> E   AssertionError: Unexpected exception: 'version'
> E   assert False{noformat}
> This has happened once. I will attach more log information below.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-5843) Use page index in Parquet files to skip pages

2018-05-17 Thread Tim Armstrong (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-5843:
-

Assignee: Zoltán Borók-Nagy

> Use page index in Parquet files to skip pages
> -
>
> Key: IMPALA-5843
> URL: https://issues.apache.org/jira/browse/IMPALA-5843
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 2.10.0
>Reporter: Lars Volker
>Assignee: Zoltán Borók-Nagy
>Priority: Critical
>  Labels: parquet, performance
>
> Once IMPALA-5842 has been resolved, we should skip pages based on the page 
> index in Parquet files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7039) Frontend HBase tests cannot tolerate HBase running on a different port

2018-05-17 Thread Adrian Ng (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrian Ng reassigned IMPALA-7039:
-

Assignee: Adrian Ng

> Frontend HBase tests cannot tolerate HBase running on a different port
> --
>
> Key: IMPALA-7039
> URL: https://issues.apache.org/jira/browse/IMPALA-7039
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Adrian Ng
>Priority: Blocker
>  Labels: broken-build
>
> When HBase doesn't get the same ports as usual, 
> org.apache.impala.planner.PlannerTest.testHbase and 
> org.apache.impala.planner.PlannerTest.testJoins fail with the following 
> errors:
> {noformat}
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.alltypessmall
> where id < 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16020 :3
> ^
>   HBASE KEYRANGE port=16022 3:7
>   HBASE KEYRANGE port=16023 7:
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16201 :3
>   HBASE KEYRANGE port=16202 3:7
>   HBASE KEYRANGE port=16203 7:
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id = '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 5:5\0
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 5:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id > '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 5\0:7
> ^
>   HBASE KEYRANGE port=16023 7:
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 5\0:7
>   HBASE KEYRANGE port=16203 7:
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id >= '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 5:7
> ^^^
>   HBASE KEYRANGE port=16023 7:
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 5:7
>   HBASE KEYRANGE port=16203 7:
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id < '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16020 :3
> ^
>   HBASE KEYRANGE port=16022 3:5
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16201 :3
>   HBASE KEYRANGE port=16202 3:5
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id > '4' and id < '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4\0:5
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4\0:5
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id >= '4' and id < '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4:5
> ^^^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4:5
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id > '4' and id <= '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4\0:5\0
> ^^^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4\0:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id >= '4' and id <= '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4:5\0
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where string_col = '4' and tinyint_col = 5 and id >= '4' and id <= '5'
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4:5\0
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where string_col = '4' and tinyint_col = 5
>   and id >= concat('', '4') and id <= concat('5', '')
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4:5\0
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.alltypesagg
> where bigint_col is not null and bool_col = true
> Actual does not match expected result:
>   HBASE KEYRANGE port=16020 :3
> ^
>   HBASE 

[jira] [Commented] (IMPALA-7039) Frontend HBase tests cannot tolerate HBase running on a different port

2018-05-17 Thread Adrian Ng (JIRA)

[ 
https://issues.apache.org/jira/browse/IMPALA-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16478641#comment-16478641
 ] 

Adrian Ng commented on IMPALA-7039:
---

[~tarasbob]  - could you please help with this one? 

> Frontend HBase tests cannot tolerate HBase running on a different port
> --
>
> Key: IMPALA-7039
> URL: https://issues.apache.org/jira/browse/IMPALA-7039
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Taras Bobrovytsky
>Priority: Blocker
>  Labels: broken-build
>
> When HBase doesn't get the same ports as usual, 
> org.apache.impala.planner.PlannerTest.testHbase and 
> org.apache.impala.planner.PlannerTest.testJoins fail with the following 
> errors:
> {noformat}
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.alltypessmall
> where id < 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16020 :3
> ^
>   HBASE KEYRANGE port=16022 3:7
>   HBASE KEYRANGE port=16023 7:
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16201 :3
>   HBASE KEYRANGE port=16202 3:7
>   HBASE KEYRANGE port=16203 7:
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id = '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 5:5\0
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 5:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id > '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 5\0:7
> ^
>   HBASE KEYRANGE port=16023 7:
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 5\0:7
>   HBASE KEYRANGE port=16203 7:
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id >= '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 5:7
> ^^^
>   HBASE KEYRANGE port=16023 7:
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 5:7
>   HBASE KEYRANGE port=16203 7:
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id < '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16020 :3
> ^
>   HBASE KEYRANGE port=16022 3:5
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16201 :3
>   HBASE KEYRANGE port=16202 3:5
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id > '4' and id < '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4\0:5
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4\0:5
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id >= '4' and id < '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4:5
> ^^^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4:5
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id > '4' and id <= '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4\0:5\0
> ^^^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4\0:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id >= '4' and id <= '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4:5\0
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where string_col = '4' and tinyint_col = 5 and id >= '4' and id <= '5'
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4:5\0
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where string_col = '4' and tinyint_col = 5
>   and id >= concat('', '4') and id <= concat('5', '')
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4:5\0
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.alltypesagg
> where bigint_col is not null and bool_col = true
> Actual does not match expected result:
>   HBASE 

[jira] [Assigned] (IMPALA-7039) Frontend HBase tests cannot tolerate HBase running on a different port

2018-05-17 Thread Adrian Ng (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrian Ng reassigned IMPALA-7039:
-

Assignee: Taras Bobrovytsky  (was: Adrian Ng)

> Frontend HBase tests cannot tolerate HBase running on a different port
> --
>
> Key: IMPALA-7039
> URL: https://issues.apache.org/jira/browse/IMPALA-7039
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Taras Bobrovytsky
>Priority: Blocker
>  Labels: broken-build
>
> When HBase doesn't get the same ports as usual, 
> org.apache.impala.planner.PlannerTest.testHbase and 
> org.apache.impala.planner.PlannerTest.testJoins fail with the following 
> errors:
> {noformat}
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.alltypessmall
> where id < 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16020 :3
> ^
>   HBASE KEYRANGE port=16022 3:7
>   HBASE KEYRANGE port=16023 7:
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16201 :3
>   HBASE KEYRANGE port=16202 3:7
>   HBASE KEYRANGE port=16203 7:
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id = '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 5:5\0
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 5:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id > '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 5\0:7
> ^
>   HBASE KEYRANGE port=16023 7:
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 5\0:7
>   HBASE KEYRANGE port=16203 7:
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id >= '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 5:7
> ^^^
>   HBASE KEYRANGE port=16023 7:
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 5:7
>   HBASE KEYRANGE port=16203 7:
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id < '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16020 :3
> ^
>   HBASE KEYRANGE port=16022 3:5
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16201 :3
>   HBASE KEYRANGE port=16202 3:5
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id > '4' and id < '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4\0:5
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4\0:5
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id >= '4' and id < '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4:5
> ^^^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4:5
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id > '4' and id <= '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4\0:5\0
> ^^^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4\0:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id >= '4' and id <= '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4:5\0
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where string_col = '4' and tinyint_col = 5 and id >= '4' and id <= '5'
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4:5\0
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where string_col = '4' and tinyint_col = 5
>   and id >= concat('', '4') and id <= concat('5', '')
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4:5\0
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.alltypesagg
> where bigint_col is not null and bool_col = true
> Actual does not match expected result:
>   HBASE KEYRANGE port=16020 :3
>