[jira] [Created] (IMPALA-7923) DecimalValue should be marked as packed

2018-12-03 Thread Tim Armstrong (JIRA)
Tim Armstrong created IMPALA-7923:
-

 Summary: DecimalValue should be marked as packed
 Key: IMPALA-7923
 URL: https://issues.apache.org/jira/browse/IMPALA-7923
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.1.0
Reporter: Tim Armstrong


IMPALA-7473 was a symptom of a more general problem that DecimalValue is not 
guaranteed to be aligned by the Impala runtime, but the class is not marked as 
packed and, under some circumstances, GCC will emit code for aligned loads to 
value_ when value_ is an int128. 

Testing helps confirm that the compiler does not emit the problematic loads in 
practice, but it would be better to mark the struct as packed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-2343) Capture operator timing information covering open/close & first/last batch close

2018-12-03 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-2343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-2343.
---
   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> Capture operator timing information covering open/close & first/last batch 
> close
> 
>
> Key: IMPALA-2343
> URL: https://issues.apache.org/jira/browse/IMPALA-2343
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.2.4
>Reporter: Mostafa Mokhtar
>Assignee: Tim Armstrong
>Priority: Minor
>  Labels: performance, supportability
> Fix For: Impala 3.2.0
>
> Attachments: 
> 0001-Add-start-and-end-time-to-a-bunch-of-plan-nodes.patch
>
>
> Currently Impala query profile doesn't cover operator level timeline, which 
> makes it difficult to understand the query timeline and fragment dependencies.
> Such information will allow us to provide query swim lanes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-2343) Capture operator timing information covering open/close & first/last batch close

2018-12-03 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-2343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-2343.
---
   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> Capture operator timing information covering open/close & first/last batch 
> close
> 
>
> Key: IMPALA-2343
> URL: https://issues.apache.org/jira/browse/IMPALA-2343
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.2.4
>Reporter: Mostafa Mokhtar
>Assignee: Tim Armstrong
>Priority: Minor
>  Labels: performance, supportability
> Fix For: Impala 3.2.0
>
> Attachments: 
> 0001-Add-start-and-end-time-to-a-bunch-of-plan-nodes.patch
>
>
> Currently Impala query profile doesn't cover operator level timeline, which 
> makes it difficult to understand the query timeline and fragment dependencies.
> Such information will allow us to provide query swim lanes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-6741) Profiles of running queries should tell last update time of counters

2018-12-03 Thread Michael Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho resolved IMPALA-6741.

   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> Profiles of running queries should tell last update time of counters
> 
>
> Key: IMPALA-6741
> URL: https://issues.apache.org/jira/browse/IMPALA-6741
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Balazs Jeszenszky
>Assignee: Michael Ho
>Priority: Major
>  Labels: ramp-up, supportability
> Fix For: Impala 3.2.0
>
>
> When looking at the profile of a running query, it's impossible to tell the 
> degree of accuracy. We've seen issues both with instances not checking in 
> with the coordinator for a long time, and with hung instances that never 
> update their counters. There are some specific issues as well, see 
> IMPALA-5200. This means that profiles taken off of running queries can't be 
> used perf troubleshooting with confidence.
> Ideally, Impala should guarantee counters to be written at a certain 
> interval, and warn for counters or instances that are out of sync for some 
> reason.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-6741) Profiles of running queries should tell last update time of counters

2018-12-03 Thread Michael Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho resolved IMPALA-6741.

   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> Profiles of running queries should tell last update time of counters
> 
>
> Key: IMPALA-6741
> URL: https://issues.apache.org/jira/browse/IMPALA-6741
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Balazs Jeszenszky
>Assignee: Michael Ho
>Priority: Major
>  Labels: ramp-up, supportability
> Fix For: Impala 3.2.0
>
>
> When looking at the profile of a running query, it's impossible to tell the 
> degree of accuracy. We've seen issues both with instances not checking in 
> with the coordinator for a long time, and with hung instances that never 
> update their counters. There are some specific issues as well, see 
> IMPALA-5200. This means that profiles taken off of running queries can't be 
> used perf troubleshooting with confidence.
> Ideally, Impala should guarantee counters to be written at a certain 
> interval, and warn for counters or instances that are out of sync for some 
> reason.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IMPALA-2990) Coordinator should timeout a connection for an unresponsive backend

2018-12-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708083#comment-16708083
 ] 

ASF subversion and git services commented on IMPALA-2990:
-

Commit df4ccf5ddf9340049d193d7e6244c8af88a2dd5c in impala's branch 
refs/heads/master from Michael Ho
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=df4ccf5 ]

IMPALA-6741: Add timestamp of fragment instance's status updates

Currently, the profile of a running query doesn't contain any
timestamps for the last updates from the fragment instances.
This makes it hard to differentiate between when a fragment
instance failed to send status reports to the coordinator
for various reasons (e.g. IMPALA-2990) or a truly stuck
fragment instance.

This change adds a timestamp to a fragment instance's profile
to record the time when the coordinator last received a status
update from it. Note that it's possible that there is delay
between when the status was created on the executor and when
it arrived at the coordinator. Given that the clocks are not
necessarily synchronized across all executors, the receiving
time of the update at the coordinator seems easier to make sense of.

Sample output:

Fragment F01:
  Instance 494d948d3235441a:23eae1790001 (host=???):(Total: 15.099ms, 
non-child: 263.951us, % non-child: 1.75%)
Last report received time: 2018-11-27 16:57:30.014
Hdfs split stats (:<# splits>/): 0:1/1.58 KB
Fragment Instance Lifecycle Event Timeline: 15.622ms
   - Prepare Finished: 1.026ms (1.026ms)
   - Open Finished: 1.137ms (110.297us)
   - First Batch Produced: 15.010ms (13.873ms)
   - First Batch Sent: 15.080ms (70.715us)
   - ExecInternal Finished: 15.622ms (541.181us)

Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
Reviewed-on: http://gerrit.cloudera.org:8080/12000
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Coordinator should timeout a connection for an unresponsive backend
> ---
>
> Key: IMPALA-2990
> URL: https://issues.apache.org/jira/browse/IMPALA-2990
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 2.3.0
>Reporter: Sailesh Mukil
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>  Labels: hang, observability, supportability
>
> The coordinator currently waits indefinitely if it does not hear back from a 
> backend. This could cause a query to hang indefinitely in case of a network 
> error, etc.
> We should add logic for determining when a backend is unresponsive and kill 
> the query. The logic should mostly revolve around Coordinator::Wait() and 
> Coordinator::UpdateFragmentExecStatus() based on whether it receives periodic 
> updates from a backed (via FragmentExecState::ReportStatusCb()).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-2343) Capture operator timing information covering open/close & first/last batch close

2018-12-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-2343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708084#comment-16708084
 ] 

ASF subversion and git services commented on IMPALA-2343:
-

Commit c23c852bd0d019df670109fa7420e107a757e6dd in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=c23c852 ]

IMPALA-2343: Add lifecycle timeline to plan nodes

Track the time when various significant events in the lifecycle
of an ExecNode occur the timeline of fragment execution. The events
tracked are: 'Open Started', 'Open Finished', 'First Batch Fetched',
'First Batch Returned', 'Last Batch Returned', 'Closed'.

This uses the existing EventSequence infrastructure so that time will
correspond to the fragment instance lifecycle timelines. It's
implemented mostly using scoped objects that add the events when
entering or exiting Open() and GetNext().

These times are not set inside subplans because it would be mostly
redundant with the counters in the containing subplan node.

Also fix MaterializeTupleTime for Kudu scan nodes to match the timing of
other scanners, where it measures the time spent in the scanner threads,
not the time in the main fragment execution thread.

Testing:
Added a basic test to verify that the event sequence is present.

Manually inspected some profiles.

Verified the Kudu timer by running a query with scanner parallelism and
checking that MaterializeTupleTime was > the wallclock time.

  set num_nodes=1;
  select * from tpch_kudu.lineitem
  where lower(l_comment) = 'foo';

Perf:
Ran TPC-H 10 locally. There was no significant perf change.

Ran TPC-H Nested locally. There was no significant perf change.

Change-Id: I15341bdb15022bad9814882689ce5cb2939f4653
Reviewed-on: http://gerrit.cloudera.org:8080/11992
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Capture operator timing information covering open/close & first/last batch 
> close
> 
>
> Key: IMPALA-2343
> URL: https://issues.apache.org/jira/browse/IMPALA-2343
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.2.4
>Reporter: Mostafa Mokhtar
>Assignee: Tim Armstrong
>Priority: Minor
>  Labels: performance, supportability
> Attachments: 
> 0001-Add-start-and-end-time-to-a-bunch-of-plan-nodes.patch
>
>
> Currently Impala query profile doesn't cover operator level timeline, which 
> makes it difficult to understand the query timeline and fragment dependencies.
> Such information will allow us to provide query swim lanes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6741) Profiles of running queries should tell last update time of counters

2018-12-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708082#comment-16708082
 ] 

ASF subversion and git services commented on IMPALA-6741:
-

Commit df4ccf5ddf9340049d193d7e6244c8af88a2dd5c in impala's branch 
refs/heads/master from Michael Ho
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=df4ccf5 ]

IMPALA-6741: Add timestamp of fragment instance's status updates

Currently, the profile of a running query doesn't contain any
timestamps for the last updates from the fragment instances.
This makes it hard to differentiate between when a fragment
instance failed to send status reports to the coordinator
for various reasons (e.g. IMPALA-2990) or a truly stuck
fragment instance.

This change adds a timestamp to a fragment instance's profile
to record the time when the coordinator last received a status
update from it. Note that it's possible that there is delay
between when the status was created on the executor and when
it arrived at the coordinator. Given that the clocks are not
necessarily synchronized across all executors, the receiving
time of the update at the coordinator seems easier to make sense of.

Sample output:

Fragment F01:
  Instance 494d948d3235441a:23eae1790001 (host=???):(Total: 15.099ms, 
non-child: 263.951us, % non-child: 1.75%)
Last report received time: 2018-11-27 16:57:30.014
Hdfs split stats (:<# splits>/): 0:1/1.58 KB
Fragment Instance Lifecycle Event Timeline: 15.622ms
   - Prepare Finished: 1.026ms (1.026ms)
   - Open Finished: 1.137ms (110.297us)
   - First Batch Produced: 15.010ms (13.873ms)
   - First Batch Sent: 15.080ms (70.715us)
   - ExecInternal Finished: 15.622ms (541.181us)

Change-Id: Iae3dcddc292d694d7003d10ed0caccfceed7d8fa
Reviewed-on: http://gerrit.cloudera.org:8080/12000
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Profiles of running queries should tell last update time of counters
> 
>
> Key: IMPALA-6741
> URL: https://issues.apache.org/jira/browse/IMPALA-6741
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Balazs Jeszenszky
>Assignee: Michael Ho
>Priority: Major
>  Labels: ramp-up, supportability
>
> When looking at the profile of a running query, it's impossible to tell the 
> degree of accuracy. We've seen issues both with instances not checking in 
> with the coordinator for a long time, and with hung instances that never 
> update their counters. There are some specific issues as well, see 
> IMPALA-5200. This means that profiles taken off of running queries can't be 
> used perf troubleshooting with confidence.
> Ideally, Impala should guarantee counters to be written at a certain 
> interval, and warn for counters or instances that are out of sync for some 
> reason.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7922) Impala 2.x build is broken due to update to Jackson dependencies

2018-12-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708081#comment-16708081
 ] 

ASF subversion and git services commented on IMPALA-7922:
-

Commit e288128ba244d073810c5cac0858187e64297d2c in impala's branch 
refs/heads/2.x from [~fredyw]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=e288128 ]

IMPALA-7922: Fix 2.x broken build due to Jackson dependency update

This patch forces some Jackson dependencies to specific versions.

Testing:
- Ran ./buildall.sh -format -testdata -notests successfully

Change-Id: Icf03f5db5b61187376c98e89a4eb16b90572cdd4
Reviewed-on: http://gerrit.cloudera.org:8080/12027
Reviewed-by: Philip Zeyliger 
Tested-by: Fredy Wijaya 


> Impala 2.x build is broken due to update to Jackson dependencies
> 
>
> Key: IMPALA-7922
> URL: https://issues.apache.org/jira/browse/IMPALA-7922
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Blocker
>
> {noformat}
> *Scanning dependencies of target yarn-extras*
> 
> Running mvn  -B install -DskipTests
> Directory /mnt/volume1/impala-orc/incubator-impala/common/yarn-extras
> 
> [WARNING] Could not transfer metadata
> com.cloudera.cdh:cdh-root:5.16.0-SNAPSHOT/maven-metadata.xml from/to
> ${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}): No connector available to
> access repository ${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}) of type
> default using the available factories WagonRepositoryConnectorFactory
> [INFO] BUILD FAILURE
> [ERROR] Failed to execute goal on project yarn-extras: Could not resolve
> dependencies for project org.apache.impala:yarn-extras:jar:0.1-SNAPSHOT:
> Failed to collect dependencies for
> [org.apache.hadoop:hadoop-common:jar:2.6.0-cdh5.16.0-SNAPSHOT (compile),
> org.apache.hadoop:hadoop-yarn-api:jar:2.6.0-cdh5.16.0-SNAPSHOT (compile),
> org.apache.hadoop:hadoop-yarn-common:jar:2.6.0-cdh5.16.0-SNAPSHOT
> (compile)]: Failed to read artifact descriptor for
> org.codehaus.jackson:jackson-mapper-asl:jar:${cdh.jackson-mapper-asl.version}:
> Could not transfer artifact
> org.codehaus.jackson:jackson-mapper-asl:pom:${cdh.jackson-mapper-asl.version}
> from/to cdh.rcs.releases.repo (
> https://repository.cloudera.com/content/groups/cdh-releases-rcs): Illegal
> character in path at index 105:
> https://repository.cloudera.com/content/groups/cdh-releases-rcs/org/codehaus/jackson/jackson-mapper-asl/${cdh.jackson-mapper-asl.version}/jackson-mapper-asl-${cdh.jackson-mapper-asl.version}.pom
> -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions,
> please read the following articles:
> [ERROR] [Help 1]
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-6754) Exchange node includes FirstBatchArrivalWaitTime in summary

2018-12-03 Thread Michael Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho resolved IMPALA-6754.

   Resolution: Fixed
Fix Version/s: Impala 2.12.0
   Impala 3.1.0

The problem has been fixed (albeit not intentionally) with KRPC. Impala 2.12.0 
and beyond have KRPC enabled by default. In Impala 3.1.0 and upstream, KRPC is 
always on so there is not much we need to do.

> Exchange node includes FirstBatchArrivalWaitTime in summary
> ---
>
> Key: IMPALA-6754
> URL: https://issues.apache.org/jira/browse/IMPALA-6754
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Balazs Jeszenszky
>Assignee: Michael Ho
>Priority: Minor
>  Labels: observability
> Fix For: Impala 3.1.0, Impala 2.12.0
>
>
> In the following execution summary:
> {code:java}
> Operator  #Hosts   Avg Time   Max Time  #Rows  Est. #Rows   Peak Mem  
> Est. Peak Mem  Detail
> ---
> 15:AGGREGATE   1  141.556us  141.556us  1   1   20.00 KB  
>  10.00 MB  FINALIZE
> 14:EXCHANGE1  2h46m  2h46m  1   1  0  
> 0  UNPARTITIONED
> 09:AGGREGATE   1   16s442ms   16s442ms  1   1  768.00 KB  
>  10.00 MB
> 08:HASH JOIN   1  1h38m  1h38m  2.63B  -1  122.64 MB  
>   2.00 GB  LEFT OUTER JOIN, BROADCAST
> [...]
> {code}
> the timer for the EXCHANGE node is misleading. It's unlikely that sending a 
> single row across network took so long, and individual counters confirm:
> {code:java}
>   EXCHANGE_NODE (id=14):(Total: 2h46m, non-child: 2h46m, % non-child: 
> 100.00%)
>  - ConvertRowBatchTime: 901.000ns
>  - PeakMemoryUsage: 0
>  - RowsReturned: 1 (1)
>  - RowsReturnedRate: 0
> DataStreamReceiver:
>- BytesReceived: 16.00 B (16)
>- DeserializeRowBatchTimer: 4.965us
>- FirstBatchArrivalWaitTime: 2h46m
>- PeakMemoryUsage: 4.01 KB (4104)
>- SendersBlockedTimer: 0.000ns
>- SendersBlockedTotalTimer(*): 0.000ns
> {code}
> In this case, the underlying joins took long (which is correctly reported in 
> the rest of the profile). 
> Exchange timers should reflect the time it took to transfer rows across 
> network.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6754) Exchange node includes FirstBatchArrivalWaitTime in summary

2018-12-03 Thread Michael Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708058#comment-16708058
 ] 

Michael Ho commented on IMPALA-6754:


Things have definitely improved with KRPC. Please find a sample excerpt below:
{noformat}
Operator  #Hosts   Avg Time   Max Time  #Rows  Est. #Rows   Peak Mem  
Est. Peak Mem  Detail 

09:AGGREGATE   1  249.568us  249.568us  1   1   16.00 KB
   10.00 MB  FINALIZE   
08:EXCHANGE1   45.272us   45.272us  3   1   32.00 KB
   16.00 KB  UNPARTITIONED  
04:AGGREGATE   3  114.728us  140.725us  3   1   16.00 KB
   10.00 MB 
07:AGGREGATE   31.954ms2.181ms 10  101.95 MB
   10.00 MB 
06:EXCHANGE3   16.369us   17.213us 10  10   16.00 KB
   16.00 KB  HASH(a.int_col)
03:AGGREGATE   3  998.326us1.110ms 10  102.09 MB
   10.00 MB  STREAMING  
02:HASH JOIN   35s036ms6s042ms100   7.30K1.99 MB
1.94 MB  INNER JOIN, BROADCAST  
|--05:EXCHANGE 3   63.364us   81.605us100 100   48.00 KB
   16.00 KB  BROADCAST  
|  01:SCAN HDFS34.197ms4.988ms100 100   50.00 KB
   32.00 MB  functional.alltypessmall b 
00:SCAN HDFS   3   35.649ms   44.940ms  7.30K   7.30K  419.00 KB
  128.00 MB  functional.alltypes a
{noformat}

The exchange node shows a much higher total time, mostly due to the long 
arrival time of the first batch.
{noformat}
EXCHANGE_NODE (id=6):(Total: 6s074ms, non-child: 16.886us, % non-child: 
0.00%)
   - ConvertRowBatchTime: 2.564us
   - PeakMemoryUsage: 16.00 KB (16384)
   - RowsReturned: 2 (2)
   - RowsReturnedRate: 0
  Buffer pool:
 - AllocTime: 2.118us
 - CumulativeAllocationBytes: 16.00 KB (16384)
 - CumulativeAllocations: 2 (2)
 - PeakReservation: 16.00 KB (16384)
 - PeakUnpinnedBytes: 0
 - PeakUsedReservation: 16.00 KB (16384)
 - ReadIoBytes: 0
 - ReadIoOps: 0 (0)
 - ReadIoWaitTime: 0.000ns
 - WriteIoBytes: 0
 - WriteIoOps: 0 (0)
 - WriteIoWaitTime: 0.000ns
  Dequeue:
BytesDequeued(500.000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
 - FirstBatchWaitTime: 6s074ms
 - TotalBytesDequeued: 26.00 B (26)
 - TotalGetBatchTime: 6s074ms
   - DataWaitTime: 6s074ms
{noformat}


> Exchange node includes FirstBatchArrivalWaitTime in summary
> ---
>
> Key: IMPALA-6754
> URL: https://issues.apache.org/jira/browse/IMPALA-6754
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0
>Reporter: Balazs Jeszenszky
>Assignee: Michael Ho
>Priority: Minor
>  Labels: observability
>
> In the following execution summary:
> {code:java}
> Operator  #Hosts   Avg Time   Max Time  #Rows  Est. #Rows   Peak Mem  
> Est. Peak Mem  Detail
> ---
> 15:AGGREGATE   1  141.556us  141.556us  1   1   20.00 KB  
>  10.00 MB  FINALIZE
> 14:EXCHANGE1  2h46m  2h46m  1   1  0  
> 0  UNPARTITIONED
> 09:AGGREGATE   1   16s442ms   16s442ms  1   1  768.00 KB  
>  10.00 MB
> 08:HASH JOIN   1  1h38m  1h38m  2.63B  -1  122.64 MB  
>   2.00 GB  LEFT OUTER JOIN, BROADCAST
> [...]
> {code}
> the timer for the EXCHANGE node is misleading. It's unlikely that sending a 
> single row across network took so long, and individual counters confirm:
> {code:java}
>   EXCHANGE_NODE (id=14):(Total: 2h46m, non-child: 2h46m, % non-child: 
> 100.00%)
>  - ConvertRowBatchTime: 901.000ns
>  - PeakMemoryUsage: 0
>  - RowsReturned: 1 (1)
>  - RowsReturnedRate: 0
> DataStreamReceiver:
>- BytesReceived: 16.00 B (16)
>- DeserializeRowBatchTimer: 4.965us
>- FirstBatchArrivalWaitTime: 2h46m
>- PeakMemoryUsage: 4.01 KB (4104)
>- SendersBlockedTimer: 0.000ns
>- SendersBlockedTotalTimer(*): 0.000ns
> {code}
> In this case, the underlying joins took long (which is correctly reported in 
> the rest 

[jira] [Commented] (IMPALA-7326) test_kudu_partition_ddl failed with exception message: "Table already exists"

2018-12-03 Thread bharath v (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707986#comment-16707986
 ] 

bharath v commented on IMPALA-7326:
---

The stack trace looks like it is coming from a thread with a separate tid 
(4057) than the one that reports the error. I'm not sure what is triggering it. 
Based on my understanding, HBaseClient has a zookeeper dependency and it 
typically dumps these stacks (some hbase test running?). I don't know if 
KuduClient does something similar.

> test_kudu_partition_ddl failed with exception message: "Table already exists"
> -
>
> Key: IMPALA-7326
> URL: https://issues.apache.org/jira/browse/IMPALA-7326
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0, Impala 3.2.0
>Reporter: Michael Ho
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>  Labels: broken-build, flaky, kudu
>
> cc'ing [~twm378]. Does it look like some known issue ? Putting it in the 
> catalog category for now but please feel free to update the component as you 
> see fit.
> {noformat}
> query_test/test_kudu.py:96: in test_kudu_partition_ddl
> self.run_test_case('QueryTest/kudu_partition_ddl', vector, 
> use_db=unique_database)
> common/impala_test_suite.py:397: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:612: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:339: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:335: in execute_query_async
> return self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:460: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: ImpalaRuntimeException: Error creating Kudu table 
> 'impala::test_kudu_partition_ddl_7e04e8f9.simple_hash_range'
> E   CAUSED BY: NonRecoverableException: Table 
> impala::test_kudu_partition_ddl_7e04e8f9.simple_hash_range already exists 
> with id 3e81a4ceff27471cad9fcb3bc0b977c3
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-7922) Impala 2.x build is broken due to update to Jackson dependencies

2018-12-03 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya resolved IMPALA-7922.
--
Resolution: Fixed

> Impala 2.x build is broken due to update to Jackson dependencies
> 
>
> Key: IMPALA-7922
> URL: https://issues.apache.org/jira/browse/IMPALA-7922
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Blocker
>
> {noformat}
> *Scanning dependencies of target yarn-extras*
> 
> Running mvn  -B install -DskipTests
> Directory /mnt/volume1/impala-orc/incubator-impala/common/yarn-extras
> 
> [WARNING] Could not transfer metadata
> com.cloudera.cdh:cdh-root:5.16.0-SNAPSHOT/maven-metadata.xml from/to
> ${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}): No connector available to
> access repository ${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}) of type
> default using the available factories WagonRepositoryConnectorFactory
> [INFO] BUILD FAILURE
> [ERROR] Failed to execute goal on project yarn-extras: Could not resolve
> dependencies for project org.apache.impala:yarn-extras:jar:0.1-SNAPSHOT:
> Failed to collect dependencies for
> [org.apache.hadoop:hadoop-common:jar:2.6.0-cdh5.16.0-SNAPSHOT (compile),
> org.apache.hadoop:hadoop-yarn-api:jar:2.6.0-cdh5.16.0-SNAPSHOT (compile),
> org.apache.hadoop:hadoop-yarn-common:jar:2.6.0-cdh5.16.0-SNAPSHOT
> (compile)]: Failed to read artifact descriptor for
> org.codehaus.jackson:jackson-mapper-asl:jar:${cdh.jackson-mapper-asl.version}:
> Could not transfer artifact
> org.codehaus.jackson:jackson-mapper-asl:pom:${cdh.jackson-mapper-asl.version}
> from/to cdh.rcs.releases.repo (
> https://repository.cloudera.com/content/groups/cdh-releases-rcs): Illegal
> character in path at index 105:
> https://repository.cloudera.com/content/groups/cdh-releases-rcs/org/codehaus/jackson/jackson-mapper-asl/${cdh.jackson-mapper-asl.version}/jackson-mapper-asl-${cdh.jackson-mapper-asl.version}.pom
> -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions,
> please read the following articles:
> [ERROR] [Help 1]
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IMPALA-7326) test_kudu_partition_ddl failed with exception message: "Table already exists"

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker updated IMPALA-7326:

Affects Version/s: Impala 3.2.0

> test_kudu_partition_ddl failed with exception message: "Table already exists"
> -
>
> Key: IMPALA-7326
> URL: https://issues.apache.org/jira/browse/IMPALA-7326
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0, Impala 3.2.0
>Reporter: Michael Ho
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>  Labels: broken-build, flaky, kudu
>
> cc'ing [~twm378]. Does it look like some known issue ? Putting it in the 
> catalog category for now but please feel free to update the component as you 
> see fit.
> {noformat}
> query_test/test_kudu.py:96: in test_kudu_partition_ddl
> self.run_test_case('QueryTest/kudu_partition_ddl', vector, 
> use_db=unique_database)
> common/impala_test_suite.py:397: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:612: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:339: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:335: in execute_query_async
> return self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:460: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: ImpalaRuntimeException: Error creating Kudu table 
> 'impala::test_kudu_partition_ddl_7e04e8f9.simple_hash_range'
> E   CAUSED BY: NonRecoverableException: Table 
> impala::test_kudu_partition_ddl_7e04e8f9.simple_hash_range already exists 
> with id 3e81a4ceff27471cad9fcb3bc0b977c3
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-7326) test_kudu_partition_ddl failed with exception message: "Table already exists"

2018-12-03 Thread Lars Volker (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707967#comment-16707967
 ] 

Lars Volker edited comment on IMPALA-7326 at 12/3/18 11:27 PM:
---

I think I've seen this again, although with a slightly different error message. 
[~twmarshall] - Can you please have a look?

{noformat}
06:45:54 raise HiveServer2Error(resp.status.errorMessage)
06:45:54 E   HiveServer2Error: ImpalaRuntimeException: Error creating Kudu 
table 'impala::test_large_strings_7779cca8.min_max_filter_large_strings2'
06:45:54 E   CAUSED BY: ImpalaRuntimeException: Table 
'impala::test_large_strings_7779cca8.min_max_filter_large_strings2' already 
exists in Kudu.
{noformat}

Around the time of the error I noticed a bunch of Zookeeper issues in the 
catalog log:

{noformat}
W1201 05:52:38.762055  4057 ClientCnxn.java:1147] Session 0x0 for server null, 
unexpected error, closing socket connection and attempting reconnect
Java exception follows:
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1126)
W1201 05:52:39.862617  4057 ClientCnxn.java:1147] Session 0x0 for server null, 
unexpected error, closing socket connection and attempting reconnect
Java exception follows:
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1126)
E1201 05:52:39.874665 14534 catalog-server.cc:107] ImpalaRuntimeException: 
Error creating Kudu table 
'impala::test_large_strings_7779cca8.min_max_filter_large_strings2'
CAUSED BY: ImpalaRuntimeException: Table 
'impala::test_large_strings_7779cca8.min_max_filter_large_strings2' already 
exists in Kudu.
W1201 05:52:39.963085  4057 ClientCnxn.java:1147] Session 0x0 for server null, 
unexpected error, closing socket connection and attempting reconnect
Java exception follows:
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1126)
{noformat}

Would Zookeeper unavailability explain why Kudu thinks a table already exists? 
The mismatch of timestamps is because Jenkins reported the error at the end of 
the test run.

[~bharathv], [~tianyiwang] - Have you seen these errors before?


was (Author: lv):
I think I've seen this again, although with a slightly different error message. 
[~twmarshall] - Can you please have a look?

{noformat}
06:45:54 raise HiveServer2Error(resp.status.errorMessage)
06:45:54 E   HiveServer2Error: ImpalaRuntimeException: Error creating Kudu 
table 'impala::test_large_strings_7779cca8.min_max_filter_large_strings2'
06:45:54 E   CAUSED BY: ImpalaRuntimeException: Table 
'impala::test_large_strings_7779cca8.min_max_filter_large_strings2' already 
exists in Kudu.
{noformat}

> test_kudu_partition_ddl failed with exception message: "Table already exists"
> -
>
> Key: IMPALA-7326
> URL: https://issues.apache.org/jira/browse/IMPALA-7326
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0, Impala 3.2.0
>Reporter: Michael Ho
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>  Labels: broken-build, flaky, kudu
>
> cc'ing [~twm378]. Does it look like some known issue ? Putting it in the 
> catalog category for now but please feel free to update the component as you 
> see fit.
> {noformat}
> query_test/test_kudu.py:96: in test_kudu_partition_ddl
> self.run_test_case('QueryTest/kudu_partition_ddl', vector, 
> use_db=unique_database)
> common/impala_test_suite.py:397: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:612: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> 

[jira] [Resolved] (IMPALA-7922) Impala 2.x build is broken due to update to Jackson dependencies

2018-12-03 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya resolved IMPALA-7922.
--
Resolution: Fixed

> Impala 2.x build is broken due to update to Jackson dependencies
> 
>
> Key: IMPALA-7922
> URL: https://issues.apache.org/jira/browse/IMPALA-7922
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Blocker
>
> {noformat}
> *Scanning dependencies of target yarn-extras*
> 
> Running mvn  -B install -DskipTests
> Directory /mnt/volume1/impala-orc/incubator-impala/common/yarn-extras
> 
> [WARNING] Could not transfer metadata
> com.cloudera.cdh:cdh-root:5.16.0-SNAPSHOT/maven-metadata.xml from/to
> ${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}): No connector available to
> access repository ${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}) of type
> default using the available factories WagonRepositoryConnectorFactory
> [INFO] BUILD FAILURE
> [ERROR] Failed to execute goal on project yarn-extras: Could not resolve
> dependencies for project org.apache.impala:yarn-extras:jar:0.1-SNAPSHOT:
> Failed to collect dependencies for
> [org.apache.hadoop:hadoop-common:jar:2.6.0-cdh5.16.0-SNAPSHOT (compile),
> org.apache.hadoop:hadoop-yarn-api:jar:2.6.0-cdh5.16.0-SNAPSHOT (compile),
> org.apache.hadoop:hadoop-yarn-common:jar:2.6.0-cdh5.16.0-SNAPSHOT
> (compile)]: Failed to read artifact descriptor for
> org.codehaus.jackson:jackson-mapper-asl:jar:${cdh.jackson-mapper-asl.version}:
> Could not transfer artifact
> org.codehaus.jackson:jackson-mapper-asl:pom:${cdh.jackson-mapper-asl.version}
> from/to cdh.rcs.releases.repo (
> https://repository.cloudera.com/content/groups/cdh-releases-rcs): Illegal
> character in path at index 105:
> https://repository.cloudera.com/content/groups/cdh-releases-rcs/org/codehaus/jackson/jackson-mapper-asl/${cdh.jackson-mapper-asl.version}/jackson-mapper-asl-${cdh.jackson-mapper-asl.version}.pom
> -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions,
> please read the following articles:
> [ERROR] [Help 1]
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7922) Impala 2.x build is broken due to update to Jackson dependencies

2018-12-03 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya updated IMPALA-7922:
-
Affects Version/s: (was: Impala 2.12.0)

> Impala 2.x build is broken due to update to Jackson dependencies
> 
>
> Key: IMPALA-7922
> URL: https://issues.apache.org/jira/browse/IMPALA-7922
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Blocker
>
> {noformat}
> *Scanning dependencies of target yarn-extras*
> 
> Running mvn  -B install -DskipTests
> Directory /mnt/volume1/impala-orc/incubator-impala/common/yarn-extras
> 
> [WARNING] Could not transfer metadata
> com.cloudera.cdh:cdh-root:5.16.0-SNAPSHOT/maven-metadata.xml from/to
> ${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}): No connector available to
> access repository ${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}) of type
> default using the available factories WagonRepositoryConnectorFactory
> [INFO] BUILD FAILURE
> [ERROR] Failed to execute goal on project yarn-extras: Could not resolve
> dependencies for project org.apache.impala:yarn-extras:jar:0.1-SNAPSHOT:
> Failed to collect dependencies for
> [org.apache.hadoop:hadoop-common:jar:2.6.0-cdh5.16.0-SNAPSHOT (compile),
> org.apache.hadoop:hadoop-yarn-api:jar:2.6.0-cdh5.16.0-SNAPSHOT (compile),
> org.apache.hadoop:hadoop-yarn-common:jar:2.6.0-cdh5.16.0-SNAPSHOT
> (compile)]: Failed to read artifact descriptor for
> org.codehaus.jackson:jackson-mapper-asl:jar:${cdh.jackson-mapper-asl.version}:
> Could not transfer artifact
> org.codehaus.jackson:jackson-mapper-asl:pom:${cdh.jackson-mapper-asl.version}
> from/to cdh.rcs.releases.repo (
> https://repository.cloudera.com/content/groups/cdh-releases-rcs): Illegal
> character in path at index 105:
> https://repository.cloudera.com/content/groups/cdh-releases-rcs/org/codehaus/jackson/jackson-mapper-asl/${cdh.jackson-mapper-asl.version}/jackson-mapper-asl-${cdh.jackson-mapper-asl.version}.pom
> -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions,
> please read the following articles:
> [ERROR] [Help 1]
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7326) test_kudu_partition_ddl failed with exception message: "Table already exists"

2018-12-03 Thread Lars Volker (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707967#comment-16707967
 ] 

Lars Volker commented on IMPALA-7326:
-

I think I've seen this again, although with a slightly different error message. 
[~twmarshall] - Can you please have a look?

{noformat}
06:45:54 raise HiveServer2Error(resp.status.errorMessage)
06:45:54 E   HiveServer2Error: ImpalaRuntimeException: Error creating Kudu 
table 'impala::test_large_strings_7779cca8.min_max_filter_large_strings2'
06:45:54 E   CAUSED BY: ImpalaRuntimeException: Table 
'impala::test_large_strings_7779cca8.min_max_filter_large_strings2' already 
exists in Kudu.
{noformat}

> test_kudu_partition_ddl failed with exception message: "Table already exists"
> -
>
> Key: IMPALA-7326
> URL: https://issues.apache.org/jira/browse/IMPALA-7326
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0, Impala 3.2.0
>Reporter: Michael Ho
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>  Labels: broken-build, flaky, kudu
>
> cc'ing [~twm378]. Does it look like some known issue ? Putting it in the 
> catalog category for now but please feel free to update the component as you 
> see fit.
> {noformat}
> query_test/test_kudu.py:96: in test_kudu_partition_ddl
> self.run_test_case('QueryTest/kudu_partition_ddl', vector, 
> use_db=unique_database)
> common/impala_test_suite.py:397: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:612: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:339: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:335: in execute_query_async
> return self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:460: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: ImpalaRuntimeException: Error creating Kudu table 
> 'impala::test_kudu_partition_ddl_7e04e8f9.simple_hash_range'
> E   CAUSED BY: NonRecoverableException: Table 
> impala::test_kudu_partition_ddl_7e04e8f9.simple_hash_range already exists 
> with id 3e81a4ceff27471cad9fcb3bc0b977c3
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-7326) test_kudu_partition_ddl failed with exception message: "Table already exists"

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker reopened IMPALA-7326:
-

> test_kudu_partition_ddl failed with exception message: "Table already exists"
> -
>
> Key: IMPALA-7326
> URL: https://issues.apache.org/jira/browse/IMPALA-7326
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0, Impala 3.2.0
>Reporter: Michael Ho
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>  Labels: broken-build, flaky, kudu
>
> cc'ing [~twm378]. Does it look like some known issue ? Putting it in the 
> catalog category for now but please feel free to update the component as you 
> see fit.
> {noformat}
> query_test/test_kudu.py:96: in test_kudu_partition_ddl
> self.run_test_case('QueryTest/kudu_partition_ddl', vector, 
> use_db=unique_database)
> common/impala_test_suite.py:397: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:612: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:339: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:335: in execute_query_async
> return self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:460: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: ImpalaRuntimeException: Error creating Kudu table 
> 'impala::test_kudu_partition_ddl_7e04e8f9.simple_hash_range'
> E   CAUSED BY: NonRecoverableException: Table 
> impala::test_kudu_partition_ddl_7e04e8f9.simple_hash_range already exists 
> with id 3e81a4ceff27471cad9fcb3bc0b977c3
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7326) test_kudu_partition_ddl failed with exception message: "Table already exists"

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker updated IMPALA-7326:

Labels: broken-build flaky kudu  (was: broken-build)

> test_kudu_partition_ddl failed with exception message: "Table already exists"
> -
>
> Key: IMPALA-7326
> URL: https://issues.apache.org/jira/browse/IMPALA-7326
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0, Impala 3.2.0
>Reporter: Michael Ho
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>  Labels: broken-build, flaky, kudu
>
> cc'ing [~twm378]. Does it look like some known issue ? Putting it in the 
> catalog category for now but please feel free to update the component as you 
> see fit.
> {noformat}
> query_test/test_kudu.py:96: in test_kudu_partition_ddl
> self.run_test_case('QueryTest/kudu_partition_ddl', vector, 
> use_db=unique_database)
> common/impala_test_suite.py:397: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:612: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:160: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:173: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:339: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:335: in execute_query_async
> return self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:460: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EINNER EXCEPTION: 
> EMESSAGE: ImpalaRuntimeException: Error creating Kudu table 
> 'impala::test_kudu_partition_ddl_7e04e8f9.simple_hash_range'
> E   CAUSED BY: NonRecoverableException: Table 
> impala::test_kudu_partition_ddl_7e04e8f9.simple_hash_range already exists 
> with id 3e81a4ceff27471cad9fcb3bc0b977c3
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7881) Visualize AST for easier debugging

2018-12-03 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707959#comment-16707959
 ] 

ASF subversion and git services commented on IMPALA-7881:
-

Commit 971adb2f8f5614473d115ac1d775a8fc3ccee201 in impala's branch 
refs/heads/master from [~paul-rogers]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=971adb2 ]

IMPALA-7881 (Part 2): Visualize AST for easier debugging

The AST visualizer has turned out to be very handy for debugging
analyzer issues. This patch contains another set of enhancements to make
it easier to use, including an easy way to visualize a node (and a few
of its descendents) or an entire tree from a debug session in Eclipse.

Testing: This is a test-only feature, it is not used from any production
code.

Change-Id: I409cabad9ec8c4dcf16c7e863dada58754d5eac1
Reviewed-on: http://gerrit.cloudera.org:8080/12015
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Visualize AST for easier debugging
> --
>
> Key: IMPALA-7881
> URL: https://issues.apache.org/jira/browse/IMPALA-7881
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> The parser creates a "raw" AST (abstract syntax tree), which is then 
> "decorated" by the analyzer. Often, when debugging the analyzer, one wants to 
> see the state of the tree. At present, doing so using an IDE's debugger is 
> tedious as one has to slowly navigate within the tree.
>  Provide a debug tool that visualizes the tree. For example:
> {noformat}
>  (SelectStmt): {
> . isExplain: false
> . analyzer: 
> . withClause: 
> . orderByElements: [
> . . 0 (OrderByElement): {
> . . . expr (SlotRef): {
> ...
> . selectList (SelectList): {
> . . planHints: []
> . . isDistinct: false
> . . items: [
> . . . 0 (SelectListItem): {
> . . . . expr (SlotRef): {
> ...
> . . . . . rawPath: [
> . . . . . . 0: "id"
> . . . . . ]
> . . . . . label: "id"
> {noformat}
> Many improvements can be made. (Format as JSON, export to a nice JSON 
> visualizer, etc.) The purpose here is to just get started.
> To avoid the need to write code for every AST node class (of which there are 
> many), use Java introspection to walk fields directly. The result may be 
> overly verbose, but it is a quick way to get started.
> The idea is to use the visualizer in conjunction with a unit test:
> {code:java}
>   @Test
>   public void test() {
> String stmt =
> "SELECT id, int_col + 10 AS c" +
> " FROM functional.alltypestiny" +
> " WHERE id > 10" +
> " ORDER BY c";
> ParseNode root = AnalyzesOk(stmt);
> AstPrinter.printTree(root);
>   }
> {code}
> When debugging an issue, create a test. If things are not working, 
> temporarily insert a call to the visualizer to see what's what. Remove the 
> call when done.
> Poking at the AST from outside a unit test (perhaps from the Impala shell) is 
> a larger project, beyond the scope of this ticket.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-7922) Impala 2.x build is broken due to update to Jackson dependencies

2018-12-03 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7922 started by Fredy Wijaya.

> Impala 2.x build is broken due to update to Jackson dependencies
> 
>
> Key: IMPALA-7922
> URL: https://issues.apache.org/jira/browse/IMPALA-7922
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.12.0
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Blocker
>
> {noformat}
> *Scanning dependencies of target yarn-extras*
> 
> Running mvn  -B install -DskipTests
> Directory /mnt/volume1/impala-orc/incubator-impala/common/yarn-extras
> 
> [WARNING] Could not transfer metadata
> com.cloudera.cdh:cdh-root:5.16.0-SNAPSHOT/maven-metadata.xml from/to
> ${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}): No connector available to
> access repository ${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}) of type
> default using the available factories WagonRepositoryConnectorFactory
> [INFO] BUILD FAILURE
> [ERROR] Failed to execute goal on project yarn-extras: Could not resolve
> dependencies for project org.apache.impala:yarn-extras:jar:0.1-SNAPSHOT:
> Failed to collect dependencies for
> [org.apache.hadoop:hadoop-common:jar:2.6.0-cdh5.16.0-SNAPSHOT (compile),
> org.apache.hadoop:hadoop-yarn-api:jar:2.6.0-cdh5.16.0-SNAPSHOT (compile),
> org.apache.hadoop:hadoop-yarn-common:jar:2.6.0-cdh5.16.0-SNAPSHOT
> (compile)]: Failed to read artifact descriptor for
> org.codehaus.jackson:jackson-mapper-asl:jar:${cdh.jackson-mapper-asl.version}:
> Could not transfer artifact
> org.codehaus.jackson:jackson-mapper-asl:pom:${cdh.jackson-mapper-asl.version}
> from/to cdh.rcs.releases.repo (
> https://repository.cloudera.com/content/groups/cdh-releases-rcs): Illegal
> character in path at index 105:
> https://repository.cloudera.com/content/groups/cdh-releases-rcs/org/codehaus/jackson/jackson-mapper-asl/${cdh.jackson-mapper-asl.version}/jackson-mapper-asl-${cdh.jackson-mapper-asl.version}.pom
> -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions,
> please read the following articles:
> [ERROR] [Help 1]
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7232) Display whether fragment instances' profile is complete

2018-12-03 Thread Michael Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho updated IMPALA-7232:
---
Labels: supportability  (was: )

> Display whether fragment instances' profile is complete
> ---
>
> Key: IMPALA-7232
> URL: https://issues.apache.org/jira/browse/IMPALA-7232
> Project: IMPALA
>  Issue Type: Task
>  Components: Distributed Exec
>Affects Versions: Impala 3.1.0
>Reporter: Michael Ho
>Assignee: Michael Ho
>Priority: Major
>  Labels: supportability
>
> While working on IMPALA-7213, it's noticed that we can fail to serialize or 
> deserialize a profile for random reasons. This shouldn't be fatal: the 
> fragment instance status can still be presented to the coordinator to avoid 
> hitting IMPALA-2990. A missing profile in ReportExecStatus() RPC may result 
> in incomplete or stale profile being presented to Impala client. It would be 
> helpful to mark whether the profile may be incomplete and/or final in the 
> profile output.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-7662) test_parquet reads bad_magic_number.parquet without an error

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker closed IMPALA-7662.
---
Resolution: Fixed

I observed this on a branch that did not have the fix.

> test_parquet reads bad_magic_number.parquet without an error
> 
>
> Key: IMPALA-7662
> URL: https://issues.apache.org/jira/browse/IMPALA-7662
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
> Environment: Impala ddef2cb9b14e7f8cf9a68a2a382e10a8e0f91c3d 
> exhaustive debug build
>Reporter: Tianyi Wang
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: correctness
> Fix For: Impala 3.1.0
>
>
> {noformat}
> 09:51:41 === FAILURES 
> ===
> 09:51:41  TestParquet.test_parquet[exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': 
> False, 'abort_on_error': 1, 'debug_action': 
> 'HDFS_SCANNER_THREAD_CHECK_SOFT_MEM_LIMIT:FAIL@0.5', 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] 
> 09:51:41 [gw5] linux2 -- Python 2.7.5 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/bin/../infra/python/env/bin/python
> 09:51:41 query_test/test_scanners.py:300: in test_parquet
> 09:51:41 self.run_test_case('QueryTest/parquet', vector)
> 09:51:41 common/impala_test_suite.py:423: in run_test_case
> 09:51:41 assert False, "Expected exception: %s" % expected_str
> 09:51:41 E   AssertionError: Expected exception: File 
> 'hdfs://localhost:20500/test-warehouse/bad_magic_number_parquet/bad_magic_number.parquet'
>  has an invalid version number: 
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (IMPALA-7662) test_parquet reads bad_magic_number.parquet without an error

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker closed IMPALA-7662.
---
Resolution: Fixed

I observed this on a branch that did not have the fix.

> test_parquet reads bad_magic_number.parquet without an error
> 
>
> Key: IMPALA-7662
> URL: https://issues.apache.org/jira/browse/IMPALA-7662
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
> Environment: Impala ddef2cb9b14e7f8cf9a68a2a382e10a8e0f91c3d 
> exhaustive debug build
>Reporter: Tianyi Wang
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: correctness
> Fix For: Impala 3.1.0
>
>
> {noformat}
> 09:51:41 === FAILURES 
> ===
> 09:51:41  TestParquet.test_parquet[exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': 
> False, 'abort_on_error': 1, 'debug_action': 
> 'HDFS_SCANNER_THREAD_CHECK_SOFT_MEM_LIMIT:FAIL@0.5', 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] 
> 09:51:41 [gw5] linux2 -- Python 2.7.5 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/bin/../infra/python/env/bin/python
> 09:51:41 query_test/test_scanners.py:300: in test_parquet
> 09:51:41 self.run_test_case('QueryTest/parquet', vector)
> 09:51:41 common/impala_test_suite.py:423: in run_test_case
> 09:51:41 assert False, "Expected exception: %s" % expected_str
> 09:51:41 E   AssertionError: Expected exception: File 
> 'hdfs://localhost:20500/test-warehouse/bad_magic_number_parquet/bad_magic_number.parquet'
>  has an invalid version number: 
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-7662) test_parquet reads bad_magic_number.parquet without an error

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker reopened IMPALA-7662:
-

> test_parquet reads bad_magic_number.parquet without an error
> 
>
> Key: IMPALA-7662
> URL: https://issues.apache.org/jira/browse/IMPALA-7662
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
> Environment: Impala ddef2cb9b14e7f8cf9a68a2a382e10a8e0f91c3d 
> exhaustive debug build
>Reporter: Tianyi Wang
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: correctness
> Fix For: Impala 3.1.0
>
>
> {noformat}
> 09:51:41 === FAILURES 
> ===
> 09:51:41  TestParquet.test_parquet[exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': 
> False, 'abort_on_error': 1, 'debug_action': 
> 'HDFS_SCANNER_THREAD_CHECK_SOFT_MEM_LIMIT:FAIL@0.5', 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] 
> 09:51:41 [gw5] linux2 -- Python 2.7.5 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/bin/../infra/python/env/bin/python
> 09:51:41 query_test/test_scanners.py:300: in test_parquet
> 09:51:41 self.run_test_case('QueryTest/parquet', vector)
> 09:51:41 common/impala_test_suite.py:423: in run_test_case
> 09:51:41 assert False, "Expected exception: %s" % expected_str
> 09:51:41 E   AssertionError: Expected exception: File 
> 'hdfs://localhost:20500/test-warehouse/bad_magic_number_parquet/bad_magic_number.parquet'
>  has an invalid version number: 
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7662) test_parquet reads bad_magic_number.parquet without an error

2018-12-03 Thread Lars Volker (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707826#comment-16707826
 ] 

Lars Volker commented on IMPALA-7662:
-

I think I've seen this again. [~tarmstrong], can you have another look?

{noformat}
15:37:00  TestParquet.test_parquet[protocol: beeswax | exec_option: 
{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
'disable_codegen': False, 'abort_on_error': 1, 'debug_action': 
'-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@0.5', 
'exec_single_node_rows_threshold': 0} | table_format: parquet/none] 
15:37:00 [gw0] linux2 -- Python 2.7.5 env/bin/python
15:37:00 query_test/test_scanners.py:299: in test_parquet
15:37:00 self.run_test_case('QueryTest/parquet', vector)
15:37:00 common/impala_test_suite.py:482: in run_test_case
15:37:00 assert False, "Expected exception: %s" % expected_str
15:37:00 E   AssertionError: Expected exception: Invalid metadata size in file 
footer
{noformat}

> test_parquet reads bad_magic_number.parquet without an error
> 
>
> Key: IMPALA-7662
> URL: https://issues.apache.org/jira/browse/IMPALA-7662
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
> Environment: Impala ddef2cb9b14e7f8cf9a68a2a382e10a8e0f91c3d 
> exhaustive debug build
>Reporter: Tianyi Wang
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: correctness
> Fix For: Impala 3.1.0
>
>
> {noformat}
> 09:51:41 === FAILURES 
> ===
> 09:51:41  TestParquet.test_parquet[exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': 
> False, 'abort_on_error': 1, 'debug_action': 
> 'HDFS_SCANNER_THREAD_CHECK_SOFT_MEM_LIMIT:FAIL@0.5', 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] 
> 09:51:41 [gw5] linux2 -- Python 2.7.5 
> /data/jenkins/workspace/impala-asf-master-exhaustive/repos/Impala/bin/../infra/python/env/bin/python
> 09:51:41 query_test/test_scanners.py:300: in test_parquet
> 09:51:41 self.run_test_case('QueryTest/parquet', vector)
> 09:51:41 common/impala_test_suite.py:423: in run_test_case
> 09:51:41 assert False, "Expected exception: %s" % expected_str
> 09:51:41 E   AssertionError: Expected exception: File 
> 'hdfs://localhost:20500/test-warehouse/bad_magic_number_parquet/bad_magic_number.parquet'
>  has an invalid version number: 
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7922) Impala 2.x build is broken due to update to Jackson dependencies

2018-12-03 Thread Fredy Wijaya (JIRA)
Fredy Wijaya created IMPALA-7922:


 Summary: Impala 2.x build is broken due to update to Jackson 
dependencies
 Key: IMPALA-7922
 URL: https://issues.apache.org/jira/browse/IMPALA-7922
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 2.12.0
Reporter: Fredy Wijaya


{noformat}
*Scanning dependencies of target yarn-extras*



Running mvn  -B install -DskipTests

Directory /mnt/volume1/impala-orc/incubator-impala/common/yarn-extras



[WARNING] Could not transfer metadata
com.cloudera.cdh:cdh-root:5.16.0-SNAPSHOT/maven-metadata.xml from/to
${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}): No connector available to
access repository ${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}) of type
default using the available factories WagonRepositoryConnectorFactory

[INFO] BUILD FAILURE

[ERROR] Failed to execute goal on project yarn-extras: Could not resolve
dependencies for project org.apache.impala:yarn-extras:jar:0.1-SNAPSHOT:
Failed to collect dependencies for
[org.apache.hadoop:hadoop-common:jar:2.6.0-cdh5.16.0-SNAPSHOT (compile),
org.apache.hadoop:hadoop-yarn-api:jar:2.6.0-cdh5.16.0-SNAPSHOT (compile),
org.apache.hadoop:hadoop-yarn-common:jar:2.6.0-cdh5.16.0-SNAPSHOT
(compile)]: Failed to read artifact descriptor for
org.codehaus.jackson:jackson-mapper-asl:jar:${cdh.jackson-mapper-asl.version}:
Could not transfer artifact
org.codehaus.jackson:jackson-mapper-asl:pom:${cdh.jackson-mapper-asl.version}
from/to cdh.rcs.releases.repo (
https://repository.cloudera.com/content/groups/cdh-releases-rcs): Illegal
character in path at index 105:
https://repository.cloudera.com/content/groups/cdh-releases-rcs/org/codehaus/jackson/jackson-mapper-asl/${cdh.jackson-mapper-asl.version}/jackson-mapper-asl-${cdh.jackson-mapper-asl.version}.pom
-> [Help 1]

[ERROR]

[ERROR] To see the full stack trace of the errors, re-run Maven with the -e
switch.

[ERROR] Re-run Maven using the -X switch to enable full debug logging.

[ERROR]

[ERROR] For more information about the errors and possible solutions,
please read the following articles:

[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7922) Impala 2.x build is broken due to update to Jackson dependencies

2018-12-03 Thread Fredy Wijaya (JIRA)
Fredy Wijaya created IMPALA-7922:


 Summary: Impala 2.x build is broken due to update to Jackson 
dependencies
 Key: IMPALA-7922
 URL: https://issues.apache.org/jira/browse/IMPALA-7922
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 2.12.0
Reporter: Fredy Wijaya


{noformat}
*Scanning dependencies of target yarn-extras*



Running mvn  -B install -DskipTests

Directory /mnt/volume1/impala-orc/incubator-impala/common/yarn-extras



[WARNING] Could not transfer metadata
com.cloudera.cdh:cdh-root:5.16.0-SNAPSHOT/maven-metadata.xml from/to
${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}): No connector available to
access repository ${distMgmtSnapshotsId} (${distMgmtSnapshotsUrl}) of type
default using the available factories WagonRepositoryConnectorFactory

[INFO] BUILD FAILURE

[ERROR] Failed to execute goal on project yarn-extras: Could not resolve
dependencies for project org.apache.impala:yarn-extras:jar:0.1-SNAPSHOT:
Failed to collect dependencies for
[org.apache.hadoop:hadoop-common:jar:2.6.0-cdh5.16.0-SNAPSHOT (compile),
org.apache.hadoop:hadoop-yarn-api:jar:2.6.0-cdh5.16.0-SNAPSHOT (compile),
org.apache.hadoop:hadoop-yarn-common:jar:2.6.0-cdh5.16.0-SNAPSHOT
(compile)]: Failed to read artifact descriptor for
org.codehaus.jackson:jackson-mapper-asl:jar:${cdh.jackson-mapper-asl.version}:
Could not transfer artifact
org.codehaus.jackson:jackson-mapper-asl:pom:${cdh.jackson-mapper-asl.version}
from/to cdh.rcs.releases.repo (
https://repository.cloudera.com/content/groups/cdh-releases-rcs): Illegal
character in path at index 105:
https://repository.cloudera.com/content/groups/cdh-releases-rcs/org/codehaus/jackson/jackson-mapper-asl/${cdh.jackson-mapper-asl.version}/jackson-mapper-asl-${cdh.jackson-mapper-asl.version}.pom
-> [Help 1]

[ERROR]

[ERROR] To see the full stack trace of the errors, re-run Maven with the -e
switch.

[ERROR] Re-run Maven using the -X switch to enable full debug logging.

[ERROR]

[ERROR] For more information about the errors and possible solutions,
please read the following articles:

[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IMPALA-7864) TestLocalCatalogRetries::test_replan_limit is flaky

2018-12-03 Thread Lars Volker (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707818#comment-16707818
 ] 

Lars Volker commented on IMPALA-7864:
-

Saw this again.

> TestLocalCatalogRetries::test_replan_limit is flaky
> ---
>
> Key: IMPALA-7864
> URL: https://issues.apache.org/jira/browse/IMPALA-7864
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.0, Impala 2.12.0
> Environment: Ubuntu 16.04
>Reporter: Jim Apple
>Assignee: Todd Lipcon
>Priority: Critical
>  Labels: broken-build, flaky
>
> In https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3605/, 
> TestLocalCatalogRetries::test_replan_limit failed on an unrelated patch. On 
> my development machine, the test passed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7864) TestLocalCatalogRetries::test_replan_limit is flaky

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker updated IMPALA-7864:

Affects Version/s: Impala 2.12.0

> TestLocalCatalogRetries::test_replan_limit is flaky
> ---
>
> Key: IMPALA-7864
> URL: https://issues.apache.org/jira/browse/IMPALA-7864
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.0, Impala 2.12.0
> Environment: Ubuntu 16.04
>Reporter: Jim Apple
>Assignee: Todd Lipcon
>Priority: Critical
>  Labels: broken-build, flaky
>
> In https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3605/, 
> TestLocalCatalogRetries::test_replan_limit failed on an unrelated patch. On 
> my development machine, the test passed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7825) Upgrade Thrift version to 0.11.0

2018-12-03 Thread Philip Zeyliger (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707816#comment-16707816
 ] 

Philip Zeyliger commented on IMPALA-7825:
-

We can revisit this as, "Let's provide Thrift 0.11 python generate code as 
well." That shouldn't encounter as much version resistance.

> Upgrade Thrift version to 0.11.0
> 
>
> Key: IMPALA-7825
> URL: https://issues.apache.org/jira/browse/IMPALA-7825
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: Lars Volker
>Assignee: Sahil Takiar
>Priority: Major
>  Labels: performance
>
> Thrift has added performance improvements to its Python deserialization code. 
> We should upgrade to 0.11.0 to make use of those.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7864) TestLocalCatalogRetries::test_replan_limit is flaky

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker updated IMPALA-7864:

Labels: broken-build flaky  (was: flaky)

> TestLocalCatalogRetries::test_replan_limit is flaky
> ---
>
> Key: IMPALA-7864
> URL: https://issues.apache.org/jira/browse/IMPALA-7864
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.0, Impala 2.12.0
> Environment: Ubuntu 16.04
>Reporter: Jim Apple
>Assignee: Todd Lipcon
>Priority: Critical
>  Labels: broken-build, flaky
>
> In https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3605/, 
> TestLocalCatalogRetries::test_replan_limit failed on an unrelated patch. On 
> my development machine, the test passed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-2648) catalogd crashes when serialized messages are over 2 GB

2018-12-03 Thread Tianyi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianyi Wang resolved IMPALA-2648.
-
   Resolution: Fixed
Fix Version/s: Impala 2.12.0

This has already been fixed by IMPALA-5990

> catalogd crashes when serialized messages are over 2 GB
> ---
>
> Key: IMPALA-2648
> URL: https://issues.apache.org/jira/browse/IMPALA-2648
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.3.0
>Reporter: Silvius Rus
>Assignee: Tianyi Wang
>Priority: Critical
>  Labels: compute-stats, crash, downgraded
> Fix For: Impala 2.12.0
>
>
> We've seen a catalogd crash triggered by loading the metadata for a table 
> with about 20K partitions and 77 columns that has incremental stats.  It 
> looks like the serialized message is over 2GB which is the Java max array 
> size.  Ideally we should catch this exception and fail the query that needs 
> this table's metadata with an appropriate message.
> {code}
> I1107 06:47:56.641507 30252 jni-util.cc:177] java.lang.OutOfMemoryError: 
> Requested array size exceeds VM limit
> at java.util.Arrays.copyOf(Arrays.java:2271)
> at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
> at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
> at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
> at 
> org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:187)
> at 
> com.cloudera.impala.thrift.THdfsPartition$THdfsPartitionStandardScheme.write(THdfsPartition.java:1831)
> at 
> com.cloudera.impala.thrift.THdfsPartition$THdfsPartitionStandardScheme.write(THdfsPartition.java:1543)
> at com.cloudera.impala.thrift.THdfsPartition.write(THdfsPartition.java:1389)
> at 
> com.cloudera.impala.thrift.THdfsTable$THdfsTableStandardScheme.write(THdfsTable.java:1123)
> at 
> com.cloudera.impala.thrift.THdfsTable$THdfsTableStandardScheme.write(THdfsTable.java:969)
> at com.cloudera.impala.thrift.THdfsTable.write(THdfsTable.java:848)
> at 
> com.cloudera.impala.thrift.TTable$TTableStandardScheme.write(TTable.java:1628)
> at 
> com.cloudera.impala.thrift.TTable$TTableStandardScheme.write(TTable.java:1395)
> at com.cloudera.impala.thrift.TTable.write(TTable.java:1209)
> at 
> com.cloudera.impala.thrift.TCatalogObject$TCatalogObjectStandardScheme.write(TCatalogObject.java:1241)
> at 
> com.cloudera.impala.thrift.TCatalogObject$TCatalogObjectStandardScheme.write(TCatalogObject.java:1098)
> at com.cloudera.impala.thrift.TCatalogObject.write(TCatalogObject.java:938)
> at 
> com.cloudera.impala.thrift.TGetAllCatalogObjectsResponse$TGetAllCatalogObjectsResponseStandardScheme.write(TGetAllCatalogObjectsResponse.java:487)
> at 
> com.cloudera.impala.thrift.TGetAllCatalogObjectsResponse$TGetAllCatalogObjectsResponseStandardScheme.write(TGetAllCatalogObjectsResponse.java:421)
> at 
> com.cloudera.impala.thrift.TGetAllCatalogObjectsResponse.write(TGetAllCatalogObjectsResponse.java:365)
> at org.apache.thrift.TSerializer.serialize(TSerializer.java:79)
> at 
> com.cloudera.impala.service.JniCatalog.getCatalogObjects(JniCatalog.java:110)
> {code}
> You can identify this issue by looking at the metastore database.  Here is 
> how to see the size of the incremental stats for table id 12345.  The table 
> with this value of 624 MB of incremental stats led to the catalogd crash 
> shown above.
> {code}
> postgres=# select pg_size_pretty(sum(length("PARTITION_PARAMS"."PARAM_KEY") + 
> length("PARTITION_PARAMS"."PARAM_VALUE"))) from "PARTITIONS", 
> "PARTITION_PARAMS" where "PARTITIONS"."TBL_ID"=12345 and 
> "PARTITIONS"."PART_ID" = "PARTITION_PARAMS"."PART_ID"  and 
> "PARTITION_PARAMS"."PARAM_KEY" LIKE 'impala_intermediate%';
>  pg_size_pretty 
> 
>  624 MB
> (1 row)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-2648) catalogd crashes when serialized messages are over 2 GB

2018-12-03 Thread Tianyi Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianyi Wang resolved IMPALA-2648.
-
   Resolution: Fixed
Fix Version/s: Impala 2.12.0

This has already been fixed by IMPALA-5990

> catalogd crashes when serialized messages are over 2 GB
> ---
>
> Key: IMPALA-2648
> URL: https://issues.apache.org/jira/browse/IMPALA-2648
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.3.0
>Reporter: Silvius Rus
>Assignee: Tianyi Wang
>Priority: Critical
>  Labels: compute-stats, crash, downgraded
> Fix For: Impala 2.12.0
>
>
> We've seen a catalogd crash triggered by loading the metadata for a table 
> with about 20K partitions and 77 columns that has incremental stats.  It 
> looks like the serialized message is over 2GB which is the Java max array 
> size.  Ideally we should catch this exception and fail the query that needs 
> this table's metadata with an appropriate message.
> {code}
> I1107 06:47:56.641507 30252 jni-util.cc:177] java.lang.OutOfMemoryError: 
> Requested array size exceeds VM limit
> at java.util.Arrays.copyOf(Arrays.java:2271)
> at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
> at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
> at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
> at 
> org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:187)
> at 
> com.cloudera.impala.thrift.THdfsPartition$THdfsPartitionStandardScheme.write(THdfsPartition.java:1831)
> at 
> com.cloudera.impala.thrift.THdfsPartition$THdfsPartitionStandardScheme.write(THdfsPartition.java:1543)
> at com.cloudera.impala.thrift.THdfsPartition.write(THdfsPartition.java:1389)
> at 
> com.cloudera.impala.thrift.THdfsTable$THdfsTableStandardScheme.write(THdfsTable.java:1123)
> at 
> com.cloudera.impala.thrift.THdfsTable$THdfsTableStandardScheme.write(THdfsTable.java:969)
> at com.cloudera.impala.thrift.THdfsTable.write(THdfsTable.java:848)
> at 
> com.cloudera.impala.thrift.TTable$TTableStandardScheme.write(TTable.java:1628)
> at 
> com.cloudera.impala.thrift.TTable$TTableStandardScheme.write(TTable.java:1395)
> at com.cloudera.impala.thrift.TTable.write(TTable.java:1209)
> at 
> com.cloudera.impala.thrift.TCatalogObject$TCatalogObjectStandardScheme.write(TCatalogObject.java:1241)
> at 
> com.cloudera.impala.thrift.TCatalogObject$TCatalogObjectStandardScheme.write(TCatalogObject.java:1098)
> at com.cloudera.impala.thrift.TCatalogObject.write(TCatalogObject.java:938)
> at 
> com.cloudera.impala.thrift.TGetAllCatalogObjectsResponse$TGetAllCatalogObjectsResponseStandardScheme.write(TGetAllCatalogObjectsResponse.java:487)
> at 
> com.cloudera.impala.thrift.TGetAllCatalogObjectsResponse$TGetAllCatalogObjectsResponseStandardScheme.write(TGetAllCatalogObjectsResponse.java:421)
> at 
> com.cloudera.impala.thrift.TGetAllCatalogObjectsResponse.write(TGetAllCatalogObjectsResponse.java:365)
> at org.apache.thrift.TSerializer.serialize(TSerializer.java:79)
> at 
> com.cloudera.impala.service.JniCatalog.getCatalogObjects(JniCatalog.java:110)
> {code}
> You can identify this issue by looking at the metastore database.  Here is 
> how to see the size of the incremental stats for table id 12345.  The table 
> with this value of 624 MB of incremental stats led to the catalogd crash 
> shown above.
> {code}
> postgres=# select pg_size_pretty(sum(length("PARTITION_PARAMS"."PARAM_KEY") + 
> length("PARTITION_PARAMS"."PARAM_VALUE"))) from "PARTITIONS", 
> "PARTITION_PARAMS" where "PARTITIONS"."TBL_ID"=12345 and 
> "PARTITIONS"."PART_ID" = "PARTITION_PARAMS"."PART_ID"  and 
> "PARTITION_PARAMS"."PARAM_KEY" LIKE 'impala_intermediate%';
>  pg_size_pretty 
> 
>  624 MB
> (1 row)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7921) Hive JVM aborts during data load

2018-12-03 Thread Lars Volker (JIRA)
Lars Volker created IMPALA-7921:
---

 Summary: Hive JVM aborts during data load
 Key: IMPALA-7921
 URL: https://issues.apache.org/jira/browse/IMPALA-7921
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.0
Reporter: Lars Volker


During a core test run I observed Hive's JVM crashing. Here is the stack trace 
from the core file it left behind:

{noformat}

CORE: ./core.1543762524.3579.java
BINARY: /usr/java/jdk1.8.0_144/bin/java
Core was generated by `/usr/java/jdk1.8.0_144/bin/java -Dproc_jar -Xmx2048m 
-Dhive.log.file=hive-serve'.
Program terminated with signal SIGABRT, Aborted.
...
#0  0x7f5b72fc81f7 in raise () from /lib64/libc.so.6
#1  0x7f5b72fc98e8 in abort () from /lib64/libc.so.6
#2  0x7f5b728c5185 in os::abort(bool) () from 
/usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
#3  0x7f5b72a67593 in VMError::report_and_die() () from 
/usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
#4  0x7f5b728ca68f in JVM_handle_linux_signal () from 
/usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
#5  0x7f5b728c0be3 in signalHandler(int, siginfo*, void*) () from 
/usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
#6  
#7  0x080721b0 in ?? ()
#8  0x in ?? ()
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IMPALA-7921) Hive JVM aborts during data load

2018-12-03 Thread Lars Volker (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707707#comment-16707707
 ] 

Lars Volker commented on IMPALA-7921:
-

[~tianyiwang] - I picked you randomly; feel free to find another person or 
assign back to me if you're swamped.


> Hive JVM aborts during data load
> 
>
> Key: IMPALA-7921
> URL: https://issues.apache.org/jira/browse/IMPALA-7921
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Lars Volker
>Assignee: Tianyi Wang
>Priority: Critical
>  Labels: broken-build, flaky
>
> During a core test run I observed Hive's JVM crashing. Here is the stack 
> trace from the core file it left behind:
> {noformat}
> CORE: ./core.1543762524.3579.java
> BINARY: /usr/java/jdk1.8.0_144/bin/java
> Core was generated by `/usr/java/jdk1.8.0_144/bin/java -Dproc_jar -Xmx2048m 
> -Dhive.log.file=hive-serve'.
> Program terminated with signal SIGABRT, Aborted.
> ...
> #0  0x7f5b72fc81f7 in raise () from /lib64/libc.so.6
> #1  0x7f5b72fc98e8 in abort () from /lib64/libc.so.6
> #2  0x7f5b728c5185 in os::abort(bool) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #3  0x7f5b72a67593 in VMError::report_and_die() () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #4  0x7f5b728ca68f in JVM_handle_linux_signal () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #5  0x7f5b728c0be3 in signalHandler(int, siginfo*, void*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #6  
> #7  0x080721b0 in ?? ()
> #8  0x in ?? ()
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7921) Hive JVM aborts during data load

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker reassigned IMPALA-7921:
---

Assignee: Tianyi Wang

> Hive JVM aborts during data load
> 
>
> Key: IMPALA-7921
> URL: https://issues.apache.org/jira/browse/IMPALA-7921
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Lars Volker
>Assignee: Tianyi Wang
>Priority: Critical
>  Labels: broken-build, flaky
>
> During a core test run I observed Hive's JVM crashing. Here is the stack 
> trace from the core file it left behind:
> {noformat}
> CORE: ./core.1543762524.3579.java
> BINARY: /usr/java/jdk1.8.0_144/bin/java
> Core was generated by `/usr/java/jdk1.8.0_144/bin/java -Dproc_jar -Xmx2048m 
> -Dhive.log.file=hive-serve'.
> Program terminated with signal SIGABRT, Aborted.
> ...
> #0  0x7f5b72fc81f7 in raise () from /lib64/libc.so.6
> #1  0x7f5b72fc98e8 in abort () from /lib64/libc.so.6
> #2  0x7f5b728c5185 in os::abort(bool) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #3  0x7f5b72a67593 in VMError::report_and_die() () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #4  0x7f5b728ca68f in JVM_handle_linux_signal () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #5  0x7f5b728c0be3 in signalHandler(int, siginfo*, void*) () from 
> /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
> #6  
> #7  0x080721b0 in ?? ()
> #8  0x in ?? ()
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7921) Hive JVM aborts during data load

2018-12-03 Thread Lars Volker (JIRA)
Lars Volker created IMPALA-7921:
---

 Summary: Hive JVM aborts during data load
 Key: IMPALA-7921
 URL: https://issues.apache.org/jira/browse/IMPALA-7921
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.0
Reporter: Lars Volker


During a core test run I observed Hive's JVM crashing. Here is the stack trace 
from the core file it left behind:

{noformat}

CORE: ./core.1543762524.3579.java
BINARY: /usr/java/jdk1.8.0_144/bin/java
Core was generated by `/usr/java/jdk1.8.0_144/bin/java -Dproc_jar -Xmx2048m 
-Dhive.log.file=hive-serve'.
Program terminated with signal SIGABRT, Aborted.
...
#0  0x7f5b72fc81f7 in raise () from /lib64/libc.so.6
#1  0x7f5b72fc98e8 in abort () from /lib64/libc.so.6
#2  0x7f5b728c5185 in os::abort(bool) () from 
/usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
#3  0x7f5b72a67593 in VMError::report_and_die() () from 
/usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
#4  0x7f5b728ca68f in JVM_handle_linux_signal () from 
/usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
#5  0x7f5b728c0be3 in signalHandler(int, siginfo*, void*) () from 
/usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
#6  
#7  0x080721b0 in ?? ()
#8  0x in ?? ()
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7871) Don't load Hive builtin jars for dataload

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker reassigned IMPALA-7871:
---

Assignee: Joe McDonnell  (was: Lars Volker)

> Don't load Hive builtin jars for dataload
> -
>
> Key: IMPALA-7871
> URL: https://issues.apache.org/jira/browse/IMPALA-7871
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
> Fix For: Impala 3.2.0
>
>
> One step in dataload is "Loading Hive Builtins", which copies a large number 
> of jars into HDFS (or whatever storage). This step takes a couple minutes on 
> HDFS dataload and 8 minutes on S3. Despite its name, I can't find any 
> indication that Hive or anything else uses these jars. Dataload and core 
> tests run fine without it. S3 can load data without it. There's no indication 
> that this is needed.
> Unless we find something using these jars, we should remove this step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7871) Don't load Hive builtin jars for dataload

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker reassigned IMPALA-7871:
---

Assignee: Lars Volker  (was: Joe McDonnell)

> Don't load Hive builtin jars for dataload
> -
>
> Key: IMPALA-7871
> URL: https://issues.apache.org/jira/browse/IMPALA-7871
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Lars Volker
>Priority: Major
> Fix For: Impala 3.2.0
>
>
> One step in dataload is "Loading Hive Builtins", which copies a large number 
> of jars into HDFS (or whatever storage). This step takes a couple minutes on 
> HDFS dataload and 8 minutes on S3. Despite its name, I can't find any 
> indication that Hive or anything else uses these jars. Dataload and core 
> tests run fine without it. S3 can load data without it. There's no indication 
> that this is needed.
> Unless we find something using these jars, we should remove this step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7920) Impala 3.2 Doc: Doc Levenshtein edit distance built-in function

2018-12-03 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-7920:

Description: levenshtein(string source, string target) returns int

> Impala 3.2 Doc: Doc Levenshtein edit distance built-in function
> ---
>
> Key: IMPALA-7920
> URL: https://issues.apache.org/jira/browse/IMPALA-7920
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Affects Versions: Impala 3.2.0
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc, in_32
>
> levenshtein(string source, string target) returns int



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7920) Impala 3.2 Doc: Doc Levenshtein edit distance built-in function

2018-12-03 Thread Alex Rodoni (JIRA)
Alex Rodoni created IMPALA-7920:
---

 Summary: Impala 3.2 Doc: Doc Levenshtein edit distance built-in 
function
 Key: IMPALA-7920
 URL: https://issues.apache.org/jira/browse/IMPALA-7920
 Project: IMPALA
  Issue Type: Sub-task
  Components: Docs
Affects Versions: Impala 3.2.0
Reporter: Alex Rodoni
Assignee: Alex Rodoni






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7920) Impala 3.2 Doc: Doc Levenshtein edit distance built-in function

2018-12-03 Thread Alex Rodoni (JIRA)
Alex Rodoni created IMPALA-7920:
---

 Summary: Impala 3.2 Doc: Doc Levenshtein edit distance built-in 
function
 Key: IMPALA-7920
 URL: https://issues.apache.org/jira/browse/IMPALA-7920
 Project: IMPALA
  Issue Type: Sub-task
  Components: Docs
Affects Versions: Impala 3.2.0
Reporter: Alex Rodoni
Assignee: Alex Rodoni






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7804) Various scanner tests intermittently failing on S3 on different runs

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker reassigned IMPALA-7804:
---

Assignee: Joe McDonnell

> Various scanner tests intermittently failing on S3 on different runs
> 
>
> Key: IMPALA-7804
> URL: https://issues.apache.org/jira/browse/IMPALA-7804
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: David Knupp
>Assignee: Joe McDonnell
>Priority: Blocker
>  Labels: S3, broken-build, flaky
>
> The failures have to do with getting AWS client credentials.
> *query_test/test_scanners.py:696: in test_decimal_encodings*
> _Stacktrace_
> {noformat}
> query_test/test_scanners.py:696: in test_decimal_encodings
> self.run_test_case('QueryTest/parquet-decimal-formats', vector, 
> unique_database)
> common/impala_test_suite.py:496: in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:358: in __verify_results_and_errors
> replace_filenames_with_placeholder)
> common/test_result_verifier.py:438: in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:260: in verify_query_result_is_equal
> assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E -255.00,-255.00,-255.00 == -255.00,-255.00,-255.00
> E -255.00,-255.00,-255.00 != -65535.00,-65535.00,-65535.00
> E -65535.00,-65535.00,-65535.00 != -999.99,-999.99,-999.99
> E -65535.00,-65535.00,-65535.00 != 
> 0.00,-.99,-.99
> E -999.99,-999.99,-999.99 != 0.00,0.00,0.00
> E -999.99,-999.99,-999.99 != 
> 0.00,.99,.99
> E 0.00,-.99,-.99 != 
> 255.00,255.00,255.00
> E 0.00,-.99,-.99 != 
> 65535.00,65535.00,65535.00
> E 0.00,0.00,0.00 != 999.99,999.99,999.99
> E 0.00,0.00,0.00 != None
> E 0.00,.99,.99 != None
> E 0.00,.99,.99 != None
> E 255.00,255.00,255.00 != None
> E 255.00,255.00,255.00 != None
> E 65535.00,65535.00,65535.00 != None
> E 65535.00,65535.00,65535.00 != None
> E 999.99,999.99,999.99 != None
> E 999.99,999.99,999.99 != None
> E Number of rows returned (expected vs actual): 18 != 9
> {noformat}
> _Standard Error_
> {noformat}
> SET sync_ddl=False;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_huge_num_rows_76a09ef1` CASCADE;
> -- 2018-11-01 09:42:41,140 INFO MainThread: Started query 
> 4c4bc0e7b69d7641:130ffe73
> SET sync_ddl=False;
> -- executing against localhost:21000
> CREATE DATABASE `test_huge_num_rows_76a09ef1`;
> -- 2018-11-01 09:42:42,402 INFO MainThread: Started query 
> e34d714d6a62cba1:2a8544d0
> -- 2018-11-01 09:42:42,405 INFO MainThread: Created database 
> "test_huge_num_rows_76a09ef1" for test ID 
> "query_test/test_scanners.py::TestParquet::()::test_huge_num_rows[protocol: 
> beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'debug_action': 
> '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@1.0', 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none]"
> 18/11/01 09:42:43 DEBUG s3a.S3AFileSystem: Initializing S3AFileSystem for 
> impala-test-uswest2-1
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Propagating entries under 
> fs.s3a.bucket.impala-test-uswest2-1.
> 18/11/01 09:42:43 WARN impl.MetricsConfig: Cannot locate configuration: tried 
> hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
> 18/11/01 09:42:43 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot 
> period at 10 second(s).
> 18/11/01 09:42:43 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> started
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: For URI s3a://impala-test-uswest2-1/, 
> using credentials AWSCredentialProviderList: BasicAWSCredentialsProvider 
> EnvironmentVariableCredentialsProvider 
> com.amazonaws.auth.InstanceProfileCredentialsProvider@15bbf42f
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.connection.maximum is 
> 1500
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.attempts.maximum is 20
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of 
> fs.s3a.connection.establish.timeout is 5000
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.connection.timeout is 
> 20
> 18/11/01 

[jira] [Commented] (IMPALA-7804) Various scanner tests intermittently failing on S3 on different runs

2018-12-03 Thread Lars Volker (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707605#comment-16707605
 ] 

Lars Volker commented on IMPALA-7804:
-

[~joemcdonnell] - Can we close this one?

> Various scanner tests intermittently failing on S3 on different runs
> 
>
> Key: IMPALA-7804
> URL: https://issues.apache.org/jira/browse/IMPALA-7804
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: David Knupp
>Assignee: Joe McDonnell
>Priority: Blocker
>  Labels: S3, broken-build, flaky
>
> The failures have to do with getting AWS client credentials.
> *query_test/test_scanners.py:696: in test_decimal_encodings*
> _Stacktrace_
> {noformat}
> query_test/test_scanners.py:696: in test_decimal_encodings
> self.run_test_case('QueryTest/parquet-decimal-formats', vector, 
> unique_database)
> common/impala_test_suite.py:496: in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:358: in __verify_results_and_errors
> replace_filenames_with_placeholder)
> common/test_result_verifier.py:438: in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:260: in verify_query_result_is_equal
> assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E -255.00,-255.00,-255.00 == -255.00,-255.00,-255.00
> E -255.00,-255.00,-255.00 != -65535.00,-65535.00,-65535.00
> E -65535.00,-65535.00,-65535.00 != -999.99,-999.99,-999.99
> E -65535.00,-65535.00,-65535.00 != 
> 0.00,-.99,-.99
> E -999.99,-999.99,-999.99 != 0.00,0.00,0.00
> E -999.99,-999.99,-999.99 != 
> 0.00,.99,.99
> E 0.00,-.99,-.99 != 
> 255.00,255.00,255.00
> E 0.00,-.99,-.99 != 
> 65535.00,65535.00,65535.00
> E 0.00,0.00,0.00 != 999.99,999.99,999.99
> E 0.00,0.00,0.00 != None
> E 0.00,.99,.99 != None
> E 0.00,.99,.99 != None
> E 255.00,255.00,255.00 != None
> E 255.00,255.00,255.00 != None
> E 65535.00,65535.00,65535.00 != None
> E 65535.00,65535.00,65535.00 != None
> E 999.99,999.99,999.99 != None
> E 999.99,999.99,999.99 != None
> E Number of rows returned (expected vs actual): 18 != 9
> {noformat}
> _Standard Error_
> {noformat}
> SET sync_ddl=False;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_huge_num_rows_76a09ef1` CASCADE;
> -- 2018-11-01 09:42:41,140 INFO MainThread: Started query 
> 4c4bc0e7b69d7641:130ffe73
> SET sync_ddl=False;
> -- executing against localhost:21000
> CREATE DATABASE `test_huge_num_rows_76a09ef1`;
> -- 2018-11-01 09:42:42,402 INFO MainThread: Started query 
> e34d714d6a62cba1:2a8544d0
> -- 2018-11-01 09:42:42,405 INFO MainThread: Created database 
> "test_huge_num_rows_76a09ef1" for test ID 
> "query_test/test_scanners.py::TestParquet::()::test_huge_num_rows[protocol: 
> beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'debug_action': 
> '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@1.0', 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none]"
> 18/11/01 09:42:43 DEBUG s3a.S3AFileSystem: Initializing S3AFileSystem for 
> impala-test-uswest2-1
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Propagating entries under 
> fs.s3a.bucket.impala-test-uswest2-1.
> 18/11/01 09:42:43 WARN impl.MetricsConfig: Cannot locate configuration: tried 
> hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
> 18/11/01 09:42:43 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot 
> period at 10 second(s).
> 18/11/01 09:42:43 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> started
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: For URI s3a://impala-test-uswest2-1/, 
> using credentials AWSCredentialProviderList: BasicAWSCredentialsProvider 
> EnvironmentVariableCredentialsProvider 
> com.amazonaws.auth.InstanceProfileCredentialsProvider@15bbf42f
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.connection.maximum is 
> 1500
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.attempts.maximum is 20
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of 
> fs.s3a.connection.establish.timeout is 5000
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of 

[jira] [Commented] (IMPALA-7919) Add predicates line in plan output for partition key predicates

2018-12-03 Thread Greg Rahn (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707600#comment-16707600
 ] 

Greg Rahn commented on IMPALA-7919:
---

I'm a believer in presenting all predicates in the node in a single section, 
but it also makes sense to add additional information where we can push those 
down into other areas.  For example, the basic or logical plan should be 
identical between a partitioned and non-partitioned table, but the more 
advanced or physical view can add additional details denoting which predicates 
were used to narrow down the list of partitions.

> Add predicates line in plan output for partition key predicates
> ---
>
> Key: IMPALA-7919
> URL: https://issues.apache.org/jira/browse/IMPALA-7919
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Greg Rahn
>Priority: Major
>  Labels: planner, ramp-up
>
> When there is a predicate on a partitioned table's partition key column the 
> SCAN node does not print the "predicates" line as it would if the table was 
> not partitioned. IMO predicates should always be included in the nodes where 
> they are applied irregardless of partitioning or not to make it clear.
> Query:
> {noformat}
> select * from t1 where part_key=42;
> {noformat}
> From a non-partitioned table:
> {noformat}
> 00:SCAN HDFS [default.t1]
>partitions=1/1 files=2 size=10B
>predicates: default.t1.part_key = 42
> {noformat}
> From a partitioned table:
> {noformat}
> 00:SCAN HDFS [default.t1]
>partitions=1/2 files=1 size=2B
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-3526) S3: Fix up S3PlannerTest

2018-12-03 Thread Lars Volker (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-3526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707583#comment-16707583
 ] 

Lars Volker commented on IMPALA-3526:
-

Here's an abandoned code review for future reference: 
https://gerrit.cloudera.org/#/c/8890/

> S3: Fix up S3PlannerTest
> 
>
> Key: IMPALA-3526
> URL: https://issues.apache.org/jira/browse/IMPALA-3526
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, 
> Impala 2.10.0, Impala 2.11.0
>Reporter: Sailesh Mukil
>Assignee: Vuk Ercegovac
>Priority: Major
>  Labels: s3, test
>
> I just recently found out that our frontend S3PlannerTest test has been 
> broken for quite some time. And we haven't been testing it.
> Specifically testS3ScanRanges, testTpch and testJoinOrder. The latter 2 have 
> different plans now and the first test needs to use regex to find and 
> substitute the size of the file as the size of the test file keeps changing 
> between data loads.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-7804) Various scanner tests intermittently failing on S3 on different runs

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker updated IMPALA-7804:

Labels: S3 broken-build flaky  (was: S3)

> Various scanner tests intermittently failing on S3 on different runs
> 
>
> Key: IMPALA-7804
> URL: https://issues.apache.org/jira/browse/IMPALA-7804
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: David Knupp
>Priority: Blocker
>  Labels: S3, broken-build, flaky
>
> The failures have to do with getting AWS client credentials.
> *query_test/test_scanners.py:696: in test_decimal_encodings*
> _Stacktrace_
> {noformat}
> query_test/test_scanners.py:696: in test_decimal_encodings
> self.run_test_case('QueryTest/parquet-decimal-formats', vector, 
> unique_database)
> common/impala_test_suite.py:496: in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:358: in __verify_results_and_errors
> replace_filenames_with_placeholder)
> common/test_result_verifier.py:438: in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:260: in verify_query_result_is_equal
> assert expected_results == actual_results
> E   assert Comparing QueryTestResults (expected vs actual):
> E -255.00,-255.00,-255.00 == -255.00,-255.00,-255.00
> E -255.00,-255.00,-255.00 != -65535.00,-65535.00,-65535.00
> E -65535.00,-65535.00,-65535.00 != -999.99,-999.99,-999.99
> E -65535.00,-65535.00,-65535.00 != 
> 0.00,-.99,-.99
> E -999.99,-999.99,-999.99 != 0.00,0.00,0.00
> E -999.99,-999.99,-999.99 != 
> 0.00,.99,.99
> E 0.00,-.99,-.99 != 
> 255.00,255.00,255.00
> E 0.00,-.99,-.99 != 
> 65535.00,65535.00,65535.00
> E 0.00,0.00,0.00 != 999.99,999.99,999.99
> E 0.00,0.00,0.00 != None
> E 0.00,.99,.99 != None
> E 0.00,.99,.99 != None
> E 255.00,255.00,255.00 != None
> E 255.00,255.00,255.00 != None
> E 65535.00,65535.00,65535.00 != None
> E 65535.00,65535.00,65535.00 != None
> E 999.99,999.99,999.99 != None
> E 999.99,999.99,999.99 != None
> E Number of rows returned (expected vs actual): 18 != 9
> {noformat}
> _Standard Error_
> {noformat}
> SET sync_ddl=False;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_huge_num_rows_76a09ef1` CASCADE;
> -- 2018-11-01 09:42:41,140 INFO MainThread: Started query 
> 4c4bc0e7b69d7641:130ffe73
> SET sync_ddl=False;
> -- executing against localhost:21000
> CREATE DATABASE `test_huge_num_rows_76a09ef1`;
> -- 2018-11-01 09:42:42,402 INFO MainThread: Started query 
> e34d714d6a62cba1:2a8544d0
> -- 2018-11-01 09:42:42,405 INFO MainThread: Created database 
> "test_huge_num_rows_76a09ef1" for test ID 
> "query_test/test_scanners.py::TestParquet::()::test_huge_num_rows[protocol: 
> beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, 
> 'abort_on_error': 1, 'debug_action': 
> '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@1.0', 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none]"
> 18/11/01 09:42:43 DEBUG s3a.S3AFileSystem: Initializing S3AFileSystem for 
> impala-test-uswest2-1
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Propagating entries under 
> fs.s3a.bucket.impala-test-uswest2-1.
> 18/11/01 09:42:43 WARN impl.MetricsConfig: Cannot locate configuration: tried 
> hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
> 18/11/01 09:42:43 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot 
> period at 10 second(s).
> 18/11/01 09:42:43 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> started
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: For URI s3a://impala-test-uswest2-1/, 
> using credentials AWSCredentialProviderList: BasicAWSCredentialsProvider 
> EnvironmentVariableCredentialsProvider 
> com.amazonaws.auth.InstanceProfileCredentialsProvider@15bbf42f
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.connection.maximum is 
> 1500
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.attempts.maximum is 20
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of 
> fs.s3a.connection.establish.timeout is 5000
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: Value of fs.s3a.connection.timeout is 
> 20
> 18/11/01 09:42:43 DEBUG s3a.S3AUtils: 

[jira] [Commented] (IMPALA-3526) S3: Fix up S3PlannerTest

2018-12-03 Thread Lars Volker (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-3526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707581#comment-16707581
 ] 

Lars Volker commented on IMPALA-3526:
-

[~vukercegovac] - Do you plan to continue working on this one? Otherwise I'll 
unassign it to reflect its status.

> S3: Fix up S3PlannerTest
> 
>
> Key: IMPALA-3526
> URL: https://issues.apache.org/jira/browse/IMPALA-3526
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, 
> Impala 2.10.0, Impala 2.11.0
>Reporter: Sailesh Mukil
>Assignee: Vuk Ercegovac
>Priority: Major
>  Labels: s3, test
>
> I just recently found out that our frontend S3PlannerTest test has been 
> broken for quite some time. And we haven't been testing it.
> Specifically testS3ScanRanges, testTpch and testJoinOrder. The latter 2 have 
> different plans now and the first test needs to use regex to find and 
> substitute the size of the file as the size of the test file keeps changing 
> between data loads.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-6544) Lack of S3 consistency leads to rare test failures

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker updated IMPALA-6544:

Labels: S3 broken-build consistency flaky test-framework  (was: S3 
consistency test-framework)

> Lack of S3 consistency leads to rare test failures
> --
>
> Key: IMPALA-6544
> URL: https://issues.apache.org/jira/browse/IMPALA-6544
> Project: IMPALA
>  Issue Type: Task
>  Components: Frontend
>Affects Versions: Impala 2.8.0
>Reporter: Sailesh Mukil
>Priority: Major
>  Labels: S3, broken-build, consistency, flaky, test-framework
>
> Every now and then, we hit a flaky test on S3 runs due to files missing when 
> they should be present, and vice versa. We could consider running our tests 
> (or a subset of our tests) with S3Guard to avoid these problems, however rare 
> they are.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7070) Failed test: query_test.test_nested_types.TestParquetArrayEncodings.test_thrift_array_of_arrays on S3

2018-12-03 Thread Lars Volker (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707562#comment-16707562
 ] 

Lars Volker commented on IMPALA-7070:
-

Unassigning since I currently don't have time to look into it.

> Failed test: 
> query_test.test_nested_types.TestParquetArrayEncodings.test_thrift_array_of_arrays
>  on S3
> -
>
> Key: IMPALA-7070
> URL: https://issues.apache.org/jira/browse/IMPALA-7070
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Dimitris Tsirogiannis
>Priority: Critical
>  Labels: broken-build, flaky, s3, test-failure
>
>  
> {code:java}
> Error Message
> query_test/test_nested_types.py:406: in test_thrift_array_of_arrays "col1 
> array>") query_test/test_nested_types.py:579: in 
> _create_test_table check_call(["hadoop", "fs", "-put", local_path, 
> location], shell=False) /usr/lib64/python2.6/subprocess.py:505: in check_call 
> raise CalledProcessError(retcode, cmd) E   CalledProcessError: Command 
> '['hadoop', 'fs', '-put', 
> '/data/jenkins/workspace/impala-asf-2.x-core-s3/repos/Impala/testdata/parquet_nested_types_encodings/bad-thrift.parquet',
>  
> 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays']'
>  returned non-zero exit status 1
> Stacktrace
> query_test/test_nested_types.py:406: in test_thrift_array_of_arrays
> "col1 array>")
> query_test/test_nested_types.py:579: in _create_test_table
> check_call(["hadoop", "fs", "-put", local_path, location], shell=False)
> /usr/lib64/python2.6/subprocess.py:505: in check_call
> raise CalledProcessError(retcode, cmd)
> E   CalledProcessError: Command '['hadoop', 'fs', '-put', 
> '/data/jenkins/workspace/impala-asf-2.x-core-s3/repos/Impala/testdata/parquet_nested_types_encodings/bad-thrift.parquet',
>  
> 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays']'
>  returned non-zero exit status 1
> Standard Error
> SET sync_ddl=False;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_thrift_array_of_arrays_11da5fde` CASCADE;
> SET sync_ddl=False;
> -- executing against localhost:21000
> CREATE DATABASE `test_thrift_array_of_arrays_11da5fde`;
> MainThread: Created database "test_thrift_array_of_arrays_11da5fde" for test 
> ID 
> "query_test/test_nested_types.py::TestParquetArrayEncodings::()::test_thrift_array_of_arrays[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none]"
> -- executing against localhost:21000
> create table test_thrift_array_of_arrays_11da5fde.ThriftArrayOfArrays (col1 
> array>) stored as parquet location 
> 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays';
> 18/05/20 18:31:03 WARN impl.MetricsConfig: Cannot locate configuration: tried 
> hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
> 18/05/20 18:31:03 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 
> 10 second(s).
> 18/05/20 18:31:03 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> started
> 18/05/20 18:31:06 INFO Configuration.deprecation: 
> fs.s3a.server-side-encryption-key is deprecated. Instead, use 
> fs.s3a.server-side-encryption.key
> put: rename 
> `s3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays/bad-thrift.parquet._COPYING_'
>  to 
> `s3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays/bad-thrift.parquet':
>  Input/output error
> 18/05/20 18:31:08 INFO impl.MetricsSystemImpl: Stopping s3a-file-system 
> metrics system...
> 18/05/20 18:31:08 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> stopped.
> 18/05/20 18:31:08 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> shutdown complete.{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-7070) Failed test: query_test.test_nested_types.TestParquetArrayEncodings.test_thrift_array_of_arrays on S3

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker reassigned IMPALA-7070:
---

Assignee: (was: Lars Volker)

> Failed test: 
> query_test.test_nested_types.TestParquetArrayEncodings.test_thrift_array_of_arrays
>  on S3
> -
>
> Key: IMPALA-7070
> URL: https://issues.apache.org/jira/browse/IMPALA-7070
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Dimitris Tsirogiannis
>Priority: Critical
>  Labels: broken-build, flaky, s3, test-failure
>
>  
> {code:java}
> Error Message
> query_test/test_nested_types.py:406: in test_thrift_array_of_arrays "col1 
> array>") query_test/test_nested_types.py:579: in 
> _create_test_table check_call(["hadoop", "fs", "-put", local_path, 
> location], shell=False) /usr/lib64/python2.6/subprocess.py:505: in check_call 
> raise CalledProcessError(retcode, cmd) E   CalledProcessError: Command 
> '['hadoop', 'fs', '-put', 
> '/data/jenkins/workspace/impala-asf-2.x-core-s3/repos/Impala/testdata/parquet_nested_types_encodings/bad-thrift.parquet',
>  
> 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays']'
>  returned non-zero exit status 1
> Stacktrace
> query_test/test_nested_types.py:406: in test_thrift_array_of_arrays
> "col1 array>")
> query_test/test_nested_types.py:579: in _create_test_table
> check_call(["hadoop", "fs", "-put", local_path, location], shell=False)
> /usr/lib64/python2.6/subprocess.py:505: in check_call
> raise CalledProcessError(retcode, cmd)
> E   CalledProcessError: Command '['hadoop', 'fs', '-put', 
> '/data/jenkins/workspace/impala-asf-2.x-core-s3/repos/Impala/testdata/parquet_nested_types_encodings/bad-thrift.parquet',
>  
> 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays']'
>  returned non-zero exit status 1
> Standard Error
> SET sync_ddl=False;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_thrift_array_of_arrays_11da5fde` CASCADE;
> SET sync_ddl=False;
> -- executing against localhost:21000
> CREATE DATABASE `test_thrift_array_of_arrays_11da5fde`;
> MainThread: Created database "test_thrift_array_of_arrays_11da5fde" for test 
> ID 
> "query_test/test_nested_types.py::TestParquetArrayEncodings::()::test_thrift_array_of_arrays[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none]"
> -- executing against localhost:21000
> create table test_thrift_array_of_arrays_11da5fde.ThriftArrayOfArrays (col1 
> array>) stored as parquet location 
> 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays';
> 18/05/20 18:31:03 WARN impl.MetricsConfig: Cannot locate configuration: tried 
> hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
> 18/05/20 18:31:03 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 
> 10 second(s).
> 18/05/20 18:31:03 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> started
> 18/05/20 18:31:06 INFO Configuration.deprecation: 
> fs.s3a.server-side-encryption-key is deprecated. Instead, use 
> fs.s3a.server-side-encryption.key
> put: rename 
> `s3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays/bad-thrift.parquet._COPYING_'
>  to 
> `s3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays/bad-thrift.parquet':
>  Input/output error
> 18/05/20 18:31:08 INFO impl.MetricsSystemImpl: Stopping s3a-file-system 
> metrics system...
> 18/05/20 18:31:08 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> stopped.
> 18/05/20 18:31:08 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> shutdown complete.{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-6656) Metrics for time spent in BufferAllocator

2018-12-03 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-6656.
---
   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> Metrics for time spent in BufferAllocator
> -
>
> Key: IMPALA-6656
> URL: https://issues.apache.org/jira/browse/IMPALA-6656
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: observability, resource-management
> Fix For: Impala 3.2.0
>
>
> We should track the total time spent and the time spent in TCMalloc so we can 
> understand where time is going globally. 
> I think we should shard them by CurrentCore() to avoid contention and get 
> more granular metrics. We want a timer for the amount of time spent in 
> SystemAllocator. We probably also want counters for how many times we go down 
> each code path in BufferAllocator::AllocateInternal() (i.e. getting a hit 
> immediately in the local area, evicting a clean page, etc down to doing a 
> full locked scavenge).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-6656) Metrics for time spent in BufferAllocator

2018-12-03 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-6656.
---
   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> Metrics for time spent in BufferAllocator
> -
>
> Key: IMPALA-6656
> URL: https://issues.apache.org/jira/browse/IMPALA-6656
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: observability, resource-management
> Fix For: Impala 3.2.0
>
>
> We should track the total time spent and the time spent in TCMalloc so we can 
> understand where time is going globally. 
> I think we should shard them by CurrentCore() to avoid contention and get 
> more granular metrics. We want a timer for the amount of time spent in 
> SystemAllocator. We probably also want counters for how many times we go down 
> each code path in BufferAllocator::AllocateInternal() (i.e. getting a hit 
> immediately in the local area, evicting a clean page, etc down to doing a 
> full locked scavenge).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IMPALA-6955) Debug webpage request for unknown query ID crashes Impala in GetClientRequestState

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker updated IMPALA-6955:

Summary: Debug webpage request for unknown query ID crashes Impala in 
GetClientRequestState  (was: Timeout when starting test_query_expiration custom 
cluster)

> Debug webpage request for unknown query ID crashes Impala in 
> GetClientRequestState
> --
>
> Key: IMPALA-6955
> URL: https://issues.apache.org/jira/browse/IMPALA-6955
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Vuk Ercegovac
>Assignee: Lars Volker
>Priority: Critical
>  Labels: broken-build, flaky
>
> Ran into the following crash on a rhel test recently:
> {noformat}
> Error starting cluster: num_known_live_backends did not reach expected value 
> in time{noformat}
> Backtrace:
> {noformat}
> #0 0x7f92365185c9 in raise () from /lib64/libc.so.6
> #1 0x7f9236519cd8 in abort () from /lib64/libc.so.6
> #2 0x7f92393841a5 in os::abort(bool) () from 
> /opt/toolchain/sun-jdk-64bit-1.8.0.05/jre/lib/amd64/server/libjvm.so
> #3 0x7f9239514843 in VMError::report_and_die() () from 
> /opt/toolchain/sun-jdk-64bit-1.8.0.05/jre/lib/amd64/server/libjvm.so
> #4 0x7f9239389562 in JVM_handle_linux_signal () from 
> /opt/toolchain/sun-jdk-64bit-1.8.0.05/jre/lib/amd64/server/libjvm.so
> #5 0x7f92393804f3 in signalHandler(int, siginfo*, void*) () from 
> /opt/toolchain/sun-jdk-64bit-1.8.0.05/jre/lib/amd64/server/libjvm.so
> #6 
> #7 0x016fded0 in base::subtle::NoBarrier_CompareAndSwap (ptr=0x238, 
> old_value=0, new_value=1) at 
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/repos/Impala/be/src/gutil/atomicops-internals-x86.h:85
> #8 0x016fdf50 in base::subtle::Acquire_CompareAndSwap (ptr=0x238, 
> old_value=0, new_value=1) at 
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/repos/Impala/be/src/gutil/atomicops-internals-x86.h:138
> #9 0x016fe26c in base::SpinLock::Lock (this=0x238) at 
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/repos/Impala/be/src/gutil/spinlock.h:74
> #10 0x016fe2f6 in impala::SpinLock::lock (this=0x238) at 
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/repos/Impala/be/src/util/spinlock.h:34
> #11 0x01aa8c96 in 
> impala::ScopedShardedMapRef 
> >::ScopedShardedMapRef (this=0x7f91aa81eb90, query_id=..., sharded_map=0x1c0) 
> at 
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/repos/Impala/be/src/util/sharded-query-map-util.h:99
> #12 0x01a999e2 in impala::ImpalaServer::GetClientRequestState 
> (this=0xa569000, query_id=...) at 
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/repos/Impala/be/src/service/impala-server.cc:2123
> #13 0x01b3ace6 in impala::ImpalaHttpHandler::QuerySummaryHandler 
> (this=0x6f057a0, include_json_plan=true, include_summary=true, args=..., 
> document=0x7f91aa81f230) at 
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/repos/Impala/be/src/service/impala-http-handler.cc:755
> #14 0x01b3cc11 in impala::ImpalaHttpHandler:: auto:6*)>::operator(), 
> std::basic_string >, rapidjson::GenericDocument > 
> >(const std::map, 
> std::allocator >, std::basic_string, 
> std::allocator >, std::less std::char_traits, std::allocator > >, 
> std::allocator, 
> std::allocator > const, std::basic_string, 
> std::allocator > > > > &, 
> rapidjson::GenericDocument, 
> rapidjson::MemoryPoolAllocator > *) const 
> (__closure=0xd9884b8, args=..., doc=0x7f91aa81f230) at 
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/repos/Impala/be/src/service/impala-http-handler.cc:132
> #15 0x01b3cc46 in 
> boost::detail::function::void_function_obj_invoker2  auto:5&, auto:6*)>, void, const std::map std::char_traits, std::allocator >, std::basic_string std::char_traits, std::allocator >, 
> std::less, 
> std::allocator > >, std::allocator std::basic_string, std::allocator >, 
> std::basic_string, std::allocator > > > 
> >&, rapidjson::GenericDocument, 
> rapidjson::MemoryPoolAllocator 
> >*>::invoke(boost::detail::function::function_buffer &, const 
> std::map, std::allocator 
> >, std::basic_string, std::allocator >, 
> std::le\
> ss, std::allocator > >, 
> std::allocator, 
> std::allocator > const, std::basic_string, 
> std::allocator > > > > &, 
> rapidjson::GenericDocument, 
> rapidjson::MemoryPoolAllocator > *) 
> (function_obj_ptr=..., a0=..., a1=0x7f91aa81f230) at 
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153
> #16 0x01c4f528 in boost::function2 std::string, std::less, std::allocator const, std::string> > > const&, 
> 

[jira] [Updated] (IMPALA-6591) TestClientSsl hung for a long time

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker updated IMPALA-6591:

Labels: broken-build flaky hang  (was: broken-build hang)

> TestClientSsl hung for a long time
> --
>
> Key: IMPALA-6591
> URL: https://issues.apache.org/jira/browse/IMPALA-6591
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 2.12.0, Impala 3.1.0, Impala 3.2.0
>Reporter: Tim Armstrong
>Assignee: Csaba Ringhofer
>Priority: Critical
>  Labels: broken-build, flaky, hang
>
> {noformat}
> 18:49:13 
> custom_cluster/test_catalog_wait.py::TestCatalogWait::test_delayed_catalog 
> PASSED
> 18:49:53 
> custom_cluster/test_client_ssl.py::TestClientSsl::test_ssl[exec_option: 
> {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none] Build timed 
> out (after 1,440 minutes). Marking the build as failed.
> 12:20:15 Build was aborted
> 12:20:15 Archiving artifacts
> {noformat}
> I unfortunately wasn't able to get any logs...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6591) TestClientSsl hung for a long time

2018-12-03 Thread Lars Volker (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707533#comment-16707533
 ] 

Lars Volker commented on IMPALA-6591:
-

I've seen this again. Here's the code loop with the failed assertion:

 
{code}
# In practice, sending SIGINT to the shell process doesn't always seem to get 
caught
# (and a search shows up some bugs in Python where SIGINT might be ignored). So 
retry
# for 30s until one signal takes.
while impalad.get_num_in_flight_queries() == 1:
  time.sleep(1)
  LOG.info("Sending signal...")
  os.kill(p.pid(), signal.SIGINT)
  num_tries += 1
  assert num_tries < 30, "SIGINT was not caught by shell within 30s"
{code}

{{p}} is an {{ImpalaShell}} object. There seems to be a possibility that the 
shell process has been terminated but the query is still registered. I think we 
should at least improve the code to log if the shell process is still alive to 
tell the two cases apart and better understand where to look next.

[~csringhofer] - I picked you randomly; feel free to find another person or 
assign back to me if you're swamped.

> TestClientSsl hung for a long time
> --
>
> Key: IMPALA-6591
> URL: https://issues.apache.org/jira/browse/IMPALA-6591
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 2.12.0, Impala 3.1.0, Impala 3.2.0
>Reporter: Tim Armstrong
>Assignee: Sailesh Mukil
>Priority: Critical
>  Labels: broken-build, flaky, hang
>
> {noformat}
> 18:49:13 
> custom_cluster/test_catalog_wait.py::TestCatalogWait::test_delayed_catalog 
> PASSED
> 18:49:53 
> custom_cluster/test_client_ssl.py::TestClientSsl::test_ssl[exec_option: 
> {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none] Build timed 
> out (after 1,440 minutes). Marking the build as failed.
> 12:20:15 Build was aborted
> 12:20:15 Archiving artifacts
> {noformat}
> I unfortunately wasn't able to get any logs...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-6591) TestClientSsl hung for a long time

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker reopened IMPALA-6591:
-
  Assignee: Csaba Ringhofer  (was: Sailesh Mukil)

> TestClientSsl hung for a long time
> --
>
> Key: IMPALA-6591
> URL: https://issues.apache.org/jira/browse/IMPALA-6591
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 2.12.0, Impala 3.1.0, Impala 3.2.0
>Reporter: Tim Armstrong
>Assignee: Csaba Ringhofer
>Priority: Critical
>  Labels: broken-build, flaky, hang
>
> {noformat}
> 18:49:13 
> custom_cluster/test_catalog_wait.py::TestCatalogWait::test_delayed_catalog 
> PASSED
> 18:49:53 
> custom_cluster/test_client_ssl.py::TestClientSsl::test_ssl[exec_option: 
> {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none] Build timed 
> out (after 1,440 minutes). Marking the build as failed.
> 12:20:15 Build was aborted
> 12:20:15 Archiving artifacts
> {noformat}
> I unfortunately wasn't able to get any logs...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-6591) TestClientSsl hung for a long time

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker updated IMPALA-6591:

Target Version: Impala 3.2.0, Impala 2.13.0  (was: Impala 2.13.0)

> TestClientSsl hung for a long time
> --
>
> Key: IMPALA-6591
> URL: https://issues.apache.org/jira/browse/IMPALA-6591
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 2.12.0, Impala 3.1.0, Impala 3.2.0
>Reporter: Tim Armstrong
>Assignee: Csaba Ringhofer
>Priority: Critical
>  Labels: broken-build, flaky, hang
>
> {noformat}
> 18:49:13 
> custom_cluster/test_catalog_wait.py::TestCatalogWait::test_delayed_catalog 
> PASSED
> 18:49:53 
> custom_cluster/test_client_ssl.py::TestClientSsl::test_ssl[exec_option: 
> {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none] Build timed 
> out (after 1,440 minutes). Marking the build as failed.
> 12:20:15 Build was aborted
> 12:20:15 Archiving artifacts
> {noformat}
> I unfortunately wasn't able to get any logs...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-6591) TestClientSsl hung for a long time

2018-12-03 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker updated IMPALA-6591:

Affects Version/s: Impala 3.2.0
   Impala 3.1.0

> TestClientSsl hung for a long time
> --
>
> Key: IMPALA-6591
> URL: https://issues.apache.org/jira/browse/IMPALA-6591
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 2.12.0, Impala 3.1.0, Impala 3.2.0
>Reporter: Tim Armstrong
>Assignee: Csaba Ringhofer
>Priority: Critical
>  Labels: broken-build, flaky, hang
>
> {noformat}
> 18:49:13 
> custom_cluster/test_catalog_wait.py::TestCatalogWait::test_delayed_catalog 
> PASSED
> 18:49:53 
> custom_cluster/test_client_ssl.py::TestClientSsl::test_ssl[exec_option: 
> {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none] Build timed 
> out (after 1,440 minutes). Marking the build as failed.
> 12:20:15 Build was aborted
> 12:20:15 Archiving artifacts
> {noformat}
> I unfortunately wasn't able to get any logs...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7910) COMPUTE STATS does an unnecessary REFRESH after writing to the Metastore

2018-12-03 Thread Balazs Jeszenszky (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707374#comment-16707374
 ] 

Balazs Jeszenszky commented on IMPALA-7910:
---

IMPALA-6994 is a similar issue.

> COMPUTE STATS does an unnecessary REFRESH after writing to the Metastore
> 
>
> Key: IMPALA-7910
> URL: https://issues.apache.org/jira/browse/IMPALA-7910
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.9.0, Impala 2.11.0, Impala 2.12.0
>Reporter: Michael Brown
>Assignee: Tianyi Wang
>Priority: Critical
>
> COMPUTE STATS and possibly other DDL operations unnecessarily do the 
> equivalent of a REFRESH after writing to the Hive Metastore. This unnecessary 
> operation can be very expensive, so should be avoided.
> The behavior can be confirmed from the catalogd logs:
> {code}
> compute stats functional_parquet.alltypes;
> +---+
> | summary   |
> +---+
> | Updated 24 partition(s) and 11 column(s). |
> +---+
> Relevant catalogd.INFO snippet
> I0413 14:40:24.210749 27295 HdfsTable.java:1263] Incrementally loading table 
> metadata for: functional_parquet.alltypes
> I0413 14:40:24.242122 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=1: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.244634 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=10: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.247174 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=11: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.249713 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=12: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.252288 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=2: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.254629 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=3: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.256991 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=4: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.259464 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=5: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.262197 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=6: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.264463 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=7: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.266736 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=8: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.269210 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=9: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.271800 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=1: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.274348 27295 HdfsTable.java:555]