[
https://issues.apache.org/jira/browse/HADOOP-17224?focusedWorklogId=525284&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-525284
]
ASF GitHub Bot logged work on HADOOP-17224:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 16/Dec/20 21:40
Start Date: 16/Dec/20 21:40
Worklog Time Spent: 10m
Work Description: amahussein commented on pull request #2537:
URL: https://github.com/apache/hadoop/pull/2537#issuecomment-747057646
> All OOMs are "unable to create new native thread" indicating ulimit or
resource shortage to create LWP. The first OOM is in TestJvmMetrics in
hadoop-common. If ISA-L is related, the cause should be in the code path of
ErasureCodeNative#loadLibrary. I don't have clear insight yet. I think we have
been familiar with test failures by "unable to create new native thread" for a
long time..
@iwasakims , I cannot fully confident that `ErasureCodeNative#loadLibrary`
is a strong indication that ISLA-L does not contribute to the OOM.
ISA-L is a native library; therefore loading this library means different
memory allocations and possibly some background threads.
For sure, we do not want to blame those pre-existing failures to ISA-L.
However, adding ISA-L could increase failures because of the hadoop code, or
the native code.
I think there are two approaches:
1. Profile the memory. Then compare the two profiles with and without ISA-L.
If there is no Yetus hookup to do that, then it will have to be done on a local
machine for a sample of unit tests.
2. Add another commit that ignores the failures frequently reported in QBT
report. In addition I suggest adding "ignore" to
`TestDistributredShell#testDistributedShellWithResourcesWithLargeContainers`
and `TestDistributredShell#testDistributedShellWithResources`. Those two tests
leave two ApplicationMaster processes running in the background. After ignoring
the "every-day" failures, we can look at the remaining failures as possible
consequences of loading ISA-L.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 525284)
Time Spent: 3h 10m (was: 3h)
> Install Intel ISA-L library in Dockerfile
> -----------------------------------------
>
> Key: HADOOP-17224
> URL: https://issues.apache.org/jira/browse/HADOOP-17224
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Takanobu Asanuma
> Assignee: Takanobu Asanuma
> Priority: Blocker
> Labels: pull-request-available
> Fix For: 3.4.0
>
> Time Spent: 3h 10m
> Remaining Estimate: 0h
>
> Currently, there is not isa-l library in the docker container, and jenkins
> skips the natvie tests, TestNativeRSRawCoder and TestNativeXORRawCoder.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]