[
https://issues.apache.org/jira/browse/HDFS-13121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499170#comment-16499170
]
Zsolt Venczel commented on HDFS-13121:
--------------------------------------
To make sure the test is meaningful I did the following:
1) built the entire project with the native profile activated as the added test
is triggered with loaded native library only:
{code:java}
mvn clean install -Pnative -DskipTests
{code}
2) Applied the patch having the test only: [^test-only.patch]
3) Run the test in the hadoop-hdfs-project/hadoop-hdfs module:
{code:java}
mvn test -Dtest=TestShortCircuitCache
{code}
The test fails with
{code:java}
[INFO]
[INFO] --- maven-surefire-plugin:2.21.0:test (default-test) @ hadoop-hdfs ---
[INFO]
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache
[ERROR] Tests run: 13, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 9.842
s <<< FAILURE! - in org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache
[ERROR]
testRequestFileDescriptorsWhenULimit(org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache)
Time elapsed: 0.355 s <<< ERROR!
java.lang.NullPointerException
at
org.apache.hadoop.hdfs.shortcircuit.ShortCircuitReplica.<init>(ShortCircuitReplica.java:129)
at
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.requestFileDescriptors(BlockReaderFactory.java:620)
at
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.createShortCircuitReplicaInfo(BlockReaderFactory.java:553)
at
org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.testRequestFileDescriptorsWhenULimit(TestShortCircuitCache.java:903)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
at
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
at
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
at
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)
[INFO]
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR] TestShortCircuitCache.testRequestFileDescriptorsWhenULimit:903 »
NullPointer
[INFO]
[ERROR] Tests run: 13, Failures: 0, Errors: 1, Skipped: 0
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 16.245 s
[INFO] Finished at: 2018-06-02T21:56:14+02:00
[INFO] Final Memory: 44M/711M
{code}
4) I reset my branch to HEAD and apply [^HDFS-13121.02.patch].
Running the test again results with:
{code:java}
[INFO]
[INFO] --- maven-surefire-plugin:2.21.0:test (default-test) @ hadoop-hdfs ---
[INFO]
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache
[INFO] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.268 s
- in org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0
[INFO]
{code}
> NPE when request file descriptors when SC read
> ----------------------------------------------
>
> Key: HDFS-13121
> URL: https://issues.apache.org/jira/browse/HDFS-13121
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client
> Affects Versions: 3.0.0
> Reporter: Gang Xie
> Assignee: Zsolt Venczel
> Priority: Minor
> Attachments: HDFS-13121.01.patch, HDFS-13121.02.patch, test-only.patch
>
>
> Recently, we hit an issue that the DFSClient throws NPE. The case is that,
> the app process exceeds the limit of the max open file. In the case, the
> libhadoop never throw and exception but return null to the request of fds.
> But requestFileDescriptors use the returned fds directly without any check
> and then NPE.
>
> We need add a sanity check here of null pointer.
>
> private ShortCircuitReplicaInfo requestFileDescriptors(DomainPeer peer,
> Slot slot) throws IOException {
> ShortCircuitCache cache = clientContext.getShortCircuitCache();
> final DataOutputStream out =
> new DataOutputStream(new BufferedOutputStream(peer.getOutputStream()));
> SlotId slotId = slot == null ? null : slot.getSlotId();
> new Sender(out).requestShortCircuitFds(block, token, slotId, 1,
> failureInjector.getSupportsReceiptVerification());
> DataInputStream in = new DataInputStream(peer.getInputStream());
> BlockOpResponseProto resp = BlockOpResponseProto.parseFrom(
> PBHelperClient.vintPrefixed(in));
> DomainSocket sock = peer.getDomainSocket();
> failureInjector.injectRequestFileDescriptorsFailure();
> switch (resp.getStatus()) {
> case SUCCESS:
> byte buf[] = new byte[1];
> FileInputStream[] fis = new FileInputStream[2];
> {color:#d04437}sock.recvFileInputStreams(fis, buf, 0, buf.length);{color}
> ShortCircuitReplica replica = null;
> try {
> ExtendedBlockId key =
> new ExtendedBlockId(block.getBlockId(), block.getBlockPoolId());
> if (buf[0] == USE_RECEIPT_VERIFICATION.getNumber()) {
> LOG.trace("Sending receipt verification byte for slot {}", slot);
> sock.getOutputStream().write(0);
> }
> {color:#d04437}replica = new ShortCircuitReplica(key, fis[0], fis[1],
> cache,{color}
> {color:#d04437} Time.monotonicNow(), slot);{color}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]