[ 
https://issues.apache.org/jira/browse/HDFS-13121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499170#comment-16499170
 ] 

Zsolt Venczel commented on HDFS-13121:
--------------------------------------

To make sure the test is meaningful I did the following:

1) built the entire project with the native profile activated as the added test 
is triggered with loaded native library only:
{code:java}
mvn clean install -Pnative -DskipTests
{code}
2) Applied the patch having the test only: [^test-only.patch]

3) Run the test in the hadoop-hdfs-project/hadoop-hdfs module:
{code:java}
mvn test -Dtest=TestShortCircuitCache
{code}
The test fails with
{code:java}
[INFO] 
[INFO] --- maven-surefire-plugin:2.21.0:test (default-test) @ hadoop-hdfs ---
[INFO] 
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache
[ERROR] Tests run: 13, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 9.842 
s <<< FAILURE! - in org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache
[ERROR] 
testRequestFileDescriptorsWhenULimit(org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache)
  Time elapsed: 0.355 s  <<< ERROR!
java.lang.NullPointerException
        at 
org.apache.hadoop.hdfs.shortcircuit.ShortCircuitReplica.<init>(ShortCircuitReplica.java:129)
        at 
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.requestFileDescriptors(BlockReaderFactory.java:620)
        at 
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.createShortCircuitReplicaInfo(BlockReaderFactory.java:553)
        at 
org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.testRequestFileDescriptorsWhenULimit(TestShortCircuitCache.java:903)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
        at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
        at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
        at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
        at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)

[INFO] 
[INFO] Results:
[INFO] 
[ERROR] Errors: 
[ERROR]   TestShortCircuitCache.testRequestFileDescriptorsWhenULimit:903 » 
NullPointer
[INFO] 
[ERROR] Tests run: 13, Failures: 0, Errors: 1, Skipped: 0
[INFO] 
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 16.245 s
[INFO] Finished at: 2018-06-02T21:56:14+02:00
[INFO] Final Memory: 44M/711M
{code}
4) I reset my branch to HEAD and apply [^HDFS-13121.02.patch].
 Running the test again results with:
{code:java}
[INFO] 
[INFO] --- maven-surefire-plugin:2.21.0:test (default-test) @ hadoop-hdfs ---
[INFO] 
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache
[INFO] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.268 s 
- in org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0
[INFO] 
{code}

> NPE when request file descriptors when SC read
> ----------------------------------------------
>
>                 Key: HDFS-13121
>                 URL: https://issues.apache.org/jira/browse/HDFS-13121
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 3.0.0
>            Reporter: Gang Xie
>            Assignee: Zsolt Venczel
>            Priority: Minor
>         Attachments: HDFS-13121.01.patch, HDFS-13121.02.patch, test-only.patch
>
>
> Recently, we hit an issue that the DFSClient throws NPE. The case is that, 
> the app process exceeds the limit of the max open file. In the case, the 
> libhadoop never throw and exception but return null to the request of fds. 
> But requestFileDescriptors use the returned fds directly without any check 
> and then NPE. 
>  
> We need add a sanity check here of null pointer.
>  
> private ShortCircuitReplicaInfo requestFileDescriptors(DomainPeer peer,
>  Slot slot) throws IOException {
>  ShortCircuitCache cache = clientContext.getShortCircuitCache();
>  final DataOutputStream out =
>  new DataOutputStream(new BufferedOutputStream(peer.getOutputStream()));
>  SlotId slotId = slot == null ? null : slot.getSlotId();
>  new Sender(out).requestShortCircuitFds(block, token, slotId, 1,
>  failureInjector.getSupportsReceiptVerification());
>  DataInputStream in = new DataInputStream(peer.getInputStream());
>  BlockOpResponseProto resp = BlockOpResponseProto.parseFrom(
>  PBHelperClient.vintPrefixed(in));
>  DomainSocket sock = peer.getDomainSocket();
>  failureInjector.injectRequestFileDescriptorsFailure();
>  switch (resp.getStatus()) {
>  case SUCCESS:
>  byte buf[] = new byte[1];
>  FileInputStream[] fis = new FileInputStream[2];
>  {color:#d04437}sock.recvFileInputStreams(fis, buf, 0, buf.length);{color}
>  ShortCircuitReplica replica = null;
>  try {
>  ExtendedBlockId key =
>  new ExtendedBlockId(block.getBlockId(), block.getBlockPoolId());
>  if (buf[0] == USE_RECEIPT_VERIFICATION.getNumber()) {
>  LOG.trace("Sending receipt verification byte for slot {}", slot);
>  sock.getOutputStream().write(0);
>  }
>  {color:#d04437}replica = new ShortCircuitReplica(key, fis[0], fis[1], 
> cache,{color}
> {color:#d04437} Time.monotonicNow(), slot);{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to