[
https://issues.apache.org/jira/browse/HDFS-13121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16492556#comment-16492556
]
Gabor Bota commented on HDFS-13121:
-----------------------------------
+1 for the v2 patch, failed tests on yetus are unrelated.
[~xiegang112], I think we should put in the effort to test these changes. If we
don't provide unit tests for these failures, we could have some other change in
the future which could introduce the same issue again or if we only provide a
description how to check the failure manually, testing will be way less
reliable.
> NPE when request file descriptors when SC read
> ----------------------------------------------
>
> Key: HDFS-13121
> URL: https://issues.apache.org/jira/browse/HDFS-13121
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client
> Affects Versions: 3.0.0
> Reporter: Gang Xie
> Assignee: Zsolt Venczel
> Priority: Minor
> Attachments: HDFS-13121.01.patch, HDFS-13121.02.patch
>
>
> Recently, we hit an issue that the DFSClient throws NPE. The case is that,
> the app process exceeds the limit of the max open file. In the case, the
> libhadoop never throw and exception but return null to the request of fds.
> But requestFileDescriptors use the returned fds directly without any check
> and then NPE.
>
> We need add a sanity check here of null pointer.
>
> private ShortCircuitReplicaInfo requestFileDescriptors(DomainPeer peer,
> Slot slot) throws IOException {
> ShortCircuitCache cache = clientContext.getShortCircuitCache();
> final DataOutputStream out =
> new DataOutputStream(new BufferedOutputStream(peer.getOutputStream()));
> SlotId slotId = slot == null ? null : slot.getSlotId();
> new Sender(out).requestShortCircuitFds(block, token, slotId, 1,
> failureInjector.getSupportsReceiptVerification());
> DataInputStream in = new DataInputStream(peer.getInputStream());
> BlockOpResponseProto resp = BlockOpResponseProto.parseFrom(
> PBHelperClient.vintPrefixed(in));
> DomainSocket sock = peer.getDomainSocket();
> failureInjector.injectRequestFileDescriptorsFailure();
> switch (resp.getStatus()) {
> case SUCCESS:
> byte buf[] = new byte[1];
> FileInputStream[] fis = new FileInputStream[2];
> {color:#d04437}sock.recvFileInputStreams(fis, buf, 0, buf.length);{color}
> ShortCircuitReplica replica = null;
> try {
> ExtendedBlockId key =
> new ExtendedBlockId(block.getBlockId(), block.getBlockPoolId());
> if (buf[0] == USE_RECEIPT_VERIFICATION.getNumber()) {
> LOG.trace("Sending receipt verification byte for slot {}", slot);
> sock.getOutputStream().write(0);
> }
> {color:#d04437}replica = new ShortCircuitReplica(key, fis[0], fis[1],
> cache,{color}
> {color:#d04437} Time.monotonicNow(), slot);{color}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]