Please go ahead.

On Thu, Oct 26, 2017 at 6:12 PM, Xie Gang <xiegang...@gmail.com> wrote:

> Shall I create the jira directly?
>
> On Thu, Oct 26, 2017 at 12:34 PM, Xie Gang <xiegang...@gmail.com> wrote:
>
> > Hi,
> >
> > We use HDFS2.4 & 2.6, and recently hit a issue that DFSClient domain
> > socket is disabled when datanode throw block invalid exception.
> >
> > The block is invalidated for some reason on datanote and it's OK. Then
> > DFSClient tries to access this block on this datanode via domain socket.
> > This triggers a IOExcetion. On DFSClient side, when get a IOExcetion and
> > error code 'ERROR', it disables the domain socket and fails back to TCP.
> > and the worst is that it seems never recover the socket.
> >
> > I think this is a defect and with such "block invalid" exception, we
> > should not disable the domain socket because the is nothing wrong about
> the
> > domain socket service.
> >
> > And thoughts?
> >
> > The code:
> >
> > private ShortCircuitReplicaInfo requestFileDescriptors(DomainPeer peer,
> >         Slot slot) throws IOException {
> >   ShortCircuitCache cache = clientContext.getShortCircuitCache();
> >   final DataOutputStream out =
> >       new DataOutputStream(new BufferedOutputStream(peer.
> getOutputStream()));
> >   SlotId slotId = slot == null ? null : slot.getSlotId();
> >   new Sender(out).requestShortCircuitFds(block, token, slotId, 1);
> >   DataInputStream in = new DataInputStream(peer.getInputStream());
> >   BlockOpResponseProto resp = BlockOpResponseProto.parseFrom(
> >       PBHelper.vintPrefixed(in));
> >   DomainSocket sock = peer.getDomainSocket();
> >   switch (resp.getStatus()) {
> >   case SUCCESS:
> >     byte buf[] = new byte[1];
> >     FileInputStream fis[] = new FileInputStream[2];
> >     sock.recvFileInputStreams(fis, buf, 0, buf.length);
> >     ShortCircuitReplica replica = null;
> >     try {
> >       ExtendedBlockId key =
> >           new ExtendedBlockId(block.getBlockId(),
> block.getBlockPoolId());
> >       replica = new ShortCircuitReplica(key, fis[0], fis[1], cache,
> >           Time.monotonicNow(), slot);
> >     } catch (IOException e) {
> >       // This indicates an error reading from disk, or a format error.
> Since
> >       // it's not a socket communication problem, we return null rather
> than
> >       // throwing an exception.
> >       LOG.warn(this + ": error creating ShortCircuitReplica.", e);
> >       return null;
> >     } finally {
> >       if (replica == null) {
> >         IOUtils.cleanup(DFSClient.LOG, fis[0], fis[1]);
> >       }
> >     }
> >     return new ShortCircuitReplicaInfo(replica);
> >   case ERROR_UNSUPPORTED:
> >     if (!resp.hasShortCircuitAccessVersion()) {
> >       LOG.warn("short-circuit read access is disabled for " +
> >           "DataNode " + datanode + ".  reason: " + resp.getMessage());
> >       clientContext.getDomainSocketFactory()
> >           .disableShortCircuitForPath(pathInfo.getPath());
> >     } else {
> >       LOG.warn("short-circuit read access for the file " +
> >           fileName + " is disabled for DataNode " + datanode +
> >           ".  reason: " + resp.getMessage());
> >     }
> >     return null;
> >   case ERROR_ACCESS_TOKEN:
> >     String msg = "access control error while " +
> >         "attempting to set up short-circuit access to " +
> >         fileName + resp.getMessage();
> >     if (LOG.isDebugEnabled()) {
> >       LOG.debug(this + ":" + msg);
> >     }
> >     return new ShortCircuitReplicaInfo(new InvalidToken(msg));
> >   default:
> >     LOG.warn(this + ": unknown response code " + resp.getStatus() +
> >         " while attempting to set up short-circuit access. " +
> >         resp.getMessage());
> >     clientContext.getDomainSocketFactory()
> >         .disableShortCircuitForPath(pathInfo.getPath());
> >     return null;
> >   }
> >
> >
> >
> > --
> > Xie Gang
> >
>
>
>
> --
> Xie Gang
>



-- 
John

Reply via email to