[
https://issues.apache.org/jira/browse/HDFS-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047962#comment-13047962
]
Jonathan Hsieh commented on HDFS-1787:
--------------------------------------
{quote}
> Text.readString can throw IOException. The InternalDataNodeException thrown
> on the next line is also a subclass of IOException. Behaviorwise it would
> essentially use the same error recovery path.
However, we will loss the information like socket addresses.
{quote}
I believe this is already an error path, but I'll look into this more.
{quote}
Some comments:
Please combine them into one message.
{code}
+ DFSClient.LOG.warn("Failed to connect to" + targetAddr +": "
+ + ex.getMessage());
+ DFSClient.LOG.warn(" Adding to deadNodes and continuing");
{code}
{quote}
My plan is to add \n's to the log message.
{quote}
{code}
It is better to log the exception.
+ } catch (IOException e) {
+ // preserve previous semantics, eat the exception.
+ }
{code}
{quote}
Will add logging.
{quote}
Do we really need internalDNErrors and getInternalDNErrorCount()? It is only
used in the tests.
{quote}
Can you suggest an alternate mechanism for (automated) testing of the changes
other than visual inspection of the logs?
This tests that the error messaging path was exercised and actually provides
some information that may be useful in trouble shooting. I believe there are
annotations in the works that are semantically mean "public for testing but
otherwise private/package". I believe the comment I added would make this
reasonably easy to find when this gets integrated throughout.
> "Not enough xcievers" error should propagate to client
> ------------------------------------------------------
>
> Key: HDFS-1787
> URL: https://issues.apache.org/jira/browse/HDFS-1787
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: data-node
> Affects Versions: 0.23.0
> Reporter: Todd Lipcon
> Assignee: Jonathan Hsieh
> Labels: newbie
> Fix For: 0.23.0
>
> Attachments: hdfs-1787.2.patch, hdfs-1787.3.patch, hdfs-1787.3.patch,
> hdfs-1787.5.patch, hdfs-1787.patch
>
>
> We find that users often run into the default transceiver limits in the DN.
> Putting aside the inherent issues with xceiver threads, it would be nice if
> the "xceiver limit exceeded" error propagated to the client. Currently,
> clients simply see an EOFException which is hard to interpret, and have to go
> slogging through DN logs to find the underlying issue.
> The data transfer protocol should be extended to either have a special error
> code for "not enough xceivers" or should have some error code for generic
> errors with which a string can be attached and propagated.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira