[
https://issues.apache.org/jira/browse/HBASE-25292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Kyle Purtell reassigned HBASE-25292:
-------------------------------------------
Assignee: Andrew Kyle Purtell
> Update InetSocketAddress usage discipline
> -----------------------------------------
>
> Key: HBASE-25292
> URL: https://issues.apache.org/jira/browse/HBASE-25292
> Project: HBase
> Issue Type: Bug
> Components: Client, HFile
> Reporter: Andrew Kyle Purtell
> Assignee: Andrew Kyle Purtell
> Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0
>
>
> We sometimes cache InetSocketAddress in data structures in an attempt to
> optimize away potential nameservice (DNS) lookups. This is, in general, an
> anti-pattern, because once an InetSocketAddress is resolved, resolution is
> never attempted again. The ideal pattern for connect() is ISA instantiation
> just before the connect() call, with no reuse of the ISA instance. For bind()
> we presume the local identity won't change while the process is live so usage
> and caching can be relaxed in that case.
> If I can restate my proposal for a usage convention for InetSocketAddress, it
> would be this: Network identities should be bound late. This means addresses
> should be resolved at the last possible moment. Also, network identity
> mappings can change, so our code should not inappropriately cache them;
> otherwise we might miss a change and fail to operate normally.
> I have reviewed the code for InetSocketAddress usage and in my opinion
> sometimes we are caching ISA acceptably, and in other cases we are not.
> Correct cases:
> * We cache ISA for RPC connections, so we don't potentially do a lookup for
> every Call. However, we resolve the address earlier than we need to. The code
> can be improved by moving resolution to just before where we connect().
> Incorrect cases that can be fixed:
> * RPC stubs. Remote clients may be recycled and replaced with new instances
> where the network identities (DNS name to IP address mapping) have changed--.
> HBASE-14544 attempts to work around DNS instability in data centers of years
> past in a way that, in my opinion, is the wrong thing to do in the modern
> era. This is just a technical opinion and not critical to the rest of the
> proposal. That said, I intend to propose a revert of HBASE-14544. Reverting
> this simplifies some code a bit. (If this part of the proposal is
> controversial it can be dropped.) When looking up the IP address of the
> remote host when creating a stub key we also make a key even if the
> resolution fails. This is the wrong thing to do. If we can't resolve the
> remote address, we can't contact the server. Making a stub that can't
> communicate is pointless. Throw an exception instead.
> * Favored nodes. Although the HDFS API requires InetSocketAddress, we don't
> have to make up a list right away and cache them forever. We can use Address
> to record the list of favored nodes and convert from Address to
> InetSocketAddress on demand (when we go to create the HFile). This will allow
> us to resolve datanode hostnames just before they are needed. In public
> cloud, kubernetes, and or some private datacenter service deployment options,
> datanode servers may have their network identities (DNS name -> IP address
> mapping) changed over time. We can and should avoid inappropriate caching
> that may cause us to indefinitely use an incorrect address when contacting a
> favored node.
> * Sometimes we use ISA when Address is just as good. For example, the dead
> servers list. If we are going to pay some attention to ISA usage discipline,
> let's remove the cases where we use ISA as a host and port pair but do not
> need to do so. Address works just as well and doesn't present an opportunity
> for misuse. Another example would be the RPC client concurrentCounterCache.
> Incorrect cases that cannot be fixed:
> * hbase-external-blockcache: We have to resolve all of the memcached
> locations up front because the memcached client constructor requires ISA
> instances. So we have to hope that the network identities (DNS name -> IP
> address mapping) does not change for any in the list. This is beyond our
> control.
> While in this area it is trivial to add new client connect metrics for number
> of potential nameservice lookups (whenever we instantiate an ISA) and number
> of failed nameservice lookups (if the instantiated ISA is unresolved).
> While in this area I also noticed we often directly access a field in
> ConnectionId where there is also a getter, so good practice is to use the
> getter instead.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)