Basit Mustafa created ACCUMULO-1585:
---------------------------------------
Summary: Provide option for FQDN/verbatim data from config files
of servers to be stored in ZooKeeper rather than resolved IP
Key: ACCUMULO-1585
URL: https://issues.apache.org/jira/browse/ACCUMULO-1585
Project: Accumulo
Issue Type: Improvement
Components: tserver
Environment: All
Reporter: Basit Mustafa
Priority: Minor
There are some situations (esp in virtualized/cloud environments) where
"hardwiring" the IP into ZooKeeper can create reachability issues and an FQDN
(or, better/also, the verbatim string/line from the concerned config file)
would fix this problem.
For example, hostname node1.company.com specified in configuration files
resolves to an Amazon EC2 *internal* IP of 10.2.3.4 (internal on virtualized
network). Externally (e.g. from your dev machine, your
offsite/non-VPN/non-VPCed data center, other client machines on different
networks/clouds), node1.company.com will resolve to a public IP (e.g. Amazon
Elastic IP, etc) of something more routeable, like 54.55.56.57.
Accumulo currently stores 10.2.3.4 in ZooKeeper based on this resolution, but,
if you try to connect to Accumulo from outside these machines/machines in the
same cloud/vitualized network/non routeable network, and the same FQDN
(node1.company.com) resolves to the public address now (54.55.56.57), you will
not be able to connect, because the Accumulo client will have pulled the
resolved, and from here, unreachable, IP of 10.2.3.4.
Using the FQDN (or in some other way allowing for client-side name
resolution/address translation, although this seems kludgy) would fix this
issue in a relatively standard way. Ideally, this would not incur a performance
issue beyond the first resolution assuming the TCP/IP stack is doing its job
and caching stuff effectively (I assume).
This doesn't really hurt/break things if you give an option in some config,
and, really, taking the literal from the file allows you to use whatever you
want, the ultimate in flexibility.
See discussion
http://mail-archives.apache.org/mod_mbox/accumulo-user/201307.mbox/%3CCAGFNOZTMVz0R2e0meDj%3DKqPPPJP6f5baaMqh8%3D07V7NZ8vToJg%40mail.gmail.com%3E
for more details and others having the same issue.
I will look into creating a patch for this as soon as I have some time to
find/look at relevant code portions (I need to find where accumulo is making
these writes to ZK and if the read FQDNs would need any resolution/their use
further down the line expects strictly IP or is in host or IP safe API calls,
etc). Any suggestions on where I can begin this are always appreciated.
Otherwise, I'll try and submit a patch when I can.
Figure I'd open this issue to at least provide a discussion on what more
experienced Accumulo devs and users think and what a solution based on the
style/patterns accepted for Accumulo development/configuration are. I can read
the guidelines myself, of course, and will, but someone suggested opening an
issue, so I am...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira