[ 
https://issues.apache.org/jira/browse/ACCUMULO-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13713334#comment-13713334
 ] 

Basit Mustafa commented on ACCUMULO-1585:
-----------------------------------------

Hi Eric, 

Thank you for volunteering to lead this, I have not made any changes to the 
Accumulo source, I haven't even really dug into the internals quite a bit, 
although have developed for it and deployed it in test and production. I'm an 
experienced Java developer, just don't know the ins and outs and way around the 
Accumulo source, but am willing to help however I can. 

To answer your question, the problem described in the discussion thread would 
likely be addressed by the pattern you discussed/mentioned in how HDFS data 
nodes self-identify using the namespaceID to the namenode, but I don't think it 
addresses the case I mention about network-related reachability when a host 
might resolve to multiple IPs (and not because of load balancing, DNS 
round-robining, etc, which, really should not be used between Accumulo nodes at 
all, and that is not the use case I'm mentioning here, which is more related to 
reachability when a name resolves to an internal network by a process running 
on a machine in a given environment and to a different public IP when running 
outside that environment, although both addresses truly terminate at the same 
machine/instance).

I cannot envisage a standards/globally applicable solution other than FQDN to 
this situation that doesn't reinvent the wheel/purpose of FQDN-based 
resolution. The crux of the reachability issue is really that the result of the 
resolution taking place on the (arbitrary?) node that populates ZooKeeper on 
start/state change rather than each individual node being responsible to do the 
resolution and trusting DNS to give them the IP that gets them to their host 
(in this situation, using an internal IP between machines inside an EC2 
availability zone and/or rack/hypervisor in your own cloud is highly desirable 
versus having to go out to a public IP because you are on in-memory/backplane 
virtual interfaces that are very fast and zero cost, and hitting the public IP 
[absent any optimization done by the stack, which is rare, and most certainly 
not done on EC2/most hypervisors I know] will route the traffic to at least the 
closest edge device hurting performance, and on EC2 incurring regional data 
transfer charges, actually, this "schizophrenia" based on where the client, 
tablet server, etc process is in relation to where the initial resolution of 
configured hosts took place is what creates the reachability issue). So, while 
an HDFS DN-NN type self-ident system might work to address an issue of simply 
changing IPs without reachability or such private/public "schizophrenia", in my 
mental model it doesn't seem to address this reachability case, which, in the 
environment we're working and deploying in, is actually quite common. 

I think the approach that this be the default behavior (e.g. that the verbatim 
string from the config file is entered into ZooKeeper, whatever the user 
configured gc/monitor/tracer/masters/slaves file with) is best, it is 
predictable, the current behavior is non-standard in my experience (e.g. that 
the system would without good reason further process/resolve my input to a 
config file and write to ZooKeeper the resolved address, now, I am a newb to 
Accumulo and there very likely might be a very good design reason this is done, 
in which case, I would happily listen/learn from that and brainstorm other 
solutions such as a flag/config opt to enable such non-resolution behavior at 
the cost of whatever the benefit of this very good reason might be, but with my 
current knowledge I think telling the system to "not touch" the entered network 
name makes less sense than a flag telling it to do so, again, this is in my 
Accumulo-ignorant state). 
                
> Provide option for FQDN/verbatim data from config files of servers to be 
> stored in ZooKeeper rather than resolved IP
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: ACCUMULO-1585
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1585
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: tserver
>         Environment: All
>            Reporter: Basit Mustafa
>            Assignee: Eric Newton
>            Priority: Minor
>             Fix For: 1.6.0
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> There are some situations (esp in virtualized/cloud environments) where 
> "hardwiring" the IP into ZooKeeper can create reachability issues and an FQDN 
> (or, better/also, the verbatim string/line from the concerned config file) 
> would fix this problem. 
> For example, hostname node1.company.com specified in configuration files 
> resolves to an Amazon EC2 *internal* IP of 10.2.3.4 (internal on virtualized 
> network). Externally (e.g. from your dev machine, your 
> offsite/non-VPN/non-VPCed data center, other client machines on different 
> networks/clouds), node1.company.com will resolve to a public IP (e.g. Amazon 
> Elastic IP, etc) of something more routeable, like 54.55.56.57. 
> Accumulo currently stores 10.2.3.4 in ZooKeeper based on this resolution, 
> but, if you try to connect to Accumulo from outside these machines/machines 
> in the same cloud/vitualized network/non routeable network, and the same FQDN 
> (node1.company.com) resolves to the public address now (54.55.56.57), you 
> will not be able to connect, because the Accumulo client will have pulled the 
> resolved, and from here, unreachable, IP of 10.2.3.4.
> Using the FQDN (or in some other way allowing for client-side name 
> resolution/address translation, although this seems kludgy) would fix this 
> issue in a relatively standard way. Ideally, this would not incur a 
> performance issue beyond the first resolution assuming the TCP/IP stack is 
> doing its job and caching stuff effectively (I assume). 
> This doesn't really hurt/break things if you give an option in some config, 
> and, really, taking the literal from the file allows you to use whatever you 
> want, the ultimate in flexibility. 
> See discussion 
> http://mail-archives.apache.org/mod_mbox/accumulo-user/201307.mbox/%3CCAGFNOZTMVz0R2e0meDj%3DKqPPPJP6f5baaMqh8%3D07V7NZ8vToJg%40mail.gmail.com%3E
>  for more details and others having the same issue. 
> I will look into creating a patch for this as soon as I have some time to 
> find/look at relevant code portions (I need to find where accumulo is making 
> these writes to ZK and if the read FQDNs would need any resolution/their use 
> further down the line expects strictly IP or is in host or IP safe API calls, 
> etc). Any suggestions on where I can begin this are always appreciated. 
> Otherwise, I'll try and submit a patch when I can. 
> Figure I'd open this issue to at least provide a discussion on what more 
> experienced Accumulo devs and users think and what a solution based on the 
> style/patterns accepted for Accumulo development/configuration are. I can 
> read the guidelines myself, of course, and will, but someone suggested 
> opening an issue, so I am...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to