Re: Request double-check on Ambari config logic (ES network_host)

Matt Foley Tue, 02 May 2017 21:58:19 -0700

Hi Otto,
This event derives from this line of code: 
https://github.com/elastic/elasticsearch/blob/2.3/core/src/main/java/org/elasticsearch/action/support/master/TransportMasterNodeAction.java#L148
which suggests that a cluster action has been requested on a local (loopback) 
address.  This is not
surprising given what I’ve learned about the semantics of network.host with 
wildcard address.
See next message, item C.  Basically, while the wildcard causes ES to “listen” 
on all IP addresses, it
only *publishes* one, and on a multi-homed server it can be the wrong one.  I 
can’t be certain
this causes what you’re seeing, but it seems feasible.


From: Otto Fowler <[email protected]>
Date: Tuesday, May 2, 2017 at 8:30 PM
To: "[email protected]" <[email protected]>, Matt 
Foley <[email protected]>, "[email protected]" 
<[email protected]>, "[email protected]" <[email protected]>
Subject: Re: Request double-check on Ambari config logic (ES network_host)

OK.
I tried it using this method, and master ( adding [] ).  In both cases, I can 
hit 9200 from other machines, but in both cases I’m getting ES master errors:

ClusterBlockException[blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / 
initialized];]
at 
org.elasticsearch.cluster.block.ClusterBlocks.indexBlockedException(ClusterBlocks.java:174)
at 
org.elasticsearch.action.admin.indices.create.TransportCreateIndexAction.checkBlock(TransportCreateIndexAction.java:66)
at 
org.elasticsearch.action.admin.indices.create.TransportCreateIndexAction.checkBlock(TransportCreateIndexAction.java:41)
at 
org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction.doStart(TransportMasterNodeAction.java:148)
at 
org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction.start(TransportMasterNodeAction.java:140)
at 
org.elasticsearch.action.support.master.TransportMasterNodeAction.doExecute(TransportMasterNodeAction.java:107)
at 
org.elasticsearch.action.support.master.TransportMasterNodeAction.doExecute(TransportMasterNodeAction.java:51)
at 
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:137)
at 
org.elasticsearch.action.index.TransportIndexAction.doExecute(TransportIndexAction.java:98)
at 
org.elasticsearch.action.index.TransportIndexAction.doExecute(TransportIndexAction.java:66)
at 
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:137)
at 
org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:85)
at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:58)
at 
org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:359)
at org.elasticsearch.client.FilterClient.doExecute(FilterClient.java:52)
at 
org.elasticsearch.rest.BaseRestHandler$HeadersAndContextCopyClient.doExecute(BaseRestHandler.java:83)
at 
org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:359)
at 
org.elasticsearch.client.support.AbstractClient.index(AbstractClient.java:371)
at 
org.elasticsearch.rest.action.index.RestIndexAction.handleRequest(RestIndexAction.java:102)
at org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:54)
at org.elasticsearch.rest.RestController.executeHandler(RestController.java:205)
at 
org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:166)
at 
org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:128)
at 
org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:86)
at 
org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServ

and kibana is not good.

not sure what that error means.
I have 5 nodes, and put es master on #5, with #3,4 as datanodes.

Sorry, but I don’t think my setup is going to be much help at this point.




On May 2, 2017 at 17:19:43, Matt Foley 
([email protected]<mailto:[email protected]>) wrote:
The default will now be “0.0.0.0”, and not eth0. And this will work if 
suggestions from various community members and a suggestion in the old 1.x 
documentation for ES are correct. The 2.x documentation (we specify ES 2.3) 
doesn’t mention “0.0.0.0”, but I think it’s likely to still work, but it needs 
testing.

Thanks,
--Matt

From: Otto Fowler <[email protected]<mailto:[email protected]>>
Date: Tuesday, May 2, 2017 at 11:27 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>, 
Matt Foley <[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>, "[email protected]" 
<[email protected]<mailto:[email protected]>>
Subject: Re: Request double-check on Ambari config logic (ES network_host)

Are you saying that the defaults should work now?
Or they should work, but I still need to change the interface from eth0?




On May 2, 2017 at 13:36:11, Matt Foley 
([email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>)
 wrote:
Hi Otto,
The basic change to use “0.0.0.0” as the default binding, and put the square 
brackets in the template text instead of the parameter value, is now available 
in
https://github.com/mattf-horton/incubator-metron branch METRON-905 commit 
e879719a0c3fb

I’m having some trouble with my test env, so if you wanted to give it a try, 
that would be great.
If the “0.0.0.0” doesn’t work, then we should use
"_local_", "_site_"
that being the ES special values that mean aprx the same.

I’m going to have to do trial-and-error to determine the exact behavior of 
multi-item lists, and then write the python code to strip redundant square 
brackets if included in the parameter value.
Thanks,
--Matt


On 5/2/17, 6:44 AM, "Otto Fowler" 
<[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>>
 wrote:

I am working on a centos 7 cluster deploy for testing the steps.
I have this issue ( along with the wrong interface name ) and can test when
you have it.

An eta would help?


On May 2, 2017 at 09:14:10, [email protected] 
([email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>)
 wrote:

Are you working on this one? The JIRA doesn't look like it's currently
assigned. Thanks,

Jon

On Mon, May 1, 2017 at 6:40 PM Matt Foley 
<[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>>
 wrote:

> Ah, I see I mis-read METRON-897, and Nick specifically says
> "lo:ipv4","eth0:ipv4" did not work for him, but
["_lo:ipv4_","_eth0:ipv4_"]
> did work.
>
> So I went back and dug a little deeper, and realized that in the
> environment where "lo:ipv4","eth0:ipv4" worked for me, I had modified the
> yaml.j2 template to include the square brackets.
>
> So the below theory is wrong. Back to the drawing board.
> Thanks,
> --Matt
>
> On 5/1/17, 3:08 PM, "Matt Foley" 
> <[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>>
>  wrote:
>
> Hi, there have been widely varying statements about what needs to be
> in the Elasticsearch config parameter “network_host”. I think I may have
a
> rationale for what works and what doesn’t, but I’d like your input or
> correction.
>
> I am focusing on what worked in terms of punctuation (quotes and
> square brackets) with the old _lo:ip4_,_eth0:ip4_. I would like to ignore
> for the moment, please, whether eth0 was the correct name for a given
env,
> and whether we can use 0.0.0.0. Instead, for systems where eth0 WAS the
> correct name, I’d like to understand what worked and why.
>
> It’s complicated because the value starts out in xml, is read into
> python, printed by jinja, then consumed by yaml.
>
> I think there were two constructs that actually worked for this
> param. Please say whether this is consistent or inconsistent with your
> experience:
>
> "_lo:ip4_","_eth0:ip4_"
> This worked for me. I think this was read from XML into python as a
> list of strings, then output in jinja ‘print statement‘
> {{ network_host }} as a python literal list with form:
> [ "_lo:ip4_", "_eth0:ip4_" ]
> In other words, the print statement for a python list object injected
> the needed square brackets.
>
> and
> "[ _lo:ip4_, _eth0:ip4_ ]"
> Nick and Anand, please confirm if this is the form that worked for
> you. I think this was read from XML into python as a single string, and
> output in the same jinja print statement as:
> [ _lo:ip4_, _eth0:ip4_ ]
> because the print statement for a python string object does not
> produce quote marks.
>
> In either case, yaml (the consumer of the jinja output) saw what it
> interprets as a list of strings (since quotes are optional for yaml
> strings).
>
> What didn’t work was:
>
> * "_lo:ip4_, _eth0:ip4_"
> This would be read in and output as a single string, and no square
> brackets would ever be introduced.
>
> * _lo:ip4_, _eth0:ip4_ or [ _lo:ip4_, _eth0:ip4_ ]
> (without quotes) I think the unquoted colons messed up the python
> parsing
>
> Finally, I don’t know whether
> * [ "_lo:ip4_", "_eth0:ip4_" ]
> worked or not, I’m not sure anyone ever tried it. By the above logic
> it probably should work.
>
> Please give me your input if you have touched on these issues.
> Thanks,
> --Matt
>
>
>
>
>
>
> --

Jon

Re: Request double-check on Ambari config logic (ES network_host)

Reply via email to