First, I confirm the following bug in consul 1.0.5:
- start a X instances of a service
- scale the service to X+Y (with Y > 1)
==> then consul crashes...
>From time to time, I also saw HAProxy getting only 10 servers from 20 for a
given service.

I'll revert to 1.0.2 for now.

The order of the returned SRV records is ignored by HAProxy.
Can you confirm the number of servers associated to the service '
mfm-monitor-opentsdb' in consul?
On the HAProxy box, can you run the following command and return the output
(obfuscating the IPs and other sensible information)
  dig +notcp @127.0.0.1 -p 8600 -t SRV _mfm-monitor-opentsdb
._tcp.service.consul

Baptiste



On Mon, Feb 12, 2018 at 8:27 AM, Чепайкин Михаил <mchepay...@gmail.com>
wrote:

> Im on Consul 1.0.2.
>
> Why do you think this issue is about serving SRV over UDP, rather than
> about different order of SRV or A records returned by Consul DNS with
> consecutive requests?
>
> On 11 February 2018 at 18:46, Baptiste <bed...@gmail.com> wrote:
>
>> Hi,
>>
>> What consul version are you using?
>> I'm facing the same issue in my consul lab. That said, it seems to be a
>> bug in consul, not able to serve too many SRV records over UDP.
>> I even triggered a consul crash (using 1.0.5 version).
>> I'm still investigating this issue and will come back to you as soon as I
>> have more reliable information.
>>
>> Note: please ensure the number of server created by server-template
>> directive (5 in your case) is above the expected number of server available
>> in your service.
>>
>> Baptiste
>>
>>
>>
>> On Thu, Feb 8, 2018 at 12:32 AM, Чепайкин Михаил <mchepay...@gmail.com>
>> wrote:
>>
>>> Hi
>>>
>>> I’ve changed configuration as you suggested:
>>>
>>> backend tsdb_backend_query
>>>   server-template tsdb_query 5 
>>> _mfm-monitor-opentsdb._tcp.service.mfmconsul:4242 check resolvers dns inter 
>>> 1000
>>>
>>> Logs are kinda different - backend servers now go UP and DOWN, but seems
>>> the same - ip addresses changing in the same way:
>>>
>>> time="2018-02-08T02:12:53+03:00" level=info msg="[WARNING] 038/021253 
>>> (18208) : Server tsdb_backend_query/tsdb_query1 is going DOWN for 
>>> maintenance (No IP for server ). 2 active and 0 backup servers left. 0 
>>> sessions active, 0 requeued, 0 remaining in queue." job=mfm-monitor-haproxy 
>>> pid=18208
>>> time="2018-02-08T02:12:53+03:00" level=info msg="[WARNING] 038/021253 
>>> (18208) : tsdb_backend_query/tsdb_query1 changed its IP from 10.182.161.223 
>>> to 10.182.161.211 by DNS cache." job=mfm-monitor-haproxy pid=18208
>>> time="2018-02-08T02:12:53+03:00" level=info msg="[WARNING] 038/021253 
>>> (18208) : Server tsdb_backend_query/tsdb_query1 administratively READY 
>>> thanks to valid DNS answer." job=mfm-monitor-haproxy pid=18208
>>> time="2018-02-08T02:12:53+03:00" level=info msg="[WARNING] 038/021253 
>>> (18208) : Server tsdb_backend_query/tsdb_query1 
>>> ('0ab6a1d3.addr.dc1.mfmconsul') is UP/READY (resolves again)." 
>>> job=mfm-monitor-haproxy pid=18208
>>> time="2018-02-08T02:12:55+03:00" level=info msg="[WARNING] 038/021255 
>>> (18208) : Server tsdb_backend_query/tsdb_query3 is going DOWN for 
>>> maintenance (No IP for server ). 2 active and 0 backup servers left. 0 
>>> sessions active, 0 requeued, 0 remaining in queue." job=mfm-monitor-haproxy 
>>> pid=18208
>>> time="2018-02-08T02:12:55+03:00" level=info msg="[WARNING] 038/021255 
>>> (18208) : tsdb_backend_query/tsdb_query3 changed its IP from 10.182.161.98 
>>> to 10.182.161.223 by DNS cache." job=mfm-monitor-haproxy pid=18208
>>> time="2018-02-08T02:12:55+03:00" level=info msg="[WARNING] 038/021255 
>>> (18208) : Server tsdb_backend_query/tsdb_query3 administratively READY 
>>> thanks to valid DNS answer." job=mfm-monitor-haproxy pid=18208
>>> time="2018-02-08T02:12:55+03:00" level=info msg="[WARNING] 038/021255 
>>> (18208) : Server tsdb_backend_query/tsdb_query3 
>>> ('0ab6a1df.addr.dc1.mfmconsul') is UP/READY (resolves again)." 
>>> job=mfm-monitor-haproxy pid=18208
>>> time="2018-02-08T02:12:57+03:00" level=info msg="[WARNING] 038/021257 
>>> (18208) : Server tsdb_backend_query/tsdb_query3 is going DOWN for 
>>> maintenance (No IP for server ). 2 active and 0 backup servers left. 0 
>>> sessions active, 0 requeued, 0 remaining in queue." job=mfm-monitor-haproxy 
>>> pid=18208
>>> time="2018-02-08T02:12:57+03:00" level=info msg="[WARNING] 038/021257 
>>> (18208) : tsdb_backend_query/tsdb_query3 changed its IP from 10.182.161.223 
>>> to 10.182.161.98 by DNS cache." job=mfm-monitor-haproxy pid=18208
>>> time="2018-02-08T02:12:57+03:00" level=info msg="[WARNING] 038/021257 
>>> (18208) : Server tsdb_backend_query/tsdb_query3 administratively READY 
>>> thanks to valid DNS answer." job=mfm-monitor-haproxy pid=18208
>>> time="2018-02-08T02:12:57+03:00" level=info msg="[WARNING] 038/021257 
>>> (18208) : Server tsdb_backend_query/tsdb_query3 
>>> ('0ab6a162.addr.dc1.mfmconsul') is UP/READY (resolves again)." 
>>> job=mfm-monitor-haproxy pid=18208
>>> time="2018-02-08T02:13:01+03:00" level=info msg="[WARNING] 038/021301 
>>> (18208) : Server tsdb_backend_query/tsdb_query1 is going DOWN for 
>>> maintenance (No IP for server ). 2 active and 0 backup servers left. 0 
>>> sessions active, 0 requeued, 0 remaining in queue." job=mfm-monitor-haproxy 
>>> pid=18208
>>> time="2018-02-08T02:13:01+03:00" level=info msg="[WARNING] 038/021301 
>>> (18208) : tsdb_backend_query/tsdb_query1 changed its IP from 10.182.161.211 
>>> to 10.182.161.223 by DNS cache." job=mfm-monitor-haproxy pid=18208
>>> time="2018-02-08T02:13:01+03:00" level=info msg="[WARNING] 038/021301 
>>> (18208) : Server tsdb_backend_query/tsdb_query1 administratively READY 
>>> thanks to valid DNS answer." job=mfm-monitor-haproxy pid=18208
>>> time="2018-02-08T02:13:01+03:00" level=info msg="[WARNING] 038/021301 
>>> (18208) : Server tsdb_backend_query/tsdb_query1 
>>> ('0ab6a1df.addr.dc1.mfmconsul') is UP/READY (resolves again)." 
>>> job=mfm-monitor-haproxy pid=18208
>>> time="2018-02-08T02:13:05+03:00" level=info msg="[WARNING] 038/021305 
>>> (18208) : Server tsdb_backend_query/tsdb_query2 is going DOWN for 
>>> maintenance (No IP for server ). 2 active and 0 backup servers left. 0 
>>> sessions active, 0 requeued, 0 remaining in queue." job=mfm-monitor-haproxy 
>>> pid=18208
>>> time="2018-02-08T02:13:05+03:00" level=info msg="[WARNING] 038/021305 
>>> (18208) : tsdb_backend_query/tsdb_query2 changed its IP from 10.182.161.163 
>>> to 10.182.161.211 by DNS cache." job=mfm-monitor-haproxy pid=18208
>>>
>>> Any thoughts?
>>>
>>> On 8 February 2018 at 01:25, Baptiste <bed...@gmail.com> wrote:
>>>
>>> Hi
>>>>
>>>> You're not using SRV records and that may be the root cause of your
>>>> issue.
>>>> Please try something like this:
>>>>
>>>> backend tsdb_backend_query
>>>>   server-template tsdb_query 5 
>>>> _mfm-monitor-opentsdb._tcp.service.mfmconsul:4242 check resolvers dns 
>>>> inter 1000
>>>>
>>>> if "mfm-monitor-opentsdb" is your service name in consul.
>>>>
>>>> Baptiste
>>>>
>>>>
>>>>
>>>> On Wed, Feb 7, 2018 at 2:52 PM, Чепайкин Михаил <mchepay...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi!
>>>>>
>>>>> I have a Consul as service discovery tool and HAProxy as load balancer.
>>>>>
>>>>> In Consul registered a service running on a number of servers, and
>>>>> this service can be scaled by adding and removing nodes and by moving 
>>>>> nodes
>>>>> from one server to another.
>>>>>
>>>>> Consul has DNS service which randomizes responses for services like
>>>>> that:
>>>>>
>>>>> [bux] michep@bux:~$ dig +short mfm-monitor-opentsdb.service.mfmconsul
>>>>> 10.182.161.239
>>>>> 10.182.161.152
>>>>> 10.182.161.240
>>>>> 10.182.161.92
>>>>> [bux] michep@bux:~$ dig +short mfm-monitor-opentsdb.service.mfmconsul
>>>>> 10.182.161.92
>>>>> 10.182.161.152
>>>>> 10.182.161.240
>>>>> 10.182.161.239
>>>>>
>>>>> In HAProxy 1.8.3 im using server-template configuration, like that:
>>>>>
>>>>> resolvers dns
>>>>>   nameserver dns1 ${HAPROXY_NAMESERVER}
>>>>>   hold valid 2s
>>>>>
>>>>> backend tsdb_backend_query
>>>>>   server-template tsdb_query 5 
>>>>> mfm-monitor-opentsdb.service.mfmconsul:4242 check resolvers dns inter 1000
>>>>>
>>>>> And in that case I get alot of warinings in haproxy log:
>>>>>
>>>>> time="2018-02-02T15:44:32+03:00" level=info msg="[WARNING] 032/154432 
>>>>> (32983) : tsdb_backend_query/tsdb_query1 changed its IP from 
>>>>> 10.182.161.240 to 10.182.161.239 by DNS cache." job=mfm-monitor-haproxy 
>>>>> pid=32983
>>>>> time="2018-02-02T15:44:42+03:00" level=info msg="[WARNING] 032/154442 
>>>>> (32983) : tsdb_backend_query/tsdb_query1 changed its IP from 
>>>>> 10.182.161.239 to 10.182.161.240 by DNS cache." job=mfm-monitor-haproxy 
>>>>> pid=32983
>>>>> time="2018-02-02T15:44:46+03:00" level=info msg="[WARNING] 032/154446 
>>>>> (32983) : tsdb_backend_query/tsdb_query3 changed its IP from 
>>>>> 10.182.161.152 to 10.182.161.239 by DNS cache." job=mfm-monitor-haproxy 
>>>>> pid=32983
>>>>> time="2018-02-02T15:44:50+03:00" level=info msg="[WARNING] 032/154450 
>>>>> (32983) : tsdb_backend_query/tsdb_query2 changed its IP from 
>>>>> 10.182.161.92 to 10.182.161.152 by DNS cache." job=mfm-monitor-haproxy 
>>>>> pid=32983
>>>>> time="2018-02-02T15:44:52+03:00" level=info msg="[WARNING] 032/154452 
>>>>> (32983) : tsdb_backend_query/tsdb_query3 changed its IP from 
>>>>> 10.182.161.239 to 10.182.161.92 by DNS cache." job=mfm-monitor-haproxy 
>>>>> pid=32983
>>>>> time="2018-02-02T15:44:56+03:00" level=info msg="[WARNING] 032/154456 
>>>>> (32983) : tsdb_backend_query/tsdb_query1 changed its IP from 
>>>>> 10.182.161.240 to 10.182.161.239 by DNS cache." job=mfm-monitor-haproxy 
>>>>> pid=32983
>>>>> time="2018-02-02T15:45:00+03:00" level=info msg="[WARNING] 032/154500 
>>>>> (32983) : tsdb_backend_query/tsdb_query3 changed its IP from 
>>>>> 10.182.161.92 to 10.182.161.240 by DNS cache." job=mfm-monitor-haproxy 
>>>>> pid=32983
>>>>> time="2018-02-02T15:45:02+03:00" level=info msg="[WARNING] 032/154502 
>>>>> (32983) : tsdb_backend_query/tsdb_query3 changed its IP from 
>>>>> 10.182.161.240 to 10.182.161.92 by DNS cache." job=mfm-monitor-haproxy 
>>>>> pid=32983
>>>>> time="2018-02-02T15:45:04+03:00" level=info msg="[WARNING] 032/154504 
>>>>> (32983) : tsdb_backend_query/tsdb_query2 changed its IP from 
>>>>> 10.182.161.152 to 10.182.161.240 by DNS cache." job=mfm-monitor-haproxy 
>>>>> pid=32983
>>>>> time="2018-02-02T15:45:06+03:00" level=info msg="[WARNING] 032/154506 
>>>>> (32983) : tsdb_backend_query/tsdb_query1 changed its IP from 
>>>>> 10.182.161.239 to 10.182.161.152 by DNS cache." job=mfm-monitor-haproxy 
>>>>> pid=32983
>>>>> time="2018-02-02T15:45:10+03:00" level=info msg="[WARNING] 032/154510 
>>>>> (32983) : tsdb_backend_query/tsdb_query3 changed its IP from 
>>>>> 10.182.161.92 to 10.182.161.239 by DNS cache." job=mfm-monitor-haproxy 
>>>>> pid=32983
>>>>> time="2018-02-02T15:45:18+03:00" level=info msg="[WARNING] 032/154518 
>>>>> (32983) : tsdb_backend_query/tsdb_query3 changed its IP from 
>>>>> 10.182.161.239 to 10.182.161.92 by DNS cache." job=mfm-monitor-haproxy 
>>>>> pid=32983
>>>>> time="2018-02-02T15:45:20+03:00" level=info msg="[WARNING] 032/154520 
>>>>> (32983) : tsdb_backend_query/tsdb_query2 changed its IP from 
>>>>> 10.182.161.240 to 10.182.161.239 by DNS cache." job=mfm-monitor-haproxy 
>>>>> pid=32983
>>>>>
>>>>> This isn’t really break the service, but I think this is not quite
>>>>> normal.
>>>>>
>>>>> Any advise on how to resolve this issue?
>>>>>
>>>>
> --
> Mike Chepaykin
>
>

Reply via email to