Dusan,

Thanks. I seemed to have misunderstood yo before. That sounds like it, yes.

After reading through most, this might be _the_ issue:

https://github.com/moby/moby/issues/16720#issuecomment-435637740
https://github.com/moby/moby/issues/16720#issuecomment-444862701

Alessandro, can you try the suggested once the container is in failed state?

conntrack -D -p udp

Marc

Missatge de Dusan Pajin <dusan.pa...@gmail.com> del dia dc., 9 de juny
2021 a les 21:54:
>
> Hi,
>
> Alessandro, do you use docker-compose or docker swarm (docker stack)?
>
> The behavior I am referring to is described in number of issues on Github, 
> for example:
> https://github.com/moby/moby/issues/16720
> https://github.com/docker/for-linux/issues/182
> https://github.com/moby/moby/issues/18845
> https://github.com/moby/libnetwork/issues/1994
> https://github.com/robcowart/elastiflow/issues/414
> In some of those issues you will find links to other issues and so on.
>
> I don't have an explanation why this works for you in some situations and 
> some not.
> SInce that is the case, you might try clearing the conntrack table, which is 
> described in some of the issues above.
> Using the host network is certainly not convenient, but it is doable.
>
> Kind regards,
> Dusan
>
>
>
> On Wed, Jun 9, 2021 at 7:37 PM Marc Sune <marcde...@gmail.com> wrote:
>>
>> Dusan, Alessandro,
>>
>> Let me answer Dusan first.
>>
>> Missatge de Dusan Pajin <dusan.pa...@gmail.com> del dia dc., 9 de juny
>> 2021 a les 18:08:
>> >
>> > Hi Alessandro,
>> >
>> > I would say that this is a "known" issue or behavior in docker which is 
>> > experienced by everyone who ever wanted to receive syslog, netflow, 
>> > telemetry or any other similar UDP stream from network devices. When you 
>> > expose ports in your docker-compose file, the docker will create the IP 
>> > tables rules to steer the traffic to your container in docker's bridge 
>> > network, but unfortunately also translate the source IP address of the 
>> > packets. I am not sure what is the reasoning behind such a behavior. If 
>> > you try to search for solutions for this issue, you will find some 
>> > proposals, but none of them used to work in my case.
>>
>> That is not my understanding. I've also double checked with a devops
>> Docker guru in my organization.
>>
>> In the default network docker mode, masquerading only happens for
>> egress traffic not ingress.
>>
>> I actually tried it locally by running an httpd container (apache2)
>> and redirect 8080 on the "host" to port 80 on the container. Container
>> is on the docker range, LAN on my laptop is 192.168.1.36, .33 being
>> another client in my LAN.
>>
>> root@d64c65384e87:/usr/local/apache2# tcpdump -l -n
>> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
>> listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
>> 17:21:49.546067 IP 192.168.1.33.46595 > 172.17.0.3.80: Flags [F.], seq
>> 2777556344, ack 4139714538, win 172, options [nop,nop,TS val 21290101
>> ecr 3311681356], length 0
>> 17:21:49.546379 IP 192.168.1.33.46591 > 172.17.0.3.80: Flags [F.], seq
>> 3001175791, ack 61192428, win 172, options [nop,nop,TS val 21290101
>> ecr 3311686360], length 0
>> 17:21:49.546402 IP 172.17.0.3.80 > 192.168.1.33.46591: Flags [.], ack
>> 1, win 236, options [nop,nop,TS val 3311689311 ecr 21290101], length 0
>> 17:21:49.546845 IP 172.17.0.3.80 > 192.168.1.33.46595: Flags [F.], seq
>> 1, ack 1, win 227, options [nop,nop,TS val 3311689311 ecr 21290101],
>> length 0
>> 17:21:49.550993 IP 192.168.1.33.46595 > 172.17.0.3.80: Flags [.], ack
>> 2, win 172, options [nop,nop,TS val 21290110 ecr 3311689311], length 0
>>
>> That works as expected, showing the real 1.33 address.
>>
>> Mind that there is a lot of confusion, because firewall services in
>> the system's OS can interfere with the rules set by the docker daemon
>> itself:
>>
>> https://stackoverflow.com/a/47913950/9321563
>>
>> Alessandro,
>>
>> I need to analyse in detail your rules, but what is clear is that
>> "something" is modifying them (see the two first rules)... whether
>> these two lines in particular are causing the issue, I am not sure:
>>
>> Pre:
>>
>> Chain POSTROUTING (policy ACCEPT)
>> target     prot opt source               destination
>> MASQUERADE  all  --  192.168.200.0/24     anywhere
>> MASQUERADE  all  --  172.17.0.0/16        anywhere
>> MASQUERADE  tcp  --  192.168.200.3        192.168.200.3        tcp dpt:8086
>> MASQUERADE  tcp  --  192.168.200.5        192.168.200.5        tcp dpt:3000
>> MASQUERADE  udp  --  192.168.200.9        192.168.200.9        udp dpt:50000
>> MASQUERADE  tcp  --  192.168.200.11       192.168.200.11       tcp dpt:9092
>> MASQUERADE  udp  --  192.168.200.4        192.168.200.4        udp dpt:50005
>> MASQUERADE  udp  --  192.168.200.8        192.168.200.8        udp dpt:5600
>> MASQUERADE  tcp  --  192.168.200.8        192.168.200.8        tcp dpt:bgp
>> MASQUERADE  udp  --  192.168.200.2        192.168.200.2        udp dpt:20013
>>
>> Post:
>>
>> Chain POSTROUTING (policy ACCEPT 4799 packets, 1170K bytes)
>>  pkts bytes target     prot opt in     out     source
>> destination
>>   340 20392 MASQUERADE  all  --  any    !br-d662f1cf56fa
>> 192.168.200.0/24     anywhere          <----------------------
>>   453 28712 MASQUERADE  all  --  any    !docker0  172.17.0.0/16
>> anywhere                         <----------------------
>>    0     0 MASQUERADE  tcp  --  any    any     192.168.200.3
>> 192.168.200.3        tcp dpt:8086
>>     0     0 MASQUERADE  tcp  --  any    any     192.168.200.5
>> 192.168.200.5        tcp dpt:3000
>>     0     0 MASQUERADE  udp  --  any    any     192.168.200.9
>> 192.168.200.9        udp dpt:50000
>>     0     0 MASQUERADE  tcp  --  any    any     192.168.200.11
>> 192.168.200.11       tcp dpt:9092
>>     0     0 MASQUERADE  udp  --  any    any     192.168.200.4
>> 192.168.200.4        udp dpt:50005
>>     0     0 MASQUERADE  udp  --  any    any     192.168.200.8
>> 192.168.200.8        udp dpt:5600
>>     0     0 MASQUERADE  tcp  --  any    any     192.168.200.8
>> 192.168.200.8        tcp dpt:bgp
>>     0     0 MASQUERADE  udp  --  any    any     192.168.200.2
>> 192.168.200.2        udp dpt:20013
>>
>> Which OS are you using in the host?
>>
>> A bit of a moonshot, when the problem occurs can you try manually
>> (using iptabes) to remove the first two rules and set them exactly as
>> in the PRE scenario. Use
>>
>> iptables -t nat -I <rule_num> <rest of params>
>>
>> which allows you to add it in a specific position. I think the problem
>> might be somewhere else though.
>>
>> marc
>>
>> >
>> > What definitely works is not to expose specific ports, but to configure 
>> > your container in docker-compose to be attached directly to the host 
>> > network. In that case, there will be no translation rules and no source 
>> > NAT and container will be directly connected to all host's network 
>> > interfaces.
>> > In such case, be aware that Docker DNS will not work, so to export 
>> > information from pmacct container further to kafka, you would need to send 
>> > it to "localhost", if the kafka container is running on the same host and 
>> > not to "kafka". This shouldn't be a big problem in your setup.
>> >
>> > Btw, I am using docker swarm and not docker-compose, although they both 
>> > use docker-compose files with similar syntax, but I don't think there is 
>> > difference in their behavior.
>> >
>> > Hope this helps
>> >
>> > Kind regards,
>> > Dusan
>> >
>> > On Wed, Jun 9, 2021 at 3:29 PM Paolo Lucente <pa...@pmacct.net> wrote:
>> >>
>> >>
>> >> Hi Alessandro,
>> >>
>> >> (thanks for the kind words, first and foremost)
>> >>
>> >> Indeed, the test that Marc proposes is very sound, ie. check the actual
>> >> packets coming in "on the wire" with tcpdump: do they really change
>> >> sender IP address?
>> >>
>> >> Let me also confirm that what is used to populate peer_ip_src is the
>> >> sender IP address coming straight from the socket (Marc's question) and,
>> >> contrary to sFlow, there is typically there is no other way to infer
>> >> such info (Alessandro's question).
>> >>
>> >> Paolo
>> >>
>> >>
>> >> On 9/6/21 14:51, Marc Sune wrote:
>> >> > Alessandro,
>> >> >
>> >> > inline
>> >> >
>> >> > Missatge de Alessandro Montano | FIBERTELECOM
>> >> > <a.mont...@fibertelecom.it> del dia dc., 9 de juny 2021 a les 10:12:
>> >> >>
>> >> >> Hi Paolo (and Marc),
>> >> >>
>> >> >> this is my first post here ... first of all THANKS FOR YOU GREAT JOB :)
>> >> >>
>> >> >> I'm using pmacct/nfacctd container from docker-hub 
>> >> >> (+kafka+telegraf+influxdb+grafana) and it's really a powerfull tool
>> >> >>
>> >> >> The sender are JUNIPER MX204 routers, using j-flow (extended netflow)
>> >> >>
>> >> >> NFACCTD VERSION:
>> >> >> NetFlow Accounting Daemon, nfacctd 1.7.6-git [20201226-0 (7ad9d1b)]
>> >> >>   '--enable-mysql' '--enable-pgsql' '--enable-sqlite3' 
>> >> >> '--enable-kafka' '--enable-geoipv2' '--enable-jansson' 
>> >> >> '--enable-rabbitmq' '--enable-nflog' '--enable-ndpi' '--enable-zmq' 
>> >> >> '--enable-avro' '--enable-serdes' '--enable-redis' '--enable-gnutls' 
>> >> >> 'AVRO_CFLAGS=-I/usr/local/avro/include' 
>> >> >> 'AVRO_LIBS=-L/usr/local/avro/lib -lavro' '--enable-l2' 
>> >> >> '--enable-traffic-bins' '--enable-bgp-bins' '--enable-bmp-bins' 
>> >> >> '--enable-st-bins'
>> >> >>
>> >> >> SYSTEM:
>> >> >> Linux 76afde386f6f 5.4.0-73-generic #82-Ubuntu SMP Wed Apr 14 17:39:42 
>> >> >> UTC 2021 x86_64 GNU/Linux
>> >> >>
>> >> >> CONFIG:
>> >> >> debug: false
>> >> >> daemonize: false
>> >> >> pidfile: /var/run/nfacctd.pid
>> >> >> logfile: /var/log/pmacct/nfacctd.log
>> >> >> nfacctd_renormalize: true
>> >> >> nfacctd_port: 20013
>> >> >> aggregate[k]: peer_src_ip, peer_dst_ip, in_iface, out_iface, vlan, 
>> >> >> sampling_direction, etype, src_as, dst_as, as_path, proto, src_net, 
>> >> >> src_mask, dst_net, dst_mask, flows
>> >> >> nfacctd_time_new: true
>> >> >> plugins: kafka[k]
>> >> >> kafka_output[k]: json
>> >> >> kafka_topic[k]: nfacct
>> >> >> kafka_broker_host[k]: kafka
>> >> >> kafka_broker_port[k]: 9092
>> >> >> kafka_refresh_time[k]: 60
>> >> >> kafka_history[k]: 1m
>> >> >> kafka_history_roundoff[k]: m
>> >> >> kafka_max_writers[k]: 1
>> >> >> kafka_markers[k]: true
>> >> >> networks_file_no_lpm: true
>> >> >> use_ip_next_hop: true
>> >> >>
>> >> >> DOCKER-COMPOSE:
>> >> >> #Docker version 20.10.2, build 20.10.2-0ubuntu1~20.04.2
>> >> >> #docker-compose version 1.29.2, build 5becea4c
>> >> >> version: "3.9"
>> >> >> services:
>> >> >>    nfacct:
>> >> >>      networks:
>> >> >>        - ingress
>> >> >>      image: pmacct/nfacctd
>> >> >>      restart: on-failure
>> >> >>      ports:
>> >> >>        - "20013:20013/udp"
>> >> >>      volumes:
>> >> >>        - /etc/localtime:/etc/localtime
>> >> >>        - ./nfacct/etc:/etc/pmacct
>> >> >>        - ./nfacct/lib:/var/lib/pmacct
>> >> >>        - ./nfacct/log:/var/log/pmacct
>> >> >> networks:
>> >> >>    ingress:
>> >> >>      name: ingress
>> >> >>      ipam:
>> >> >>        config:
>> >> >>        - subnet: 192.168.200.0/24
>> >> >>
>> >> >> My problem is the  value of field PEER_IP_SRC ... at start everything 
>> >> >> is correct, and it works well for a (long) while ... hours ... days ...
>> >> >> I have ten routers so  "peer_ip_src": "151.157.228.xxx"  where xxx can 
>> >> >> easily identify the sender. Perfect.
>> >> >>
>> >> >> Suddenly ... "peer_ip_src": "192.168.200.1" for all records (and I 
>> >> >> loose the sender info!!!) ...
>> >> >>
>> >> >> It seems that docker-proxy decide to do nat/masquerading and translate 
>> >> >> source_ip for the udp stream.
>> >> >> The only way for me to have the correct behavior again is to 
>> >> >> stop/start the container.
>> >> >>
>> >> >> How can I fix it? Or, is there an alternative way to obtain the same 
>> >> >> info (router ip) from inside the netflow stream, and not from the udp 
>> >> >> packet.
>> >> >
>> >> > Paolo is definitely the right person to answer how "peer_ip_src" is 
>> >> > populated.
>> >> >
>> >> > However, there is something that I don't fully understand. To the best
>> >> > of my knowledge, even when binding ports, docker (actually the kernel,
>> >> > configured by docker) shouldn't masquerade traffic at all - if
>> >> > masquerade is truly what happens. And certainly that wouldn't happen
>> >> > "randomly" in the middle of the execution.
>> >> >
>> >> > My first thought would be that this is something related to pmacct
>> >> > itself, and that records are incorrectly generated but traffic is ok.
>> >> >
>> >> > I doubt the  linux kernel iptables rules would randomly change the way
>> >> > traffic is manipulated, unless of course, something else on that
>> >> > machine/server is reloading iptables, and the resulting ruleset is
>> >> > _slightly different_ for the traffic flowing towards the docker
>> >> > container, effectively modifying the streams that go to pmacct (e.g.
>> >> > rule priority reording). That _could_ explain why restarting the
>> >> > daemon suddenly works, as order would be fixed.
>> >> >
>> >> > Some more info would be needed to discard an iptables/docker issue:
>> >> >
>> >> > * Dump the iptables -L and iptables -t nat -L before and after the
>> >> > issue and compare.
>> >> > * Use iptables -vL and iptables -t nat -vL to monitor counters, before
>> >> > and after the issue, specially in the NAT table.
>> >> > * Get inside the running container
>> >> > (https://github.com/pmacct/pmacct/blob/master/docs/DOCKER.md#opening-a-shell-on-a-running-container),
>> >> > install tcpdump, and write the pcap to a file, before and after the
>> >> > incident.
>> >> >
>> >> > Since these dumps might contain sensitive data, you can send them
>> >> > anonymized or in private.
>> >> >
>> >> > Hopefully with this info we will see if it's an iptables issue or we
>> >> > have to look somewhere else.
>> >> >
>> >> > Regards
>> >> > marc
>> >> >
>> >> >>
>> >> >> Thanks for your support.
>> >> >>
>> >> >> Cheers.
>> >> >>
>> >> >> --
>> >> >> AlexIT
>> >> >> --
>> >> >> docker-doctors mailing list
>> >> >> docker-doct...@pmacct.net
>> >> >> http://acaraje.pmacct.net/cgi-bin/mailman/listinfo/docker-doctors
>> >> >
>> >> > _______________________________________________
>> >> > pmacct-discussion mailing list
>> >> > http://www.pmacct.net/#mailinglists
>> >> >
>> >>
>> >> _______________________________________________
>> >> pmacct-discussion mailing list
>> >> http://www.pmacct.net/#mailinglists

_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Reply via email to