Hi Stephen,

On Wed, Jan 24, 2024 at 04:16:27AM +0000, Stephen Farrell wrote:
> 
> Hiya,
> 
> On 24/01/2024 03:32, Willy Tarreau wrote:
> > Even worse, sometimes
> > you can discover by accident that you're having a trace caught in emergency
> > while trying to spot a big prod problem and that was stored on your USB
> > thumb drive, and when you (re)discover this, you're very happy to see
> > that the data were encrypted.
> 
> That's interesting - I think if there were a usable reference that
> describes such a situation, or multiple thereof, that'd be quite
> useful. Do you know of such?

Oh it's simple. You almost completely lost connectivity between two
critical layers in the infrastructure. The only solution is to go into
the server room, you use the locally provided sniffer because for
various reasons you cannot enter with your own tools, you copy the
capture to the only USB key you have handy, the one in your pocket,
you leave the server room and go to an office where you're looking
at the trace in a hurry, thinking about the production that's still
working really bad, until you figure that an equipment is going sick
and needs to be first rebooted then replaced. But by this time you've
already locked your laptop and put back your USB key in your pocket,
and one day you need it to put something on it and you rediscover
this 4 GB pcap file that you need to delete to save some space and
you suddenly remember where it comes from and think "fortunately
those data are not exploitable".

There's a huge gap between what people imagine network captures to
be and what it's really like in field. Often it's important to
remember *why* people have to perform network captures. It's always
in order to debug something, and very often the causes that make you
capture 100% of the traffic on one interface are very vague but
causing lots of disruption and require lots of random manipulations
to try to narrow the investigation scope down. This implies limited
care for certain operations and even mistakes that occasionally make
the problem temporarily worse (e.g. reboot the wrong equipement).

Network troubleshooting is not just analysing window sizes or trying
to optimize buffer usage with pacing, it's used a lot to try to fix
real breakage. I for example once spotted a faulty DRAM chip on a
network interface, that was causing havoc due to 3 stuck bits. Packets
smaller than the stuck bit were OK, larger ones were corrupted. It
takes a long time before you start suspecting the NIC that seems to
be delivering traffic and passes all ping correctly... Until you use
full-sized packets and see that only exactly one every 8 passes (due
to the ping pattern changing). All such stuff requires captures and
quick action. There's often no other solution than using what you
have handy, including just the random USB key lying on a desk, that
you plug on a proprietary windows-based sniffer product in the room,
to analyse on your linux laptop outside.

I'm used to saying that inside the datacentre you can consider that
the network is trusted. That doesn't mean it's confidential, that
means that it's only observed by trusted people, who will be careful
about what they do with the data they may randomly collect during
debugging sessions, and who can sometimes make mistakes like anyone.
At least the environment is not supposed to be actively hostile, but
it's not completely secret either.

Willy

Reply via email to