To start with, I would take off the sum() to be sure you're not summing
over multiple instances or interfaces. That is, try this query instead:
rate(node_network_receive_bytes_total{instance=~"ip-10-XX-XXX-44.us-west-2.compute.internal",device="eth0"}[5m])
Does that give one result, or multiple? If it's multiple then you'll need
to investigate why. (For example: is instance=~ matching multiple
instances? Are you doing federation so you have multiple copies of the same
metric?)
If it's only a single result, and it's still showing around 22.5MB/sec,
then you could always escalate to AWS support, to ask them why their
traffic metering doesn't matching what is seen at the host. But since AWS
is recording traffic levels 5 times lower than what node_exporter sees, you
might want to keep it to yourself :-)
The other things I'd do are:
- look at node_network_transmit_bytes_total, and the corresponding AWS
transmit metric. Maybe they consider "send" and "receive" the other way
round.
- treble-check that you're looking at the same instance
- look at what proportion of packets are broadcast/multicast
[compare node_network_receive_packets_total
and node_network_receive_multicast_total] - those probably don't count
towards ingress/egress totals
- look at what proportion of traffic is going between hosts on the same
LAN, e.g. using sniffnet.net. It could be that AWS are only counting
Internet ingress traffic, not local traffic.
Personally I'd trust node_exporter more than AWS: that AWS line is
suspiciously flat.
On Monday, 17 July 2023 at 18:59:02 UTC+1 Anoop Mohan wrote:
> Hi Brian,
>
> I changed the query based on your comments. I think, now the query is
> showing bytes value instead of bits/second and it is taking only the
> external interface.
>
> sum(rate(node_network_receive_bytes_total{instance=~"ip-10-XX-XXX-44.us-west-2.compute.internal",device="eth0"}[5m]))
>
> by (instance)
> When I compare the values now, AWS is still showing 1.36GByte (ie.,
> 1.36GByte per 5 minutes => 4.5MByte/sec). However, the prometheus is
> showing around 22.5MBytes.
>
>
> [image: image.png]
>
> Could you please let me know if there is anything still missing?
>
>
> Thanks & Regards,
> Anoop
>
>
> On Sat, Jul 15, 2023 at 10:02 AM Brian Candler <[email protected]> wrote:
>
>> Firstly, you multiplied by 8 to get bits/second, whereas AWS is showing
>> bytes.
>> Secondly, I *think* AWS shows total transferred in 5 minutes, not rate in
>> bytes per second. If I'm right, then
>>
>> AWS 1.34GByte per 5 minutes => 4.5MByte/sec => 36Mbit/sec
>>
>> Thirdly, you have summed over all interfaces, including virtual ones. Try
>> selecting just the external interface.
>>
>> On Saturday, 15 July 2023 at 00:12:19 UTC+1 Anoop Mohan wrote:
>>
>>> Thanks Ben for responding to my question.
>>>
>>> That means, if we write the query like below, I believe it will display
>>> the average network traffic received in the last 5 minute for the given
>>> node.
>>>
>>> sum(rate(node_network_receive_bytes_total{instance=~"ip-10-XX-XXX-44.us-west-2.compute.internal"}[5m])*8
>>>
>>> ) by (instance)
>>>
>>> When I execute this query in prometheus, it is showing around 550M as
>>> the usage.
>>> [image: image.png]
>>>
>>> But, when I check the networkIn usage in AWS console for the same node,
>>> it is showing more than 1.34G usage.
>>> [image: image.png]
>>> So, can someone please explain why it is showing this discrepancy?
>>> Please let me know if I am doing something wrong or if the query is not
>>> correct.
>>>
>>>
>>> Thanks,
>>>
>>> On Thu, Jul 13, 2023 at 10:32 PM Ben Kochie <[email protected]> wrote:
>>>
>>>> I think what you're looking for is node_exporter metrics if you want
>>>> host level data.
>>>>
>>>> For example, node_network_receive_bytes_total
>>>>
>>>> On Thu, Jul 13, 2023 at 10:56 PM Anoop <[email protected]> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am using the metric *aws_ec2_network_in_average* (exposed by YACE
>>>>> exporter) in Prometheus to display the average network traffic received
>>>>> by
>>>>> an EC2 instance. However, I am checking, if there is any CAdvisor metrics
>>>>> available to replace the cloudwatch metric.
>>>>>
>>>>> For eg;
>>>>> How can I replace the below query:
>>>>> aws_ec2_network_in_average{instance="i-111b8ddf7cb4bf8d1"}
>>>>>
>>>>> with CAdvisor metric, something like this:
>>>>> avg(container_network_receive_bytes_total{kubernetes_io_hostname=~"ip-10-XX-XXX-44.us-west-2.compute.internal"})
>>>>>
>>>>> by (kubernetes_io_hostname)
>>>>>
>>>>> Kindly share your suggestions on this.
>>>>>
>>>>> Thanks,
>>>>>
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Prometheus Users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/prometheus-users/a43a1b21-aff4-4c94-8c25-a3fd961162bbn%40googlegroups.com
>>>>>
>>>>> <https://groups.google.com/d/msgid/prometheus-users/a43a1b21-aff4-4c94-8c25-a3fd961162bbn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> --
>> You received this message because you are subscribed to the Google Groups
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>>
> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/prometheus-users/15a11964-0560-4c3d-ac30-8ad7a809c449n%40googlegroups.com
>>
>> <https://groups.google.com/d/msgid/prometheus-users/15a11964-0560-4c3d-ac30-8ad7a809c449n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/1c669ffb-9dc2-4288-acd6-337e4a102275n%40googlegroups.com.