Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2019-01-06 Thread Dave Barach (dbarach)
Good points. I've updated the spec. It will take a bit of time to propagate, so 
I've appended the current .md text below.

-Original Message-
From: Guy Harris  
Sent: Saturday, January 5, 2019 11:39 PM
To: Dave Barach (dbarach) 
Cc: tcpdump-workers 
Subject: Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

On Dec 29, 2018, at 4:50 AM, Dave Barach (dbarach)  wrote:

> The same packet - with [traced] metadata changes - will appear multiple times 
> as the packet traverses the vpp forwarding graph.

The description of the format should probably warn about that, because protocol 
analyzers that maintain state between packets might get confused if multiple 
instances of the same packet appear in a capture.

> Simple example: from the driver layer, an ip4 transit packet will visit 
> ethernet-input, ip4-input[-no-checksum], ip4-lookup, ip4-rewrite, 
> interface-output, and the device driver TX node. Each of those visits results 
> in a trace record. The dispatch framework traces vectors of packets, so one 
> sees N x trace records from ethernet-input, the N x trace records from 
> ip4-input, and so on. Folks typically filter by buffer-index in wireshark, to 
> see what happens to one packet in a convenient sequential view.

So an analyzer *could*, in theory, work around this by, for example, treating 
each node name(?) as a separate flow, with a copy of a packet that visited one 
node as not being related to packets that visited different nodes, so a 
dissector would treat all of the copies of the IPv4 transit packet listed above 
as separate packets rather than as, for example, retransmissions of the same 
packet, and so that a request at one layer isn't matched with all of the copies 
of a reply that show up.

>>> Limiting stateful analysis to one graph node - "ethernet-input" - ought to 
>>> "just work..." 

I suppose that you could also suppress all dissection past the IP or maybe 
transport layer, although if you see multiple instances of a TCP segment, the 
TCP dissector will interpret that as a retransmission unless it knows that 
they're just multiple appearances of the same packet.

The problem here is that a VPP trace is significantly different from a regular 
network capture, in that it seems mainly tracing the flow of a packet through 
the packet processing code on a single machine rather than tracing its flow on 
a network; packet analyzers are more oriented towards the latter.

You don't need to give details of *how* an analyzer should deal with this - 
different analyzers might choose to do so in different ways; just note that 
this is significantly different from the sort of network traces one might be 
used to.



Graph Dispatcher Pcap Tracing
-

The vpp graph dispatcher knows how to capture vectors of packets in pcap
format as they're dispatched. The pcap captures are as follows:

```
VPP graph dispatch trace record description:

0   1   2   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Major Version | Minor Version | NStrings  | ProtoHint |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Buffer index (big endian) |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   + VPP graph node name ... ...   | NULL octet|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Buffer Metadata ... ...   | NULL octet|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Buffer Opaque ... ... | NULL octet|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Buffer Opaque 2 ... ...   | NULL octet|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | VPP ASCII packet trace (if NStrings > 4)  | NULL octet|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Packet data (up to 16K)   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```

Graph dispatch records comprise a version stamp, an indication of how
many NULL-terminated strings will follow the record header and preceed
packet data, and a protocol hint.

The buffer index is an opaque 32-bit cookie which allows consumers of
these data to easily filter/track single packets as they traverse the
forwarding graph.

Multiple records per packet are normal, and to be expected. Packets
will appear multipe times as they traverse the vpp forwarding
graph. In this way, vpp graph dispatch traces are significantly
different from regular network packet cap

Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2019-01-05 Thread Guy Harris
On Dec 29, 2018, at 4:50 AM, Dave Barach (dbarach)  wrote:

> The same packet - with [traced] metadata changes - will appear multiple times 
> as the packet traverses the vpp forwarding graph.

The description of the format should probably warn about that, because protocol 
analyzers that maintain state between packets might get confused if multiple 
instances of the same packet appear in a capture.

> Simple example: from the driver layer, an ip4 transit packet will visit 
> ethernet-input, ip4-input[-no-checksum], ip4-lookup, ip4-rewrite, 
> interface-output, and the device driver TX node. Each of those visits results 
> in a trace record. The dispatch framework traces vectors of packets, so one 
> sees N x trace records from ethernet-input, the N x trace records from 
> ip4-input, and so on. Folks typically filter by buffer-index in wireshark, to 
> see what happens to one packet in a convenient sequential view.

So an analyzer *could*, in theory, work around this by, for example, treating 
each node name(?) as a separate flow, with a copy of a packet that visited one 
node as not being related to packets that visited different nodes, so a 
dissector would treat all of the copies of the IPv4 transit packet listed above 
as separate packets rather than as, for example, retransmissions of the same 
packet, and so that a request at one layer isn't matched with all of the copies 
of a reply that show up.

I suppose that you could also suppress all dissection past the IP or maybe 
transport layer, although if you see multiple instances of a TCP segment, the 
TCP dissector will interpret that as a retransmission unless it knows that 
they're just multiple appearances of the same packet.

The problem here is that a VPP trace is significantly different from a regular 
network capture, in that it seems mainly tracing the flow of a packet through 
the packet processing code on a single machine rather than tracing its flow on 
a network; packet analyzers are more oriented towards the latter.

You don't need to give details of *how* an analyzer should deal with this - 
different analyzers might choose to do so in different ways; just note that 
this is significantly different from the sort of network traces one might be 
used to.
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2018-12-29 Thread Dave Barach (dbarach)
The same packet - with [traced] metadata changes - will appear multiple times 
as the packet traverses the vpp forwarding graph. 

Simple example: from the driver layer, an ip4 transit packet will visit 
ethernet-input, ip4-input[-no-checksum], ip4-lookup, ip4-rewrite, 
interface-output, and the device driver TX node. Each of those visits results 
in a trace record. The dispatch framework traces vectors of packets, so one 
sees N x trace records from ethernet-input, the N x trace records from 
ip4-input, and so on. Folks typically filter by buffer-index in wireshark, to 
see what happens to one packet in a convenient sequential view. 

In terms of medatadata: at ethernet input, b->current_data will be zero. At 
ip4-input, b->current_data will be 14 (or more, if the packet has 1 or 2 vlan 
tags). At interface-output, b->current_data is often [but not always] zero.

TBH we've been using the dispatch tracer + not-yet-upstreamed wirshark 
dissector for a while. It's incredibly handy for chasing "new code" problems: 
broken L3 and/or L4 checksums, leaving b->current_data pointing to the wrong 
layer, forgetting to ask for hardware checksum offload insertion, and so on. 

Thanks... Dave

-Original Message-
From: Guy Harris  
Sent: Monday, December 24, 2018 6:47 PM
To: Dave Barach (dbarach) 
Cc: tcpdump-workers 
Subject: Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

On Nov 28, 2018, at 4:34 AM, Dave Barach (dbarach)  wrote:

> The buffer index is an opaque 32-bit cookie which allows consumers of these 
> data to easily filter/track single packets as they traverse the forwarding 
> graph. Multiple records per packet are normal, and to be expected.

In what form?

For example, might you see:

an Ethernet packet, containing an IP datagram, containing a TCP segment 
or UDP datagram;

an IP packet, containing the same IP datagram as the previous packet;

a TCP segment or UDP datagram, containing the same segment/datagram as 
the previous packet;

or might you see the same {Ethernet,IP,TCP,UDP} packet more than once, or both?

___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2018-12-24 Thread Guy Harris
On Nov 28, 2018, at 4:34 AM, Dave Barach (dbarach)  wrote:

> The buffer index is an opaque 32-bit cookie which allows consumers of these 
> data to easily filter/track single packets as they traverse the forwarding 
> graph. Multiple records per packet are normal, and to be expected.

In what form?

For example, might you see:

an Ethernet packet, containing an IP datagram, containing a TCP segment 
or UDP datagram;

an IP packet, containing the same IP datagram as the previous packet;

a TCP segment or UDP datagram, containing the same segment/datagram as 
the previous packet;

or might you see the same {Ethernet,IP,TCP,UDP} packet more than once, or both?

___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2018-12-24 Thread Guy Harris
On Nov 28, 2018, at 10:53 AM, Dave Barach (dbarach)  wrote:

> On Wednesday, November 28, 2018, at 1:40 PM, Guy Harris  
> wrote:
> 
>> And do 4 (VLIB_NODE_PROTO_HINT_TCP) and 5 (VLIB_NODE_PROTO_HINT_UDP) mean, 
>> respectively, "the payload is probably a TCP segment, beginning with a TCP 
>> header" and "the payload is probably a UDP segment, beginning with a UDP 
>> header"?  And, again, "probably" means that the hint should be inaccurate - 
>> potentially meaning it's something other than what's hinted?
> 
> s/should/could/, presumably.

Yes.

> When working with completed, tested vpp code, the hints will be accurate. The 
> UDP and TCP hints mean exactly what you think the would mean. Again, the 
> primary use case is for developers who need to see what's going on with new 
> code...

When working with completed, tested networking code, the Ethernet type field of 
an Ethernet packet will, modulo errors not detected by the CRC (or caputures 
getting packets that failed the CRC check) will mean exactly what you think 
they would mean.

Even when using a sniffer to see what's going on with new code, "wrong Ethernet 
type" is probably not the most likely error case.  Some sniffers (Wireshark, 
for example), do have a mechanism for overriding the normal interpretation of a 
given Ethernet type value ("Decode As..."), but that's rarely used for Ethernet 
types.

So, by analogy, is this a case where a sniffer should, by default, believe the 
hint, and, if it turns out to be necessary, offer a way to override that and 
force an interpretation of the payload other than what the hint suggests?
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2018-12-23 Thread Dave Barach (dbarach)
Dear Michael,

Thanks for the info. Apologies for not finding DLT_ALLOCATE_HOWTO.md. "I'm on 
it..." 

At least until the vpp project doc tree moves to more sensible place, the URL 
below will track any changes. The file format [and by implication, the 
companion wireshark dissector] shouldn't need to change. 

https://fdio-vpp.readthedocs.io/en/latest/gettingstarted/developers/vnet.html?highlight=wireshark#graph-dispatcher-pcap-tracing

Thanks... Dave

-Original Message-
From: Michael Richardson  
Sent: Sunday, December 23, 2018 12:56 PM
To: Dave Barach (dbarach) 
Cc: tcpdump-workers@lists.tcpdump.org
Subject: Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

Dave Barach (dbarach)  wrote:
> Haven't heard anything in a while, what needs to happen in order to
> assign a LINKTYPE_/DLT_ type for the file format described below?

Generally, an email such as yours.
You can send a pull requests against libpcap if you like, see:

https://github.com/the-tcpdump-group/libpcap/blob/master/doc/DLT_ALLOCATE_HOWTO.md

Is there a URL we can point to that might contain updates, or will your email 
be enough?

> Thanks... Dave

> VPP graph dispatch trace record description.

Can you explain a bit more about what collects these records, and what they are 
used for?

--
]   Never tell me the odds! | ipv6 mesh networks [
]   Michael Richardson, Sandelman Software Works|IoT architect   [
] m...@sandelman.ca  http://www.sandelman.ca/|   ruby on rails[

___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2018-12-23 Thread Michael Richardson
Dave Barach (dbarach)  wrote:
> Haven't heard anything in a while, what needs to happen in order to
> assign a LINKTYPE_/DLT_ type for the file format described below?

Generally, an email such as yours.
You can send a pull requests against libpcap if you like, see:

https://github.com/the-tcpdump-group/libpcap/blob/master/doc/DLT_ALLOCATE_HOWTO.md

Is there a URL we can point to that might contain updates, or will your email
be enough?

> Thanks... Dave

> VPP graph dispatch trace record description.

Can you explain a bit more about what collects these records, and
what they are used for?

--
]   Never tell me the odds! | ipv6 mesh networks [
]   Michael Richardson, Sandelman Software Works|IoT architect   [
] m...@sandelman.ca  http://www.sandelman.ca/|   ruby on rails[

___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2018-12-23 Thread Dave Barach (dbarach)
Folks,

Haven't heard anything in a while, what needs to happen in order to assign a 
LINKTYPE_/DLT_ type for the file format described below?

Thanks... Dave

VPP graph dispatch trace record description. 

0   1   2   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Major Version | Minor Version | NStrings  | ProtoHint |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Buffer index (big endian) |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   + VPP graph node name ... ...   | NULL octet|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Buffer Metadata ... ...   | NULL octet|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Buffer Opaque ... ... | NULL octet|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Buffer Opaque 2 ... ...   | NULL octet|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | VPP ASCII packet trace (if NStrings > 4)  | NULL octet|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Packet data (up to 16K)   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Graph dispatch records comprise a version stamp, an indication of how many 
NULL-terminated strings will follow the record header and precede packet data, 
and a protocol hint.

The buffer index is an opaque 32-bit cookie which allows consumers of these 
data to easily filter/track single packets as they traverse the forwarding 
graph. Multiple records per packet are normal, and to be expected. 

As of this writing: major version = 1, minor version = 0. Nstrings SHOULD be 4 
or 5. Consumers SHOULD be wary values less than 4 or greater than 5. They MAY 
attempt to display the claimed number of strings, or they MAY treat the 
condition as an error.

Here is the current set of protocol hints:

typedef enum
  {
VLIB_NODE_PROTO_HINT_NONE = 0,
VLIB_NODE_PROTO_HINT_ETHERNET,
VLIB_NODE_PROTO_HINT_IP4,
VLIB_NODE_PROTO_HINT_IP6,
VLIB_NODE_PROTO_HINT_TCP,
VLIB_NODE_PROTO_HINT_UDP,
VLIB_NODE_N_PROTO_HINTS,
  } vlib_node_proto_hint_t;

Example: VLIB_NODE_PROTO_HINT_IP6 means that the first octet of packet data 
SHOULD be 0x60, and should begin an ipv6 packet header.

Downstream consumers of these data SHOULD pay attention to the protocol hint. 
They MUST tolerate inaccurate hints, which WILL occur from time to time.
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2018-11-28 Thread Dave Barach (dbarach)


-Original Message-
From: Guy Harris  
Sent: Wednesday, November 28, 2018 1:40 PM
To: Dave Barach (dbarach) 
Cc: tcpdump-workers 
Subject: Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

On Nov 28, 2018, at 4:34 AM, Dave Barach (dbarach)  wrote:

>0   1   2   3
>0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Major Version | Minor Version | NStrings  | ProtoHint |
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Buffer index (big endian) |
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   + VPP graph node name ... ...   | NULL octet|
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

So are those strings counted - i.e., preceded by a length - and 
null-terminated, or are they just null-terminated?

>>> Just NULL terminated. I re-did the implementation from scratch... 

> Downstream consumers of these data SHOULD pay attention to the protocol hint. 
> They MUST tolerate inaccurate hints, which WILL occur from time to time.

"Inaccurate" as in, for example, a packet might have a hint of 2 
(VLIB_NODE_PROTO_HINT_IP4), it might be an IPv6 packet, so both 2 and 3 
(VLIB_NODE_PROTO_HINT_IP6) should be interpreted as IP, and the v4 vs. v6 
decision should be based solely on the version field of the header?

>>> Someone trying to debug new vpp code will use this toolset. They may well 
>>> mis-specify the hint, or build bogus packets. Initial code development is 
>>> the primary use-case. Wireshark is wonderful in terms of giving feedback of 
>>> the form: "you forgot to fix the ip4 checksum in addition to changing the 
>>> ip4 length in that GRE template header you're using."

Or, worse, it might be an Ethernet packet?

>>> Not the most likely mistake, but not out of the question. All it would take 
>>> would be to roll back b->current_data to zero; instead of setting that 
>>> field to 14 for a non-vlan pkt, etc. 

And do 4 (VLIB_NODE_PROTO_HINT_TCP) and 5 (VLIB_NODE_PROTO_HINT_UDP) mean, 
respectively, "the payload is probably a TCP segment, beginning with a TCP 
header" and "the payload is probably a UDP segment, beginning with a UDP 
header"?  And, again, "probably" means that the hint should be inaccurate - 
potentially meaning it's something other than what's hinted?

>>> s/should/could/, presumably.

>>> When working with completed, tested vpp code, the hints will be accurate. 
>>> The UDP and TCP hints mean exactly what you think the would mean. Again, 
>>> the primary use case is for developers who need to see what's going on with 
>>> new code...
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2018-11-28 Thread Guy Harris
On Nov 28, 2018, at 4:34 AM, Dave Barach (dbarach)  wrote:

>0   1   2   3
>0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Major Version | Minor Version | NStrings  | ProtoHint |
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Buffer index (big endian) |
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   + VPP graph node name ... ...   | NULL octet|
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

So are those strings counted - i.e., preceded by a length - and 
null-terminated, or are they just null-terminated?

> Downstream consumers of these data SHOULD pay attention to the protocol hint. 
> They MUST tolerate inaccurate hints, which WILL occur from time to time.

"Inaccurate" as in, for example, a packet might have a hint of 2 
(VLIB_NODE_PROTO_HINT_IP4), it might be an IPv6 packet, so both 2 and 3 
(VLIB_NODE_PROTO_HINT_IP6) should be interpreted as IP, and the v4 vs. v6 
decision should be based solely on the version field of the header?

Or, worse, it might be an Ethernet packet?

And do 4 (VLIB_NODE_PROTO_HINT_TCP) and 5 (VLIB_NODE_PROTO_HINT_UDP) mean, 
respectively, "the payload is probably a TCP segment, beginning with a TCP 
header" and "the payload is probably a UDP segment, beginning with a UDP 
header"?  And, again, "probably" means that the hint should be inaccurate - 
potentially meaning it's something other than what's hinted?
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2018-11-28 Thread Dave Barach (dbarach)
Dear Guy,

Here is a cleaned-up copy of the spec which incorporates your comments.

Thanks... Dave

VPP graph dispatch trace record description. 

0   1   2   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Major Version | Minor Version | NStrings  | ProtoHint |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Buffer index (big endian) |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   + VPP graph node name ... ...   | NULL octet|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Buffer Metadata ... ...   | NULL octet|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Buffer Opaque ... ... | NULL octet|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Buffer Opaque 2 ... ...   | NULL octet|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | VPP ASCII packet trace (if NStrings > 4)  | NULL octet|
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Packet data (up to 16K)   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Graph dispatch records comprise a version stamp, an indication of how many 
NULL-terminated strings will follow the record header and precede
packet data, and a protocol hint.

The buffer index is an opaque 32-bit cookie which allows consumers of these 
data to easily filter/track single packets as they traverse the forwarding 
graph. Multiple records per packet are normal, and to be expected. 

As of this writing: major version = 1, minor version = 0. Nstrings SHOULD be 4 
or 5. Consumers SHOULD be wary values less than 4 or greater than 5. They MAY 
attempt to display the claimed number of strings, or they MAY treat the 
condition as an error.

Here is the current set of protocol hints:

typedef enum
  {
VLIB_NODE_PROTO_HINT_NONE = 0,
VLIB_NODE_PROTO_HINT_ETHERNET,
VLIB_NODE_PROTO_HINT_IP4,
VLIB_NODE_PROTO_HINT_IP6,
VLIB_NODE_PROTO_HINT_TCP,
VLIB_NODE_PROTO_HINT_UDP,
VLIB_NODE_N_PROTO_HINTS,
  } vlib_node_proto_hint_t;

Example: VLIB_NODE_PROTO_HINT_IP6 means that the first octet of packet data 
SHOULD be 0x60, and should begin an ipv6 packet header.

Downstream consumers of these data SHOULD pay attention to the protocol hint. 
They MUST tolerate inaccurate hints, which WILL occur from time to time.
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2018-11-27 Thread Guy Harris
On Nov 27, 2018, at 3:11 PM, Dave Barach (dbarach)  wrote:

> On November 27, 2018, at 5:58 PM, Guy Harris  wrote:
> 
>> On Nov 27, 2018, at 1:50 PM, Dave Barach (dbarach)  wrote:
>> 
>>> The buffer index allows downstream consumers of these data to easily 
>>> filter/track single packets as they traverse the forwarding graph.
>> 
>> So does that mean that there might be multiple records in the file for the 
>> same packet, with all records for the same packet having the same buffer 
>> index?
> 
> There absolutely will be multiple records for every packet in the trace, 
> unless e.g. ethernet-input has a horrible bug.

So this needs to be noted in the specification.

Does that mean that you might see multiple instances of the same packet 
payload, e.g. more than one copy of a single request and more than one copy of 
the response to that request in some protocol?

>>> FWIW, the 32-bit buffer index is stored in big endian format.
>> 
>> If it's only to be used for matching purposes, presumably that means that
>> 
>>  1) its value has no numerical significance;
>> 
>>  2) the only comparisons done on that value are equality comparisons;
>> 
>> so it could be treated as a 4-byte opaque value, in which case the byte 
>> order doesn't matter.
> 
> Exactly so. I wrote the vpp code so it will always show up in big endian 
> order, but it truly doesn't matter.

OK, so we'll just specify it as an opaque 32-bit cookie to use to match up 
multiple records for the same packets.

>>> As of this writing, major version = 1, minor version = 0; Nstrings will be 
>>> either 4 or 5.
>> 
>> So smaller or larger values MAY (and probably SHOULD) be treated as errors.
> 
> Yes, that would probably mean that the capture file is damaged beyond repair.

Or, to be a bit more robust, treat it as a count of strings; if it's less than 
4, we display only the strings that are there and, if it's greater than 5, we 
just don't display the others, or display them as "unknown string".

>>> Example: VLIB_NODE_PROTO_HINT_IP6 means that the first octet of packet data 
>>> SHOULD be 0x60, and should begin an ipv6 packet header.
>> 
>> That's SHOULD, not MUST; is there anything else a dissector should do other 
>> than use the protocol hint for handoff?
> 
> Right. If someone screws up on the vpp side, the hint might not match 
> reality. Here's why my current dissector does with it:

That appears to assume that the hint is valid (although both Wireshark and 
tcpdump will, if handed a purportedly-IPv4 packet with a version of 6, or a 
purportedly-IPv6 packet with a version of 4, report it as an error).
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2018-11-27 Thread Dave Barach (dbarach)
inline

-Original Message-
From: Guy Harris  
Sent: Tuesday, November 27, 2018 5:58 PM
To: Dave Barach (dbarach) 
Cc: tcpdump-workers 
Subject: Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

On Nov 27, 2018, at 1:50 PM, Dave Barach (dbarach)  wrote:

> After thinking about some of your feedback, I decided to move most of the 
> work back to the vpp side where it probably belonged in the first place.

...

> Anyhow, here's what I implemented. Take a look AYC and let me know what you 
> think.
> 
> VPP graph dispatch trace record description. 
> 
>0   1   2   3
>0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Major Version | Minor Version | NStrings  | ProtoHint |
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Buffer index (big endian FWIW)|
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   + VPP graph node name ... ...   | NULL octet|
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Buffer Metadata ASCII string ... ...  | NULL octet|
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Buffer Opaque ASCII string ... ...| NULL octet|
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Buffer Opaque 2 ASCII string ... ...  | NULL octet|
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | VPP ASCII packet trace (if NStrings > 4) ...  | NULL octet|
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Packet data (up to 16K)   |
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 
> Graph dispatch records comprise a version stamp, an indication of how 
> many NULL-terminated strings will follow the record header, and a protocol 
> hint.
> 
> The buffer index allows downstream consumers of these data to easily 
> filter/track single packets as they traverse the forwarding graph.

So does that mean that there might be multiple records in the file for the same 
packet, with all records for the same packet having the same buffer index?

>>> There absolutely will be multiple records for every packet in the trace, 
>>> unless e.g. ethernet-input has a horrible bug. 

> FWIW, the 32-bit buffer index is stored in big endian format.

If it's only to be used for matching purposes, presumably that means that

1) its value has no numerical significance;

2) the only comparisons done on that value are equality comparisons;

so it could be treated as a 4-byte opaque value, in which case the byte order 
doesn't matter.

>>> Exactly so. I wrote the vpp code so it will always show up in big endian 
>>> order, but it truly doesn't matter.

> As of this writing, major version = 1, minor version = 0; Nstrings will be 
> either 4 or 5.

So smaller or larger values MAY (and probably SHOULD) be treated as errors.

>>> Yes, that would probably mean that the capture file is damaged beyond 
>>> repair.


> Here is the current set of protocol hints:
> 
> typedef enum
>  {
>VLIB_NODE_PROTO_HINT_NONE = 0,
>VLIB_NODE_PROTO_HINT_ETHERNET,
>VLIB_NODE_PROTO_HINT_IP4,
>VLIB_NODE_PROTO_HINT_IP6,
>VLIB_NODE_PROTO_HINT_TCP,
>VLIB_NODE_PROTO_HINT_UDP,
>VLIB_NODE_N_PROTO_HINTS,
>  } vlib_node_proto_hint_t;

So a dissector MAY use that to indicate what the next protocol is.

>>> Yes.

> Example: VLIB_NODE_PROTO_HINT_IP6 means that the first octet of packet data 
> SHOULD be 0x60, and should begin an ipv6 packet header.

That's SHOULD, not MUST; is there anything else a dissector should do other 
than use the protocol hint for handoff?

>>> Right. If someone screws up on the vpp side, the hint might not match 
>>> reality. Here's why my current dissector does with it:

/* 
 * Delegate the rest of the packet dissection as directed. If there wasn't 
a hint, 
 * take a guess. 
 */ 
if (protocol_hint >= array_length(next_dissectors)) {
ws_debug_printf ("protocol_hint %d out of range (max %d)",
 (int) protocol_hint, 
 (int) array_length(next_dissectors));
protocol_hint = 0;
}

/* note: next_dissectors[0] = eth_maybefcs_dissector_handle... */

use_this_dissector = next_dissectors [protocol_hint];
if (protocol_hint == 0) {
maybe_protocol_id = tvb_get_guint8 (tvb, offset);

switch (maybe_protocol_id) {
case 0x

Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2018-11-27 Thread Guy Harris
On Nov 27, 2018, at 1:50 PM, Dave Barach (dbarach)  wrote:

> After thinking about some of your feedback, I decided to move most of the 
> work back to the vpp side where it probably belonged in the first place.

...

> Anyhow, here's what I implemented. Take a look AYC and let me know what you 
> think.
> 
> VPP graph dispatch trace record description. 
> 
>0   1   2   3
>0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Major Version | Minor Version | NStrings  | ProtoHint |
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Buffer index (big endian FWIW)|
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   + VPP graph node name ... ...   | NULL octet|
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Buffer Metadata ASCII string ... ...  | NULL octet|
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Buffer Opaque ASCII string ... ...| NULL octet|
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Buffer Opaque 2 ASCII string ... ...  | NULL octet|
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | VPP ASCII packet trace (if NStrings > 4) ...  | NULL octet|
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Packet data (up to 16K)   |
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 
> Graph dispatch records comprise a version stamp, an indication of how many 
> NULL-terminated strings will follow the record header, and a
> protocol hint.
> 
> The buffer index allows downstream consumers of these data to easily 
> filter/track single packets as they traverse the forwarding
> graph.

So does that mean that there might be multiple records in the file for the same 
packet, with all records for the same packet having the same buffer index?

> FWIW, the 32-bit buffer index is stored in big endian format.

If it's only to be used for matching purposes, presumably that means that

1) its value has no numerical significance;

2) the only comparisons done on that value are equality comparisons;

so it could be treated as a 4-byte opaque value, in which case the byte order 
doesn't matter.

> As of this writing, major version = 1, minor version = 0; Nstrings will be 
> either 4 or 5.

So smaller or larger values MAY (and probably SHOULD) be treated as errors.

> Here is the current set of protocol hints:
> 
> typedef enum
>  {
>VLIB_NODE_PROTO_HINT_NONE = 0,
>VLIB_NODE_PROTO_HINT_ETHERNET,
>VLIB_NODE_PROTO_HINT_IP4,
>VLIB_NODE_PROTO_HINT_IP6,
>VLIB_NODE_PROTO_HINT_TCP,
>VLIB_NODE_PROTO_HINT_UDP,
>VLIB_NODE_N_PROTO_HINTS,
>  } vlib_node_proto_hint_t;

So a dissector MAY use that to indicate what the next protocol is.

> Example: VLIB_NODE_PROTO_HINT_IP6 means that the first octet of packet data 
> SHOULD be 0x60, and should begin an ipv6 packet header.

That's SHOULD, not MUST; is there anything else a dissector should do other 
than use the protocol hint for handoff?

___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2018-11-27 Thread Guy Harris
On Nov 27, 2018, at 4:08 AM, Dave Barach (dbarach)  wrote:

> Opaque[10] is the primary metadata.

That's only 40 bytes.

Do you mean that

> /* Offset within data[] that we are currently processing.
>If negative current header points into predata area. */
> i16 current_data;  /**< signed offset in data[], pre_data[]
>   that we are currently processing.
>   If negative current header points into predata area.
>*/
> u16 current_length;  /**< Nbytes between current data and
> the end of this buffer.
>  */
> u32 flags; /**< buffer flags */
> u32 flow_id;  /**< Generic flow identifier */
> 
> 
> u32 next_buffer;   /**< Next buffer for this linked-list of buffers.
>   Only valid if VLIB_BUFFER_NEXT_PRESENT flag is set.
>*/
> 
> u32 current_config_index; /**< Used by feature subgraph arcs to
>  visit enabled feature nodes
>   */
> u16 error;/**< Error code for buffers to be enqueued
>  to error handler.
>   */
> u8 n_add_refs; /**< Number of additional references to this buffer. */
> 
> u8 buffer_pool_index; /**< index of buffer pool this buffer belongs. */
> 
> u32 opaque[10]; /**< Opaque data used by sub-graphs for their own purposes.
>  See above */

is the primary metadata?

> Opaque2[12] is the secondary metadata.

That's only 48 bytes; do you mean that

> u32 trace_index; /**< Specifies index into trace buffer
> if VLIB_PACKET_IS_TRACED flag is set.
>  */
> u32 recycle_count; /**< Used by L2 path recycle code */
> 
> u32 total_length_not_including_first_buffer;
> /**< Only valid for first buffer in chain. Current length plus
>total length given here give total number of bytes in buffer chain.
> */
> u8 free_list_index; /** < only used if
>  
> VLIB_BUFFER_NON_DEFAULT_FREELIST
>  flag is set */
> u8 align_pad[3]; /**< available */
> u32 opaque2[12];  /**< More opaque data, see ../vnet/vnet/buffer.h */
> 
> /* end of second cache line */
> u8 pre_data[VLIB_BUFFER_PRE_DATA_SIZE];  /**< Space for inserting data
>  before buffer start.
>  Packet rewrite string will be
>  rewritten backwards and may 
> extend
>  back before buffer->data[0].
>  Must come directly before packet 
> data.
>   */
> 

is the secondary metadata?

> BTW, I've decided to pull all of these opaque structure definitions out of 
> the WS dissector, and do most all of the formatting on the vpp side. Should 
> reduce maintenance to near-zero, eliminate endian swapping issues, and 
> generally make life less miserable for everyone involved.

So does "doing most [or] all of the formatting on the vpp side" change what's 
in the header?  Or does it just mean that most of the fields of the header 
should be marked as "opaque data"?
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2018-11-26 Thread Guy Harris
On Nov 26, 2018, at 12:43 PM, Dave Barach (dbarach)  wrote:

> On November 26, 2018, at 3:01 PM, Guy Harris  wrote:
> 
>> So which of those structures describes the primary metadata?
> 
> vlib_buffer_t. The key fields are flags, current_data, and current_length. 

So that's:

> /* VLIB buffer representation. */
> typedef struct
> {
>  /* Offset within data[] that we are currently processing.
> If negative current header points into predata area. */
>  i16 current_data;  /**< signed offset in data[], pre_data[]
>that we are currently processing.
>If negative current header points into predata area.
> */
>  u16 current_length;  /**< Nbytes between current data and
>  the end of this buffer.
>   */
>  u32 flags; /**< buffer flags */
>  u32 flow_id; /**< Generic flow identifier */
> 
> 
>  u32 next_buffer;   /**< Next buffer for this linked-list of buffers.
>Only valid if VLIB_BUFFER_NEXT_PRESENT flag is set.
> */
> 
>  u32 current_config_index; /**< Used by feature subgraph arcs to
>   visit enabled feature nodes
>*/
>  u16 error;   /**< Error code for buffers to be enqueued
>   to error handler.
>*/
>  u8 n_add_refs; /**< Number of additional references to this buffer. */
> 
>  u8 buffer_pool_index;/**< index of buffer pool this buffer belongs. 
> */
> 
>  u32 opaque[10]; /**< Opaque data used by sub-graphs for their own purposes.
>   See above */
>  u32 trace_index; /**< Specifies index into trace buffer
>  if VLIB_PACKET_IS_TRACED flag is set.
>   */
>  u32 recycle_count; /**< Used by L2 path recycle code */
> 
>  u32 total_length_not_including_first_buffer;
>  /**< Only valid for first buffer in chain. Current length plus
> total length given here give total number of bytes in buffer chain.
>  */
>  u8 free_list_index; /** < only used if
>  
> VLIB_BUFFER_NON_DEFAULT_FREELIST
>  flag is set */
>  u8 align_pad[3]; /**< available */
>  u32 opaque2[12];  /**< More opaque data, see ../vnet/vnet/buffer.h */
> 
>  /* end of second cache line */
>  u8 pre_data[VLIB_BUFFER_PRE_DATA_SIZE];  /**< Space for inserting data
>   before buffer start.
>   Packet rewrite string will be
>   rewritten backwards and may 
> extend
>   back before buffer->data[0].
>   Must come directly before 
> packet data.
>*/
> 
>  u8 data[0]; /**< Packet data. Hardware DMA here */
> } vlib_buffer_t;  /* Must be a multiple of 64B. */

which is 128 bytes followed by VLIB_BUFFER_PRE_DATA_SIZE bytes of data.

Which of those 64 of those 128 bytes are the primary metadata?
___
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2018-11-26 Thread Guy Harris
On Nov 26, 2018, at 12:43 PM, Dave Barach (dbarach)  wrote:

> On November 26, 2018, at 3:01 PM, Guy Harris  wrote:
> 
>> On Nov 26, 2018, at 6:03 AM, Dave Barach (dbarach)  wrote:
>> 
>>> VPP graph dispatch trace record description, in network byte order. 
>>> Integers wider than 8 bits are in little endian byte order.
>> 
>> "Byte order" doesn't apply to 8-bit fields; if all fields are in 
>> little-endian byte order, what, if anything, is in network byte order 
>> (big-endian)?
>> 
>> And is everything guaranteed to be in little-endian byte order *even if the 
>> tracing code is running on, for example, a Power ISA processor running in 
>> big-endian mode, or on z/Architecture processor (which *only* runs 
>> big-endian)?
> 
> Good point. It would be easy to trace the 1 x 32-bit and 1 x 16 bit 
> quantities in for-real network byte order. I'll just go do that. Frankly, we 
> haven't run the code base on a PowerPC or other big-endian processor in 
> years. I'm fairly sure that the dispatch trace code would be the least of 
> anyone's problems if/when we go there again.

So, in other words, you meant "Integers wider than 8 bits are in *the byte 
order of the host writing the trace*", not "...are in little-endian byte order".

Either big-endian or little-endian byte order would work easily, as long as 
it's standardized.  Host-endian can be made to work, *but* it means that any 
code that reads pcap or pcapng files has to byte-swap the VPP header if the 
byte order claimed by the pcap file header or the pcapng section header differs 
from the native byte order of the host reading the file; we have code to do 
that in libpcap and in Wireshark's libwiretap, but we'd really prefer not to 
have to introduce that here.

>>> Notes: as of this writing, major version = 1, minor version = 0.
>> 
>> Presumably any code that can read major version M, minor version N will also 
>> be able to read major version M, minor version K, for all values of K <= N.
> 
> That's the goal, but since the paint is barely dry on v1.0 it would be 
> slightly rash of me to make that claim...

That shouldn't just be a goal, it should be a definition of how the major and 
minor versions work.

This is similar to, for example, SunOS 4.x's shared library version numbering - 
if you add new capabilities to a library, so that programs using the new 
capabilities won't work with older versions of the library, *but* the 
capabilities are added in a compatible fashion, so that programs using only the 
capabilities of older versions of the library will work with newer versions of 
the library, you increase the minor version, *but* if you make incompatible 
changes (removing routines, changing the signature of existing functions, 
etc.), you increase the major version.

So you should probably specify that's how the major and minor versions are used.

>> For the fields defined in that header:
>> 
>> What is the buffer index?
> 
> A 32-bit buffer handle which can be rapidly converted into either a virtual 
> address or a physical addresses. It's highly useful as a filter in WS: since 
> we trace e.g. 100 packets in ethernet-input, then 100 packets in ip4-input, 
> etc.

So "as a filter" means that if the handle value is equal to some particular 
value - either an arbitrary value or the same value as another packet - that's 
significant?

>> Does the node name length include the terminating NUL?  (Presumably anything 
>> writing those files MUST, in the RFC 2119 sense, null-terminate strings, and 
>> anything writing those files MUST not assume that the strings are 
>> null-terminated; a count *and* a terminating NUL is redundant.)
>> 
>> Does the ASCII trace length include the terminating NUL?  Is that just an 
>> opaque string to display to the user, or are there any ways in which an 
>> application can parse it?
> 
> Yes, the NULL is included in the count.

The spec should indicate that.

> Yes, it's slightly redundant. Yes, it keeps people from shooting themselves 
> in the foot when processing the data.

It doesn't prevent code that processes the data from having to check for a 
terminating NUL, unless you're in *so* tightly-controlled an environment that 
you can guarantee that you will *never* see maliciously-constructed files that 
don't have a terminating NUL.

Neither tcpdump nor Wireshark, for example, are always run in environments like 
that.

>> In an earlier mail on another list you said:
>> 
>>> Packet data can be anything: L2, L3, L4 or above. The vpp dissector knows 
>>> from the node name what to expect. I have a [seriously incomplete as of 
>>> this writing] table of the form:
>>> 
>>> #define foreach_node_to_dissector_handle\
>>> _("ip6-lookup", "ipv6", ip6_dissector_handle)   \
>>> _("ip4-input-no-checksum", "ip", ip4_dissector_handle)  \
>>> _("ip4-lookup", "ip", ip4_dissector_handle) \
>>> _("ip4-local", "ip", ip4_dissector_handle)  \
>>> 

Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2018-11-26 Thread Dave Barach (dbarach)
Inline, see >>> Thanks... Dave

-Original Message-
From: Guy Harris  
Sent: Monday, November 26, 2018 3:01 PM
To: Dave Barach (dbarach) 
Cc: tcpdump-workers@lists.tcpdump.org
Subject: Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

On Nov 26, 2018, at 6:03 AM, Dave Barach (dbarach)  wrote:

> I've built a wireshark dissector for fd.io vpp graph dispatcher pcap traces. 
> Please see https://fdio-vpp.readthedocs.io/en/latest/ for a description of 
> the code base / project, etc. 
> 
> For development purposes, I borrowed one of the USERxxx encap types. Please 
> allocate a LINKTYPE_/DLT_ type for this file format, so I can upstream the 
> dissector.
> 
> Thanks... Dave Barach
> Fd.io vpp PTL
> 
> Trace Record format
> ---
> 
> VPP graph dispatch trace record description, in network byte order. Integers 
> wider than 8 bits are in little endian byte order.

"Byte order" doesn't apply to 8-bit fields; if all fields are in little-endian 
byte order, what, if anything, is in network byte order (big-endian)?

And is everything guaranteed to be in little-endian byte order *even if the 
tracing code is running on, for example, a Power ISA processor running in 
big-endian mode, or on z/Architecture processor (which *only* runs big-endian)?

>>> Good point. It would be easy to trace the 1 x 32-bit and 1 x 16 bit 
>>> quantities in for-real network byte order. I'll just go do that. Frankly, 
>>> we haven't run the code base on a PowerPC or other big-endian processor in 
>>> years. I'm fairly sure that the dispatch trace code would be the least of 
>>> anyone's problems if/when we go there again.  

>0   1   2   3
>0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   |Major Version  |Minor Version  |Buffer index high 16 bits  |
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   |Buffer index low 16 bits   |Node Name Len  | Node name ... |
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   + Node name cont'd... ...   | NULL octet|
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Primary buffer metadata (64 octets)   |
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | [Secondary buffer metadata (64 octets, major version > 1)]|
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | ASCII trace length 16 bits|  ASCII trace ...  |
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | ASCII trace cont'd ...... | NULL octet|
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>   | Packet data (up to 16K)   |
>   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Is there a page at any fd.io or VPP Web site that describes the header, to 
which we could point?

> Notes: as of this writing, major version = 1, minor version = 0.

Presumably any code that can read major version M, minor version N will also be 
able to read major version M, minor version K, for all values of K <= N.

>>> That's the goal, but since the paint is barely dry on v1.0 it would be 
>>> slightly rash of me to make that claim...

> See below for pro forma definitions of the primary buffer metadata and 
> primary opaque data. Please refer to fd.io vpp source code before you invest, 
> send money, or write code: "git clone https://gerrit.fd.io/r/vpp;
> 
> Trace records are generated by code in 
> .../src/vlib/main.c:dispatch_pcap_trace(...).
> 
> The secondary buffer metadata shown in the diagram above is NOT present in 
> version 1 traces.

So if some future version 2 of the trace is defined, an update will be sent to 
tcpdump-workers, describing the secondary buffer metadata?

For the fields defined in that header:

What is the buffer index?

>>> A 32-bit buffer handle which can be rapidly converted into either a virtual 
>>> address or a physical addresses. It's highly useful as a filter in WS: 
>>> since we trace e.g. 100 packets in ethernet-input, then 100 packets in 
>>> ip4-input, etc.  

Does the node name length include the terminating NUL?  (Presumably anything 
writing those files MUST, in the RFC 2119 sense, null-terminate strings, and 
anything writing those files MUST not assume that the strings are 
null-terminated; a count *and* a terminating NUL is redundant.)

Does the ASCII trace length include the terminating NUL?  Is that just an 
opaque string to display to the user, or are there any ways in w

Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

2018-11-26 Thread Guy Harris
On Nov 26, 2018, at 6:03 AM, Dave Barach (dbarach)  wrote:

> I've built a wireshark dissector for fd.io vpp graph dispatcher pcap traces. 
> Please see https://fdio-vpp.readthedocs.io/en/latest/ for a description of 
> the code base / project, etc. 
> 
> For development purposes, I borrowed one of the USERxxx encap types. Please 
> allocate a LINKTYPE_/DLT_ type for this file format, so I can upstream the 
> dissector.
> 
> Thanks... Dave Barach
> Fd.io vpp PTL
> 
> Trace Record format
> ---
> 
> VPP graph dispatch trace record description, in network byte order. Integers 
> wider than 8 bits are in little endian byte order.

"Byte order" doesn't apply to 8-bit fields; if all fields are in little-endian 
byte order, what, if anything, is in network byte order (big-endian)?

And is everything guaranteed to be in little-endian byte order *even if the 
tracing code is running on, for example, a Power ISA processor running in 
big-endian mode, or on z/Architecture processor (which *only* runs big-endian)?

>   0   1   2   3
>   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  |Major Version  |Minor Version  |Buffer index high 16 bits  |
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  |Buffer index low 16 bits   |Node Name Len  | Node name ... |
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  + Node name cont'd... ...   | NULL octet|
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  | Primary buffer metadata (64 octets)   |
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  | [Secondary buffer metadata (64 octets, major version > 1)]|
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  | ASCII trace length 16 bits|  ASCII trace ...  |
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  | ASCII trace cont'd ...... | NULL octet|
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>  | Packet data (up to 16K)   |
>  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Is there a page at any fd.io or VPP Web site that describes the header, to 
which we could point?

> Notes: as of this writing, major version = 1, minor version = 0.

Presumably any code that can read major version M, minor version N will also be 
able to read major version M, minor version K, for all values of K <= N.

> See below for pro forma definitions of the primary buffer metadata and
> primary opaque data. Please refer to fd.io vpp source code before you invest, 
> send money, or write code: "git clone https://gerrit.fd.io/r/vpp;
> 
> Trace records are generated by code in 
> .../src/vlib/main.c:dispatch_pcap_trace(...).
> 
> The secondary buffer metadata shown in the diagram above is NOT present in 
> version 1 traces.

So if some future version 2 of the trace is defined, an update will be sent to 
tcpdump-workers, describing the secondary buffer metadata?

For the fields defined in that header:

What is the buffer index?

Does the node name length include the terminating NUL?  (Presumably anything 
writing those files MUST, in the RFC 2119 sense, null-terminate strings, and 
anything writing those files MUST not assume that the strings are 
null-terminated; a count *and* a terminating NUL is redundant.)

Does the ASCII trace length include the terminating NUL?  Is that just an 
opaque string to display to the user, or are there any ways in which an 
application can parse it?

In an earlier mail on another list you said:

> Packet data can be anything: L2, L3, L4 or above. The vpp dissector knows 
> from the node name what to expect. I have a [seriously incomplete as of this 
> writing] table of the form:
> 
> #define foreach_node_to_dissector_handle\
> _("ip6-lookup", "ipv6", ip6_dissector_handle)   \
> _("ip4-input-no-checksum", "ip", ip4_dissector_handle)  \
> _("ip4-lookup", "ip", ip4_dissector_handle) \
> _("ip4-local", "ip", ip4_dissector_handle)  \
> _("ip4-udp-lookup", "ip", udp_dissector_handle) \
> _("ip4-icmp-error", "ip", ip4_dissector_handle) \
> _("ip4-glean", "ip", ip4_dissector_handle)  \
> _("ethernet-input", "eth_maybefcs", eth_dissector_handle)

Presumably, once a node name is used for a particular type of payload, it will 
always indicate that particular payload type.

Could new node names be added in the future?

Is there a page at any fd.io or VPP Web site that gives the current list of 
node names, showing what payload type each node name indicates?

> Pro forma structure definitions:

So which of those structures describes the primary