Attached is a JSON file from a Daffodil parse of PCAP data. If I place that
file in /tmp, then using Apache Drill, I can query it using SQL:
You install drill, then run the drill-embedded SQL shell from a drill
install bin directory.

Then this query returns all IP addresses in the IPv4 packets in the PCAP
data

use dfs.tmp;

with pcapDoc as (select PCAP from `infoset.json`),
     packets as (select flatten(pcapDoc.PCAP.Packet) as packet from
pcapDoc),
     ipv4Headers as (select
packets.packet.LinkLayer.Ethernet.NetworkLayer.IPv4.IPv4Header as hdr from
packets),
     ipsrcs as (select ipv4headers.hdr.IPSrc.value as ip from ipv4Headers),
     ipdests as (select ipv4headers.hdr.IPDest.value as ip from
ipv4Headers),
     ips as (select ip from ipsrcs union select ip from ipdests)
select * from ips;

+-----------------+
|       ip        |
+-----------------+
| 192.168.158.139 |
| 174.137.42.77   |
+-----------------+
2 rows selected (0.244 seconds)

This is a stepping stone on the path to integrating Daffodil into Apache
Drill so that one can directly query data given a DFDL schema of the data.
The conversion into JSON will then not be necessary.

Mike Beckerle
Apache Daffodil PMC | daffodil.apache.org
OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
Owl Cyber Defense | www.owlcyberdefense.com

Attachment: infoset.json
Description: application/json

Reply via email to