Attached is a JSON file from a Daffodil parse of PCAP data. If I place that file in /tmp, then using Apache Drill, I can query it using SQL: You install drill, then run the drill-embedded SQL shell from a drill install bin directory.
Then this query returns all IP addresses in the IPv4 packets in the PCAP data use dfs.tmp; with pcapDoc as (select PCAP from `infoset.json`), packets as (select flatten(pcapDoc.PCAP.Packet) as packet from pcapDoc), ipv4Headers as (select packets.packet.LinkLayer.Ethernet.NetworkLayer.IPv4.IPv4Header as hdr from packets), ipsrcs as (select ipv4headers.hdr.IPSrc.value as ip from ipv4Headers), ipdests as (select ipv4headers.hdr.IPDest.value as ip from ipv4Headers), ips as (select ip from ipsrcs union select ip from ipdests) select * from ips; +-----------------+ | ip | +-----------------+ | 192.168.158.139 | | 174.137.42.77 | +-----------------+ 2 rows selected (0.244 seconds) This is a stepping stone on the path to integrating Daffodil into Apache Drill so that one can directly query data given a DFDL schema of the data. The conversion into JSON will then not be necessary. Mike Beckerle Apache Daffodil PMC | daffodil.apache.org OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl Owl Cyber Defense | www.owlcyberdefense.com
infoset.json
Description: application/json