I remember looking at this some months back.

My recollection is that PCAP is a somewhat awkward format to
MapReduce, since it isn't splittable -- you can't find record
boundaries, if you start at a random offset.

You may want to do some sort of preprocessing, before you upload your
logs to HDFS to fix this.  Irritatingly, the existing code I've seen
for processing PCAP files doesn't seem very friendly to parsing
arbitrary packet-trace data in-memory.

--Ari

On Tue, Jul 28, 2009 at 8:31 AM, Wasim Bari<[email protected]> wrote:
>
>
>
>
>
> Hi,
>
>   I have data in PCAP file format (packet capture for network trafficc). Is 
> it possible to process this file in Hadoop in same format ? Or any supporting 
> tool over hadoop to analyze data from PCAP files ?
>
>
>
>
>
> Bye
>
>
>
> Wasim
>



-- 
Ari Rabkin [email protected]
UC Berkeley Computer Science Department

Reply via email to