Thanks for the fast reply ... tried the avro but its not working.. I wanted each event to be separated as one line. Right now with avro everything is coming in one line
On Thu, Aug 30, 2012 at 3:55 PM, Alexander Lorenz <[email protected]>wrote: > HI, > > You could use avro to get the records serialized, transfer over Flume's > AVRO sink into HDFS and process the files with Hive. Since the log looks > well formatted, it should be easy. > http://flume.apache.org/FlumeDeveloperGuide.html => Avro RPC Client > > Example: > http://flume.apache.org/FlumeUserGuide.html => search for Avro > > cheers > - Alex > > > On Aug 30, 2012, at 12:18 PM, Manu Moncy K < > [email protected]> wrote: > > > Tue Aug 7 00:00:00 2012 > > User-Name = "xxxxxxxx" > > NAS-Port = xxxxxxxx > > NAS-IP-Address = xxxxxxxx > > Framed-IP-Address = xxxxxxxx > > Filter-Id = " xxxxxxxx " > > Class = " xxxxxxxx " > > NAS-Identifier = " xxxxxxxx " > > Acct-Status-Type = xxxxxxxx > > Acct-Delay-Time = 0 > > Acct-Session-Id = " xxxxxxxx " > > Acct-Authentic = RADIUS > > Event-Timestamp = 1344286800 > > NAS-Port-Type = Ethernet > > Calling-Station-Id = " xxxxxxxx " > > NAS-Port-Id = " xxxxxxxx " > > Service-Type = Framed-User > > Framed-Protocol = PPP > > Acct-Link-Count = 0 > > RB-Agent-Circuit-Id = " xxxxxxxx " > > DSLForum-Agent-Circuit-Id = " xxxxxxxx " > > DSLForum-Access-Loop-Encapsulation = "" > > Timestamp = 1344286800 > > OSC-Service-Identifier = "DSLUsers" > > Proxy-State = OSC-Extended-Id=40682 > > Timestamp = 1344286800 > > > > Tue Aug 7 00:00:00 2012 > > User-Name = " xxxxxxxx " > > NAS-Port = xxxxxxxx > > NAS-IP-Address = xxxxxxxx > > Framed-IP-Address = xxxxxxxx > > Class = "44620232:04:" > > NAS-Identifier = " xxxxxxxx " > > Acct-Status-Type = Stop > > Acct-Delay-Time = 0 > > Acct-Input-Octets = 6021 > > Acct-Output-Octets = 323749 > > Acct-Session-Id = " xxxxxxxx " > > Acct-Authentic = RADIUS > > Acct-Session-Time = 1348 > > Acct-Input-Packets = 53 > > Acct-Output-Packets = 3187 > > Acct-Terminate-Cause = User-Request > > Acct-Input-Gigawords = 0 > > Acct-Output-Gigawords = 0 > > Event-Timestamp = 1344286800 > > NAS-Port-Type = Ethernet > > Calling-Station-Id = " xxxxxxxx " > > NAS-Port-Id = " xxxxxxxx " > > Service-Type = Framed-User > > Framed-Protocol = PPP > > Acct-Link-Count = 0 > > Timestamp = 1344286800 > > OSC-Service-Identifier = "DSLUsers" > > Proxy-State = OSC-Extended-Id=24386 > > Timestamp = 1344286800 > > > > > > Above given log format (2 events given) is the RADIUS LOG I am working > on, > > I wanted to know if there is a way i can use flume and put this log into > > hive in JSON format and take the required fields for each event. > > -- > > Manu K Moncy > > Data Scientist > > Flutura Business Solutions Pvt. Ltd > > Electronics and Communication Engineering(2008-2012) > > Govt. Model Engineering College, > > Cochin - 21 > > ☎: +91-9740245341 > > ☎: +91-9895163190 > > ✉: [email protected] > > ✉: [email protected] > > > -- > Alexander Alten-Lorenz > http://mapredit.blogspot.com > German Hadoop LinkedIn Group: http://goo.gl/N8pCF > > -- Manu K Moncy Data Scientist Flutura Business Solutions Pvt. Ltd Electronics and Communication Engineering(2008-2012) Govt. Model Engineering College, Cochin - 21 ☎: +91-9740245341 ☎: +91-9895163190 ✉: [email protected] ✉: [email protected]
