Thanks for the fast reply ... tried the avro but its not working.. I wanted
each event to be separated as one line. Right now with avro everything is
coming in one line

On Thu, Aug 30, 2012 at 3:55 PM, Alexander Lorenz <[email protected]>wrote:

> HI,
>
> You could use avro to get the records serialized, transfer over Flume's
> AVRO sink into HDFS and process the files with Hive. Since the log looks
> well formatted, it should be easy.
> http://flume.apache.org/FlumeDeveloperGuide.html => Avro RPC Client
>
> Example:
> http://flume.apache.org/FlumeUserGuide.html => search for Avro
>
> cheers
> - Alex
>
>
> On Aug 30, 2012, at 12:18 PM, Manu Moncy K <
> [email protected]> wrote:
>
> > Tue Aug  7 00:00:00 2012
> >        User-Name = "xxxxxxxx"
> >        NAS-Port = xxxxxxxx
> >        NAS-IP-Address = xxxxxxxx
> >        Framed-IP-Address = xxxxxxxx
> >        Filter-Id = " xxxxxxxx "
> >        Class = " xxxxxxxx "
> >        NAS-Identifier = " xxxxxxxx "
> >        Acct-Status-Type = xxxxxxxx
> >        Acct-Delay-Time = 0
> >        Acct-Session-Id = " xxxxxxxx "
> >        Acct-Authentic = RADIUS
> >        Event-Timestamp = 1344286800
> >        NAS-Port-Type = Ethernet
> >        Calling-Station-Id = " xxxxxxxx "
> >        NAS-Port-Id = " xxxxxxxx "
> >        Service-Type = Framed-User
> >        Framed-Protocol = PPP
> >        Acct-Link-Count = 0
> >        RB-Agent-Circuit-Id = " xxxxxxxx "
> >        DSLForum-Agent-Circuit-Id = " xxxxxxxx "
> >        DSLForum-Access-Loop-Encapsulation = ""
> >        Timestamp = 1344286800
> >        OSC-Service-Identifier = "DSLUsers"
> >        Proxy-State = OSC-Extended-Id=40682
> >        Timestamp = 1344286800
> >
> > Tue Aug  7 00:00:00 2012
> >        User-Name = " xxxxxxxx "
> >        NAS-Port = xxxxxxxx
> >        NAS-IP-Address = xxxxxxxx
> >        Framed-IP-Address = xxxxxxxx
> >        Class = "44620232:04:"
> >        NAS-Identifier = " xxxxxxxx "
> >        Acct-Status-Type = Stop
> >        Acct-Delay-Time = 0
> >        Acct-Input-Octets = 6021
> >        Acct-Output-Octets = 323749
> >        Acct-Session-Id = " xxxxxxxx "
> >        Acct-Authentic = RADIUS
> >        Acct-Session-Time = 1348
> >        Acct-Input-Packets = 53
> >        Acct-Output-Packets = 3187
> >        Acct-Terminate-Cause = User-Request
> >        Acct-Input-Gigawords = 0
> >        Acct-Output-Gigawords = 0
> >        Event-Timestamp = 1344286800
> >        NAS-Port-Type = Ethernet
> >        Calling-Station-Id = " xxxxxxxx "
> >        NAS-Port-Id = " xxxxxxxx "
> >        Service-Type = Framed-User
> >        Framed-Protocol = PPP
> >        Acct-Link-Count = 0
> >        Timestamp = 1344286800
> >        OSC-Service-Identifier = "DSLUsers"
> >        Proxy-State = OSC-Extended-Id=24386
> >        Timestamp = 1344286800
> >
> >
> > Above given log format (2 events given) is the RADIUS LOG I am working
> on,
> > I wanted to know if there is a way i can use flume and put this log into
> > hive in JSON format and take the required fields for each event.
> > --
> > Manu K Moncy
> > Data Scientist
> > Flutura Business Solutions Pvt. Ltd
> > Electronics and Communication Engineering(2008-2012)
> > Govt. Model Engineering College,
> > Cochin - 21
> > ☎: +91-9740245341
> > ☎: +91-9895163190
> > ✉: [email protected]
> > ✉: [email protected]
>
>
> --
> Alexander Alten-Lorenz
> http://mapredit.blogspot.com
> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>
>


-- 
Manu K Moncy
Data Scientist
Flutura Business Solutions Pvt. Ltd
Electronics and Communication Engineering(2008-2012)
Govt. Model Engineering College,
Cochin - 21
☎: +91-9740245341
☎: +91-9895163190
✉: [email protected]
✉: [email protected]

Reply via email to