The big question is how the log file needs to be parsed / formatting.  I'd
be inclined to write a UDF that would take the line of text and return a
tuple of the values you'd be storing in hbase.

Then you could do other operations on the bag of tuples that get passed
back.

Alternatively, you could write a regex statement and use an internal pig
function like REGEX_EXTRACT or REGEX_EXTRACT_ALL.

I like the UDF approach in this case because then I can more easily write
unit tests around my log parser and get that testing out of the way before
actually spawning any jobs.


On Wed, Feb 26, 2014 at 12:22 AM, Chhaya Vishwakarma <
[email protected]> wrote:

> hi,
>
> I have a log file in HDFS which needs to be parsed and put in a Hbase
> table.
>
> I want to do this using PIG .
>
> How can i go about it .Pig script should parse the logs and then put in
> Hbase?
>
>
> Regards,
> Chhaya Vishwakarma
>
>
> ________________________________
> The contents of this e-mail and any attachment(s) may contain confidential
> or privileged information for the intended recipient(s). Unintended
> recipients are prohibited from taking action on the basis of information in
> this e-mail and using or disseminating the information, and must notify the
> sender and delete it from their system. L&T Infotech will not accept
> responsibility or liability for the accuracy or completeness of, or the
> presence of any virus or disabling code in this e-mail"
>

Reply via email to