Re: Analyzing MySQL slow query logs using Pig + Hadoop

Ricardo Varela Mon, 11 Jan 2010 11:35:52 -0800

hey Chris,

If you find it hard to define UDFs, maybe you can start by using
scripts written in PHP if you feel more comfortable with it, or even
shell scripts. You can do that with the Pig streaming interface
(http://wiki.apache.org/pig/PigStreamingFunctionalSpec) It won't have
as much perf as the proper UDFs, but it is useful for trying (I often
prototype with STREAM first and then create some UDFs if needed)


I found the examples in the doc and in the following article from
Dmitriy Ryaboy very useful to start with:

http://www.cloudera.com/blog/2009/06/17/analyzing-apache-logs-with-pig/

Good luck in your tests and hope you like Pig and Hadoop!

Saludos!

---
ricardo

On Mon, Jan 11, 2010 at 6:29 PM, Chris Hartjes <[email protected]> wrote:
> My apologies if this is the wrong mailing list to ask this question.  I've
> started playing around with Pig and Hadoop, with the intention of using it
> to do some analysis of a collection of MySQL slow query log files.  I am not
> a Java programmer (been using PHP for a very long time, dabbled in other
> languages as required) so I am slightly intimidated by the documentation in
> Pig for writing your own UDF's.
>
> If anyone has done anything like this, I would appreciate some tips and some
> pointers on how to approach it.  Sure, I could hunker down and learn to use
> some CLI tools for analyzing the slow query log, but then I couldn't use Pig
> and Hadoop. ;)
>
> --
> Chris Hartjes
>



-- 
Ricardo Varela  -  http://phobeo.com  -  http://twitter.com/phobeo
"Though this be madness, yet there's method in 't"

Re: Analyzing MySQL slow query logs using Pig + Hadoop

Reply via email to