You can register it in the pig script (or with the a recent patch, on the
command-line even), and it will get shipped and put on the classpath; or you
can prep your machines to have the local copy.  For something like JDBC
drivers I think it may be reasonable to let users decide rather than bundle
it in by default -- shipping jars from the client to the cluster does have
some overhead, and a lot of folks will probably have these installed on
their hadoop nodes anyway.

Just imho (and I haven't actually tried using Ankur's patch yet).

On Thu, Feb 18, 2010 at 9:37 AM, zaki rahaman <[email protected]>wrote:

> Hey,
>
> First off, @Ankur, great work so far on the patch. This probably is not an
> efficient way of doing mass dumps to DB (but why would you want to do that
> anyway when you have HDFS?), but it hits the sweetspot for my particular
> use
> case (storing aggregates to interface with a webapp). I was able to apply
> the patch cleanly and build. I had a question about actually using the
> DBStorage UDF, namely where I have to keep the JDBC driver? I was wondering
> if it would be possible to simply bundle it in the same jar as the UDF
> itself, but I know that Hadoop's DBOutputFormat requires a local copy of
> the
> driver on each machine. Any pointers?
>
> --
> Zaki Rahaman
>

Reply via email to