Hi Brad,
Your test, after edting for local host/file names, etc. worked. It
must be something else I'm doing wrong in my development stuff. At
least I know it should work. I'll figure it out eventually. Thanks
again.
David
On Mon, Apr 28, 2014 at 10:22:57AM -0700, Brad Ruderman wrote:
> Hi David-
> Can you test the code? It is working for me. Make sure your jar is in HDFS
> and you are using the FQDN for referencing it.
>
> import pyhs2
>
> with pyhs2.connect(host='127.0.0.1',
> port=10000,
> authMechanism="PLAIN",
> user='root',
> password='test',
> database='default') as conn:
> with conn.cursor() as cur:
> cur.execute("ADD JAR hdfs://
> sandbox.hortonworks.com:8020/nexr-hive-udf-0.2-SNAPSHOT.jar")
> cur.execute("CREATE TEMPORARY FUNCTION substr AS
> 'com.nexr.platform.hive.udf.UDFSubstrForOracle'")
> #Execute query
> cur.execute("select substr(description,2,4) from sample_07")
>
> #Return column info from query
> print cur.getSchema()
>
> #Fetch table results
> for i in cur.fetch():
> print i
>
> Thanks,
> Brad
>
>
> On Mon, Apr 28, 2014 at 7:39 AM, David Engel <[email protected]> wrote:
>
> > Thanks for your response.
> >
> > We've essentially done your first suggestion in the past by copying or
> > symlinking our jar into Hive's lib directory. It works, but we'd like
> > a better way for different users to to use different versions of our
> > jar during development. Perhaps that's not possible, though, without
> > running completely differnt instances of Hive.
> >
> > I don't think your second suggestion will work. The original problem
> > is that when "add jar file.jar" is run through pyhs2, the fulle
> > command gets passed to AddResourceProcessor.run(), yet
> > AddResourceProcessor.run() is written such that it only expects "jar
> > file.jar" to get passed to it. That's how it appears to work when
> > "add jar file.jar" is run from a stand-alone Hive CLI and from beeline.
> >
> > David
> >
> > On Sat, Apr 26, 2014 at 12:14:53AM -0700, Brad Ruderman wrote:
> > > An easy solution would be to add the jar to the classpath or auxlibs
> > > therefore every instance of hive already has the jar and you just need to
> > > create the temporary function.
> > >
> > > Else you can put the JAR in HDFS and reference the add jar using the hdfs
> > > scheme. Example:
> > >
> > > import pyhs2
> > >
> > > with pyhs2.connect(host='127.0.0.1',
> > > port=10000,
> > > authMechanism="PLAIN",
> > > user='root',
> > > password='test',
> > > database='default') as conn:
> > > with conn.cursor() as cur:
> > > cur.execute("ADD JAR hdfs://
> > > sandbox.hortonworks.com:8020/nexr-hive-udf-0.2-SNAPSHOT.jar")
> > > cur.execute("CREATE TEMPORARY FUNCTION substr AS
> > > 'com.nexr.platform.hive.udf.UDFSubstrForOracle'")
> > > #Execute query
> > > cur.execute("select substr(description,2,4) from sample_07")
> > >
> > > #Return column info from query
> > > print cur.getSchema()
> > >
> > > #Fetch table results
> > > for i in cur.fetch():
> > > print i
> > >
> > >
> > > On Fri, Apr 25, 2014 at 7:54 AM, David Engel <[email protected]> wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm trying to convert some of our Hive queries to use the pyhs2 Python
> > > > package (https://github.com/BradRuderman/pyhs2). Because we have our
> > > > own jar with some custom SerDes and UDFs, we need to use the "add jar
> > > > /path/to/my.jar" command to make them available to Hive. This works
> > > > fine using the Hive CLI directly and also with the Beeline client. It
> > > > doesn't work, however, with pyhs2.
> > > >
> > > > I naively tracked the problem down to a bug in
> > > > AddResourceProcessor.run(). See HIVE-6971 in Jira. My attempted fix
> > > > turned out to not be correct because it breaks the "add" command when
> > > > used from the CLI and Beeline. It seems the "add" part of any "add
> > > > file|jar|archive ..." command needs to get stripped off somewhere
> > > > before it gets passed to AddResourceProcessor.run(). Unfortunately, I
> > > > can't find that location when the command is received from pyhs2. Can
> > > > someone help?
> > > >
> > > > David
> > > > --
> > > > David Engel
> > > > [email protected]
> > > >
> >
> > --
> > David Engel
> > [email protected]
> >
--
David Engel
[email protected]