Actually it should be more like:
* public Text evaluate (String s) { if(t==null) { t=new Text("initialization"); } else { t.set(s.getBytes()); } return t; }* Your trying to avoid new if possible. On Tue, Mar 25, 2014 at 9:09 PM, sky88088 <sky880883...@hotmail.com> wrote: > It works! > I really appreciate your help! > > > Best Regards, > ypg > > ------------------------------ > From: java8...@hotmail.com > To: user@hive.apache.org > Subject: RE: Does hive instantiate new udf object for each record > Date: Tue, 25 Mar 2014 09:57:25 -0400 > > The reason you saw that is because when you provide evaluate() method, you > didn't specified the type of column it can be used. So Hive will just > create test instance again and again for every new row, as it doesn't know > how or which column to apply your UDF. > > I changed your code as below: > > > > > *public class test extends UDF { private Text t;* > > > > > > > > > * public Text evaluate (String s) { if(t==null) { > t=new Text("initialization"); } else { t=new > Text("OK"); } return t; }* > > > > > > > > > > * public Text evaluate () { if(t==null) { t=new > Text("initialization"); } else { t=new > Text("OK"); } return t; }}* > > *Now, if you invoke your UDF like this:* > > *select test(colA) from AnyTable;* > > *You should see one "Init" and the rest are "OK", make sense?* > > *Yong* > > ------------------------------ > From: sky880883...@hotmail.com > To: user@hive.apache.org > Subject: RE: Does hive instantiate new udf object for each record > Date: Tue, 25 Mar 2014 10:17:46 +0800 > > I have implemented a simple udf for test. > > > > > > > > > > > > > > > *public class test extends UDF { private Text t; public Text > evaluate () { if(t==null) { t=new > Text("initialization"); } else { t=new > Text("OK"); } return t; }}* > > And the test query: *select test() from AnyTable;* > I got > > > > *initializationinitializationinitialization...* > > I have also implemented a similar GenericUDF, and got similar result. > > What' wrong with my code? > > Best Regards, > ypg > ------------------------------ > From: java8...@hotmail.com > To: user@hive.apache.org > Subject: RE: Does hive instantiate new udf object for each record > Date: Mon, 24 Mar 2014 16:58:49 -0400 > > Your UDF object will only initialized once per map or reducer. > > When you said your UDF object being initialized for each row, why do you > think so? Do you have log to make you think that way? > > If OK, please provide more information, so we can help you, like your > example code, log etc.... > > Yong > > ------------------------------ > Date: Tue, 25 Mar 2014 00:30:21 +0800 > From: sky880883...@hotmail.com > To: user@hive.apache.org > Subject: Does hive instantiate new udf object for each record > > Hi all, > > I'm trying to implement a udf which makes use of some data > structures like binary tree. > > However, it seems that hive instantiates new udf object for each > row in the table. Then the data structures would be also initialized again > and again for each row. > > Whereas, in the book <Programming Hive>, a geoip function is taken > for an example showing that a LookupService object "is saved in a > reference so it only needs to be > initialized once in the lifetime of a map or reduce task that initializes > it". The code for this function can be found here ( > https://github.com/edwardcapriolo/hive-geoip/). > > Could anyone give me some ideas how to make the udf > object initialize once in the lifetime of a map or reduce task? > > > Best Regards, > ypg > ------------------------------ > >