Actually it should be more like:








*    public Text evaluate (String s) {        if(t==null) {
t=new Text("initialization");        }        else {
t.set(s.getBytes());        }        return t;    }*


Your trying to avoid new if possible.


On Tue, Mar 25, 2014 at 9:09 PM, sky88088 <sky880883...@hotmail.com> wrote:

> It works!
> I really appreciate your help!
>
>
> Best Regards,
> ypg
>
> ------------------------------
> From: java8...@hotmail.com
> To: user@hive.apache.org
> Subject: RE: Does hive instantiate new udf object for each record
> Date: Tue, 25 Mar 2014 09:57:25 -0400
>
> The reason you saw that is because when you provide evaluate() method, you
> didn't specified the type of column it can be used. So Hive will just
> create test instance again and again for every new row, as it doesn't know
> how or which column to apply your UDF.
>
> I changed your code as below:
>
>
>
>
> *public class test extends UDF {    private Text t;*
>
>
>
>
>
>
>
>
> *    public Text evaluate (String s) {        if(t==null) {
> t=new Text("initialization");        }        else {            t=new
> Text("OK");        }        return t;    }*
>
>
>
>
>
>
>
>
>
> *    public Text evaluate () {        if(t==null) {            t=new
> Text("initialization");        }        else {            t=new
> Text("OK");        }        return t;    }}*
>
> *Now, if you invoke your UDF like this:*
>
> *select test(colA) from AnyTable;*
>
> *You should see one "Init" and the rest are "OK", make sense?*
>
> *Yong*
>
> ------------------------------
> From: sky880883...@hotmail.com
> To: user@hive.apache.org
> Subject: RE: Does hive instantiate new udf object for each record
> Date: Tue, 25 Mar 2014 10:17:46 +0800
>
> I have implemented a simple udf for test.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *public class test extends UDF {    private Text t;    public Text
> evaluate () {        if(t==null) {            t=new
> Text("initialization");        }        else {            t=new
> Text("OK");        }        return t;    }}*
>
> And the test query: *select test() from AnyTable;*
> I got
>
>
>
> *initializationinitializationinitialization...*
>
> I have also implemented a similar GenericUDF, and got similar result.
>
> What' wrong with my code?
>
> Best Regards,
> ypg
> ------------------------------
> From: java8...@hotmail.com
> To: user@hive.apache.org
> Subject: RE: Does hive instantiate new udf object for each record
> Date: Mon, 24 Mar 2014 16:58:49 -0400
>
> Your UDF object will only initialized once per map or reducer.
>
> When you said your UDF object being initialized for each row, why do you
> think so? Do you have log to make you think that way?
>
> If OK, please provide more information, so we can help you, like your
> example code, log etc....
>
> Yong
>
> ------------------------------
> Date: Tue, 25 Mar 2014 00:30:21 +0800
> From: sky880883...@hotmail.com
> To: user@hive.apache.org
> Subject: Does hive instantiate new udf object for each record
>
> Hi all,
>
>         I'm trying to implement a udf which makes use of some data
> structures like binary tree.
>
>         However,  it seems that hive instantiates new udf object for each
> row in the table. Then the data structures would be also initialized again
> and again for each row.
>
>         Whereas, in the book <Programming Hive>, a geoip function is taken
> for an example showing that a LookupService object "is saved in a
> reference so it only needs to be
> initialized once in the lifetime of a map or reduce task that initializes
> it". The code for this function can be found here (
> https://github.com/edwardcapriolo/hive-geoip/).
>
>         Could anyone give me some ideas how to make the udf
> object initialize once in the lifetime of a map or reduce task?
>
>
> Best Regards,
> ypg
> ------------------------------
>
>

Reply via email to