Re: RE: did hive support the udf now?

Zheng Shao Tue, 02 Dec 2008 20:04:28 -0800

Hi Paradisehit,

We are planning to add a way to allow users to attach files to the job.
Before that, you will have to specify the full path of map.awk, and make
sure it is accessible on all machines. Our cluster has a single home mount
that allows users to do that.


We never encounter the hanging problem. Mostly probably JobTracker is too
busy to accept new jobs.
Can you submit a normal map-reduce job to the same JobTracker?

Zheng

On Tue, Dec 2, 2008 at 7:12 PM, 施兴 <[EMAIL PROTECTED]> wrote:

> yes, it can.
>
> But when I write my script to extract the domain, it hangs all the time
> ,also there is  no job page in the job monitor!But it occurs in the cli
> that:
> *hive> FROM (FROM log_stg2 log SELECT TRANSFORM(url) USING 'awk -f
> map.awk' AS (domain) )tmap INSERT OVERWRITE TABLE test3 SELECT tmap.domain,
> COUNT(1) group by tmap.domain;
> Total MapReduce jobs = 2
> Number of reducers = 1
> In order to change numer of reducers use:
>   set mapred.reduce.tasks = <number>
> Starting Job = job_200812011231_0257, Tracking URL =
> http://sz-mapred000.sz01:50030/jobdetails.jsp?jobid=job_200812011231_0257
> Kill Command = ./../../bin/hadoop job
> -Dmapred.job.tracker=sz-mapred000.sz01:54311 -kill job_200812011231_0257
>  map = 0%,  reduce =0%
> *
>  and it hanging....
> without a running job shows in the jobtracker monitor:
> Running Jobs *none*
> ------------------------------
>
> Is the script will be distcp to all the tasktracker? and is the path of
> script right? I place the script under the hive directory.
> I followed the wiki: http://wiki.apache.org/hadoop/Hive/UserGuide
> 2.4. Running custom map/reduce jobs 2.4.1. MovieLens User Ratings**
> Also it hangs when I just input the STATEMENT:
> *hive> SELECT TRANSFORM(url) USING 'awk -f map.awk' AS (domain)  FROM
> log_stg2;
> Total MapReduce jobs = 1
> Starting Job = job_200812011231_0259, Tracking URL =
> http://sz-mapred000.sz01:50030/jobdetails.jsp?jobid=job_200812011231_0259
> Kill Command = ./../../bin/hadoop job
> -Dmapred.job.tracker=sz-mapred000.sz01:54311 -kill job_200812011231_0259
>  map = 0%,  reduce =0%
> *
>
> 在2008-12-02 19:17:51，"Ashish Thusoo" <[EMAIL PROTECTED]> 写道：
>
>
> >Paradisehi,
> >
> >you can perhaps use the regexp_replace udf to do this.
> >
> >Basically
> >
> >regexp_replace(a.url, '/*$', '') should be able replace everything after the 
> >first / with an empty string. The second string which is a regular 
> >expression is a java regular expression.
>
> >
> >Ashish
> >________________________________________
> >From: paradisehi [EMAIL PROTECTED]
> >Sent: Tuesday, December 02, 2008 3:06 AM
>
> >To: [email protected]
> >Subject: did hive support the udf now?
> >
> >My table:a just contains field:url
> >And Now I wanna compute each domain of url's pv? and out put insert into a 
> >table:b domain pv.
>
> >
> >Now I didn't know whether the hive support the udf, maybe also I can use 
> >map_script to support this.
> >
>
>
>
> --
> Best wishes!
> My Friend~
>



-- 
Yours,
Zheng

Re: RE: did hive support the udf now?

Reply via email to