yes, it can.

But when I write my script to extract the domain, it hangs all the time
,also there is  no job page in the job monitor!But it occurs in the cli
that:
*hive> FROM (FROM log_stg2 log SELECT TRANSFORM(url) USING 'awk -f map.awk'
AS (domain) )tmap INSERT OVERWRITE TABLE test3 SELECT tmap.domain, COUNT(1)
group by tmap.domain;
Total MapReduce jobs = 2
Number of reducers = 1
In order to change numer of reducers use:
  set mapred.reduce.tasks = <number>
Starting Job = job_200812011231_0257, Tracking URL =
http://sz-mapred000.sz01:50030/jobdetails.jsp?jobid=job_200812011231_0257
Kill Command = ./../../bin/hadoop job
-Dmapred.job.tracker=sz-mapred000.sz01:54311 -kill job_200812011231_0257
 map = 0%,  reduce =0%
*
 and it hanging....
without a running job shows in the jobtracker monitor:
Running Jobs *none*
------------------------------

Is the script will be distcp to all the tasktracker? and is the path of
script right? I place the script under the hive directory.
I followed the wiki: http://wiki.apache.org/hadoop/Hive/UserGuide
2.4. Running custom map/reduce jobs 2.4.1. MovieLens User Ratings**
Also it hangs when I just input the STATEMENT:
*hive> SELECT TRANSFORM(url) USING 'awk -f map.awk' AS (domain)  FROM
log_stg2;
Total MapReduce jobs = 1
Starting Job = job_200812011231_0259, Tracking URL =
http://sz-mapred000.sz01:50030/jobdetails.jsp?jobid=job_200812011231_0259
Kill Command = ./../../bin/hadoop job
-Dmapred.job.tracker=sz-mapred000.sz01:54311 -kill job_200812011231_0259
 map = 0%,  reduce =0%
*

在2008-12-02 19:17:51,"Ashish Thusoo" <[EMAIL PROTECTED]> 写道:
>Paradisehi,
>
>you can perhaps use the regexp_replace udf to do this.
>
>Basically
>
>regexp_replace(a.url, '/*$', '') should be able replace everything after the 
>first / with an empty string. The second string which is a regular expression 
>is a java regular expression.
>
>Ashish
>________________________________________
>From: paradisehi [EMAIL PROTECTED]
>Sent: Tuesday, December 02, 2008 3:06 AM
>To: [email protected]
>Subject: did hive support the udf now?
>
>My table:a just contains field:url
>And Now I wanna compute each domain of url's pv? and out put insert into a 
>table:b domain pv.
>
>Now I didn't know whether the hive support the udf, maybe also I can use 
>map_script to support this.
>



-- 
Best wishes!
My Friend~

Reply via email to