Re: INSERT OVERWRITE not working with map/reduce transform

Josh Ferguson Mon, 12 Jan 2009 22:24:09 -0800

A follow up

If I insert into a local file and then load the file into the tableusing a load command it works:

hive> INSERT OVERWRITE LOCAL DIRECTORY '/tmp/hdfs_out' SELECTTRANSFORM(actor_id) USING '/Users/Josh/percentiles.rb' AS (actor_id,percentile, count) FROM (SELECT actor_id FROM activities CLUSTER BYactor_id) actor;

hive> LOAD DATA LOCAL INPATH '/tmp/hdfs_out/attempt_200901112100_0030_r_000000_0' OVERWRITE INTO TABLE percentilesPARTITION ( account='cUU5T7y6DmdzMJFcFt3JDe', application='TestApplication', dataset='Purchases', hour=342007, span=1 );Copying data from file:/tmp/hdfs_out/attempt_200901112100_0030_r_000000_0Loading data to table percentiles partition{account=cUU5T7y6DmdzMJFcFt3JDe, application=Test Application,dataset=Purchases, hour=342007, span=1}

OK

hive> SELECT * FROM percentiles;
OK

00d5c3f0-1b29-4bd3-bc8c-e3f0a7ba5949 100 1 cUU5T7y6DmdzMJFcFt3JDe TestApplication Purchases 342007 103e58605-7de7-48fb-9852-781700823a71 100 1 cUU5T7y6DmdzMJFcFt3JDe TestApplication Purchases 342007 106986e3e-7c73-4466-b0dc-d92038e3f665 100 1 cUU5T7y6DmdzMJFcFt3JDe TestApplication Purchases 342007 106af307c-0da6-4795-860a-6b26425bdbc8 100 1 cUU5T7y6DmdzMJFcFt3JDe TestApplication Purchases 342007 10cca5b18-efe3-4903-9387-a8c7eb8432e4 100 1 cUU5T7y6DmdzMJFcFt3JDe TestApplication Purchases 342007 10e20b565-59aa-4d57-8f68-ddcaaa75673b 100 1 cUU5T7y6DmdzMJFcFt3JDe TestApplication Purchases 342007 1

...

So it works in two parts but not in one part, I'm not sure why.

Josh F.

On Jan 12, 2009, at 9:57 PM, Zheng Shao wrote:

Does the map-reduce job produced any output? Can you check themapred job page?
If so then it must be that the "moveTask" after map-reduce job isfailed.
Zheng
On Mon, Jan 12, 2009 at 9:48 PM, Josh Ferguson <j...@besquared.net>wrote:
https://gist.github.com/3cb4be29625442c90140

Josh

On Jan 12, 2009, at 9:39 PM, Zheng Shao wrote:
Should be tab separated.

if you run it without insert, is there any data on the screen?

Zheng
On Mon, Jan 12, 2009 at 9:32 PM, Josh Ferguson <j...@besquared.net>wrote:The only thing I could figure is that my output is incorrect.. Isthe output from the transform script supposed to be tab separatedor separated by the delimiters of the table you're trying to insertinto? It doesn't seem to make any difference (my table is stillempty no matter which one I try) but I'd better make sure justincase.
Josh

On Jan 12, 2009, at 9:11 PM, Zheng Shao wrote:
Here are some examples:
[zs...@xxx /hive.root] find ./ql/src/test/queries/clientpositive -name '*.q' | xargs grep TRANSFORM./ql/src/test/queries/clientpositive/input14_limit.q: SELECTTRANSFORM(src.key, src.value)./ql/src/test/queries/clientpositive/input14_limit.q: SELECTTRANSFORM(src.key, src.value)./ql/src/test/queries/clientpositive/input14.q: SELECTTRANSFORM(src.key, src.value)./ql/src/test/queries/clientpositive/input14.q: SELECTTRANSFORM(src.key, src.value)./ql/src/test/queries/clientpositive/input18.q: SELECTTRANSFORM(src.key, src.value, 1+2, 3+4)./ql/src/test/queries/clientpositive/input18.q: SELECTTRANSFORM(src.key, src.value, 1+2, 3+4)./ql/src/test/queries/clientpositive/scriptfile1.q: SELECTTRANSFORM(src.key, src.value)./ql/src/test/queries/clientpositive/input5.q: SELECTTRANSFORM(src_thrift.lint, src_thrift.lintstring)./ql/src/test/queries/clientpositive/input5.q: SELECTTRANSFORM(src_thrift.lint, src_thrift.lintstring)./ql/src/test/queries/clientpositive/input17.q: SELECTTRANSFORM(src_thrift.aint + src_thrift.lint[0],src_thrift.lintstring[0])./ql/src/test/queries/clientpositive/input17.q: SELECTTRANSFORM(src_thrift.aint + src_thrift.lint[0],src_thrift.lintstring[0])
On Mon, Jan 12, 2009 at 8:58 PM, Josh Ferguson<j...@besquared.net> wrote:Anyone have any word on why this might not work? Can someone giveme an example of a query they use to INSERT OVERWRITE a table froma map and/or reduce job that I could use as a reference?
Josh F.


On Jan 11, 2009, at 9:48 PM, Josh Ferguson wrote:

I have a query that returns the proper results:
SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,percentile, count) FROM (SELECT actor_id FROM activities CLUSTERBY actor_id) actors;
But when I do

INSERT OVERWRITE TABLE percentiles
SELECT TRANSFORM(actor_id) USING '/my/script.rb' AS (actor_id,percentile, count) FROM (SELECT actor_id FROM activities CLUSTERBY actor_id) actors;
It says it loads data into the percentiles table but when I askfor data from that table I get:
hive> SELECT actor_id, percentile, count FROM percentiles;
FAILED: Error in semantic analysis:org.apache.hadoop.hive.ql.metadata.HiveException: Path /user/hive/warehouse/percentiles not a valid path
$ hadoop fs -ls /user/hive/warehouse/percentiles/
Found 1 items
-rw-r--r-- 1 Josh supergroup 0 2009-01-11 21:45 /user/hive/warehouse/percentiles/attempt_200901112100_0010_r_000000_0
It's nothing but an empty file.

Am I doing something wrong?

Josh Ferguson




--
Yours,
Zheng
--
Yours,
Zheng
--
Yours,
Zheng

Re: INSERT OVERWRITE not working with map/reduce transform

Reply via email to