Hello there,

 

I have 2 tables

CREATE TABLE data(calling STRING COMMENT 'Calling number', 
volumn_download BIGINT COMMENT 'Volume download',
volumn_upload BIGINT COMMENT 'Volume upload')
PARTITIONED BY(ds STRING)
CLUSTERED BY (calling) INTO 100 BUCKETS;

CREATE TABLE sub(isdn STRING, sub_id STRING)
CLUSTERED BY (isdn) INTO 100 BUCKETS;

The DATA table has 15m records while SUB table only has 600k records.

The following SQL script were executed successfully:
select /*+ MAPJOIN(b) */ a.calling, b.sub_id from data a join sub b on
a.calling=b.isdn;

But when I used Bucket map join by setting: set hive.optimize.bucketmapjoin
= true
the above SQL script failed
select /*+ MAPJOIN(b) */ a.calling, b.sub_id from data a join sub b on
a.calling=b.isdn;

hive> set hive.optimize.bucketmapjoin = true;
hive> select /*+ MAPJOIN(b) */ a.calling, b.sub_id from ggsn_bucket a join
sub_bucket b on a.calling=b.isdn;
Total MapReduce jobs = 1
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please
use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties
files.
Execution log at:
/tmp/hduser/hduser_20120320080909_8e6a3419-4d2c-4148-a0c9-166d051c8274.log
2012-03-20 08:09:34 Starting to launch local task to process map join;
maximum memory = 932118528
2012-03-20 08:09:34 End of local task; Time Taken: 0.072 sec.
Execution completed successfully
Mapred Local Task Succeeded . Convert the Join into MapJoin
Mapred Local Task Succeeded . Convert the Join into MapJoin
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
org.apache.hadoop.util.Shell$ExitCodeException: bash: line 0: cd:
/u01/app/hduser/hadoop-0.20.203.0/tempdir/hduser/hive_2012-03-20_08-09-27_81
0_1393729636696443501/-local-10002/HashTable-Stage-1: No such file or
directory
tar: Cowardly refusing to create an empty archive
Try `tar --help' or `tar --usage' for more information.

at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
at org.apache.hadoop.util.Shell.run(Shell.java:182)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:
375)
at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:260)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:407 )
at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136 )
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.ja
va:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2 55)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Job Submission failed with exception
'org.apache.hadoop.util.Shell$ExitCodeException(bash: line 0: cd:
/u01/app/hduser/hadoop-0.20.203.0/tempdir/hduser/hive_2012-03-20_08-09-27_81
0_1393729636696443501/-local-10002/HashTable-Stage-1: No such file or
directory
tar: Cowardly refusing to create an empty archive
Try `tar --help' or `tar --usage' for more information.
)'
java.lang.IllegalArgumentException: Can not create a Path from an empty
string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
at org.apache.hadoop.fs.Path.<init>(Path.java:90)
at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:
379)
at org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.ja
va:192)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:476 )
at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136 )
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.ja
va:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2 55)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.MapRedTask

in hadoop-env.sh, I set:
export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true
-Djava.io.tmpdir=/u01/app/hduser/hadoop-0.20.203.0/tempdir"

It looked like hive could not create temporary directory.

 

 

Best regards,

Reply via email to