I have found a work-around for this bug. After you issue the ALTER
TABLE...CONCATENATE command, issue:

ALTER TABLE T1 PARTITION (P1) SET LOCATION
".../apps/hive/warehouse/DB1/T1/P1";

This will fix the metadata that CONCATENATE breaks.


––
*Tim Ellis:* 510-761-6610


On Mon, Oct 13, 2014 at 10:37 PM, Time Less <timelessn...@gmail.com> wrote:

> Has anyone seen anything like this? Google searches turned up nothing, so
> I thought I'd ask here, then file a JIRA if no-one thinks I'm doing it
> wrong.
>
> If I ALTER a particular table with three partitions once, it works. Second
> time it works, too, but reports it is moving a directory to the Trash that
> doesn't exist (still, this doesn't kill it). The third time I ALTER the
> table, it crashes, because the directory structure has been modified to
> something invalid.
>
> Here's a nearly-full output of the 2nd and 3rd runs. The ALTER is exactly
> the same both times (I just press UP ARROW):
>
>
> *HQL, 2nd Run:*hive (analytics)> alter table bidtmp partition
> (log_type='bidder',dt='2014-05-01',hour=11) concatenate ;
>
>
> *Output:*Starting Job = job_1412894367814_0017, Tracking URL =
> ....application_1412894367814_0017/
> Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1412894367814_0017
> Hadoop job information for null: number of mappers: 97; number of
> reducers: 0
> 2014-10-13 20:28:23,143 null map = 0%,  reduce = 0%
> 2014-10-13 20:28:36,042 null map = 1%,  reduce = 0%, Cumulative CPU 49.69
> sec
> ...
> 2014-10-13 20:31:56,415 null map = 99%,  reduce = 0%, Cumulative CPU
> 812.65 sec
> 2014-10-13 20:31:57,458 null map = 100%,  reduce = 0%, Cumulative CPU
> 813.88 sec
> MapReduce Total cumulative CPU time: 13 minutes 33 seconds 880 msec
> Ended Job = job_1412894367814_0017
> Loading data to table analytics.bidtmp partition (log_type=bidder,
> dt=2014-05-01, hour=11)
> rmr: DEPRECATED: Please use 'rm -r' instead.
> Moved: '.../apps/hive/warehouse/analytics.db/bidtmp/
> *dt=2014-05-01/hour=11/log_type=bidder*' to trash at:
> .../user/hdfs/.Trash/Current
> *// (note the bold-faced path doesn't exist, the partition is specified as
> log_type first, then dt, then hour)*
> Partition analytics.bidtmp*{log_type=bidder, dt=2014-05-01, hour=11}*
> stats: [numFiles=0, numRows=0, totalSize=0, rawDataSize=0]
> *(here, the partition ordering is correct!)*
> MapReduce Jobs Launched:
> Job 0: Map: 97   Cumulative CPU: 813.88 sec   HDFS Read: 30298871932 HDFS
> Write: 28746848923 SUCCESS
> Total MapReduce CPU Time Spent: 13 minutes 33 seconds 880 msec
> OK
> Time taken: 224.128 seconds
>
>
> *HQL, 3rd Run:*hive (analytics)> alter table bidtmp partition
> (log_type='bidder',dt='2014-05-01',hour=11) concatenate ;
>
>
> *Output:*java.io.FileNotFoundException: File does not exist:
> .../apps/hive/warehouse/analytics.db/bidtmp/dt=2014-05-01/hour=11/log_type=bidder
> *(because it should be log_type=.../dt=.../hour=... - not this order)*
>         at org.apache.hadoop.hdfs.
> DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
>         at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
>         at
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:419)
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:520)
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:512)
>         at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)
>         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
>         at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>         at
> org.apache.hadoop.hive.ql.io.rcfile.merge.BlockMergeTask.execute(BlockMergeTask.java:214)
>         at
> org.apache.hadoop.hive.ql.exec.DDLTask.mergeFiles(DDLTask.java:511)
>         at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:458)
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
>         at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1508)
>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1275)
>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1093)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:916)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:906)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
>         at
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793)
>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Job Submission failed with exception 'java.io.FileNotFoundException(File
> does not exist:
> .../apps/hive/warehouse/analytics.db/bidtmp/dt=2014-05-01/hour=11/log_type=bidder)'
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.DDLTask
>
> ––
> *Tim Ellis:* 510-761-6610
>
>

Reply via email to