Below is the output of Hive for an INSERT-SELECT from one 'EXTERNAL' table to another. This is running in EC2 and the external tables have partitions registered as path-keys in S3. The final upload of data to S3 fails. This does not always happen, yet when it does the only option seems to be to re-run the job.
Is there another way? Perhaps a way to tell hive to retry data-uploads to S3 if a failure occurs? - Neal [snip] map = 100%, reduce =52% map = 100%, reduce =88% map = 100%, reduce =93% map = 100%, reduce =100% map = 100%, reduce =49% map = 100%, reduce =100% map = 100%, reduce =45% map = 100%, reduce =86% map = 100%, reduce =100% map = 100%, reduce =67% map = 100%, reduce =12% map = 100%, reduce =14% map = 100%, reduce =0% map = 100%, reduce =17% map = 100%, reduce =0% map = 100%, reduce =45% Ended Job = job_200907281406_0315 Job Commit failed with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: Encountered too many S3 Internal Server errors (6), aborting request.)' FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.ExecDriver
