Below is the output of Hive for an INSERT-SELECT from one 'EXTERNAL' table
to another.  This is running in EC2 and the external tables have partitions
registered as path-keys in S3.   The final upload of data to S3 fails.  This
does not always happen, yet when it does the only option seems to be to
re-run the job.

Is there another way?  Perhaps a way to tell hive to retry data-uploads to
S3 if a failure occurs? - Neal


[snip]
 map = 100%,  reduce =52%
 map = 100%,  reduce =88%
 map = 100%,  reduce =93%
 map = 100%,  reduce =100%
 map = 100%,  reduce =49%
 map = 100%,  reduce =100%
 map = 100%,  reduce =45%
 map = 100%,  reduce =86%
 map = 100%,  reduce =100%
 map = 100%,  reduce =67%
 map = 100%,  reduce =12%
 map = 100%,  reduce =14%
 map = 100%,  reduce =0%
 map = 100%,  reduce =17%
 map = 100%,  reduce =0%
 map = 100%,  reduce =45%
Ended Job = job_200907281406_0315
Job Commit failed with exception
'org.apache.hadoop.hive.ql.metadata.HiveException(org.apache.hadoop.fs.s3.S3Exception:
org.jets3t.service.S3ServiceException: Encountered too many S3 Internal
Server errors (6), aborting request.)'
FAILED: Execution Error, return code 3 from
org.apache.hadoop.hive.ql.exec.ExecDriver

Reply via email to