Oh... I forgot the Crunch is only an abstract for MapReduce pipeline.
But anyone tried use it with S3 job output ?  It's strange, seems the
job froze after write the _SUCESS output to S3. The last log appeared
in my job log file is like below:

2016-09-22 10:05:37,194 INFO
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob
(Thread-5): Job status available at:
http://ip-172-31-103-28.cn-north-1.compute.internal:20888/proxy/application_1472715051930_0002/

2016-09-22 10:12:13,692 INFO
com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream
(Thread-5): close closed:false
s3://mgtv-ott-data-archive/vodstat-output/ov/year=2016/month=09/day=21/_SUCCESS

2016-09-22 1:09 GMT+08:00 Josh Wills <josh.wi...@gmail.com>:
> I don't follow- Hadoop handles compression transparently for most of the
> commonly used input formats and compression schemes; you shouldn't have to
> do anything.
>
> On Wed, Sep 21, 2016 at 12:53 AM wu lihu <routermanwul...@gmail.com> wrote:
>>
>> Hi Everyone
>>   I want to ask one question about process the logs files end with
>> compressed files ? Is there any example for that ?

Reply via email to