Thank you!

I tried with "set hive.optimize.cp=false;", then it works!
However, it reduce the merit of RCFile which skip unnecessary data.
It may be better to use *SequenceFile in the present state of things?*

--
Shunsuke Mikami

2011/3/23 yongqiang he <heyongqiang...@gmail.com>

> Don't know the reason, but can you try with "set hive.optimize.cp=false;"
> The problem seems it reports error when trying to skip some data.
>
> On Tue, Mar 22, 2011 at 7:27 AM, Shunsuke Mikami <shun0...@gmail.com>
> wrote:
> > Hi all,
> > I am testing RCFile on S3.
> > I could execute queries which don't specify columns such as "select *
> from
> > table".
> > But, I could not execute queries which specify columns such as "select id
> > from table".
> > This job progress to near the end of a map task, but cannot finish the
> task
> > as the below log message.
> > 2011-03-22 17:12:04,325 INFO
> > org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key
> > 'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0'
> for
> > reading at position '50000365'
> > 2011-03-22 17:12:04,362 INFO
> > org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key
> > 'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0'
> for
> > reading at position '50458664'
> > 2011-03-22 17:12:04,444 INFO
> > org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key
> > 'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0'
> for
> > reading at position '50603753'
> > 2011-03-22 17:12:04,509 INFO
> > org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key
> > 'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0'
> for
> > reading at position '50651845'
> > 2011-03-22 17:12:04,536 INFO
> > org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key
> > 'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0'
> for
> > reading at position '50735249'
> > 2011-03-22 17:12:04,570 INFO
> > org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key
> > 'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0'
> for
> > reading at position '50956751'
> > 2011-03-22 17:12:04,600 INFO
> > org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key
> > 'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0'
> for
> > reading at position '51025754'
> > 2011-03-22 17:12:04,633 INFO org.apache.hadoop.hive.ql.exec.MapOperator:
> 9
> > finished. closing...
> > ...
> > 2011-03-22 17:12:05,167 WARN org.apache.hadoop.mapred.Child: Error
> running
> > child org.apache.hadoop.fs.s3.S3Exception:
> > org.jets3t.service.S3ServiceException: S3 GET failed for
> >
> '/user%2Fhive%2Fwarehouse%2Frcfile_logs%2Fdt%3D20110312%2Fcontroller%3Drecipe%2F000001_0'
> > XML Error Message: <?xml version="1.0"
> > encoding="UTF-8"?><Error><Code>InvalidRange</Code><Message>The requested
> > range is not
> >
> satisfiable</Message><ActualObjectSize>51025754</ActualObjectSize><RequestId>***</RequestId><HostId>***</HostId><RangeRequested>bytes=51025754-</RangeRequested></Error>
> > at
> >
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleServiceException(Jets3tNativeFileSystemStore.java:229)
> > at
> >
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleServiceException(Jets3tNativeFileSystemStore.java:220)
> > at
> >
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:133)
> > at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > at java.lang.reflect.Method.invoke(Method.java:597) at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> > at
> >
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> > at org.apache.hadoop.fs.s3native.$Proxy1.retrieve(Unknown Source) at
> >
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.seek(NativeS3FileSystem.java:150)
> > at
> >
> org.apache.hadoop.fs.BufferedFSInputStream.seek(BufferedFSInputStream.java:76)
> > at
> >
> org.apache.hadoop.fs.BufferedFSInputStream.skip(BufferedFSInputStream.java:56)
> > at java.io.DataInputStream.skipBytes(DataInputStream.java:203) at
> >
> org.apache.hadoop.hive.ql.io.RCFile$ValueBuffer.readFields(RCFile.java:443)
> > at
> >
> org.apache.hadoop.hive.ql.io.RCFile$Reader.currentValueBuffer(RCFile.java:1304)
> > at
> >
> org.apache.hadoop.hive.ql.io.RCFile$Reader.getCurrentRow(RCFile.java:1425)
> > at
> >
> org.apache.hadoop.hive.ql.io.RCFileRecordReader.next(RCFileRecordReader.java:88)
> > at
> >
> org.apache.hadoop.hive.ql.io.RCFileRecordReader.next(RCFileRecordReader.java:39)
> > at
> >
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:66)
> > at
> >
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:32)
> > at
> >
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67)
> > at
> >
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:208)
> > at
> >
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:193)
> > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) at
> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:390) at
> > org.apache.hadoop.mapred.MapTask.run(MapTask.java:324) at
> > org.apache.hadoop.mapred.Child$4.run(Child.java:240) at
> > java.security.AccessController.doPrivileged(Native Method) at
> > javax.security.auth.Subject.doAs(Subject.java:396) at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
> > at org.apache.hadoop.mapred.Child.main(Child.java:234) Caused by:
> > org.jets3t.service.S3ServiceException: S3 GET failed for
> >
> '/user%2Fhive%2Fwarehouse%2Frcfile_logs%2Fdt%3D20110312%2Fcontroller%3Drecipe%2F000001_0'
> > XML Error Message: <?xml version="1.0"
> > encoding="UTF-8"?><Error><Code>InvalidRange</Code><Message>The requested
> > range is not
> >
> satisfiable</Message><ActualObjectSize>51025754</ActualObjectSize><RequestId>4E5BD7E6D94DBA1B</RequestId><HostId>l+oM6yDUt+MbQgDB4pzcGckUQ1E7pbaUGy26yuTqNE4Gn+FdiJIA6u4VvsQl2+aR</HostId><RangeRequested>bytes=51025754-</RangeRequested></Error>
> > at
> >
> org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest(RestS3Service.java:424)
> > at
> >
> org.jets3t.service.impl.rest.httpclient.RestS3Service.performRestGet(RestS3Service.java:686)
> > at
> >
> org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectImpl(RestS3Service.java:1558)
> > at
> >
> org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectImpl(RestS3Service.java:1501)
> > at org.jets3t.service.S3Service.getObject(S3Service.java:1876) at
> >
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:129)
> > ... 28 more 2011-03-22 17:12:05,170 INFO org.apache.hadoop.mapred.Task:
> > Runnning cleanup for the task
> >
> > The client seems to request to the invalid range like this error code
> shows.
> > S3 GET failed for
> >
> '/user%2Fhive%2Fwarehouse%2Frcfile_logs%2Fdt%3D20110312%2Fcontroller%3Drecipe%2F000001_0'
> > XML Error Message: <?xml version="1.0" encoding="UTF-8"?><Error>
> > <Code>InvalidRange</Code>
> > <Message>The requested range is not satisfiable</Message>
> > <ActualObjectSize>51025754</ActualObjectSize>
> > <RequestId>***</RequestId>
> > <HostId>***</HostId>
> > <RangeRequested>bytes=51025754-</RangeRequested></Error>
> > This error did not occur on HDFS, so I guess this is a bug.
> > Or is there a person was able to run queries using RCFile on S3?
> > Thanks,
> > --
> > Shusuke Mikami
> > shun0...@gmail.com
>

Reply via email to