Thank you! I tried with "set hive.optimize.cp=false;", then it works! However, it reduce the merit of RCFile which skip unnecessary data. It may be better to use *SequenceFile in the present state of things?*
-- Shunsuke Mikami 2011/3/23 yongqiang he <heyongqiang...@gmail.com> > Don't know the reason, but can you try with "set hive.optimize.cp=false;" > The problem seems it reports error when trying to skip some data. > > On Tue, Mar 22, 2011 at 7:27 AM, Shunsuke Mikami <shun0...@gmail.com> > wrote: > > Hi all, > > I am testing RCFile on S3. > > I could execute queries which don't specify columns such as "select * > from > > table". > > But, I could not execute queries which specify columns such as "select id > > from table". > > This job progress to near the end of a map task, but cannot finish the > task > > as the below log message. > > 2011-03-22 17:12:04,325 INFO > > org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key > > 'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0' > for > > reading at position '50000365' > > 2011-03-22 17:12:04,362 INFO > > org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key > > 'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0' > for > > reading at position '50458664' > > 2011-03-22 17:12:04,444 INFO > > org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key > > 'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0' > for > > reading at position '50603753' > > 2011-03-22 17:12:04,509 INFO > > org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key > > 'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0' > for > > reading at position '50651845' > > 2011-03-22 17:12:04,536 INFO > > org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key > > 'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0' > for > > reading at position '50735249' > > 2011-03-22 17:12:04,570 INFO > > org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key > > 'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0' > for > > reading at position '50956751' > > 2011-03-22 17:12:04,600 INFO > > org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key > > 'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0' > for > > reading at position '51025754' > > 2011-03-22 17:12:04,633 INFO org.apache.hadoop.hive.ql.exec.MapOperator: > 9 > > finished. closing... > > ... > > 2011-03-22 17:12:05,167 WARN org.apache.hadoop.mapred.Child: Error > running > > child org.apache.hadoop.fs.s3.S3Exception: > > org.jets3t.service.S3ServiceException: S3 GET failed for > > > '/user%2Fhive%2Fwarehouse%2Frcfile_logs%2Fdt%3D20110312%2Fcontroller%3Drecipe%2F000001_0' > > XML Error Message: <?xml version="1.0" > > encoding="UTF-8"?><Error><Code>InvalidRange</Code><Message>The requested > > range is not > > > satisfiable</Message><ActualObjectSize>51025754</ActualObjectSize><RequestId>***</RequestId><HostId>***</HostId><RangeRequested>bytes=51025754-</RangeRequested></Error> > > at > > > org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleServiceException(Jets3tNativeFileSystemStore.java:229) > > at > > > org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleServiceException(Jets3tNativeFileSystemStore.java:220) > > at > > > org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:133) > > at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) at > > > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) > > at > > > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) > > at org.apache.hadoop.fs.s3native.$Proxy1.retrieve(Unknown Source) at > > > org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.seek(NativeS3FileSystem.java:150) > > at > > > org.apache.hadoop.fs.BufferedFSInputStream.seek(BufferedFSInputStream.java:76) > > at > > > org.apache.hadoop.fs.BufferedFSInputStream.skip(BufferedFSInputStream.java:56) > > at java.io.DataInputStream.skipBytes(DataInputStream.java:203) at > > > org.apache.hadoop.hive.ql.io.RCFile$ValueBuffer.readFields(RCFile.java:443) > > at > > > org.apache.hadoop.hive.ql.io.RCFile$Reader.currentValueBuffer(RCFile.java:1304) > > at > > > org.apache.hadoop.hive.ql.io.RCFile$Reader.getCurrentRow(RCFile.java:1425) > > at > > > org.apache.hadoop.hive.ql.io.RCFileRecordReader.next(RCFileRecordReader.java:88) > > at > > > org.apache.hadoop.hive.ql.io.RCFileRecordReader.next(RCFileRecordReader.java:39) > > at > > > org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:66) > > at > > > org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:32) > > at > > > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67) > > at > > > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:208) > > at > > > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:193) > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) at > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:390) at > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:324) at > > org.apache.hadoop.mapred.Child$4.run(Child.java:240) at > > java.security.AccessController.doPrivileged(Native Method) at > > javax.security.auth.Subject.doAs(Subject.java:396) at > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) > > at org.apache.hadoop.mapred.Child.main(Child.java:234) Caused by: > > org.jets3t.service.S3ServiceException: S3 GET failed for > > > '/user%2Fhive%2Fwarehouse%2Frcfile_logs%2Fdt%3D20110312%2Fcontroller%3Drecipe%2F000001_0' > > XML Error Message: <?xml version="1.0" > > encoding="UTF-8"?><Error><Code>InvalidRange</Code><Message>The requested > > range is not > > > satisfiable</Message><ActualObjectSize>51025754</ActualObjectSize><RequestId>4E5BD7E6D94DBA1B</RequestId><HostId>l+oM6yDUt+MbQgDB4pzcGckUQ1E7pbaUGy26yuTqNE4Gn+FdiJIA6u4VvsQl2+aR</HostId><RangeRequested>bytes=51025754-</RangeRequested></Error> > > at > > > org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest(RestS3Service.java:424) > > at > > > org.jets3t.service.impl.rest.httpclient.RestS3Service.performRestGet(RestS3Service.java:686) > > at > > > org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectImpl(RestS3Service.java:1558) > > at > > > org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectImpl(RestS3Service.java:1501) > > at org.jets3t.service.S3Service.getObject(S3Service.java:1876) at > > > org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:129) > > ... 28 more 2011-03-22 17:12:05,170 INFO org.apache.hadoop.mapred.Task: > > Runnning cleanup for the task > > > > The client seems to request to the invalid range like this error code > shows. > > S3 GET failed for > > > '/user%2Fhive%2Fwarehouse%2Frcfile_logs%2Fdt%3D20110312%2Fcontroller%3Drecipe%2F000001_0' > > XML Error Message: <?xml version="1.0" encoding="UTF-8"?><Error> > > <Code>InvalidRange</Code> > > <Message>The requested range is not satisfiable</Message> > > <ActualObjectSize>51025754</ActualObjectSize> > > <RequestId>***</RequestId> > > <HostId>***</HostId> > > <RangeRequested>bytes=51025754-</RangeRequested></Error> > > This error did not occur on HDFS, so I guess this is a bug. > > Or is there a person was able to run queries using RCFile on S3? > > Thanks, > > -- > > Shusuke Mikami > > shun0...@gmail.com >