Can you take a look at
https://drill.apache.org/docs/s3-storage-plugin/#quering-parquet-format-files-on-s3
? It could be an issue of connection to s3 timing out.

On Fri, Apr 15, 2016 at 1:03 AM, Ashish Goel <[email protected]>
wrote:

> Hi,
>
> I am running a CTAS query to convert JSON data stored in S3 into parquet
> store back into S3. Both the input and output are S3 locations. While some
> of parquest files are created in S3 but not all. I receive this error
> message after some time -
>
> *Error: DATA_READ ERROR: Failure reading JSON file - Unable to execute HTTP
> request: Timeout waiting for connection from pool*
>
> Input JSON Data Set - 93GB
>
> Number of Rows in input data set - ~131 Million
>
> From Google search, it indicates some kind of resource leak while
> reading/writing data to S3, which is caused by not calling close() method
> on S3 object. As I am able to run select queries on the same JSON data set
> without any such issues, I suspect the leak to be around around S3 writes,
> if there is any.
>
> Has anyone encountered similar issue before?
>
> Also I am able to create table from S3 data and store it in my local fs
> using dfs storage plugin. But then the queries against the dfs data returns
> partial view of the data stored in just the node locally not the entire
> cluster which makes this option unviable for my use case.
>
>
> Detailed Stack trace from one of the drillbits -
>
> at
>
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
> ~[drill-common-1.6.0.jar:1.6.0]
>
> at
>
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:318)
> [drill-java-exec-1.6.0.jar:1.6.0]
>
> at
>
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:185)
> [drill-java-exec-1.6.0.jar:1.6.0]
>
> at
>
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:287)
> [drill-java-exec-1.6.0.jar:1.6.0]
>
> at
>
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
> [drill-common-1.6.0.jar:1.6.0]
>
> at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> [na:1.7.0_99]
>
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> [na:1.7.0_99]
>
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_99]
>
> Caused by: com.amazonaws.AmazonClientException: Unable to execute HTTP
> request: null
>
> at
>
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:454)
> ~[aws-java-sdk-1.7.4.jar:na]
>
> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
> ~[aws-java-sdk-1.7.4.jar:na]
>
> at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
> ~[aws-java-sdk-1.7.4.jar:na]
>
> at
>
> com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:976)
> ~[aws-java-sdk-1.7.4.jar:na]
>
> at
>
> com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:956)
> ~[aws-java-sdk-1.7.4.jar:na]
>
> at
>
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:892)
> ~[hadoop-aws-2.7.1.jar:na]
>
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:77)
> ~[hadoop-aws-2.7.1.jar:na]
>
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
> ~[hadoop-common-2.7.1.jar:na]
>
> at org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:400)
> ~[hadoop-aws-2.7.1.jar:na]
>
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
> ~[hadoop-common-2.7.1.jar:na]
>
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890)
> ~[hadoop-common-2.7.1.jar:na]
>
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787)
> ~[hadoop-common-2.7.1.jar:na]
>
> at
>
> org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:228)
> ~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
>
> at
>
> org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:183)
> ~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
>
> at
>
> org.apache.drill.exec.store.parquet.ParquetRecordWriter.endRecord(ParquetRecordWriter.java:364)
> ~[drill-java-exec-1.6.0.jar:1.6.0]
>
> at
>
> org.apache.drill.exec.store.EventBasedRecordWriter.write(EventBasedRecordWriter.java:65)
> ~[drill-java-exec-1.6.0.jar:1.6.0]
>
> at
>
> org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:106)
> ~[drill-java-exec-1.6.0.jar:1.6.0]
>
> at
>
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
> ~[drill-java-exec-1.6.0.jar:1.6.0]
>
> at
>
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104)
> ~[drill-java-exec-1.6.0.jar:1.6.0]
>
> at
>
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:92)
> ~[drill-java-exec-1.6.0.jar:1.6.0]
>
> at
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94)
> ~[drill-java-exec-1.6.0.jar:1.6.0]
>
> at
>
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:257)
> ~[drill-java-exec-1.6.0.jar:1.6.0]
>
> at
>
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:251)
> ~[drill-java-exec-1.6.0.jar:1.6.0]
>
> at java.security.AccessController.doPrivileged(Native Method)
> ~[na:1.7.0_99]
>
> at javax.security.auth.Subject.doAs(Subject.java:415) ~[na:1.7.0_99]
>
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> ~[hadoop-common-2.7.1.jar:na]
>
> at
>
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:251)
> [drill-java-exec-1.6.0.jar:1.6.0]
>
> ... 4 common frames omitted
>
> Caused by: java.io.InterruptedIOException: null
>
> at
>
> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:459)
> ~[httpclient-4.2.5.jar:4.2.5]
>
> at
>
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
> ~[httpclient-4.2.5.jar:4.2.5]
>
> at
>
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
> ~[httpclient-4.2.5.jar:4.2.5]
>
> at
>
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:384)
> ~[aws-java-sdk-1.7.4.jar:na]
>
> ... 30 common frames omitted
>
> 2016-04-15 07:25:51,722 [28ef693e-604b-cb6e-6562-a18377d3b10c:frag:1:127]
> INFO  o.a.d.e.w.fragment.FragmentExecutor -
> 28ef693e-604b-cb6e-6562-a18377d3b10c:1:127: State change requested FAILED
> --> FINISHED
>
> Appreciate any response from the community.
>
> --
> Thanks,
> Ashish
>

Reply via email to