Thanks Abhishek. This helped.

On Fri, Apr 15, 2016 at 3:13 PM, Abhishek Girish <[email protected]>
wrote:

> Can you take a look at
>
> https://drill.apache.org/docs/s3-storage-plugin/#quering-parquet-format-files-on-s3
> ? It could be an issue of connection to s3 timing out.
>
> On Fri, Apr 15, 2016 at 1:03 AM, Ashish Goel <[email protected]
> >
> wrote:
>
> > Hi,
> >
> > I am running a CTAS query to convert JSON data stored in S3 into parquet
> > store back into S3. Both the input and output are S3 locations. While
> some
> > of parquest files are created in S3 but not all. I receive this error
> > message after some time -
> >
> > *Error: DATA_READ ERROR: Failure reading JSON file - Unable to execute
> HTTP
> > request: Timeout waiting for connection from pool*
> >
> > Input JSON Data Set - 93GB
> >
> > Number of Rows in input data set - ~131 Million
> >
> > From Google search, it indicates some kind of resource leak while
> > reading/writing data to S3, which is caused by not calling close() method
> > on S3 object. As I am able to run select queries on the same JSON data
> set
> > without any such issues, I suspect the leak to be around around S3
> writes,
> > if there is any.
> >
> > Has anyone encountered similar issue before?
> >
> > Also I am able to create table from S3 data and store it in my local fs
> > using dfs storage plugin. But then the queries against the dfs data
> returns
> > partial view of the data stored in just the node locally not the entire
> > cluster which makes this option unviable for my use case.
> >
> >
> > Detailed Stack trace from one of the drillbits -
> >
> > at
> >
> >
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
> > ~[drill-common-1.6.0.jar:1.6.0]
> >
> > at
> >
> >
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:318)
> > [drill-java-exec-1.6.0.jar:1.6.0]
> >
> > at
> >
> >
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:185)
> > [drill-java-exec-1.6.0.jar:1.6.0]
> >
> > at
> >
> >
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:287)
> > [drill-java-exec-1.6.0.jar:1.6.0]
> >
> > at
> >
> >
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
> > [drill-common-1.6.0.jar:1.6.0]
> >
> > at
> >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> > [na:1.7.0_99]
> >
> > at
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> > [na:1.7.0_99]
> >
> > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_99]
> >
> > Caused by: com.amazonaws.AmazonClientException: Unable to execute HTTP
> > request: null
> >
> > at
> >
> >
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:454)
> > ~[aws-java-sdk-1.7.4.jar:na]
> >
> > at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
> > ~[aws-java-sdk-1.7.4.jar:na]
> >
> > at
> > com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
> > ~[aws-java-sdk-1.7.4.jar:na]
> >
> > at
> >
> >
> com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:976)
> > ~[aws-java-sdk-1.7.4.jar:na]
> >
> > at
> >
> >
> com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:956)
> > ~[aws-java-sdk-1.7.4.jar:na]
> >
> > at
> >
> >
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:892)
> > ~[hadoop-aws-2.7.1.jar:na]
> >
> > at
> >
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:77)
> > ~[hadoop-aws-2.7.1.jar:na]
> >
> > at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
> > ~[hadoop-common-2.7.1.jar:na]
> >
> > at org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:400)
> > ~[hadoop-aws-2.7.1.jar:na]
> >
> > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
> > ~[hadoop-common-2.7.1.jar:na]
> >
> > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890)
> > ~[hadoop-common-2.7.1.jar:na]
> >
> > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787)
> > ~[hadoop-common-2.7.1.jar:na]
> >
> > at
> >
> >
> org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:228)
> > ~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
> >
> > at
> >
> >
> org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:183)
> > ~[parquet-hadoop-1.8.1-drill-r0.jar:1.8.1-drill-r0]
> >
> > at
> >
> >
> org.apache.drill.exec.store.parquet.ParquetRecordWriter.endRecord(ParquetRecordWriter.java:364)
> > ~[drill-java-exec-1.6.0.jar:1.6.0]
> >
> > at
> >
> >
> org.apache.drill.exec.store.EventBasedRecordWriter.write(EventBasedRecordWriter.java:65)
> > ~[drill-java-exec-1.6.0.jar:1.6.0]
> >
> > at
> >
> >
> org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:106)
> > ~[drill-java-exec-1.6.0.jar:1.6.0]
> >
> > at
> >
> >
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
> > ~[drill-java-exec-1.6.0.jar:1.6.0]
> >
> > at
> >
> >
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104)
> > ~[drill-java-exec-1.6.0.jar:1.6.0]
> >
> > at
> >
> >
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:92)
> > ~[drill-java-exec-1.6.0.jar:1.6.0]
> >
> > at
> >
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94)
> > ~[drill-java-exec-1.6.0.jar:1.6.0]
> >
> > at
> >
> >
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:257)
> > ~[drill-java-exec-1.6.0.jar:1.6.0]
> >
> > at
> >
> >
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:251)
> > ~[drill-java-exec-1.6.0.jar:1.6.0]
> >
> > at java.security.AccessController.doPrivileged(Native Method)
> > ~[na:1.7.0_99]
> >
> > at javax.security.auth.Subject.doAs(Subject.java:415) ~[na:1.7.0_99]
> >
> > at
> >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> > ~[hadoop-common-2.7.1.jar:na]
> >
> > at
> >
> >
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:251)
> > [drill-java-exec-1.6.0.jar:1.6.0]
> >
> > ... 4 common frames omitted
> >
> > Caused by: java.io.InterruptedIOException: null
> >
> > at
> >
> >
> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:459)
> > ~[httpclient-4.2.5.jar:4.2.5]
> >
> > at
> >
> >
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
> > ~[httpclient-4.2.5.jar:4.2.5]
> >
> > at
> >
> >
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
> > ~[httpclient-4.2.5.jar:4.2.5]
> >
> > at
> >
> >
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:384)
> > ~[aws-java-sdk-1.7.4.jar:na]
> >
> > ... 30 common frames omitted
> >
> > 2016-04-15 07:25:51,722 [28ef693e-604b-cb6e-6562-a18377d3b10c:frag:1:127]
> > INFO  o.a.d.e.w.fragment.FragmentExecutor -
> > 28ef693e-604b-cb6e-6562-a18377d3b10c:1:127: State change requested FAILED
> > --> FINISHED
> >
> > Appreciate any response from the community.
> >
> > --
> > Thanks,
> > Ashish
> >
>



-- 
Thanks,
Ashish

Reply via email to