Are you able to list the bucket with the AWS CLI (aws s3 ls)? It can be helpful to compare performance between NiFi and the AWS CLI, especially if you are able to do so from the same machine, with the same permissions, and as similar bucket and prefix settings as you can manage.
In the screenshot above, the bucket is shown as "part-d-prescription-drug/unstructured", which looks unusual to me. Is the bucket "part-d-prescription-drug" and the prefix "unstructured/"? Thanks, James On Tue, Dec 12, 2017 at 7:34 AM, Aruna Sankaralingam < [email protected]> wrote: > Joe, > > > > No, I don’t have anything in between AWS and NiFi. > > NiFi is installed in one of the EC2 instance in AWS – N.Virginia Region > > S3 is also in N.Virginia Region > > > > *From:* Joe Witt [mailto:[email protected]] > *Sent:* Monday, December 11, 2017 1:28 PM > *To:* [email protected] > *Subject:* Re: ListS3 Processor Error > > > > The XML response is truncated for some reason as implied by the following. > Do you have any devices/software/systems/proxies in between your NiFi and > the amazon service? Are you able to manually issue the request and get the > response you expect? > > > > 2017-12-11 18:01:02,875 ERROR [Timer-Driven Process Thread-6] > org.apache.nifi.processors.aws.s3.ListS3 > ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] > ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] failed to process session > due to com.amazonaws.SdkClientException: Failed to parse XML document > with handler class com.amazonaws.services.s3.model.transform. > XmlResponsesSaxParser$ListBucketHandler: {} > > com.amazonaws.SdkClientException: Failed to parse XML document with > handler class com.amazonaws.services.s3.model.transform. > XmlResponsesSaxParser$ListBucketHandler > > at com.amazonaws.services.s3.model.transform. > XmlResponsesSaxParser.parseXmlInputStream(XmlResponsesSaxParser.java:156) > > at com.amazonaws.services.s3.model.transform. > XmlResponsesSaxParser.parseListBucketObjectsResponse > (XmlResponsesSaxParser.java:298) > > at com.amazonaws.services.s3.model.transform.Unmarshallers$ > ListObjectsUnmarshaller.unmarshall(Unmarshallers.java:70) > > at com.amazonaws.services.s3.model.transform.Unmarshallers$ > ListObjectsUnmarshaller.unmarshall(Unmarshallers.java:59) > > at com.amazonaws.services.s3.internal.S3XmlResponseHandler. > handle(S3XmlResponseHandler.java:62) > > at com.amazonaws.services.s3.internal.S3XmlResponseHandler. > handle(S3XmlResponseHandler.java:31) > > at com.amazonaws.http.response.AwsResponseHandlerAdapter. > handle(AwsResponseHandlerAdapter.java:70) > > at com.amazonaws.http.AmazonHttpClient$RequestExecutor. > handleResponse(AmazonHttpClient.java:1444) > > at com.amazonaws.http.AmazonHttpClient$RequestExecutor. > executeOneRequest(AmazonHttpClient.java:1151) > > at com.amazonaws.http.AmazonHttpClient$ > RequestExecutor.executeHelper(AmazonHttpClient.java:964) > > at com.amazonaws.http.AmazonHttpClient$ > RequestExecutor.doExecute(AmazonHttpClient.java:676) > > at com.amazonaws.http.AmazonHttpClient$RequestExecutor. > executeWithTimer(AmazonHttpClient.java:650) > > at com.amazonaws.http.AmazonHttpClient$ > RequestExecutor.execute(AmazonHttpClient.java:633) > > at com.amazonaws.http.AmazonHttpClient$ > RequestExecutor.access$300(AmazonHttpClient.java:601) > > at com.amazonaws.http.AmazonHttpClient$ > RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:583) > > at com.amazonaws.http.AmazonHttpClient.execute( > AmazonHttpClient.java:447) > > at com.amazonaws.services.s3.AmazonS3Client.invoke( > AmazonS3Client.java:4137) > > at com.amazonaws.services.s3.AmazonS3Client.invoke( > AmazonS3Client.java:4079) > > at com.amazonaws.services.s3.AmazonS3Client.listObjects( > AmazonS3Client.java:819) > > at org.apache.nifi.processors.aws.s3.ListS3$ > S3ObjectBucketLister.listVersions(ListS3.java:314) > > at org.apache.nifi.processors.aws.s3.ListS3.onTrigger( > ListS3.java:208) > > at org.apache.nifi.processor.AbstractProcessor.onTrigger( > AbstractProcessor.java:27) > > at org.apache.nifi.controller.StandardProcessorNode.onTrigger( > StandardProcessorNode.java:1119) > > at org.apache.nifi.controller.tasks. > ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147) > > at org.apache.nifi.controller.tasks. > ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) > > at org.apache.nifi.controller.scheduling. > TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:128) > > at java.util.concurrent.Executors$RunnableAdapter. > call(Executors.java:511) > > at java.util.concurrent.FutureTask.runAndReset( > FutureTask.java:308) > > at java.util.concurrent.ScheduledThreadPoolExecutor$ > ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > > at java.util.concurrent.ScheduledThreadPoolExecutor$ > ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1149) > > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > > Caused by: org.xml.sax.SAXParseException: Premature end of file. > > at com.sun.org.apache.xerces.internal.util. > ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203) > > at com.sun.org.apache.xerces.internal.util. > ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177) > > at com.sun.org.apache.xerces.internal.impl. > XMLErrorReporter.reportError(XMLErrorReporter.java:400) > > at com.sun.org.apache.xerces.internal.impl. > XMLErrorReporter.reportError(XMLErrorReporter.java:327) > > at com.sun.org.apache.xerces.internal.impl.XMLScanner. > reportFatalError(XMLScanner.java:1472) > > at com.sun.org.apache.xerces.internal.impl. > XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:1014) > > at com.sun.org.apache.xerces.internal.impl. > XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602) > > at com.sun.org.apache.xerces.internal.impl. > XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112) > > at com.sun.org.apache.xerces.internal.impl. > XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl > .java:505) > > at com.sun.org.apache.xerces.internal.parsers. > XML11Configuration.parse(XML11Configuration.java:841) > > at com.sun.org.apache.xerces.internal.parsers. > XML11Configuration.parse(XML11Configuration.java:770) > > at com.sun.org.apache.xerces.internal.parsers.XMLParser. > parse(XMLParser.java:141) > > at com.sun.org.apache.xerces.internal.parsers. > AbstractSAXParser.parse(AbstractSAXParser.java:1213) > > at com.amazonaws.services.s3.model.transform. > XmlResponsesSaxParser.parseXmlInputStream(XmlResponsesSaxParser.java:142) > > ... 32 common frames omitted > > > > > > On Mon, Dec 11, 2017 at 1:07 PM, Aruna Sankaralingam < > [email protected]> wrote: > > Attached my nifi-app.log. Could you please let me know what went wrong? > > > > *From:* Joe Witt [mailto:[email protected]] > *Sent:* Friday, December 08, 2017 4:04 PM > > > *To:* [email protected] > *Subject:* Re: ListS3 Processor Error > > > > Here is an example I found for another processor > > > > https://mail-archives.apache.org/mod_mbox/nifi-dev/201509.mbox/% > 3CCAFddr26AEVqnoQ=mWr7DSNDFVrr9NuYy9GCcXg=4fyycqab...@mail.gmail.com%3E > > > > Thanks > > > > On Fri, Dec 8, 2017 at 4:02 PM, Aruna Sankaralingam < > [email protected]> wrote: > > Joe, > > Could you please let me know how to turn on the debug logging? > > > > *From:* Joe Witt [mailto:[email protected]] > *Sent:* Friday, December 08, 2017 3:59 PM > *To:* [email protected] > *Subject:* Re: ListS3 Processor Error > > > > What version of NiFi? > > > > Looks like either a classpath/classloader issue OR the amazon client > library cannot parse the response it is getting back... > > > > The logs/nifi-app.log should have the full stack trace. If not you can > turn on debug logging for that processor and perhaps then it will. > > > > Thanks > > > > On Fri, Dec 8, 2017 at 3:56 PM, Aruna Sankaralingam < > [email protected]> wrote: > > I am trying to get a pdf file from S3 and load to Elastic Search. The > ListS3 processor is giving me this error. Could someone please let me know > where I am going wrong? > > > > *20:52:25 UTC* > > *ERROR* > > *37d7226e-0160-1000-6049-d4c489cd32f3* > > ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] > ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] failed to process session > due to com.amazonaws.SdkClientException: Failed to parse XML document > with handler class com.amazonaws.services.s3.model.transform. > XmlResponsesSaxParser$ListBucketHandler: Failed to parse XML document > with handler class com.amazonaws.services.s3.model.transform. > XmlResponsesSaxParser$ListBucketHandler > > *20:52:25 UTC* > > *WARNING* > > *37d7226e-0160-1000-6049-d4c489cd32f3* > > ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] Processor > Administratively Yielded for 1 sec due to processing failure > > *20:52:26 UTC* > > *ERROR* > > *37d7226e-0160-1000-6049-d4c489cd32f3* > > ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] > ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] failed to process due to > com.amazonaws.SdkClientException: Failed to parse XML document with > handler class com.amazonaws.services.s3.model.transform. > XmlResponsesSaxParser$ListBucketHandler; rolling back session: Failed to > parse XML document with handler class com.amazonaws.services.s3. > model.transform.XmlResponsesSaxParser$ListBucketHandler > > *20:52:26 UTC* > > *ERROR* > > *37d7226e-0160-1000-6049-d4c489cd32f3* > > ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] > ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] failed to process session > due to com.amazonaws.SdkClientException: Failed to parse XML document > with handler class com.amazonaws.services.s3.model.transform. > XmlResponsesSaxParser$ListBucketHandler: Failed to parse XML document > with handler class com.amazonaws.services.s3.model.transform. > XmlResponsesSaxParser$ListBucketHandler > > *20:52:26 UTC* > > *WARNING* > > *37d7226e-0160-1000-6049-d4c489cd32f3* > > ListS3[id=37d7226e-0160-1000-6049-d4c489cd32f3] Processor > Administratively Yielded for 1 sec due to processing failure > > Auto-refresh > > > > > > > > >
