Make the AWS credentials related objects transient and initialize them in a
operator Setup call.

Troubleshooting guide has the more details about the Kryo exceptions.
http://docs.datatorrent.com/troubleshooting/


On Thu, Jun 30, 2016 at 9:38 AM Doyle, Austin O. <
[email protected]> wrote:

> I have a 4 data node apex system using Hadoop distribution CDH-5.6.0 and
> Apex version incubator-apex-core-3.3.0-incubating.  I have an application
> that should input (stream) a file from Amazon S3.  If the file (which is a
> tar.gz file with a single file inside which has json lines delimited by a
> new line) is small (maybe a thousand records) there are no issues.  If I
> try a 5GB file, the stream works for a while and I am able to process maybe
> 200,000 of the 1,000,000 records (amount changes every time, sometimes more
> processed sometimes less) and then exceptions are thrown such as:
>
>
>
> -com.esotericsoftware.kryo.KryoException: Class cannot be created (missing
> no-arg constructor): com.amazonaws.internal.StaticCredentialsProvider
>
>
>
> -java.lang.IllegalStateException: Deploy request failed:
> [OperatorDeployInfo
>
>
>
> -WARN com.datatorrent.netlet.OptimizedEventLoop: Exception on unattached
> SelectionKey sun.nio.ch.SelectionKeyImpl@29369cb9
>
> java.io.IOException: Broken pipe
>
>
>
> Once these exceptions come up, the KyroExceptions continuously come up and
> no more data is processed.  Is there something that needs to be done in the
> Apex operators to handle processing large files(Streams)?
>
>
>

Reply via email to