[jira] [Created] (FLINK-9075) BucketingSink S3 does not work on local cluster
dejan miljkovic created FLINK-9075: -- Summary: BucketingSink S3 does not work on local cluster Key: FLINK-9075 URL: https://issues.apache.org/jira/browse/FLINK-9075 Project: Flink Issue Type: Bug Components: Streaming Connectors Affects Versions: 1.4.2 Reporter: dejan miljkovic Trying to write to S3 using BucketingSink. Got below error when code is executed on local Flink 1.4.2 cluster. Code works from InteliJ. I followed procedure for S3 connection from documentation (copied flink-s3-fs-hadoop-1.4.2.jar to lib). I reported similar issues before. It looks that they were all related to class loading issues. On [https://github.com/dmiljkovic/test-flink-bucketingsink-s3] I provided code that produces below error. pom.xm contains more stuff than is needed. I just copied pom from project that need to write to S3. javax.xml.parsers.FactoryConfigurationError: Provider for class javax.xml.parsers.DocumentBuilderFactory cannot be created at javax.xml.parsers.FactoryFinder.findServiceProvider(FactoryFinder.java:311) at javax.xml.parsers.FactoryFinder.find(FactoryFinder.java:267) at javax.xml.parsers.DocumentBuilderFactory.newInstance(DocumentBuilderFactory.java:120) at org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2567) at org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2543) at org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2426) at org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.conf.Configuration.get(Configuration.java:1240) at org.apache.flink.fs.s3hadoop.S3FileSystemFactory.create(S3FileSystemFactory.java:98) at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:397) at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.createHadoopFileSystem(BucketingSink.java:1126) at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initFileSystem(BucketingSink.java:411) at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:355) at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178) at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160) at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:258) at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694) at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718) at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9075) BucketingSink S3 does not work on local cluster
[ https://issues.apache.org/jira/browse/FLINK-9075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dejan miljkovic updated FLINK-9075: --- Priority: Blocker (was: Major) > BucketingSink S3 does not work on local cluster > --- > > Key: FLINK-9075 > URL: https://issues.apache.org/jira/browse/FLINK-9075 > Project: Flink > Issue Type: Bug > Components: Streaming Connectors >Affects Versions: 1.4.2 >Reporter: dejan miljkovic >Priority: Blocker > > Trying to write to S3 using BucketingSink. Got below error when code is > executed on local Flink 1.4.2 cluster. Code works from InteliJ. I followed > procedure for S3 connection from documentation (copied > flink-s3-fs-hadoop-1.4.2.jar to lib). I reported similar issues before. It > looks that they were all related to class loading issues. > On [https://github.com/dmiljkovic/test-flink-bucketingsink-s3] I provided > code that produces below error. pom.xm contains more stuff than is needed. I > just copied pom from project that need to write to S3. > > javax.xml.parsers.FactoryConfigurationError: Provider for class > javax.xml.parsers.DocumentBuilderFactory cannot be created > at > javax.xml.parsers.FactoryFinder.findServiceProvider(FactoryFinder.java:311) > at javax.xml.parsers.FactoryFinder.find(FactoryFinder.java:267) > at > javax.xml.parsers.DocumentBuilderFactory.newInstance(DocumentBuilderFactory.java:120) > at > org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2567) > at > org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2543) > at > org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2426) > at > org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.conf.Configuration.get(Configuration.java:1240) > at > org.apache.flink.fs.s3hadoop.S3FileSystemFactory.create(S3FileSystemFactory.java:98) > at > org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:397) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.createHadoopFileSystem(BucketingSink.java:1126) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initFileSystem(BucketingSink.java:411) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:355) > at > org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178) > at > org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:258) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718) > at java.lang.Thread.run(Thread.java:748) > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8720) Logging exception with S3 connector and BucketingSink
[ https://issues.apache.org/jira/browse/FLINK-8720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379167#comment-16379167 ] dejan miljkovic commented on FLINK-8720: I see from FLINK-8798 that solution is going to be provided from 1.4.2. Do you know when 1.4.2 is going to be released. > Logging exception with S3 connector and BucketingSink > - > > Key: FLINK-8720 > URL: https://issues.apache.org/jira/browse/FLINK-8720 > Project: Flink > Issue Type: Bug > Components: Streaming Connectors >Affects Versions: 1.4.1 >Reporter: dejan miljkovic >Priority: Critical > > Trying to stream data to S3. Code works from InteliJ. When submitting code > trough UI on my machine (single node cluster started by start-cluster.sh > script) below stack trace is produced. > > Below is the link to the simple test app that is streaming data to S3. > [https://github.com/dmiljkovic/test-flink-bucketingsink-s3] > The behavior is bit different but same error is produced. Job works only > once. If job is submitted second time below stack trace is produced. If I > restart the cluster job works but only for the first time. > > > org.apache.commons.logging.LogConfigurationException: > java.lang.IllegalAccessError: > org/apache/commons/logging/impl/LogFactoryImpl$3 (Caused by > java.lang.IllegalAccessError: > org/apache/commons/logging/impl/LogFactoryImpl$3) > at > org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:637) > at > org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:336) > at > org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:310) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685) > at > org.apache.http.impl.conn.PoolingClientConnectionManager.(PoolingClientConnectionManager.java:76) > at > org.apache.http.impl.conn.PoolingClientConnectionManager.(PoolingClientConnectionManager.java:102) > at > org.apache.http.impl.conn.PoolingClientConnectionManager.(PoolingClientConnectionManager.java:88) > at > org.apache.http.impl.conn.PoolingClientConnectionManager.(PoolingClientConnectionManager.java:96) > at > com.amazonaws.http.ConnectionManagerFactory.createPoolingClientConnManager(ConnectionManagerFactory.java:26) > at > com.amazonaws.http.HttpClientFactory.createHttpClient(HttpClientFactory.java:96) > at com.amazonaws.http.AmazonHttpClient.(AmazonHttpClient.java:158) > at > com.amazonaws.AmazonWebServiceClient.(AmazonWebServiceClient.java:119) > at > com.amazonaws.services.s3.AmazonS3Client.(AmazonS3Client.java:389) > at > com.amazonaws.services.s3.AmazonS3Client.(AmazonS3Client.java:371) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:235) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.createHadoopFileSystem(BucketingSink.java:1206) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initFileSystem(BucketingSink.java:411) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:355) > at > org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178) > at > org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:258) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.IllegalAccessError: > org/apache/commons/logging/impl/LogFactoryImpl$3 > at > org.apache.commons.logging.impl.LogFactoryImpl.getParentClassLoader(LogFactoryImpl.java:700) > at > org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1187) > at > org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:914) > at > org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:604) > ... 26 more > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8720) Logging exception with S3 connector and BucketingSink
[ https://issues.apache.org/jira/browse/FLINK-8720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16377835#comment-16377835 ] dejan miljkovic commented on FLINK-8720: Thanks for the suggestions. Removing Hadoop dependencies from jar did solve the problem!!! The application is built on Flink 1.4.1. It looks that more work is needed for class loading logic. I noticed that in application works in InteligJ with some versions of jars but does not work when submitted to cluster. I tried "parent-first" option but that required adding more dependencies in pom.xml. Did not proceed with this because I would affect other application that are deployed. One more thanks a lot for the response and solution. > Logging exception with S3 connector and BucketingSink > - > > Key: FLINK-8720 > URL: https://issues.apache.org/jira/browse/FLINK-8720 > Project: Flink > Issue Type: Bug > Components: Streaming Connectors >Affects Versions: 1.4.1 >Reporter: dejan miljkovic >Priority: Critical > > Trying to stream data to S3. Code works from InteliJ. When submitting code > trough UI on my machine (single node cluster started by start-cluster.sh > script) below stack trace is produced. > > Below is the link to the simple test app that is streaming data to S3. > [https://github.com/dmiljkovic/test-flink-bucketingsink-s3] > The behavior is bit different but same error is produced. Job works only > once. If job is submitted second time below stack trace is produced. If I > restart the cluster job works but only for the first time. > > > org.apache.commons.logging.LogConfigurationException: > java.lang.IllegalAccessError: > org/apache/commons/logging/impl/LogFactoryImpl$3 (Caused by > java.lang.IllegalAccessError: > org/apache/commons/logging/impl/LogFactoryImpl$3) > at > org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:637) > at > org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:336) > at > org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:310) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685) > at > org.apache.http.impl.conn.PoolingClientConnectionManager.(PoolingClientConnectionManager.java:76) > at > org.apache.http.impl.conn.PoolingClientConnectionManager.(PoolingClientConnectionManager.java:102) > at > org.apache.http.impl.conn.PoolingClientConnectionManager.(PoolingClientConnectionManager.java:88) > at > org.apache.http.impl.conn.PoolingClientConnectionManager.(PoolingClientConnectionManager.java:96) > at > com.amazonaws.http.ConnectionManagerFactory.createPoolingClientConnManager(ConnectionManagerFactory.java:26) > at > com.amazonaws.http.HttpClientFactory.createHttpClient(HttpClientFactory.java:96) > at com.amazonaws.http.AmazonHttpClient.(AmazonHttpClient.java:158) > at > com.amazonaws.AmazonWebServiceClient.(AmazonWebServiceClient.java:119) > at > com.amazonaws.services.s3.AmazonS3Client.(AmazonS3Client.java:389) > at > com.amazonaws.services.s3.AmazonS3Client.(AmazonS3Client.java:371) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:235) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.createHadoopFileSystem(BucketingSink.java:1206) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initFileSystem(BucketingSink.java:411) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:355) > at > org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178) > at > org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:258) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.IllegalAccessError: > org/apache/commons/logging/impl/LogFactoryImpl$3 > at > org.apache.commons.logging.impl.LogFactoryImpl.getParentClassLoader(LogFactoryImpl.java:700) > at >
[jira] [Updated] (FLINK-8720) Logging exception with S3 connector and BucketingSink
[ https://issues.apache.org/jira/browse/FLINK-8720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dejan miljkovic updated FLINK-8720: --- Summary: Logging exception with S3 connector and BucketingSink (was: Logging exception ) > Logging exception with S3 connector and BucketingSink > - > > Key: FLINK-8720 > URL: https://issues.apache.org/jira/browse/FLINK-8720 > Project: Flink > Issue Type: Bug > Components: Streaming Connectors >Affects Versions: 1.4.1 >Reporter: dejan miljkovic >Priority: Critical > > Trying to stream data to S3. Code works from InteliJ. When submitting code > trough UI on my machine (single node cluster started by start-cluster.sh > script) below stack trace is produced. > > Below is the link to the simple test app that is streaming data to S3. > [https://github.com/dmiljkovic/test-flink-bucketingsink-s3] > The behavior is bit different but same error is produced. Job works only > once. If job is submitted second time below stack trace is produced. If I > restart the cluster job works but only for the first time. > > > org.apache.commons.logging.LogConfigurationException: > java.lang.IllegalAccessError: > org/apache/commons/logging/impl/LogFactoryImpl$3 (Caused by > java.lang.IllegalAccessError: > org/apache/commons/logging/impl/LogFactoryImpl$3) > at > org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:637) > at > org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:336) > at > org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:310) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685) > at > org.apache.http.impl.conn.PoolingClientConnectionManager.(PoolingClientConnectionManager.java:76) > at > org.apache.http.impl.conn.PoolingClientConnectionManager.(PoolingClientConnectionManager.java:102) > at > org.apache.http.impl.conn.PoolingClientConnectionManager.(PoolingClientConnectionManager.java:88) > at > org.apache.http.impl.conn.PoolingClientConnectionManager.(PoolingClientConnectionManager.java:96) > at > com.amazonaws.http.ConnectionManagerFactory.createPoolingClientConnManager(ConnectionManagerFactory.java:26) > at > com.amazonaws.http.HttpClientFactory.createHttpClient(HttpClientFactory.java:96) > at com.amazonaws.http.AmazonHttpClient.(AmazonHttpClient.java:158) > at > com.amazonaws.AmazonWebServiceClient.(AmazonWebServiceClient.java:119) > at > com.amazonaws.services.s3.AmazonS3Client.(AmazonS3Client.java:389) > at > com.amazonaws.services.s3.AmazonS3Client.(AmazonS3Client.java:371) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:235) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.createHadoopFileSystem(BucketingSink.java:1206) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initFileSystem(BucketingSink.java:411) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:355) > at > org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178) > at > org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:258) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.IllegalAccessError: > org/apache/commons/logging/impl/LogFactoryImpl$3 > at > org.apache.commons.logging.impl.LogFactoryImpl.getParentClassLoader(LogFactoryImpl.java:700) > at > org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1187) > at > org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:914) > at > org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:604) > ... 26 more > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-8720) Logging exception
dejan miljkovic created FLINK-8720: -- Summary: Logging exception Key: FLINK-8720 URL: https://issues.apache.org/jira/browse/FLINK-8720 Project: Flink Issue Type: Bug Components: Streaming Connectors Affects Versions: 1.4.1 Reporter: dejan miljkovic Trying to stream data to S3. Code works from InteliJ. When submitting code trough UI on my machine (single node cluster started by start-cluster.sh script) below stack trace is produced. Below is the link to the simple test app that is streaming data to S3. [https://github.com/dmiljkovic/test-flink-bucketingsink-s3] The behavior is bit different but same error is produced. Job works only once. If job is submitted second time below stack trace is produced. If I restart the cluster job works but only for the first time. org.apache.commons.logging.LogConfigurationException: java.lang.IllegalAccessError: org/apache/commons/logging/impl/LogFactoryImpl$3 (Caused by java.lang.IllegalAccessError: org/apache/commons/logging/impl/LogFactoryImpl$3) at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:637) at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:336) at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:310) at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685) at org.apache.http.impl.conn.PoolingClientConnectionManager.(PoolingClientConnectionManager.java:76) at org.apache.http.impl.conn.PoolingClientConnectionManager.(PoolingClientConnectionManager.java:102) at org.apache.http.impl.conn.PoolingClientConnectionManager.(PoolingClientConnectionManager.java:88) at org.apache.http.impl.conn.PoolingClientConnectionManager.(PoolingClientConnectionManager.java:96) at com.amazonaws.http.ConnectionManagerFactory.createPoolingClientConnManager(ConnectionManagerFactory.java:26) at com.amazonaws.http.HttpClientFactory.createHttpClient(HttpClientFactory.java:96) at com.amazonaws.http.AmazonHttpClient.(AmazonHttpClient.java:158) at com.amazonaws.AmazonWebServiceClient.(AmazonWebServiceClient.java:119) at com.amazonaws.services.s3.AmazonS3Client.(AmazonS3Client.java:389) at com.amazonaws.services.s3.AmazonS3Client.(AmazonS3Client.java:371) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:235) at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.createHadoopFileSystem(BucketingSink.java:1206) at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initFileSystem(BucketingSink.java:411) at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:355) at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178) at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160) at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:258) at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694) at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalAccessError: org/apache/commons/logging/impl/LogFactoryImpl$3 at org.apache.commons.logging.impl.LogFactoryImpl.getParentClassLoader(LogFactoryImpl.java:700) at org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1187) at org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:914) at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:604) ... 26 more -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8628) BucketingSink does not work with S3
[ https://issues.apache.org/jira/browse/FLINK-8628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366112#comment-16366112 ] dejan miljkovic commented on FLINK-8628: Sorry can not reproduce the issue. Lost the pom.xml that was producing this problem. I am still not able to write to S3. Getting different error. Interesting thing is that it works from InteliJ but produces below error when executed on local cluster. javax.xml.parsers.FactoryConfigurationError: Provider for class javax.xml.parsers.DocumentBuilderFactory cannot be created at javax.xml.parsers.FactoryFinder.findServiceProvider(FactoryFinder.java:311) at javax.xml.parsers.FactoryFinder.find(FactoryFinder.java:267) at javax.xml.parsers.DocumentBuilderFactory.newInstance(DocumentBuilderFactory.java:120) at org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2567) at org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2543) at org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2426) at org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.conf.Configuration.get(Configuration.java:1240) at org.apache.flink.fs.s3hadoop.S3FileSystemFactory.create(S3FileSystemFactory.java:98) at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:397) at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:320) at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.createHadoopFileSystem(BucketingSink.java:1125) at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initFileSystem(BucketingSink.java:411) at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:355) at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178) at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160) at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:259) at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694) at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718) at java.lang.Thread.run(Thread.java:748) > BucketingSink does not work with S3 > --- > > Key: FLINK-8628 > URL: https://issues.apache.org/jira/browse/FLINK-8628 > Project: Flink > Issue Type: Bug > Components: FileSystem, Streaming >Affects Versions: 1.4.0 >Reporter: dejan miljkovic >Priority: Blocker > Fix For: 1.5.0 > > > BucketingSink does not work wit S3. Followed instructions provided on > [https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/deployment/aws.html] > but got below exception. Several people are complaining on the same issue. > [http://mail-archives.apache.org/mod_mbox/flink-user/201801.mbox/%3CCADAFrT9T6WQa25HXR1z1NaL=n8wP9s7aSXxZWxHy=hubo93...@mail.gmail.com%3E] > [https://lists.apache.org/thread.html/%3CCADAFrT9T6WQa25HXR1z1NaL=n8wP9s7aSXxZWxHy=hubo93...@mail.gmail.com%3E] > [http://mail-archives.apache.org/mod_mbox/flink-user/201801.mbox/%3CCADAFrT-i+vGe64e__=-dnu4pmpxhvyzvkfqzrhgxbeyhnwa...@mail.gmail.com%3E] > I don't see any specific bug related to this. > > java.lang.RuntimeException: Error while creating FileSystem when initializing > the state of the BucketingSink. > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:358) > at > org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178) > at > org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:259) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682) > at >
[jira] [Commented] (FLINK-8628) BucketingSink does not work with S3
[ https://issues.apache.org/jira/browse/FLINK-8628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364513#comment-16364513 ] dejan miljkovic commented on FLINK-8628: Thanks for the comments. Do you know when 1.4.1 will go out. I hope that test is going to be added to 1.4.1 that is verifying ability to write to S3 using bucketing sink. > BucketingSink does not work with S3 > --- > > Key: FLINK-8628 > URL: https://issues.apache.org/jira/browse/FLINK-8628 > Project: Flink > Issue Type: Bug > Components: FileSystem, Streaming >Affects Versions: 1.4.0 >Reporter: dejan miljkovic >Priority: Blocker > Fix For: 1.5.0 > > > BucketingSink does not work wit S3. Followed instructions provided on > [https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/deployment/aws.html] > but got below exception. Several people are complaining on the same issue. > [http://mail-archives.apache.org/mod_mbox/flink-user/201801.mbox/%3CCADAFrT9T6WQa25HXR1z1NaL=n8wP9s7aSXxZWxHy=hubo93...@mail.gmail.com%3E] > [https://lists.apache.org/thread.html/%3CCADAFrT9T6WQa25HXR1z1NaL=n8wP9s7aSXxZWxHy=hubo93...@mail.gmail.com%3E] > [http://mail-archives.apache.org/mod_mbox/flink-user/201801.mbox/%3CCADAFrT-i+vGe64e__=-dnu4pmpxhvyzvkfqzrhgxbeyhnwa...@mail.gmail.com%3E] > I don't see any specific bug related to this. > > java.lang.RuntimeException: Error while creating FileSystem when initializing > the state of the BucketingSink. > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:358) > at > org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178) > at > org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:259) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.IOException: Cannot instantiate file system for URI: > hdfs://localhost:12345/ > at > org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:187) > at > org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:401) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.createHadoopFileSystem(BucketingSink.java:1154) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initFileSystem(BucketingSink.java:411) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:355) > ... 9 more > Caused by: java.lang.ClassCastException: > org.apache.hadoop.ipc.ProtobufRpcEngine cannot be cast to > org.apache.hadoop.ipc.RpcEngine > at org.apache.hadoop.ipc.RPC.getProtocolEngine(RPC.java:211) > at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:583) > at > org.apache.hadoop.hdfs.NameNodeProxiesClient.createNonHAProxyWithClientProtocol(NameNodeProxiesClient.java:343) > at > org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol(NameNodeProxiesClient.java:131) > at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:343) > at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:287) > at > org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:156) > at > org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:159) > ... 13 more -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-8628) BucketingSink does not work with S3
[ https://issues.apache.org/jira/browse/FLINK-8628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dejan miljkovic updated FLINK-8628: --- Description: BucketingSink does not work wit S3. Followed instructions provided on [https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/deployment/aws.html] but got below exception. Several people are complaining on the same issue. [http://mail-archives.apache.org/mod_mbox/flink-user/201801.mbox/%3CCADAFrT9T6WQa25HXR1z1NaL=n8wP9s7aSXxZWxHy=hubo93...@mail.gmail.com%3E] [https://lists.apache.org/thread.html/%3CCADAFrT9T6WQa25HXR1z1NaL=n8wP9s7aSXxZWxHy=hubo93...@mail.gmail.com%3E] [http://mail-archives.apache.org/mod_mbox/flink-user/201801.mbox/%3CCADAFrT-i+vGe64e__=-dnu4pmpxhvyzvkfqzrhgxbeyhnwa...@mail.gmail.com%3E] I don't see any specific bug related to this. java.lang.RuntimeException: Error while creating FileSystem when initializing the state of the BucketingSink. at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:358) at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178) at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160) at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:259) at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694) at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: Cannot instantiate file system for URI: hdfs://localhost:12345/ at org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:187) at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:401) at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.createHadoopFileSystem(BucketingSink.java:1154) at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initFileSystem(BucketingSink.java:411) at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:355) ... 9 more Caused by: java.lang.ClassCastException: org.apache.hadoop.ipc.ProtobufRpcEngine cannot be cast to org.apache.hadoop.ipc.RpcEngine at org.apache.hadoop.ipc.RPC.getProtocolEngine(RPC.java:211) at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:583) at org.apache.hadoop.hdfs.NameNodeProxiesClient.createNonHAProxyWithClientProtocol(NameNodeProxiesClient.java:343) at org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol(NameNodeProxiesClient.java:131) at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:343) at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:287) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:156) at org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:159) ... 13 more was: BucketingSink does not work wit S3. Followed instructions provided on [https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/deployment/aws.html] but got below exception. Several people are complaining on the same issue. [http://mail-archives.apache.org/mod_mbox/flink-user/201801.mbox/%3CCADAFrT9T6WQa25HXR1z1NaL=n8wP9s7aSXxZWxHy=hubo93...@mail.gmail.com%3E] [https://lists.apache.org/thread.html/%3CCADAFrT9T6WQa25HXR1z1NaL=n8wP9s7aSXxZWxHy=hubo93...@mail.gmail.com%3E] http://mail-archives.apache.org/mod_mbox/flink-user/201801.mbox/%3CCADAFrT-i+vGe64e__=-dnu4pmpxhvyzvkfqzrhgxbeyhnwa...@mail.gmail.com%3E I don't see any specific bug related to this. java.lang.RuntimeException: Error while creating FileSystem when initializing the state of the RollingSink. at org.apache.flink.streaming.connectors.fs.RollingSink.initializeState(RollingSink.java:345) at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178) at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160) at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:259) at
[jira] [Created] (FLINK-8628) BucketingSink does not work wit S3
dejan miljkovic created FLINK-8628: -- Summary: BucketingSink does not work wit S3 Key: FLINK-8628 URL: https://issues.apache.org/jira/browse/FLINK-8628 Project: Flink Issue Type: Bug Components: FileSystem, Streaming Affects Versions: 1.4.0 Reporter: dejan miljkovic BucketingSink does not work wit S3. Followed instructions provided on [https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/deployment/aws.html] but got below exception. Several people are complaining on the same issue. [http://mail-archives.apache.org/mod_mbox/flink-user/201801.mbox/%3CCADAFrT9T6WQa25HXR1z1NaL=n8wP9s7aSXxZWxHy=hubo93...@mail.gmail.com%3E] [https://lists.apache.org/thread.html/%3CCADAFrT9T6WQa25HXR1z1NaL=n8wP9s7aSXxZWxHy=hubo93...@mail.gmail.com%3E] http://mail-archives.apache.org/mod_mbox/flink-user/201801.mbox/%3CCADAFrT-i+vGe64e__=-dnu4pmpxhvyzvkfqzrhgxbeyhnwa...@mail.gmail.com%3E I don't see any specific bug related to this. java.lang.RuntimeException: Error while creating FileSystem when initializing the state of the RollingSink. at org.apache.flink.streaming.connectors.fs.RollingSink.initializeState(RollingSink.java:345) at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178) at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160) at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:259) at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694) at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: Cannot instantiate file system for URI: hdfs://localhost:12345/ at org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:187) at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:401) at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.createHadoopFileSystem(BucketingSink.java:1154) at org.apache.flink.streaming.connectors.fs.RollingSink.initFileSystem(RollingSink.java:389) at org.apache.flink.streaming.connectors.fs.RollingSink.initializeState(RollingSink.java:342) ... 9 more Caused by: java.lang.ClassCastException: org.apache.hadoop.ipc.ProtobufRpcEngine cannot be cast to org.apache.hadoop.ipc.RpcEngine at org.apache.hadoop.ipc.RPC.getProtocolEngine(RPC.java:211) at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:583) at org.apache.hadoop.hdfs.NameNodeProxiesClient.createNonHAProxyWithClientProtocol(NameNodeProxiesClient.java:343) at org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol(NameNodeProxiesClient.java:131) at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:343) at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:287) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:156) at org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:159) -- This message was sent by Atlassian JIRA (v7.6.3#76005)