[jira] [Commented] (FLINK-2584) ASM dependency is not shaded away
[ https://issues.apache.org/jira/browse/FLINK-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721488#comment-14721488 ] ASF GitHub Bot commented on FLINK-2584: --- GitHub user rmetzger opened a pull request: https://github.com/apache/flink/pull/1076 [FLINK-2584] Check for unshaded classes in fat jar and shade curator This PR is an addition for FLINK-2584, removing the transitive guava dependencies from the fat jar introduced by Apache Curator. Its also adding a check ensuring that shaded classes (guava, asm) are not showing up in the fat jar. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rmetzger/flink flink2584 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1076.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1076 commit 7ae0c9d956ffcdb77d55edc70f1588dc507a9c39 Author: Robert Metzger rmetz...@apache.org Date: 2015-08-27T16:13:08Z [FLINK-2584] Check for unshaded classes in fat jar and shade curator ASM dependency is not shaded away - Key: FLINK-2584 URL: https://issues.apache.org/jira/browse/FLINK-2584 Project: Flink Issue Type: Bug Components: Core Affects Versions: 0.9, master Reporter: Ufuk Celebi Assignee: Stephan Ewen Fix For: 0.10, 0.9.1 ASM is not correctly shaded away. If you build the quick start against the snapshot version, you will see the following dependencies. Robert is fixing this. {code} [INFO] +- org.apache.flink:flink-java:jar:0.9.1:compile [INFO] | +- org.apache.flink:flink-core:jar:0.9.1:compile [INFO] | | \- commons-collections:commons-collections:jar:3.2.1:compile [INFO] | +- org.apache.flink:flink-shaded-include-yarn:jar:0.9.1:compile [INFO] | +- org.apache.avro:avro:jar:1.7.6:compile [INFO] | | +- org.codehaus.jackson:jackson-core-asl:jar:1.9.13:compile [INFO] | | +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile [INFO] | | +- com.thoughtworks.paranamer:paranamer:jar:2.3:compile [INFO] | | +- org.xerial.snappy:snappy-java:jar:1.0.5:compile [INFO] | | \- org.apache.commons:commons-compress:jar:1.4.1:compile [INFO] | | \- org.tukaani:xz:jar:1.0:compile [INFO] | +- com.esotericsoftware.kryo:kryo:jar:2.24.0:compile [INFO] | | +- com.esotericsoftware.minlog:minlog:jar:1.2:compile [INFO] | | \- org.objenesis:objenesis:jar:2.1:compile [INFO] | +- com.twitter:chill_2.10:jar:0.5.2:compile [INFO] | | +- org.scala-lang:scala-library:jar:2.10.4:compile [INFO] | | \- com.twitter:chill-java:jar:0.5.2:compile [INFO] | +- com.twitter:chill-avro_2.10:jar:0.5.2:compile [INFO] | | +- com.twitter:chill-bijection_2.10:jar:0.5.2:compile [INFO] | | | \- com.twitter:bijection-core_2.10:jar:0.7.2:compile [INFO] | | \- com.twitter:bijection-avro_2.10:jar:0.7.2:compile [INFO] | +- de.javakaffee:kryo-serializers:jar:0.36:compile [INFO] | | +- com.esotericsoftware:kryo:jar:3.0.3:compile [INFO] | | | +- com.esotericsoftware:reflectasm:jar:1.10.1:compile [INFO] | | | | \- org.ow2.asm:asm:jar:5.0.3:compile [INFO] | | | \- com.esotericsoftware:minlog:jar:1.3.0:compile [INFO] | | \- com.google.protobuf:protobuf-java:jar:2.6.1:compile {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2584) ASM dependency is not shaded away
[ https://issues.apache.org/jira/browse/FLINK-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721503#comment-14721503 ] ASF GitHub Bot commented on FLINK-2584: --- Github user uce commented on the pull request: https://github.com/apache/flink/pull/1076#issuecomment-136131254 Nice work :) This needs to go into the milestone branch as well. What do you mean with We can easily integrate curator's netty into the jar file.? ASM dependency is not shaded away - Key: FLINK-2584 URL: https://issues.apache.org/jira/browse/FLINK-2584 Project: Flink Issue Type: Bug Components: Core Affects Versions: 0.9, master Reporter: Ufuk Celebi Assignee: Stephan Ewen Fix For: 0.10, 0.9.1 ASM is not correctly shaded away. If you build the quick start against the snapshot version, you will see the following dependencies. Robert is fixing this. {code} [INFO] +- org.apache.flink:flink-java:jar:0.9.1:compile [INFO] | +- org.apache.flink:flink-core:jar:0.9.1:compile [INFO] | | \- commons-collections:commons-collections:jar:3.2.1:compile [INFO] | +- org.apache.flink:flink-shaded-include-yarn:jar:0.9.1:compile [INFO] | +- org.apache.avro:avro:jar:1.7.6:compile [INFO] | | +- org.codehaus.jackson:jackson-core-asl:jar:1.9.13:compile [INFO] | | +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile [INFO] | | +- com.thoughtworks.paranamer:paranamer:jar:2.3:compile [INFO] | | +- org.xerial.snappy:snappy-java:jar:1.0.5:compile [INFO] | | \- org.apache.commons:commons-compress:jar:1.4.1:compile [INFO] | | \- org.tukaani:xz:jar:1.0:compile [INFO] | +- com.esotericsoftware.kryo:kryo:jar:2.24.0:compile [INFO] | | +- com.esotericsoftware.minlog:minlog:jar:1.2:compile [INFO] | | \- org.objenesis:objenesis:jar:2.1:compile [INFO] | +- com.twitter:chill_2.10:jar:0.5.2:compile [INFO] | | +- org.scala-lang:scala-library:jar:2.10.4:compile [INFO] | | \- com.twitter:chill-java:jar:0.5.2:compile [INFO] | +- com.twitter:chill-avro_2.10:jar:0.5.2:compile [INFO] | | +- com.twitter:chill-bijection_2.10:jar:0.5.2:compile [INFO] | | | \- com.twitter:bijection-core_2.10:jar:0.7.2:compile [INFO] | | \- com.twitter:bijection-avro_2.10:jar:0.7.2:compile [INFO] | +- de.javakaffee:kryo-serializers:jar:0.36:compile [INFO] | | +- com.esotericsoftware:kryo:jar:3.0.3:compile [INFO] | | | +- com.esotericsoftware:reflectasm:jar:1.10.1:compile [INFO] | | | | \- org.ow2.asm:asm:jar:5.0.3:compile [INFO] | | | \- com.esotericsoftware:minlog:jar:1.3.0:compile [INFO] | | \- com.google.protobuf:protobuf-java:jar:2.6.1:compile {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2584] Check for unshaded classes in fat...
Github user uce commented on the pull request: https://github.com/apache/flink/pull/1076#issuecomment-136131254 Nice work :) This needs to go into the milestone branch as well. What do you mean with We can easily integrate curator's netty into the jar file.? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2590] fixing DataSetUtils.zipWithUnique...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/1075#issuecomment-136129581 Thanks a lot for the contribution. Can you add a test case for the method to make sure the issue is not re-introduced again when somebody else is changing the code? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2590) DataSetUtils.zipWithUniqueID creates duplicate IDs
[ https://issues.apache.org/jira/browse/FLINK-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721489#comment-14721489 ] ASF GitHub Bot commented on FLINK-2590: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/1075#issuecomment-136129581 Thanks a lot for the contribution. Can you add a test case for the method to make sure the issue is not re-introduced again when somebody else is changing the code? DataSetUtils.zipWithUniqueID creates duplicate IDs -- Key: FLINK-2590 URL: https://issues.apache.org/jira/browse/FLINK-2590 Project: Flink Issue Type: Bug Components: Java API, Scala API Affects Versions: 0.10, master Reporter: Martin Junghanns Assignee: Martin Junghanns Priority: Minor The function creates IDs using the following code: {code:java} shifter = log2(numberOfParallelSubtasks) id = counter shifter + taskId; {code} As the binary function + is executed before the bitshift , this results in cases where different tasks create the same ID. It essentially calculates {code} counter*2^(shifter+taskId) {code} which is 0 for counter = 0 and all values of shifter and taskID. Consider the following example. numberOfParallelSubtaks = 8 shifter = log2(8) = 4 (maybe rename the function?) produces: {code} start: 1, shifter: 4 taskId: 4 label: 256 start: 2, shifter: 4 taskId: 3 label: 256 start: 4, shifter: 4 taskId: 2 label: 256 {code} I would suggest the following: {code} counter*2^(shifter)+taskId {code} which in code is equivalent to {code} shifter = log2(numberOfParallelSubtasks); id = (counter shifter) + taskId; {code} and for our example produces: {code} start: 1, shifter: 4 taskId: 4 label: 20 start: 2, shifter: 4 taskId: 3 label: 35 start: 4, shifter: 4 taskId: 2 label: 66 {code} So we move the counter to the left and add the task id. As there is space for 2^shifter numbers, this prevents collisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2125][streaming] Delimiter change from ...
GitHub user ogokal opened a pull request: https://github.com/apache/flink/pull/1077 [FLINK-2125][streaming] Delimiter change from char to string I tried to change based on the previous comments. I hope it is sufficient enough. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ogokal/flink master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1077.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1077 commit c78b6d726c60b1e197bf5ee513e081c852362919 Author: ogokal ogo...@gmail.com Date: 2015-08-30T17:42:56Z delimiter change from char to string commit a51c486370e3e168912cbb71bde325701112d14b Author: ogokal ogo...@gmail.com Date: 2015-08-30T18:01:57Z [FLINK-2125][streaming] Delimiter change from char to string --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2543) State handling does not support deserializing classes through the UserCodeClassloader
[ https://issues.apache.org/jira/browse/FLINK-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721659#comment-14721659 ] ASF GitHub Bot commented on FLINK-2543: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1048 State handling does not support deserializing classes through the UserCodeClassloader - Key: FLINK-2543 URL: https://issues.apache.org/jira/browse/FLINK-2543 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9, 0.10 Reporter: Robert Metzger Assignee: Robert Metzger Priority: Blocker Fix For: 0.10 The current implementation of the state checkpointing does not support custom classes, because the UserCodeClassLoader is not used to deserialize the state. {code} Error: java.lang.RuntimeException: Failed to deserialize state handle and setup initial operator state. at org.apache.flink.runtime.taskmanager.Task.run(Task.java:544) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ClassNotFoundException: com.ottogroup.bi.searchlab.searchsessionizer.OperatorState at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:626) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at org.apache.flink.runtime.state.ByteStreamStateHandle.getState(ByteStreamStateHandle.java:63) at org.apache.flink.runtime.state.ByteStreamStateHandle.getState(ByteStreamStateHandle.java:33) at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.restoreInitialState(AbstractUdfStreamOperator.java:83) at org.apache.flink.streaming.runtime.tasks.StreamTask.setInitialState(StreamTask.java:276) at org.apache.flink.runtime.state.StateUtils.setOperatorState(StateUtils.java:51) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:541) {code} The issue has been reported by a user: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Custom-Class-for-state-checkpointing-td2415.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2543] Fix user object deserialization f...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1048 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-2543) State handling does not support deserializing classes through the UserCodeClassloader
[ https://issues.apache.org/jira/browse/FLINK-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-2543. - Resolution: Fixed Fixed via bf8c8e54094151348caedd3120931516f76c3cf3 and 0ba53558f9b56b1e17c84ab8e4ee639ca09b9133 State handling does not support deserializing classes through the UserCodeClassloader - Key: FLINK-2543 URL: https://issues.apache.org/jira/browse/FLINK-2543 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9, 0.10 Reporter: Robert Metzger Assignee: Robert Metzger Priority: Blocker Fix For: 0.10 The current implementation of the state checkpointing does not support custom classes, because the UserCodeClassLoader is not used to deserialize the state. {code} Error: java.lang.RuntimeException: Failed to deserialize state handle and setup initial operator state. at org.apache.flink.runtime.taskmanager.Task.run(Task.java:544) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ClassNotFoundException: com.ottogroup.bi.searchlab.searchsessionizer.OperatorState at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:626) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at org.apache.flink.runtime.state.ByteStreamStateHandle.getState(ByteStreamStateHandle.java:63) at org.apache.flink.runtime.state.ByteStreamStateHandle.getState(ByteStreamStateHandle.java:33) at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.restoreInitialState(AbstractUdfStreamOperator.java:83) at org.apache.flink.streaming.runtime.tasks.StreamTask.setInitialState(StreamTask.java:276) at org.apache.flink.runtime.state.StateUtils.setOperatorState(StateUtils.java:51) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:541) {code} The issue has been reported by a user: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Custom-Class-for-state-checkpointing-td2415.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2543) State handling does not support deserializing classes through the UserCodeClassloader
[ https://issues.apache.org/jira/browse/FLINK-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-2543. --- State handling does not support deserializing classes through the UserCodeClassloader - Key: FLINK-2543 URL: https://issues.apache.org/jira/browse/FLINK-2543 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9, 0.10 Reporter: Robert Metzger Assignee: Robert Metzger Priority: Blocker Fix For: 0.10 The current implementation of the state checkpointing does not support custom classes, because the UserCodeClassLoader is not used to deserialize the state. {code} Error: java.lang.RuntimeException: Failed to deserialize state handle and setup initial operator state. at org.apache.flink.runtime.taskmanager.Task.run(Task.java:544) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ClassNotFoundException: com.ottogroup.bi.searchlab.searchsessionizer.OperatorState at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:626) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371) at org.apache.flink.runtime.state.ByteStreamStateHandle.getState(ByteStreamStateHandle.java:63) at org.apache.flink.runtime.state.ByteStreamStateHandle.getState(ByteStreamStateHandle.java:33) at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.restoreInitialState(AbstractUdfStreamOperator.java:83) at org.apache.flink.streaming.runtime.tasks.StreamTask.setInitialState(StreamTask.java:276) at org.apache.flink.runtime.state.StateUtils.setOperatorState(StateUtils.java:51) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:541) {code} The issue has been reported by a user: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Custom-Class-for-state-checkpointing-td2415.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-1737: Kronecker product
GitHub user daniel-pape opened a pull request: https://github.com/apache/flink/pull/1078 FLINK-1737: Kronecker product This is preparational work related to FLINK-1737: Adds an implementation of outer/Kronecker product which can subsequently be used to compute the sample covariance matrix. You can merge this pull request into a Git repository by running: $ git pull https://github.com/daniel-pape/flink FLINK-0 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1078.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1078 commit 627a0e9776a3c39e985b30b508521e4869309767 Author: daniel-pape dgp...@web.de Date: 2015-08-18T18:29:06Z Work in progress: Test cases and implementation for outer product of vectors. commit 21aee8d0e3aeea1b027bb70c71c5ea1aa66b Author: daniel-pape dgp...@web.de Date: 2015-08-21T12:50:26Z Implementation of outer product for sparse vectors. commit 0e9a608feb305ef254d896e9f39f58f98e236dba Author: daniel-pape dgp...@web.de Date: 2015-08-21T12:51:40Z Test cases for outer product computation. For dense as well as sparse vectors, More tests are to come. commit d0eb80102ae4856236fce0b98c4e396183d86f3f Author: daniel-pape dgp...@web.de Date: 2015-08-21T19:38:05Z Added test case. commit 97dd4f050e7d3abf7c419d904913979406abac05 Author: Daniel Pape dgp...@web.de Date: 2015-08-30T20:11:53Z Added method documentation for outer product methods. commit 4dde9f86b300cd7c64c7f62feb11984267f45913 Author: daniel-pape dgp...@web.de Date: 2015-08-18T18:29:06Z Work in progress: Test cases and implementation for outer product of vectors. commit 9ea41fc721bb6983cd91ca102342ef31c4cd0732 Author: daniel-pape dgp...@web.de Date: 2015-08-21T12:50:26Z Implementation of outer product for sparse vectors. commit b021b1f4d6a31626cf5b1cfac7c9dbf025ff00a1 Author: daniel-pape dgp...@web.de Date: 2015-08-21T12:51:40Z Test cases for outer product computation. For dense as well as sparse vectors, More tests are to come. commit f70f5e0be5851d98cbbb4d0572abfb8294af3b0f Author: daniel-pape dgp...@web.de Date: 2015-08-21T19:38:05Z Added test case. commit 503e4c04416c436da31f9340448420198b495d7b Author: Daniel Pape dgp...@web.de Date: 2015-08-30T20:11:53Z Added method documentation for outer product methods. commit 31b25266924e89412cafa13f8801d8eff9fcb84c Author: Daniel Pape dgp...@web.de Date: 2015-08-30T20:18:56Z Merge branch 'FLINK-0' of https://www.github.com/daniel-pape/flink into FLINK-0 commit 9f337f3d117d025e26578a96fafde2cdd7b2df72 Author: Daniel Pape dgp...@web.de Date: 2015-08-30T20:46:11Z Removed marker comments from test suites and also add the missing test to SparseVector suite that correspond to the one from the suite for DenseVector. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1737) Add statistical whitening transformation to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721710#comment-14721710 ] ASF GitHub Bot commented on FLINK-1737: --- GitHub user daniel-pape opened a pull request: https://github.com/apache/flink/pull/1078 FLINK-1737: Kronecker product This is preparational work related to FLINK-1737: Adds an implementation of outer/Kronecker product which can subsequently be used to compute the sample covariance matrix. You can merge this pull request into a Git repository by running: $ git pull https://github.com/daniel-pape/flink FLINK-0 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1078.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1078 commit 627a0e9776a3c39e985b30b508521e4869309767 Author: daniel-pape dgp...@web.de Date: 2015-08-18T18:29:06Z Work in progress: Test cases and implementation for outer product of vectors. commit 21aee8d0e3aeea1b027bb70c71c5ea1aa66b Author: daniel-pape dgp...@web.de Date: 2015-08-21T12:50:26Z Implementation of outer product for sparse vectors. commit 0e9a608feb305ef254d896e9f39f58f98e236dba Author: daniel-pape dgp...@web.de Date: 2015-08-21T12:51:40Z Test cases for outer product computation. For dense as well as sparse vectors, More tests are to come. commit d0eb80102ae4856236fce0b98c4e396183d86f3f Author: daniel-pape dgp...@web.de Date: 2015-08-21T19:38:05Z Added test case. commit 97dd4f050e7d3abf7c419d904913979406abac05 Author: Daniel Pape dgp...@web.de Date: 2015-08-30T20:11:53Z Added method documentation for outer product methods. commit 4dde9f86b300cd7c64c7f62feb11984267f45913 Author: daniel-pape dgp...@web.de Date: 2015-08-18T18:29:06Z Work in progress: Test cases and implementation for outer product of vectors. commit 9ea41fc721bb6983cd91ca102342ef31c4cd0732 Author: daniel-pape dgp...@web.de Date: 2015-08-21T12:50:26Z Implementation of outer product for sparse vectors. commit b021b1f4d6a31626cf5b1cfac7c9dbf025ff00a1 Author: daniel-pape dgp...@web.de Date: 2015-08-21T12:51:40Z Test cases for outer product computation. For dense as well as sparse vectors, More tests are to come. commit f70f5e0be5851d98cbbb4d0572abfb8294af3b0f Author: daniel-pape dgp...@web.de Date: 2015-08-21T19:38:05Z Added test case. commit 503e4c04416c436da31f9340448420198b495d7b Author: Daniel Pape dgp...@web.de Date: 2015-08-30T20:11:53Z Added method documentation for outer product methods. commit 31b25266924e89412cafa13f8801d8eff9fcb84c Author: Daniel Pape dgp...@web.de Date: 2015-08-30T20:18:56Z Merge branch 'FLINK-0' of https://www.github.com/daniel-pape/flink into FLINK-0 commit 9f337f3d117d025e26578a96fafde2cdd7b2df72 Author: Daniel Pape dgp...@web.de Date: 2015-08-30T20:46:11Z Removed marker comments from test suites and also add the missing test to SparseVector suite that correspond to the one from the suite for DenseVector. Add statistical whitening transformation to machine learning library Key: FLINK-1737 URL: https://issues.apache.org/jira/browse/FLINK-1737 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Daniel Pape Labels: ML, Starter The statistical whitening transformation [1] is a preprocessing step for different ML algorithms. It decorrelates the individual dimensions and sets its variance to 1. Statistical whitening should be implemented as a {{Transfomer}}. Resources: [1] [http://en.wikipedia.org/wiki/Whitening_transformation] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2111) Add stop signal to cleanly shutdown streaming jobs
[ https://issues.apache.org/jira/browse/FLINK-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721481#comment-14721481 ] ASF GitHub Bot commented on FLINK-2111: --- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/750#discussion_r38270665 --- Diff: flink-runtime/src/main/scala/org/apache/flink/runtime/taskmanager/TaskManager.scala --- @@ -411,6 +411,23 @@ class TaskManager( log.debug(sCannot find task to fail for execution ${executionID})) } +// stops a task +case StopTask(executionID) = + val task = runningTasks.get(executionID) + if (task != null) { +try { + task.stopExecution() --- End diff -- I assume that the stops should be idempotent. But I agree if we document and check that all `cancel` calls are non-blocking, then it should work as well. Add stop signal to cleanly shutdown streaming jobs Key: FLINK-2111 URL: https://issues.apache.org/jira/browse/FLINK-2111 Project: Flink Issue Type: Improvement Components: Distributed Runtime, JobManager, Local Runtime, Streaming, TaskManager, Webfrontend Reporter: Matthias J. Sax Assignee: Matthias J. Sax Priority: Minor Currently, streaming jobs can only be stopped using cancel command, what is a hard stop with no clean shutdown. The new introduced stop signal, will only affect streaming source tasks such that the sources can stop emitting data and shutdown cleanly, resulting in a clean shutdown of the whole streaming job. This feature is a pre-requirment for https://issues.apache.org/jira/browse/FLINK-1929 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2111] Add stop signal to cleanly shut...
Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/750#discussion_r38270665 --- Diff: flink-runtime/src/main/scala/org/apache/flink/runtime/taskmanager/TaskManager.scala --- @@ -411,6 +411,23 @@ class TaskManager( log.debug(sCannot find task to fail for execution ${executionID})) } +// stops a task +case StopTask(executionID) = + val task = runningTasks.get(executionID) + if (task != null) { +try { + task.stopExecution() --- End diff -- I assume that the stops should be idempotent. But I agree if we document and check that all `cancel` calls are non-blocking, then it should work as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1681) Remove the old Record API
[ https://issues.apache.org/jira/browse/FLINK-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721746#comment-14721746 ] Stephan Ewen commented on FLINK-1681: - Migrated a set of tests from the deprecated API to the current API in a7a57ebea6d8f60abba4fe2559af05d316112ca4 Remove the old Record API - Key: FLINK-1681 URL: https://issues.apache.org/jira/browse/FLINK-1681 Project: Flink Issue Type: Task Affects Versions: 0.8.1 Reporter: Henry Saputra Assignee: Henry Saputra Per discussion in dev@ list from FLINK-1106 issue, this time would like to remove the old APIs since we already deprecate them in 0.8.x release. This would help make the code base cleaner and easier for new contributors to navigate the source. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2125) String delimiter for SocketTextStream
[ https://issues.apache.org/jira/browse/FLINK-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721638#comment-14721638 ] ASF GitHub Bot commented on FLINK-2125: --- GitHub user ogokal opened a pull request: https://github.com/apache/flink/pull/1077 [FLINK-2125][streaming] Delimiter change from char to string I tried to change based on the previous comments. I hope it is sufficient enough. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ogokal/flink master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1077.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1077 commit c78b6d726c60b1e197bf5ee513e081c852362919 Author: ogokal ogo...@gmail.com Date: 2015-08-30T17:42:56Z delimiter change from char to string commit a51c486370e3e168912cbb71bde325701112d14b Author: ogokal ogo...@gmail.com Date: 2015-08-30T18:01:57Z [FLINK-2125][streaming] Delimiter change from char to string String delimiter for SocketTextStream - Key: FLINK-2125 URL: https://issues.apache.org/jira/browse/FLINK-2125 Project: Flink Issue Type: Improvement Components: Streaming Affects Versions: 0.9 Reporter: Márton Balassi Priority: Minor Labels: starter The SocketTextStreamFunction uses a character delimiter, despite other parts of the API using String delimiter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request:
Github user rmetzger commented on the pull request: https://github.com/apache/flink/commit/554b77bcd9ed66d57d99d4990774a43f35f6a835#commitcomment-12965859 Currently, flink-runtime has a dependency on Hadoop, so I can assume its always available. Even for a binary Flink release without build in Hadoop dependencies, we would assume Hadoop to be present (from the classpath). For a Flink release without any Hadoop, we can either remove this again or use some reflection / fake hadoop class magic (added via maven) if needed. But for now, I would like to have this in the code base because it helps debugging user issues. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2584) ASM dependency is not shaded away
[ https://issues.apache.org/jira/browse/FLINK-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721504#comment-14721504 ] ASF GitHub Bot commented on FLINK-2584: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/1076#issuecomment-136131482 I saw that curator has a netty dependency. If we are running into netty conflicts because of curator's netty dependency, it's very easy to just shade curator's netty into curator. My change is creating a new apache curator jar for us, where guava is located in `org.apache.curator.shaded.com.google`. We can do exactly the same for netty if needed. ASM dependency is not shaded away - Key: FLINK-2584 URL: https://issues.apache.org/jira/browse/FLINK-2584 Project: Flink Issue Type: Bug Components: Core Affects Versions: 0.9, master Reporter: Ufuk Celebi Assignee: Stephan Ewen Fix For: 0.10, 0.9.1 ASM is not correctly shaded away. If you build the quick start against the snapshot version, you will see the following dependencies. Robert is fixing this. {code} [INFO] +- org.apache.flink:flink-java:jar:0.9.1:compile [INFO] | +- org.apache.flink:flink-core:jar:0.9.1:compile [INFO] | | \- commons-collections:commons-collections:jar:3.2.1:compile [INFO] | +- org.apache.flink:flink-shaded-include-yarn:jar:0.9.1:compile [INFO] | +- org.apache.avro:avro:jar:1.7.6:compile [INFO] | | +- org.codehaus.jackson:jackson-core-asl:jar:1.9.13:compile [INFO] | | +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13:compile [INFO] | | +- com.thoughtworks.paranamer:paranamer:jar:2.3:compile [INFO] | | +- org.xerial.snappy:snappy-java:jar:1.0.5:compile [INFO] | | \- org.apache.commons:commons-compress:jar:1.4.1:compile [INFO] | | \- org.tukaani:xz:jar:1.0:compile [INFO] | +- com.esotericsoftware.kryo:kryo:jar:2.24.0:compile [INFO] | | +- com.esotericsoftware.minlog:minlog:jar:1.2:compile [INFO] | | \- org.objenesis:objenesis:jar:2.1:compile [INFO] | +- com.twitter:chill_2.10:jar:0.5.2:compile [INFO] | | +- org.scala-lang:scala-library:jar:2.10.4:compile [INFO] | | \- com.twitter:chill-java:jar:0.5.2:compile [INFO] | +- com.twitter:chill-avro_2.10:jar:0.5.2:compile [INFO] | | +- com.twitter:chill-bijection_2.10:jar:0.5.2:compile [INFO] | | | \- com.twitter:bijection-core_2.10:jar:0.7.2:compile [INFO] | | \- com.twitter:bijection-avro_2.10:jar:0.7.2:compile [INFO] | +- de.javakaffee:kryo-serializers:jar:0.36:compile [INFO] | | +- com.esotericsoftware:kryo:jar:3.0.3:compile [INFO] | | | +- com.esotericsoftware:reflectasm:jar:1.10.1:compile [INFO] | | | | \- org.ow2.asm:asm:jar:5.0.3:compile [INFO] | | | \- com.esotericsoftware:minlog:jar:1.3.0:compile [INFO] | | \- com.google.protobuf:protobuf-java:jar:2.6.1:compile {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request:
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/commit/554b77bcd9ed66d57d99d4990774a43f35f6a835#commitcomment-12965831 At some points there were thoughts about a hadoop-free version. How would this play together? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request:
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/commit/554b77bcd9ed66d57d99d4990774a43f35f6a835#commitcomment-12965891 I think this is a good addition. In the future (Hadoop not present), we may have to go for reflection, true. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-2598) NPE when arguments are missing for a -m yarn-cluster job
Robert Metzger created FLINK-2598: - Summary: NPE when arguments are missing for a -m yarn-cluster job Key: FLINK-2598 URL: https://issues.apache.org/jira/browse/FLINK-2598 Project: Flink Issue Type: Bug Affects Versions: 0.9.1 Reporter: Robert Metzger Priority: Minor Fix For: 0.9.2 Flink is properly reporting that the argument is missing. Its just showing an ugly NPE exception, but this is not limiting the functionality in any way. The error does not occur in version: 0.10-SNAPSHOT, rev:6e1de98. I'm adding this bug in case we are going to release 0.9.2 {code} robert@cdh544-master:~/release091-rc1/flink-0.9.1/build-target$ ./bin/flink run -m yarn-cluster ./examples/flink-java-examples-0.9.1-WordCount.jar hdfs:///user/robert/kmeans/points hdfs:///user/robert/garbage YARN cluster mode detected. Switching Log4j output to console 13:05:50,432 INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032 13:05:50,605 ERROR org.apache.flink.client.FlinkYarnSessionCli - Missing required argument yn Usage: Required -yn,--yarncontainer arg Number of YARN container to allocate (=Number of Task Managers) Optional -yd,--yarndetached Start detached -yD argDynamic properties -yjm,--yarnjobManagerMemory argMemory for JobManager Container [in MB] -ynm,--yarnname argSet a custom name for the application on YARN -yq,--yarnquery Display available YARN resources (memory, cores) -yqu,--yarnqueue arg Specify YARN queue. -ys,--yarnslots argNumber of slots per TaskManager -yst,--yarnstreaming Start Flink in streaming mode -ytm,--yarntaskManagerMemory arg Memory per TaskManager Container [in MB] java.lang.NullPointerException at org.apache.flink.client.CliFrontend.getClient(CliFrontend.java:735) at org.apache.flink.client.CliFrontend.run(CliFrontend.java:271) at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:878) at org.apache.flink.client.CliFrontend.main(CliFrontend.java:920) The exception above occurred while trying to run your command. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2475] Rename Flink Client log file
Github user mjsax commented on the pull request: https://github.com/apache/flink/pull/1074#issuecomment-136152037 Test fails in instable yarn-test. Should be ready to get merged. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2475) Rename Flink Client log file
[ https://issues.apache.org/jira/browse/FLINK-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721542#comment-14721542 ] ASF GitHub Bot commented on FLINK-2475: --- Github user mjsax commented on the pull request: https://github.com/apache/flink/pull/1074#issuecomment-136152037 Test fails in instable yarn-test. Should be ready to get merged. Rename Flink Client log file Key: FLINK-2475 URL: https://issues.apache.org/jira/browse/FLINK-2475 Project: Flink Issue Type: Improvement Components: Start-Stop Scripts Reporter: Matthias J. Sax Assignee: Matthias J. Sax Priority: Trivial Currently, JoManager and TaskManager log/out files are names as follows: - flink-mjsax-jobmanager-log - flink-mjsax-jobmanager-out - flink-mjsax-taskmanager-log - flink-mjsax-taskmanager-out However, CLI log file is named differently: - flink-mjsax-flink-client-log This should be client only and not flink-client for consistency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2525]Add configuration support in Storm...
Github user ffbin commented on the pull request: https://github.com/apache/flink/pull/1046#issuecomment-136241503 @mjsax @StephanEwen I have finish the code changes. 1.serialize Storm Config as a byte[] into the Flink configuration 2.extend ExclamationTopology such that the number of added !in ExclamationBolt and ExclamationWithStormSpout.ExclamationMap is configurable and adapt the tests. 3.extend FiniteStormFileSpout and base class with an empty constructor and configure the file to be opened via Storm configuration Map. I have run flink-storm-compatibility test successfully in local machine and do not know why CI failed. Can you have a look at my code? Thank you very much. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2545] add bucket member count verificat...
Github user ChengXiangLi commented on the pull request: https://github.com/apache/flink/pull/1067#issuecomment-136243437 Nice job, @greghogan , you just pointed out the root cause and the solution. I add the logic to skip latest buckets as @StephanEwen suggested, and add related unit test for this issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1984] Integrate Flink with Apache Mesos
Github user ankurcha commented on a diff in the pull request: https://github.com/apache/flink/pull/948#discussion_r38281174 --- Diff: flink-mesos/src/main/scala/org/apache/flink/mesos/scheduler/SchedulerUtils.scala --- @@ -0,0 +1,348 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.mesos.scheduler + +import java.util.{List = JList} + +import scala.collection.JavaConversions._ +import scala.util.{Failure, Success, Try} +import scala.util.control.NonFatal + +import com.google.common.base.Splitter +import com.google.protobuf.{ByteString, GeneratedMessage} +import org.apache.flink.configuration.{ConfigConstants, Configuration, GlobalConfiguration} +import org.apache.flink.configuration.ConfigConstants._ +import org.apache.flink.mesos._ +import org.apache.flink.mesos.scheduler.FlinkScheduler._ +import org.apache.flink.runtime.StreamingMode +import org.apache.flink.runtime.jobmanager.{JobManager, JobManagerMode} +import org.apache.flink.runtime.util.EnvironmentInformation +import org.apache.mesos.{MesosSchedulerDriver, Scheduler, SchedulerDriver} +import org.apache.mesos.Protos._ +import org.apache.mesos.Protos.CommandInfo.URI +import org.apache.mesos.Protos.Value.Ranges +import org.apache.mesos.Protos.Value.Type._ + +trait SchedulerUtils { --- End diff -- I have addressed this in the latest set of changes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1984) Integrate Flink with Apache Mesos
[ https://issues.apache.org/jira/browse/FLINK-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721881#comment-14721881 ] ASF GitHub Bot commented on FLINK-1984: --- Github user ankurcha commented on a diff in the pull request: https://github.com/apache/flink/pull/948#discussion_r38281174 --- Diff: flink-mesos/src/main/scala/org/apache/flink/mesos/scheduler/SchedulerUtils.scala --- @@ -0,0 +1,348 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.mesos.scheduler + +import java.util.{List = JList} + +import scala.collection.JavaConversions._ +import scala.util.{Failure, Success, Try} +import scala.util.control.NonFatal + +import com.google.common.base.Splitter +import com.google.protobuf.{ByteString, GeneratedMessage} +import org.apache.flink.configuration.{ConfigConstants, Configuration, GlobalConfiguration} +import org.apache.flink.configuration.ConfigConstants._ +import org.apache.flink.mesos._ +import org.apache.flink.mesos.scheduler.FlinkScheduler._ +import org.apache.flink.runtime.StreamingMode +import org.apache.flink.runtime.jobmanager.{JobManager, JobManagerMode} +import org.apache.flink.runtime.util.EnvironmentInformation +import org.apache.mesos.{MesosSchedulerDriver, Scheduler, SchedulerDriver} +import org.apache.mesos.Protos._ +import org.apache.mesos.Protos.CommandInfo.URI +import org.apache.mesos.Protos.Value.Ranges +import org.apache.mesos.Protos.Value.Type._ + +trait SchedulerUtils { --- End diff -- I have addressed this in the latest set of changes. Integrate Flink with Apache Mesos - Key: FLINK-1984 URL: https://issues.apache.org/jira/browse/FLINK-1984 Project: Flink Issue Type: New Feature Components: New Components Reporter: Robert Metzger Priority: Minor Attachments: 251.patch There are some users asking for an integration of Flink into Mesos. There also is a pending pull request for adding Mesos support for Flink: https://github.com/apache/flink/pull/251 But the PR is insufficiently tested. I'll add the code of the pull request to this JIRA in case somebody wants to pick it up in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1984] Integrate Flink with Apache Mesos
Github user ankurcha commented on the pull request: https://github.com/apache/flink/pull/948#issuecomment-136236719 @rmetzger I have finally got some time to work on this again. Let me address your question one by one: Why did you decide to start the JobManager alongside the Scheduler? This is basically a easy first step way of getting things running the way it was done in a whole bunch of projects. The easiest way to run a single master + multiple worker application is to make the scheduler run the master process and have another meta-framework such as marathon submit the whole framework as a task to the mesos server. In the lack of marathon or aurora etc, mesos-submit ( an app that ships with mesos) can be used to submit the scheduler as a task. This means the job manager + scheduler would be running in the mesos cluster submitted as an app (just like in YARN). My eventual goal is to make the scheduler support a completely standalone mode of operation but that requires coordination in order to assure that only one scheduler instance exists at a time - this may have some hooks that can be a part of the HA job manager initiative. Tests I am working on some docker and vagrant based scripts that can make the setup part of the tests more palatable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1984) Integrate Flink with Apache Mesos
[ https://issues.apache.org/jira/browse/FLINK-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721880#comment-14721880 ] ASF GitHub Bot commented on FLINK-1984: --- Github user ankurcha commented on the pull request: https://github.com/apache/flink/pull/948#issuecomment-136236719 @rmetzger I have finally got some time to work on this again. Let me address your question one by one: Why did you decide to start the JobManager alongside the Scheduler? This is basically a easy first step way of getting things running the way it was done in a whole bunch of projects. The easiest way to run a single master + multiple worker application is to make the scheduler run the master process and have another meta-framework such as marathon submit the whole framework as a task to the mesos server. In the lack of marathon or aurora etc, mesos-submit ( an app that ships with mesos) can be used to submit the scheduler as a task. This means the job manager + scheduler would be running in the mesos cluster submitted as an app (just like in YARN). My eventual goal is to make the scheduler support a completely standalone mode of operation but that requires coordination in order to assure that only one scheduler instance exists at a time - this may have some hooks that can be a part of the HA job manager initiative. Tests I am working on some docker and vagrant based scripts that can make the setup part of the tests more palatable. Integrate Flink with Apache Mesos - Key: FLINK-1984 URL: https://issues.apache.org/jira/browse/FLINK-1984 Project: Flink Issue Type: New Feature Components: New Components Reporter: Robert Metzger Priority: Minor Attachments: 251.patch There are some users asking for an integration of Flink into Mesos. There also is a pending pull request for adding Mesos support for Flink: https://github.com/apache/flink/pull/251 But the PR is insufficiently tested. I'll add the code of the pull request to this JIRA in case somebody wants to pick it up in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2125][streaming] Delimiter change from ...
Github user HuangWHWHW commented on a diff in the pull request: https://github.com/apache/flink/pull/1077#discussion_r38282466 --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/functions/SocketTextStreamFunctionTest.java --- @@ -0,0 +1,237 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.api.functions; + +import org.apache.flink.streaming.api.functions.source.SocketTextStreamFunction; +import org.junit.Test; + +import java.io.ByteArrayInputStream; +import java.lang.reflect.Field; +import java.net.Socket; +import java.util.ArrayList; +import java.util.List; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertTrue; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + + +public class SocketTextStreamFunctionTest { +//Actual text +/* +Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras sagittis nisl non euismod fermentum. Curabitur lacinia vehicula enim quis tristique. Suspendisse imperdiet arcu sed bibendum vulputate. Sed vitae nisl vitae turpis dapibus lacinia in id elit. Integer lorem dolor, porttitor ut nisi in, tincidunt sodales leo. Aliquam tristique dui sit amet odio bibendum, et rutrum turpis auctor. Morbi sit amet mollis augue, ac rutrum velit. Vestibulum suscipit finibus sapien, et congue enim laoreet consequat. + +Integer aliquam metus iaculis risus hendrerit maximus. Suspendisse vestibulum nibh ac mauris cursus molestie sit amet vel turpis. Nulla et posuere orci. Aliquam dui quam, posuere vitae erat vitae, finibus commodo ipsum. Aliquam eu dui quis arcu porttitor sollicitudin. Integer sodales finibus ullamcorper. Praesent et felis tempor, laoreet libero eget, consequat nisl. Aenean molestie rutrum lorem, ac cursus nisl dapibus vitae. + +Quisque sodales dui et sem bibendum semper. Pellentesque luctus leo nec lacus euismod pellentesque. Phasellus a metus dignissim risus auctor lacinia. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Aenean consectetur bibendum imperdiet. Etiam dignissim rutrum enim, non volutpat nisi condimentum sed. Quisque condimentum ultrices est sit amet facilisis. + +Ut vitae volutpat odio. Sed eget vestibulum libero, eu tincidunt lorem. Nam pretium nulla nisl. Maecenas fringilla nunc ut turpis consectetur, et fringilla sem placerat. Etiam nec scelerisque nisi, at sodales ligula. Aliquam euismod faucibus egestas. Curabitur eget enim quam. Praesent convallis mattis lobortis. Pellentesque a consectetur nisl. Duis molestie diam est. Nam a malesuada augue. Vivamus enim massa, luctus ac elit ut, vestibulum laoreet nulla. Curabitur pellentesque vel mi eget tempus. Donec cursus et leo quis viverra. + +In ac imperdiet ex, nec aliquet erat. Nullam sit amet enim in dolor finibus convallis id eu nibh. Fusce aliquam convallis orci aliquam. +*/ +// Generated 5 paragraphs, 290 words, 2000 bytes of Lorem Ipsum + +private static final String content = Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras sagittis nisl non euismod fermentum. + +Curabitur lacinia vehicula enim quis tristique. Suspendisse imperdiet arcu sed bibendum vulputate. + +Sed vitae nisl vitae turpis dapibus lacinia in id elit. Integer lorem dolor, + +porttitor ut nisi in, tincidunt sodales leo. Aliquam tristique dui sit amet odio bibendum, + +et rutrum turpis auctor. Morbi sit amet mollis augue, ac rutrum velit. + +Vestibulum suscipit finibus sapien, et congue enim laoreet consequat.\r\nInteger + +aliquam metus iaculis risus hendrerit maximus. Suspendisse vestibulum nibh ac + +mauris cursus molestie sit amet vel turpis. Nulla et posuere orci. + +Aliquam dui quam, posuere vitae erat vitae, finibus commodo ipsum. Aliquam + +
[jira] [Commented] (FLINK-2125) String delimiter for SocketTextStream
[ https://issues.apache.org/jira/browse/FLINK-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721923#comment-14721923 ] ASF GitHub Bot commented on FLINK-2125: --- Github user HuangWHWHW commented on a diff in the pull request: https://github.com/apache/flink/pull/1077#discussion_r38282466 --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/functions/SocketTextStreamFunctionTest.java --- @@ -0,0 +1,237 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.api.functions; + +import org.apache.flink.streaming.api.functions.source.SocketTextStreamFunction; +import org.junit.Test; + +import java.io.ByteArrayInputStream; +import java.lang.reflect.Field; +import java.net.Socket; +import java.util.ArrayList; +import java.util.List; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertTrue; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + + +public class SocketTextStreamFunctionTest { +//Actual text +/* +Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras sagittis nisl non euismod fermentum. Curabitur lacinia vehicula enim quis tristique. Suspendisse imperdiet arcu sed bibendum vulputate. Sed vitae nisl vitae turpis dapibus lacinia in id elit. Integer lorem dolor, porttitor ut nisi in, tincidunt sodales leo. Aliquam tristique dui sit amet odio bibendum, et rutrum turpis auctor. Morbi sit amet mollis augue, ac rutrum velit. Vestibulum suscipit finibus sapien, et congue enim laoreet consequat. + +Integer aliquam metus iaculis risus hendrerit maximus. Suspendisse vestibulum nibh ac mauris cursus molestie sit amet vel turpis. Nulla et posuere orci. Aliquam dui quam, posuere vitae erat vitae, finibus commodo ipsum. Aliquam eu dui quis arcu porttitor sollicitudin. Integer sodales finibus ullamcorper. Praesent et felis tempor, laoreet libero eget, consequat nisl. Aenean molestie rutrum lorem, ac cursus nisl dapibus vitae. + +Quisque sodales dui et sem bibendum semper. Pellentesque luctus leo nec lacus euismod pellentesque. Phasellus a metus dignissim risus auctor lacinia. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Aenean consectetur bibendum imperdiet. Etiam dignissim rutrum enim, non volutpat nisi condimentum sed. Quisque condimentum ultrices est sit amet facilisis. + +Ut vitae volutpat odio. Sed eget vestibulum libero, eu tincidunt lorem. Nam pretium nulla nisl. Maecenas fringilla nunc ut turpis consectetur, et fringilla sem placerat. Etiam nec scelerisque nisi, at sodales ligula. Aliquam euismod faucibus egestas. Curabitur eget enim quam. Praesent convallis mattis lobortis. Pellentesque a consectetur nisl. Duis molestie diam est. Nam a malesuada augue. Vivamus enim massa, luctus ac elit ut, vestibulum laoreet nulla. Curabitur pellentesque vel mi eget tempus. Donec cursus et leo quis viverra. + +In ac imperdiet ex, nec aliquet erat. Nullam sit amet enim in dolor finibus convallis id eu nibh. Fusce aliquam convallis orci aliquam. +*/ +// Generated 5 paragraphs, 290 words, 2000 bytes of Lorem Ipsum + +private static final String content = Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras sagittis nisl non euismod fermentum. + +Curabitur lacinia vehicula enim quis tristique. Suspendisse imperdiet arcu sed bibendum vulputate. + +Sed vitae nisl vitae turpis dapibus lacinia in id elit. Integer lorem dolor, + +porttitor ut nisi in, tincidunt sodales leo. Aliquam tristique dui sit amet odio bibendum, + +et rutrum turpis auctor. Morbi sit amet mollis augue, ac rutrum velit. + +Vestibulum suscipit finibus sapien, et congue enim laoreet consequat.\r\nInteger + +aliquam metus iaculis
[jira] [Commented] (FLINK-2125) String delimiter for SocketTextStream
[ https://issues.apache.org/jira/browse/FLINK-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721932#comment-14721932 ] ASF GitHub Bot commented on FLINK-2125: --- Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/1077#issuecomment-136243644 Hi, Generally this is a good idea to receive all buffer once instead get a char every time. And you can see my PR:https://github.com/apache/flink/pull/992. There will be some changes in SocketTextStreamFunctionTest.java. And These changes is just maybe. I am not sure since that PR has not been merged yet. Just provide you an info. String delimiter for SocketTextStream - Key: FLINK-2125 URL: https://issues.apache.org/jira/browse/FLINK-2125 Project: Flink Issue Type: Improvement Components: Streaming Affects Versions: 0.9 Reporter: Márton Balassi Priority: Minor Labels: starter The SocketTextStreamFunction uses a character delimiter, despite other parts of the API using String delimiter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2590) DataSetUtils.zipWithUniqueID creates duplicate IDs
[ https://issues.apache.org/jira/browse/FLINK-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721935#comment-14721935 ] ASF GitHub Bot commented on FLINK-2590: --- Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/1075#issuecomment-136245681 @rmetzger +1. I think add a test is helpful. Otherwise can you give us a infomation that prove the 'id = (counter shifter) + taskId; ' will never generate the same id in different task? And a minor thing in you issue description: Is log2(8)=3 not 4? DataSetUtils.zipWithUniqueID creates duplicate IDs -- Key: FLINK-2590 URL: https://issues.apache.org/jira/browse/FLINK-2590 Project: Flink Issue Type: Bug Components: Java API, Scala API Affects Versions: 0.10, master Reporter: Martin Junghanns Assignee: Martin Junghanns Priority: Minor The function creates IDs using the following code: {code:java} shifter = log2(numberOfParallelSubtasks) id = counter shifter + taskId; {code} As the binary function + is executed before the bitshift , this results in cases where different tasks create the same ID. It essentially calculates {code} counter*2^(shifter+taskId) {code} which is 0 for counter = 0 and all values of shifter and taskID. Consider the following example. numberOfParallelSubtaks = 8 shifter = log2(8) = 4 (maybe rename the function?) produces: {code} start: 1, shifter: 4 taskId: 4 label: 256 start: 2, shifter: 4 taskId: 3 label: 256 start: 4, shifter: 4 taskId: 2 label: 256 {code} I would suggest the following: {code} counter*2^(shifter)+taskId {code} which in code is equivalent to {code} shifter = log2(numberOfParallelSubtasks); id = (counter shifter) + taskId; {code} and for our example produces: {code} start: 1, shifter: 4 taskId: 4 label: 20 start: 2, shifter: 4 taskId: 3 label: 35 start: 4, shifter: 4 taskId: 2 label: 66 {code} So we move the counter to the left and add the task id. As there is space for 2^shifter numbers, this prevents collisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2525) Add configuration support in Storm-compatibility
[ https://issues.apache.org/jira/browse/FLINK-2525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721918#comment-14721918 ] ASF GitHub Bot commented on FLINK-2525: --- Github user ffbin commented on the pull request: https://github.com/apache/flink/pull/1046#issuecomment-136241503 @mjsax @StephanEwen I have finish the code changes. 1.serialize Storm Config as a byte[] into the Flink configuration 2.extend ExclamationTopology such that the number of added !in ExclamationBolt and ExclamationWithStormSpout.ExclamationMap is configurable and adapt the tests. 3.extend FiniteStormFileSpout and base class with an empty constructor and configure the file to be opened via Storm configuration Map. I have run flink-storm-compatibility test successfully in local machine and do not know why CI failed. Can you have a look at my code? Thank you very much. Add configuration support in Storm-compatibility Key: FLINK-2525 URL: https://issues.apache.org/jira/browse/FLINK-2525 Project: Flink Issue Type: New Feature Components: Storm Compatibility Reporter: fangfengbin Assignee: fangfengbin Spouts and Bolt are initialized by a call to `Spout.open(...)` and `Bolt.prepare()`, respectively. Both methods have a config `Map` as first parameter. This map is currently not populated. Thus, Spouts and Bolts cannot be configure with user defined parameters. In order to support this feature, spout and bolt wrapper classes need to be extended to create a proper `Map` object. Furthermore, the clients need to be extended to take a `Map`, translate it into a Flink `Configuration` that is forwarded to the wrappers for proper initialization of the map. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2590] fixing DataSetUtils.zipWithUnique...
Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/1075#issuecomment-136245681 @rmetzger +1. I think add a test is helpful. Otherwise can you give us a infomation that prove the 'id = (counter shifter) + taskId; ' will never generate the same id in different task? And a minor thing in you issue description: Is log2(8)=3 not 4? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (FLINK-2596) Failing Test: RandomSamplerTest
[ https://issues.apache.org/jira/browse/FLINK-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li reassigned FLINK-2596: Assignee: Chengxiang Li > Failing Test: RandomSamplerTest > --- > > Key: FLINK-2596 > URL: https://issues.apache.org/jira/browse/FLINK-2596 > Project: Flink > Issue Type: Bug > Components: Tests >Reporter: Matthias J. Sax >Assignee: Chengxiang Li >Priority: Critical > Labels: test-stability > > {noformat} > Tests run: 17, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 14.925 sec > <<< FAILURE! - in org.apache.flink.api.java.sampling.RandomSamplerTest > testReservoirSamplerWithMultiSourcePartitions2(org.apache.flink.api.java.sampling.RandomSamplerTest) > Time elapsed: 0.444 sec <<< ERROR! > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeLo(TimSort.java:747) > at java.util.TimSort.mergeAt(TimSort.java:483) > at java.util.TimSort.mergeCollapse(TimSort.java:410) > at java.util.TimSort.sort(TimSort.java:214) > at java.util.TimSort.sort(TimSort.java:173) > at java.util.Arrays.sort(Arrays.java:659) > at java.util.Collections.sort(Collections.java:217) > at > org.apache.flink.api.java.sampling.RandomSamplerTest.transferFromListToArrayWithOrder(RandomSamplerTest.java:375) > at > org.apache.flink.api.java.sampling.RandomSamplerTest.getSampledOutput(RandomSamplerTest.java:367) > at > org.apache.flink.api.java.sampling.RandomSamplerTest.verifyKSTest(RandomSamplerTest.java:338) > at > org.apache.flink.api.java.sampling.RandomSamplerTest.verifyRandomSamplerWithSampleSize(RandomSamplerTest.java:330) > at > org.apache.flink.api.java.sampling.RandomSamplerTest.verifyReservoirSamplerWithReplacement(RandomSamplerTest.java:290) > at > org.apache.flink.api.java.sampling.RandomSamplerTest.testReservoirSamplerWithMultiSourcePartitions2(RandomSamplerTest.java:212) > Results : > Tests in error: > RandomSamplerTest.testReservoirSamplerWithMultiSourcePartitions2:212->verifyReservoirSamplerWithReplacement:290->verifyRandomSamplerWithSampleSize:330->verifyKSTest:338->getSampledOutput:367->transferFromListToArrayWithOrder:375 > » IllegalArgument > {noformat} > https://travis-ci.org/apache/flink/jobs/77750329 -- This message was sent by Atlassian JIRA (v6.3.4#6332)