[GitHub] flink pull request: [FLINK-1587][gelly] Added additional check for...
Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/440#issuecomment-76914047 The failure in TaskManagerFailsITCase should be unrelated to your PR. This should be fixed once the proper shading is in place. So don't worry and go ahead merging it. On Mon, Mar 2, 2015 at 8:55 PM, Vasia Kalavri notificati...@github.com wrote: There is a failure in TaskManagerFailsITCase for Hadoop 2.0.0-alpha on travis. Is this fixed or something I can ignore and go ahead and merge this one? :)) â Reply to this email directly or view it on GitHub https://github.com/apache/flink/pull/440#issuecomment-76799370. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-999] Configurability of LocalExecutor
GitHub user jkirsch opened a pull request: https://github.com/apache/flink/pull/448 [FLINK-999] Configurability of LocalExecutor Add also the global configuration to the minicluster. This is an attempt to bring in the ability to pass configuration parameters to the embedded mini-cluster - this happened as with more slots, the machine ran out of network buffers. The config could then be initialized using for example `GlobalConfiguration.loadConfiguration(conf);` You can merge this pull request into a Git repository by running: $ git pull https://github.com/jkirsch/incubator-flink configuration Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/448.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #448 commit b37ca76cbfe5b3ede1a130736f4b39d78c980928 Author: Johannes jkirschn...@gmail.com Date: 2015-03-03T15:17:47Z [FLINK-999] Configurability of LocalExecutor Add also the global configuration to the minicluster --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Streaming cancellation + exception handling re...
GitHub user gyfora opened a pull request: https://github.com/apache/flink/pull/449 Streaming cancellation + exception handling rework This PR reworks the way runtime exceptions are handled in the streaming runtime. User code and other types of exceptions thrown during the invocation are now properly propagated. This PR also introduces proper cancellation for streaminvokables with extending the Source and SinkFunction interfaces with a cancel method. Some sources which maintain connections are also reworked to close the connections in any case. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mbalassi/flink FLINK-1625 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/449.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #449 commit e3b26bebe4c9c42790d8f1a30573cbfaf493a45c Author: Gyula Fora gyf...@apache.org Date: 2015-03-03T10:49:37Z [FLINK-1625] [streaming] Refactored StreamVertex and subclasses to clean up after invoke and properly log and propagate exceptions commit 93fb55670523bc52891395c4a32d3bbadf38811f Author: Gyula Fora gyf...@apache.org Date: 2015-03-03T13:43:33Z [FLINK-1625] [streaming] Added proper cancellation to StreamInvokables + Sink- and SourceFunction interfaces extended with cancel method --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1350][FLINK-1359][Distributed runtime] ...
Github user uce closed the pull request at: https://github.com/apache/flink/pull/356 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-377] [FLINK-671] Generic Interface / PA...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-76973194 Yes, this sounds good? Another thing: it has probably already come up but I just want to make sure, you implement CoGroup and Reduce the way you do because of performance, correct? That is, you don't do any work in the user code of a ReduceOperator but you do it in a chained MapPartition because there you get all the elements which makes communication with the python process more efficient. Same with CoGroup, where you implement your own grouping logic in python from the raw input streams. Overall I like the architecture, the communication between the host and the guest language is well abstracted and I can see this being reused for other languages. Could you rename the CoGroupPython* classes to something more generic? Because they really are a part of the generic language binding stuff and not specific to python, correct? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1637) Flink uberjar does not work with Java 6
[ https://issues.apache.org/jira/browse/FLINK-1637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344927#comment-14344927 ] Max Michels commented on FLINK-1637: There are tools like ProGuard which can reduce the number of class files by removing unused code. https://developer.android.com/tools/help/proguard.html Flink uberjar does not work with Java 6 --- Key: FLINK-1637 URL: https://issues.apache.org/jira/browse/FLINK-1637 Project: Flink Issue Type: Bug Affects Versions: 0.9 Environment: Java 6 Reporter: Robert Metzger Assignee: Robert Metzger Priority: Critical Apparently the uberjar created by maven shade does not work with java 6 {code} /jre1.6.0_45/bin/java -classpath flink-0.9-SNAPSHOT/lib/flink-dist-0.9-SNAPSHOT-yarn-uberjar.jar org.apache.flink.client.CliFrontend Exception in thread main java.lang.NoClassDefFoundError: org/apache/flink/client/CliFrontend Caused by: java.lang.ClassNotFoundException: org.apache.flink.client.CliFrontend at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) Could not find the main class: org.apache.flink.client.CliFrontend. Program will exit. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1636) Misleading exception during concurrent partition release and remote request
Ufuk Celebi created FLINK-1636: -- Summary: Misleading exception during concurrent partition release and remote request Key: FLINK-1636 URL: https://issues.apache.org/jira/browse/FLINK-1636 Project: Flink Issue Type: Improvement Components: Distributed Runtime Reporter: Ufuk Celebi Priority: Minor When a result partition is released concurrently with a remote partition request, the request might come in late and result in an exception at the receiving task saying: {code} 16:04:22,499 INFO org.apache.flink.runtime.taskmanager.Task - CHAIN Partition - Map (Map at testRestartMultipleTimes(SimpleRecoveryITCase.java:200)) (1/4) switched to FAILED : java.io.IOException: org.apache.flink.runtime.io.network.partition.queue.IllegalQueueIteratorRequestException at remote input channel: Intermediate result partition has already been released.]. at org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.checkIoError(RemoteInputChannel.java:223) at org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.getNextBuffer(RemoteInputChannel.java:103) at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:310) at org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:75) at org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:34) at org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:59) at org.apache.flink.runtime.operators.NoOpDriver.run(NoOpDriver.java:91) at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Add auto-parallelism to Jobs (0.8 branch)
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/410#issuecomment-76974155 Ping ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1555] Add serializer hierarchy debug ut...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/415 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Add auto-parallelism to Jobs (0.8 branch)
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/410#issuecomment-76978687 @rmetzger I don't see a reason why this should not go to master as well. After all, it's optional and quite useful if you want to run a job on the full cluster with as many available slots as possible. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345284#comment-14345284 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-76979731 yes they are implemented as they are for performance reasons. the python cogroup grouping logic is actually a direct port of the SortMergeCoGroupIterator. it also makes things a bit simpler since you can work on the assumption that an operators function is not called more than once. if, on the java side, hasNext() returns false we know that we processed all input data, something you usually can only say when close() was called. the coGroupPython* stuiff is generic, will try to come up with a more suitable name. Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1577) Misleading error messages when cancelling tasks
[ https://issues.apache.org/jira/browse/FLINK-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345300#comment-14345300 ] Robert Metzger commented on FLINK-1577: --- Thank you! Misleading error messages when cancelling tasks --- Key: FLINK-1577 URL: https://issues.apache.org/jira/browse/FLINK-1577 Project: Flink Issue Type: Improvement Components: Distributed Runtime Affects Versions: master Reporter: Ufuk Celebi A user running a Flink version before bec9c4d ran into a job manager failure (fixed in bec9c4d), which lead to restarting the JM and cancelling/clearing all tasks on the TMs. The logs of the TMs were inconclusive. I think part of that has been fixed by now, e.g. there is a log message when cancelAndClearEverything is called, but the task thread (RuntimeEnvironment) always logs an error when interrupted during the run method -- even if the task gets cancelled. I think these error messages are misleading and only the root cause is important (i.e. non-failed tasks should be silently cancelled). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1555) Add utility to log the serializers of composite types
[ https://issues.apache.org/jira/browse/FLINK-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345255#comment-14345255 ] ASF GitHub Bot commented on FLINK-1555: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/415 Add utility to log the serializers of composite types - Key: FLINK-1555 URL: https://issues.apache.org/jira/browse/FLINK-1555 Project: Flink Issue Type: Improvement Reporter: Robert Metzger Assignee: Robert Metzger Priority: Minor Fix For: 0.9 Users affected by poor performance might want to understand how Flink is serializing their data. Therefore, it would be cool to have a tool utility which logs the serializers like this: {{SerializerUtils.getSerializers(TypeInformationPOJO t);}} to get {code} PojoSerializer TupleSerializer IntSer DateSer GenericTypeSer(java.sql.Date) PojoSerializer GenericTypeSer(HashMap) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1577) Misleading error messages when cancelling tasks
[ https://issues.apache.org/jira/browse/FLINK-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ufuk Celebi resolved FLINK-1577. Resolution: Fixed Fixed in 9255594fb3b9b7c00d9088c3b630af9ecbdf22f4. Misleading error messages when cancelling tasks --- Key: FLINK-1577 URL: https://issues.apache.org/jira/browse/FLINK-1577 Project: Flink Issue Type: Improvement Components: Distributed Runtime Affects Versions: master Reporter: Ufuk Celebi A user running a Flink version before bec9c4d ran into a job manager failure (fixed in bec9c4d), which lead to restarting the JM and cancelling/clearing all tasks on the TMs. The logs of the TMs were inconclusive. I think part of that has been fixed by now, e.g. there is a log message when cancelAndClearEverything is called, but the task thread (RuntimeEnvironment) always logs an error when interrupted during the run method -- even if the task gets cancelled. I think these error messages are misleading and only the root cause is important (i.e. non-failed tasks should be silently cancelled). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-377] [FLINK-671] Generic Interface / PA...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-76981299 You could call it CoGroupRaw, just an idea... Once that and the split into the python and generic part is done I vote for merging this. The API looks good and other stuff, such as getting rid of the type annotations can be worked on afterwards. I think it would be good to get people that are interested to try it out. Also, the code is very well commented and documented. :smile_cat: --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-1555) Add utility to log the serializers of composite types
[ https://issues.apache.org/jira/browse/FLINK-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger resolved FLINK-1555. --- Resolution: Fixed Fix Version/s: 0.9 This has been resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/226e9058. Add utility to log the serializers of composite types - Key: FLINK-1555 URL: https://issues.apache.org/jira/browse/FLINK-1555 Project: Flink Issue Type: Improvement Reporter: Robert Metzger Assignee: Robert Metzger Priority: Minor Fix For: 0.9 Users affected by poor performance might want to understand how Flink is serializing their data. Therefore, it would be cool to have a tool utility which logs the serializers like this: {{SerializerUtils.getSerializers(TypeInformationPOJO t);}} to get {code} PojoSerializer TupleSerializer IntSer DateSer GenericTypeSer(java.sql.Date) PojoSerializer GenericTypeSer(HashMap) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345296#comment-14345296 ] ASF GitHub Bot commented on FLINK-377: -- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-76981299 You could call it CoGroupRaw, just an idea... Once that and the split into the python and generic part is done I vote for merging this. The API looks good and other stuff, such as getting rid of the type annotations can be worked on afterwards. I think it would be good to get people that are interested to try it out. Also, the code is very well commented and documented. :smile_cat: Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1628) Strange behavior of where function during a join
[ https://issues.apache.org/jira/browse/FLINK-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345350#comment-14345350 ] Aljoscha Krettek commented on FLINK-1628: - [~fhueske] and I found the bug in the optimizer. We still have to run some tests, though, to be sure. Strange behavior of where function during a join -- Key: FLINK-1628 URL: https://issues.apache.org/jira/browse/FLINK-1628 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Daniel Bali Labels: batch Hello! If I use the `where` function with a field list during a join, it exhibits strange behavior. Here is the sample code that triggers the error: https://gist.github.com/balidani/d9789b713e559d867d5c This example joins a DataSet with itself, then counts the number of rows. If I use `.where(0, 1)` the result is (22), which is not correct. If I use `EdgeKeySelector`, I get the correct result (101). When I pass a field list to the `equalTo` function (but not `where`), everything works again. If I don't include the `groupBy` and `reduceGroup` parts, everything works. Also, when working with large DataSets, passing a field list to `where` makes it incredibly slow, even though I don't see any exceptions in the log (in DEBUG mode). Does anybody know what might cause this problem? Thanks! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1628) Strange behavior of where function during a join
[ https://issues.apache.org/jira/browse/FLINK-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345353#comment-14345353 ] Aljoscha Krettek commented on FLINK-1628: - This is the fix, line 128 in AbstractJoinDescriptor: {code} @Override public boolean areCompatible(RequestedGlobalProperties requested1, RequestedGlobalProperties requested2, GlobalProperties produced1, GlobalProperties produced2) { if (requested1.getPartitioning().isPartitionedOnKey() requested2.getPartitioning().isPartitionedOnKey()) { return produced1.getPartitioning() == produced2.getPartitioning() produced1.getPartitioningFields().equals(produced2.getPartitioningFields()) (produced1.getCustomPartitioner() == null ? produced2.getCustomPartitioner() == null : produced1.getCustomPartitioner().equals(produced2.getCustomPartitioner())); } else { return true; } } {code} Strange behavior of where function during a join -- Key: FLINK-1628 URL: https://issues.apache.org/jira/browse/FLINK-1628 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Daniel Bali Labels: batch Hello! If I use the `where` function with a field list during a join, it exhibits strange behavior. Here is the sample code that triggers the error: https://gist.github.com/balidani/d9789b713e559d867d5c This example joins a DataSet with itself, then counts the number of rows. If I use `.where(0, 1)` the result is (22), which is not correct. If I use `EdgeKeySelector`, I get the correct result (101). When I pass a field list to the `equalTo` function (but not `where`), everything works again. If I don't include the `groupBy` and `reduceGroup` parts, everything works. Also, when working with large DataSets, passing a field list to `where` makes it incredibly slow, even though I don't see any exceptions in the log (in DEBUG mode). Does anybody know what might cause this problem? Thanks! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1637) Flink uberjar does not work with Java 6
[ https://issues.apache.org/jira/browse/FLINK-1637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345404#comment-14345404 ] ASF GitHub Bot commented on FLINK-1637: --- GitHub user rmetzger opened a pull request: https://github.com/apache/flink/pull/450 [FLINK-1637] Reduce number of files in uberjar for java 6 It seems that we've recently surpassed the magic number of 65536 files in our YARN uberjar. Java 6 is not able to read jar files with so many files. Therefore, I added a check to our travis tests which verify that the number of files in the jar is lower than 65k. To to solve the problem for now, I've removed the `flink-streaming-connectors` from the distribution. If users want to use them, they can just add them as a dependency and they should be fine (assuming our classloading is working properly). I've also removed `flink-hbase` which is also a module users can easily get through mvn central. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rmetzger/flink flink1637 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/450.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #450 commit f3c80c6cab02e9f049cdefceff424f4d431c37e1 Author: Robert Metzger rmetz...@apache.org Date: 2015-03-03T11:44:21Z [FLINK-1637] Reduce number of files in uberjar for java 6 Flink uberjar does not work with Java 6 --- Key: FLINK-1637 URL: https://issues.apache.org/jira/browse/FLINK-1637 Project: Flink Issue Type: Bug Affects Versions: 0.9 Environment: Java 6 Reporter: Robert Metzger Assignee: Robert Metzger Priority: Critical Apparently the uberjar created by maven shade does not work with java 6 {code} /jre1.6.0_45/bin/java -classpath flink-0.9-SNAPSHOT/lib/flink-dist-0.9-SNAPSHOT-yarn-uberjar.jar org.apache.flink.client.CliFrontend Exception in thread main java.lang.NoClassDefFoundError: org/apache/flink/client/CliFrontend Caused by: java.lang.ClassNotFoundException: org.apache.flink.client.CliFrontend at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) Could not find the main class: org.apache.flink.client.CliFrontend. Program will exit. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1616) Action list -r gives IOException when there are running jobs
[ https://issues.apache.org/jira/browse/FLINK-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345457#comment-14345457 ] Stephan Ewen commented on FLINK-1616: - There are a few issues with the CLI frontend. I am doing a major cleanup today. Action list -r gives IOException when there are running jobs -- Key: FLINK-1616 URL: https://issues.apache.org/jira/browse/FLINK-1616 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Vasia Kalavri Priority: Minor Here's the full exception: java.io.IOException: Could not retrieve running jobs from job manager. at org.apache.flink.client.CliFrontend.list(CliFrontend.java:528) at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1089) at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1114) Caused by: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka.tcp://flink@10.20.0.25:13245/user/jobmanager#-1517424081]] after [10 ms] at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:333) at akka.actor.Scheduler$$anon$7.run(Scheduler.scala:117) at akka.actor.LightArrayRevolverScheduler$TaskHolder.run(Scheduler.scala:476) at akka.actor.LightArrayRevolverScheduler$$anonfun$close$1.apply(Scheduler.scala:282) at akka.actor.LightArrayRevolverScheduler$$anonfun$close$1.apply(Scheduler.scala:281) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at akka.actor.LightArrayRevolverScheduler.close(Scheduler.scala:280) at akka.actor.ActorSystemImpl.stopScheduler(ActorSystem.scala:688) at akka.actor.ActorSystemImpl$$anonfun$liftedTree2$1$1.apply$mcV$sp(ActorSystem.scala:617) at akka.actor.ActorSystemImpl$$anonfun$liftedTree2$1$1.apply(ActorSystem.scala:617) at akka.actor.ActorSystemImpl$$anonfun$liftedTree2$1$1.apply(ActorSystem.scala:617) at akka.actor.ActorSystemImpl$$anon$3.run(ActorSystem.scala:641) at akka.actor.ActorSystemImpl$TerminationCallbacks$$anonfun$run$1.runNext$1(ActorSystem.scala:808) at akka.actor.ActorSystemImpl$TerminationCallbacks$$anonfun$run$1.apply$mcV$sp(ActorSystem.scala:811) at akka.actor.ActorSystemImpl$TerminationCallbacks$$anonfun$run$1.apply(ActorSystem.scala:804) at akka.actor.ActorSystemImpl$TerminationCallbacks$$anonfun$run$1.apply(ActorSystem.scala:804) at akka.util.ReentrantGuard.withGuard(LockUtil.scala:15) at akka.actor.ActorSystemImpl$TerminationCallbacks.run(ActorSystem.scala:804) at akka.actor.ActorSystemImpl$$anonfun$terminationCallbacks$1.apply(ActorSystem.scala:638) at akka.actor.ActorSystemImpl$$anonfun$terminationCallbacks$1.apply(ActorSystem.scala:638) at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32) at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67) at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82) at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59) at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59) at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72) at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58) at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) If there are no running jobs, no exception is thrown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1522) Add tests for the library methods and examples
[ https://issues.apache.org/jira/browse/FLINK-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345545#comment-14345545 ] ASF GitHub Bot commented on FLINK-1522: --- Github user vasia commented on the pull request: https://github.com/apache/flink/pull/441#issuecomment-77013031 Hi @balidani! You're right, there is no way to check this without modifying the library method. You can ignore my last bullet point for now :) Add tests for the library methods and examples -- Key: FLINK-1522 URL: https://issues.apache.org/jira/browse/FLINK-1522 Project: Flink Issue Type: New Feature Components: Gelly Reporter: Vasia Kalavri Assignee: Daniel Bali Labels: easyfix, test The current tests in gelly test one method at a time. We should have some tests for complete applications. As a start, we could add one test case per example and this way also make sure that our graph library methods actually give correct results. I'm assigning this to [~andralungu] because she has already implemented the test for SSSP, but I will help as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1522][FLINK-1576] Updated LabelPropagat...
Github user balidani commented on the pull request: https://github.com/apache/flink/pull/441#issuecomment-76998835 Hi @vasia! Thanks for the ideas! I tried to add more test cases that reflect them. However, I'm not sure about the last bullet-point. Do you think my last test case matches this? Cheers! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Comment Edited] (FLINK-1639) Document the Flink deployment scripts to make sure others know how to make release
[ https://issues.apache.org/jira/browse/FLINK-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345637#comment-14345637 ] Henry Saputra edited comment on FLINK-1639 at 3/3/15 8:04 PM: -- Cool! Thanks, Marton, no good goes unpunished =P was (Author: hsaputra): Cool! Thanks [~ mbalassi], no good goes unpunished =P Document the Flink deployment scripts to make sure others know how to make release -- Key: FLINK-1639 URL: https://issues.apache.org/jira/browse/FLINK-1639 Project: Flink Issue Type: Task Components: release Reporter: Henry Saputra Assignee: Márton Balassi Currently, Robert knows the detail about Flink deployment and release scripts to support both Hadoop versions. Need to document details black magic used in the scripts to make sure other knows how the flow work just in case we need to push release and Robert is not available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1637) Flink uberjar does not work with Java 6
[ https://issues.apache.org/jira/browse/FLINK-1637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-1637: -- Component/s: Build System Flink uberjar does not work with Java 6 --- Key: FLINK-1637 URL: https://issues.apache.org/jira/browse/FLINK-1637 Project: Flink Issue Type: Bug Components: Build System Affects Versions: 0.9 Environment: Java 6 Reporter: Robert Metzger Assignee: Robert Metzger Priority: Critical Apparently the uberjar created by maven shade does not work with java 6 {code} /jre1.6.0_45/bin/java -classpath flink-0.9-SNAPSHOT/lib/flink-dist-0.9-SNAPSHOT-yarn-uberjar.jar org.apache.flink.client.CliFrontend Exception in thread main java.lang.NoClassDefFoundError: org/apache/flink/client/CliFrontend Caused by: java.lang.ClassNotFoundException: org.apache.flink.client.CliFrontend at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) Could not find the main class: org.apache.flink.client.CliFrontend. Program will exit. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (FLINK-1633) Add getTriplets() Gelly method
[ https://issues.apache.org/jira/browse/FLINK-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andra Lungu reassigned FLINK-1633: -- Assignee: Andra Lungu Add getTriplets() Gelly method -- Key: FLINK-1633 URL: https://issues.apache.org/jira/browse/FLINK-1633 Project: Flink Issue Type: New Feature Components: Gelly Affects Versions: 0.9 Reporter: Vasia Kalavri Assignee: Andra Lungu Priority: Minor Labels: starter In some graph algorithms, it is required to access the graph edges together with the vertex values of the source and target vertices. For example, several graph weighting schemes compute some kind of similarity weights for edges, based on the attributes of the source and target vertices. This issue proposes adding a convenience Gelly method that generates a DataSet of srcVertex, Edge, TrgVertex triplets from the input graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1639) Document the Flink deployment scripts to make sure others know how to make release
[ https://issues.apache.org/jira/browse/FLINK-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345634#comment-14345634 ] Márton Balassi commented on FLINK-1639: --- I do agree that a documentation is needed on release. It is notable however that I also know how to use them, did a reasonable update for the 0.8.0 release. :) Ufuk also did a release a while ago. All things considered I hope that I can do this while preparing the next release. Document the Flink deployment scripts to make sure others know how to make release -- Key: FLINK-1639 URL: https://issues.apache.org/jira/browse/FLINK-1639 Project: Flink Issue Type: Task Components: release Reporter: Henry Saputra Currently, Robert knows the detail about Flink deployment and release scripts to support both Hadoop versions. Need to document details black magic used in the scripts to make sure other knows how the flow work just in case we need to push release and Robert is not available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1522][FLINK-1576] Updated LabelPropagat...
Github user vasia commented on the pull request: https://github.com/apache/flink/pull/441#issuecomment-77013031 Hi @balidani! You're right, there is no way to check this without modifying the library method. You can ignore my last bullet point for now :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1522) Add tests for the library methods and examples
[ https://issues.apache.org/jira/browse/FLINK-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345412#comment-14345412 ] ASF GitHub Bot commented on FLINK-1522: --- Github user balidani commented on the pull request: https://github.com/apache/flink/pull/441#issuecomment-76998835 Hi @vasia! Thanks for the ideas! I tried to add more test cases that reflect them. However, I'm not sure about the last bullet-point. Do you think my last test case matches this? Cheers! Add tests for the library methods and examples -- Key: FLINK-1522 URL: https://issues.apache.org/jira/browse/FLINK-1522 Project: Flink Issue Type: New Feature Components: Gelly Reporter: Vasia Kalavri Assignee: Daniel Bali Labels: easyfix, test The current tests in gelly test one method at a time. We should have some tests for complete applications. As a start, we could add one test case per example and this way also make sure that our graph library methods actually give correct results. I'm assigning this to [~andralungu] because she has already implemented the test for SSSP, but I will help as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1639) Document the Flink deployment scripts to make sure others know how to make release
[ https://issues.apache.org/jira/browse/FLINK-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345637#comment-14345637 ] Henry Saputra commented on FLINK-1639: -- Cool! Thanks [~ mbalassi], no good goes unpunished =P Document the Flink deployment scripts to make sure others know how to make release -- Key: FLINK-1639 URL: https://issues.apache.org/jira/browse/FLINK-1639 Project: Flink Issue Type: Task Components: release Reporter: Henry Saputra Assignee: Márton Balassi Currently, Robert knows the detail about Flink deployment and release scripts to support both Hadoop versions. Need to document details black magic used in the scripts to make sure other knows how the flow work just in case we need to push release and Robert is not available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1587][gelly] Added additional check for...
Github user vasia commented on the pull request: https://github.com/apache/flink/pull/440#issuecomment-77004964 perfect, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1587) coGroup throws NoSuchElementException on iterator.next()
[ https://issues.apache.org/jira/browse/FLINK-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345455#comment-14345455 ] ASF GitHub Bot commented on FLINK-1587: --- Github user vasia commented on the pull request: https://github.com/apache/flink/pull/440#issuecomment-77004964 perfect, thanks! coGroup throws NoSuchElementException on iterator.next() Key: FLINK-1587 URL: https://issues.apache.org/jira/browse/FLINK-1587 Project: Flink Issue Type: Bug Components: Gelly Environment: flink-0.8.0-SNAPSHOT Reporter: Carsten Brandt Assignee: Andra Lungu I am receiving the following exception when running a simple job that extracts outdegree from a graph using Gelly. It is currently only failing on the cluster and I am not able to reproduce it locally. Will try that the next days. {noformat} 02/20/2015 02:27:02: CoGroup (CoGroup at inDegrees(Graph.java:675)) (5/64) switched to FAILED java.util.NoSuchElementException at java.util.Collections$EmptyIterator.next(Collections.java:3006) at flink.graphs.Graph$CountNeighborsCoGroup.coGroup(Graph.java:665) at org.apache.flink.runtime.operators.CoGroupDriver.run(CoGroupDriver.java:130) at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:493) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257) at java.lang.Thread.run(Thread.java:745) 02/20/2015 02:27:02: Job execution switched to status FAILING ... {noformat} The error occurs in Gellys Graph.java at this line: https://github.com/apache/flink/blob/a51c02f6e8be948d71a00c492808115d622379a7/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java#L636 Is there any valid case where a coGroup Iterator may be empty? As far as I see there is a bug somewhere. I'd like to write a test case for this to reproduce the issue. Where can I put such a test? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1633][gelly] Added getTriplets() method...
GitHub user andralungu opened a pull request: https://github.com/apache/flink/pull/452 [FLINK-1633][gelly] Added getTriplets() method and test A convenience Gelly method that generates a DataSet of srcVertexId, trgVertexId, srcVertexVal, trgVertexVal, edgeValue from the input graph. You can merge this pull request into a Git repository by running: $ git pull https://github.com/andralungu/flink triplets Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/452.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #452 commit a14d27586783e9fce6996851b45f30fcbd9e4782 Author: andralungu lungu.an...@gmail.com Date: 2015-03-03T22:03:43Z [FLINK-1633][gelly] Added getTriplets() method and test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-1631) Port collisions in ProcessReaping tests
[ https://issues.apache.org/jira/browse/FLINK-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-1631. - Resolution: Fixed Fixed via 94a66d570e4bb40824813911a4f1bb47a8bf8b90 Port collisions in ProcessReaping tests --- Key: FLINK-1631 URL: https://issues.apache.org/jira/browse/FLINK-1631 Project: Flink Issue Type: Bug Components: Distributed Runtime Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 The process reaping tests for the JobManager spawn a process that starts a webserver on the default port. It may happen that this port is not available, due to another concurrently running task. I suggest to add an option to not start the webserver to prevent this, by setting the webserver port to {{-1}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1640) FileOutputFormat writes to wrong path if path ends with '/'
Fabian Hueske created FLINK-1640: Summary: FileOutputFormat writes to wrong path if path ends with '/' Key: FLINK-1640 URL: https://issues.apache.org/jira/browse/FLINK-1640 Project: Flink Issue Type: Bug Components: Java API, Scala API Affects Versions: 0.8.1, 0.9 Reporter: Fabian Hueske The FileOutputFormat duplicates the last directory of a path, if the path ends with a slash '/'. For example, if the output path is specified as {{/home/myuser/outputPath/}} the output is written to {{/home/myuser/outputPath/outputPath/}}. This bug was introduced by commit 8fc04e4da8a36866e10564205c3f900894f4f6e0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-377) Create a general purpose framework for language bindings
[ https://issues.apache.org/jira/browse/FLINK-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345656#comment-14345656 ] ASF GitHub Bot commented on FLINK-377: -- Github user zentol commented on the pull request: https://github.com/apache/flink/pull/202#issuecomment-77025149 renamed, rebased, re...structured! Create a general purpose framework for language bindings Key: FLINK-377 URL: https://issues.apache.org/jira/browse/FLINK-377 Project: Flink Issue Type: Improvement Reporter: GitHub Import Assignee: Chesnay Schepler Labels: github-import Fix For: pre-apache A general purpose API to run operators with arbitrary binaries. This will allow to run Stratosphere programs written in Python, JavaScript, Ruby, Go or whatever you like. We suggest using Google Protocol Buffers for data serialization. This is the list of languages that currently support ProtoBuf: https://code.google.com/p/protobuf/wiki/ThirdPartyAddOns Very early prototype with python: https://github.com/rmetzger/scratch/tree/learn-protobuf (basically testing protobuf) For Ruby: https://github.com/infochimps-labs/wukong Two new students working at Stratosphere (@skunert and @filiphaase) are working on this. The reference binding language will be for Python, but other bindings are very welcome. The best name for this so far is stratosphere-lang-bindings. I created this issue to track the progress (and give everybody a chance to comment on this) Imported from GitHub Url: https://github.com/stratosphere/stratosphere/issues/377 Created by: [rmetzger|https://github.com/rmetzger] Labels: enhancement, Assignee: [filiphaase|https://github.com/filiphaase] Created at: Tue Jan 07 19:47:20 CET 2014 State: open -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1616] [client] Overhaul of the client.
GitHub user StephanEwen opened a pull request: https://github.com/apache/flink/pull/451 [FLINK-1616] [client] Overhaul of the client. - Fix bugs with non-serializable messages - Separate parser and action logic - Clean up tests - Vastly improve logging in CLI client - Additional tests for parsing / config setup in the command line client You can merge this pull request into a Git repository by running: $ git pull https://github.com/StephanEwen/incubator-flink master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/451.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #451 commit 23190a553f3af015494e72d93a6d9615972c9b2a Author: Stephan Ewen se...@apache.org Date: 2015-03-03T20:49:37Z [FLINK-1631] [client] Overhaul of the client. - Fix bugs with non-serializable messages - Separate parser and action logic - Clean up tests - Vastly improve logging in CLI client - Additional tests for parsing / config setup in the command line client --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (FLINK-1640) FileOutputFormat writes to wrong path if path ends with '/'
[ https://issues.apache.org/jira/browse/FLINK-1640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske reassigned FLINK-1640: Assignee: Fabian Hueske FileOutputFormat writes to wrong path if path ends with '/' --- Key: FLINK-1640 URL: https://issues.apache.org/jira/browse/FLINK-1640 Project: Flink Issue Type: Bug Components: Java API, Scala API Affects Versions: 0.9, 0.8.1 Reporter: Fabian Hueske Assignee: Fabian Hueske The FileOutputFormat duplicates the last directory of a path, if the path ends with a slash '/'. For example, if the output path is specified as {{/home/myuser/outputPath/}} the output is written to {{/home/myuser/outputPath/outputPath/}}. This bug was introduced by commit 8fc04e4da8a36866e10564205c3f900894f4f6e0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1587][gelly] Added additional check for...
Github user vasia commented on the pull request: https://github.com/apache/flink/pull/440#issuecomment-77068115 Hi, I was about to merge this one, but I have two doubts: - the tests print all execution output, including the exceptions, in standard output. Is this OK or should we avoid it? - apart from the expected exception, the last 3 tests also produce the following: ``` java.lang.Exception: The data preparation for task 'Reduce(SUM(1), at getDegrees(Graph.java:664) (981992a0fbf442e4039eba434b173362)' , caused an error: Error obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated due to an exception: Bug in input gate/channel logic: input gate gotnotified by channel about available data, but none was available. at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:471) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Error obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated due to an exception: Bug in input gate/channel logic: input gate gotnotified by channel about available data, but none was available. at org.apache.flink.runtime.operators.sort.UnilateralSortMerger.getIterator(UnilateralSortMerger.java:607) at org.apache.flink.runtime.operators.RegularPactTask.getInput(RegularPactTask.java:1131) at org.apache.flink.runtime.operators.GroupReduceDriver.prepare(GroupReduceDriver.java:94) at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:466) ... 3 more Caused by: java.io.IOException: Thread 'SortMerger Reading Thread' terminated due to an exception: Bug in input gate/channel logic: input gate gotnotified by channel about available data, but none was available. at org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:784) Caused by: java.lang.IllegalStateException: Bug in input gate/channel logic: input gate gotnotified by channel about available data, but none was available. at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:314) at org.apache.flink.runtime.io.network.partition.consumer.UnionInputGate.getNextBufferOrEvent(UnionInputGate.java:141) at org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:75) at org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:34) at org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:59) at org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ReadingThread.go(UnilateralSortMerger.java:958) at org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:781) ``` I'm not sure where this is coming from and whether it's a problem in this case. Any ideas? Thanks ^^ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1587) coGroup throws NoSuchElementException on iterator.next()
[ https://issues.apache.org/jira/browse/FLINK-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14346050#comment-14346050 ] ASF GitHub Bot commented on FLINK-1587: --- Github user vasia commented on the pull request: https://github.com/apache/flink/pull/440#issuecomment-77068115 Hi, I was about to merge this one, but I have two doubts: - the tests print all execution output, including the exceptions, in standard output. Is this OK or should we avoid it? - apart from the expected exception, the last 3 tests also produce the following: ``` java.lang.Exception: The data preparation for task 'Reduce(SUM(1), at getDegrees(Graph.java:664) (981992a0fbf442e4039eba434b173362)' , caused an error: Error obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated due to an exception: Bug in input gate/channel logic: input gate gotnotified by channel about available data, but none was available. at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:471) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Error obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated due to an exception: Bug in input gate/channel logic: input gate gotnotified by channel about available data, but none was available. at org.apache.flink.runtime.operators.sort.UnilateralSortMerger.getIterator(UnilateralSortMerger.java:607) at org.apache.flink.runtime.operators.RegularPactTask.getInput(RegularPactTask.java:1131) at org.apache.flink.runtime.operators.GroupReduceDriver.prepare(GroupReduceDriver.java:94) at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:466) ... 3 more Caused by: java.io.IOException: Thread 'SortMerger Reading Thread' terminated due to an exception: Bug in input gate/channel logic: input gate gotnotified by channel about available data, but none was available. at org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:784) Caused by: java.lang.IllegalStateException: Bug in input gate/channel logic: input gate gotnotified by channel about available data, but none was available. at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:314) at org.apache.flink.runtime.io.network.partition.consumer.UnionInputGate.getNextBufferOrEvent(UnionInputGate.java:141) at org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:75) at org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:34) at org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:59) at org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ReadingThread.go(UnilateralSortMerger.java:958) at org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:781) ``` I'm not sure where this is coming from and whether it's a problem in this case. Any ideas? Thanks ^^ coGroup throws NoSuchElementException on iterator.next() Key: FLINK-1587 URL: https://issues.apache.org/jira/browse/FLINK-1587 Project: Flink Issue Type: Bug Components: Gelly Environment: flink-0.8.0-SNAPSHOT Reporter: Carsten Brandt Assignee: Andra Lungu I am receiving the following exception when running a simple job that extracts outdegree from a graph using Gelly. It is currently only failing on the cluster and I am not able to reproduce it locally. Will try that the next days. {noformat} 02/20/2015 02:27:02: CoGroup (CoGroup at inDegrees(Graph.java:675)) (5/64) switched to FAILED java.util.NoSuchElementException at java.util.Collections$EmptyIterator.next(Collections.java:3006) at flink.graphs.Graph$CountNeighborsCoGroup.coGroup(Graph.java:665) at org.apache.flink.runtime.operators.CoGroupDriver.run(CoGroupDriver.java:130) at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:493) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:257) at java.lang.Thread.run(Thread.java:745) 02/20/2015 02:27:02: Job execution switched to status FAILING ... {noformat} The error occurs in Gellys
[jira] [Commented] (FLINK-1627) Netty channel connect deadlock
[ https://issues.apache.org/jira/browse/FLINK-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344837#comment-14344837 ] Ufuk Celebi commented on FLINK-1627: I couldn't yet reproduce the problem (I did 10 builds on Travis, which all passed), but I've discovered some related issues, which I've addressed in 256b2ee and 6da093a. In 256b2ee I've addressed a swallowed exception, which could lead to a deadlock when handing in a channel failed. With 6da093a interrupted exceptions during a wait for channel call are thrown up to the reader API when requesting a result partition. I will keep this open and trigger some further Travis builds with the hope of reproducing the error. Netty channel connect deadlock --- Key: FLINK-1627 URL: https://issues.apache.org/jira/browse/FLINK-1627 Project: Flink Issue Type: Bug Reporter: Ufuk Celebi [~StephanEwen] reports the following deadlock (https://travis-ci.org/StephanEwen/incubator-flink/jobs/52755844, logs: https://s3.amazonaws.com/flink.a.o.uce.east/travis-artifacts/StephanEwen/incubator-flink/477/477.2.tar.gz). {code} CHAIN Partition - Map (Map at testRestartMultipleTimes(SimpleRecoveryITCase.java:200)) (2/4) daemon prio=10 tid=0x7f5fdc008800 nid=0xe230 in Object.wait() [0x7f5fca8f2000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xf2a13530 (a java.lang.Object) at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.waitForChannel(PartitionRequestClientFactory.java:179) - locked 0xf2a13530 (a java.lang.Object) at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.access$000(PartitionRequestClientFactory.java:125) at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:64) at org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:53) at org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestIntermediateResultPartition(RemoteInputChannel.java:92) at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:287) - locked 0xf29dbcd8 (a java.lang.Object) at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:306) at org.apache.flink.runtime.io.network.api.reader.AbstractRecordReader.getNextRecord(AbstractRecordReader.java:75) at org.apache.flink.runtime.io.network.api.reader.MutableRecordReader.next(MutableRecordReader.java:34) at org.apache.flink.runtime.operators.util.ReaderIterator.next(ReaderIterator.java:59) at org.apache.flink.runtime.operators.NoOpDriver.run(NoOpDriver.java:91) at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205) at java.lang.Thread.run(Thread.java:745) {code} {code} CHAIN Partition - Map (Map at testRestartMultipleTimes(SimpleRecoveryITCase.java:200)) (3/4) daemon prio=10 tid=0x7f5fdc005000 nid=0xe22f in Object.wait() [0x7f5fca9f3000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xf2a13530 (a java.lang.Object) at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.waitForChannel(PartitionRequestClientFactory.java:179) - locked 0xf2a13530 (a java.lang.Object) at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory$ConnectingChannel.access$000(PartitionRequestClientFactory.java:125) at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:79) at org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:53) at org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestIntermediateResultPartition(RemoteInputChannel.java:92) at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:287) - locked 0xf2896f88 (a java.lang.Object) at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:306)
[jira] [Created] (FLINK-1637) Flink uberjar does not work with Java 6
Robert Metzger created FLINK-1637: - Summary: Flink uberjar does not work with Java 6 Key: FLINK-1637 URL: https://issues.apache.org/jira/browse/FLINK-1637 Project: Flink Issue Type: Bug Affects Versions: 0.9 Environment: Java 6 Reporter: Robert Metzger Assignee: Robert Metzger Priority: Critical Apparently the uberjar created by maven shade does not work with java 6 {code} /jre1.6.0_45/bin/java -classpath flink-0.9-SNAPSHOT/lib/flink-dist-0.9-SNAPSHOT-yarn-uberjar.jar org.apache.flink.client.CliFrontend Exception in thread main java.lang.NoClassDefFoundError: org/apache/flink/client/CliFrontend Caused by: java.lang.ClassNotFoundException: org.apache.flink.client.CliFrontend at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) Could not find the main class: org.apache.flink.client.CliFrontend. Program will exit. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1637) Flink uberjar does not work with Java 6
[ https://issues.apache.org/jira/browse/FLINK-1637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344918#comment-14344918 ] Robert Metzger commented on FLINK-1637: --- The issue is that Java 6 can not load jar files with more than 65536 files. http://bugs.java.com/bugdatabase/view_bug.do?bug_id=7191282 It seems that we've recently surpassed that number for our uberjar. It contains 65868 files. Flink uberjar does not work with Java 6 --- Key: FLINK-1637 URL: https://issues.apache.org/jira/browse/FLINK-1637 Project: Flink Issue Type: Bug Affects Versions: 0.9 Environment: Java 6 Reporter: Robert Metzger Assignee: Robert Metzger Priority: Critical Apparently the uberjar created by maven shade does not work with java 6 {code} /jre1.6.0_45/bin/java -classpath flink-0.9-SNAPSHOT/lib/flink-dist-0.9-SNAPSHOT-yarn-uberjar.jar org.apache.flink.client.CliFrontend Exception in thread main java.lang.NoClassDefFoundError: org/apache/flink/client/CliFrontend Caused by: java.lang.ClassNotFoundException: org.apache.flink.client.CliFrontend at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) Could not find the main class: org.apache.flink.client.CliFrontend. Program will exit. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1635) Remove Apache Thrift dependency from Flink
[ https://issues.apache.org/jira/browse/FLINK-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344841#comment-14344841 ] Stephan Ewen commented on FLINK-1635: - I agree, let's remove this. The way that it currently does out-of-the-box ties it to a specific version, which is bad for both protobuf and thrift. Remove Apache Thrift dependency from Flink -- Key: FLINK-1635 URL: https://issues.apache.org/jira/browse/FLINK-1635 Project: Flink Issue Type: Bug Components: Java API Affects Versions: 0.9 Reporter: Robert Metzger I've added Thrift and Protobuf to Flink to support it out of the box with Kryo. However, after trying to access a HCatalog/Hive table yesterday using Flink I found that there is a dependency conflict between Flink and Hive (on thrift). Maybe it makes more sense to properly document our serialization framework and provide a copypaste solution on how to get thrift/protobuf et al to work with Flink. Please chime in if you are against removing the out of the box support for protobuf and kryo. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1577) Misleading error messages when cancelling tasks
[ https://issues.apache.org/jira/browse/FLINK-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345169#comment-14345169 ] Robert Metzger commented on FLINK-1577: --- Another user was affected by this. Would be good to resolve this, because the messages are indeed misleading. Misleading error messages when cancelling tasks --- Key: FLINK-1577 URL: https://issues.apache.org/jira/browse/FLINK-1577 Project: Flink Issue Type: Improvement Components: Distributed Runtime Affects Versions: master Reporter: Ufuk Celebi A user running a Flink version before bec9c4d ran into a job manager failure (fixed in bec9c4d), which lead to restarting the JM and cancelling/clearing all tasks on the TMs. The logs of the TMs were inconclusive. I think part of that has been fixed by now, e.g. there is a log message when cancelAndClearEverything is called, but the task thread (RuntimeEnvironment) always logs an error when interrupted during the run method -- even if the task gets cancelled. I think these error messages are misleading and only the root cause is important (i.e. non-failed tasks should be silently cancelled). -- This message was sent by Atlassian JIRA (v6.3.4#6332)