[jira] [Commented] (FLINK-1615) Introduces a new InputFormat for Tweets
[ https://issues.apache.org/jira/browse/FLINK-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509139#comment-14509139 ] ASF GitHub Bot commented on FLINK-1615: --- Github user Elbehery commented on the pull request: https://github.com/apache/flink/pull/442#issuecomment-95603365 @aljoscha you should be able to see it from the PR .. anyway this is mine https://github.com/Elbehery/flink Introduces a new InputFormat for Tweets --- Key: FLINK-1615 URL: https://issues.apache.org/jira/browse/FLINK-1615 Project: Flink Issue Type: New Feature Components: flink-contrib Affects Versions: 0.8.1 Reporter: mustafa elbehery Priority: Minor An event-driven parser for Tweets into Java Pojos. It parses all the important part of the tweet into Java objects. Tested on cluster and the performance in pretty well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1848) Paths containing a Windows drive letter cannot be used in FileOutputFormats
[ https://issues.apache.org/jira/browse/FLINK-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509282#comment-14509282 ] Alexander Alexandrov commented on FLINK-1848: - I think this should be fixed (with the PR) and can be closed now, right? Paths containing a Windows drive letter cannot be used in FileOutputFormats --- Key: FLINK-1848 URL: https://issues.apache.org/jira/browse/FLINK-1848 Project: Flink Issue Type: Bug Affects Versions: 0.9 Environment: Windows (Cygwin and native) Reporter: Fabian Hueske Assignee: Fabian Hueske Priority: Critical Fix For: 0.9 Paths that contain a Windows drive letter such as {{file:///c:/my/directory}} cannot be used as output path for {{FileOutputFormat}}. If done, the following exception is thrown: {code} Caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: file:c: at org.apache.flink.core.fs.Path.initialize(Path.java:242) at org.apache.flink.core.fs.Path.init(Path.java:225) at org.apache.flink.core.fs.Path.init(Path.java:138) at org.apache.flink.core.fs.local.LocalFileSystem.pathToFile(LocalFileSystem.java:147) at org.apache.flink.core.fs.local.LocalFileSystem.mkdirs(LocalFileSystem.java:232) at org.apache.flink.core.fs.local.LocalFileSystem.mkdirs(LocalFileSystem.java:233) at org.apache.flink.core.fs.local.LocalFileSystem.mkdirs(LocalFileSystem.java:233) at org.apache.flink.core.fs.local.LocalFileSystem.mkdirs(LocalFileSystem.java:233) at org.apache.flink.core.fs.local.LocalFileSystem.mkdirs(LocalFileSystem.java:233) at org.apache.flink.core.fs.FileSystem.initOutPathLocalFS(FileSystem.java:603) at org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:233) at org.apache.flink.api.java.io.CsvOutputFormat.open(CsvOutputFormat.java:158) at org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:183) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217) at java.lang.Thread.run(Unknown Source) Caused by: java.net.URISyntaxException: Relative path in absolute URI: file:c: at java.net.URI.checkPath(Unknown Source) at java.net.URI.init(Unknown Source) at org.apache.flink.core.fs.Path.initialize(Path.java:240) ... 14 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1615] [java api] SimpleTweetInputFormat
Github user Elbehery commented on the pull request: https://github.com/apache/flink/pull/442#issuecomment-95603365 @aljoscha you should be able to see it from the PR .. anyway this is mine https://github.com/Elbehery/flink --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1828) Impossible to output data to an HBase table
[ https://issues.apache.org/jira/browse/FLINK-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509134#comment-14509134 ] ASF GitHub Bot commented on FLINK-1828: --- Github user fpompermaier commented on a diff in the pull request: https://github.com/apache/flink/pull/571#discussion_r28966088 --- Diff: flink-staging/flink-hbase/pom.xml --- @@ -112,6 +112,12 @@ under the License. /exclusion /exclusions /dependency + dependency --- End diff -- Could you do that? Impossible to output data to an HBase table --- Key: FLINK-1828 URL: https://issues.apache.org/jira/browse/FLINK-1828 Project: Flink Issue Type: Bug Components: Hadoop Compatibility Affects Versions: 0.9 Reporter: Flavio Pompermaier Labels: hadoop, hbase Fix For: 0.9 Right now it is not possible to use HBase TableOutputFormat as output format because Configurable.setConf is not called in the configure() method of the HadoopOutputFormatBase -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1693) Deprecate the Spargel API
[ https://issues.apache.org/jira/browse/FLINK-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509442#comment-14509442 ] ASF GitHub Bot commented on FLINK-1693: --- Github user vasia commented on the pull request: https://github.com/apache/flink/pull/618#issuecomment-95668944 Thank you @hsaputra! Looks good :-) Deprecate the Spargel API - Key: FLINK-1693 URL: https://issues.apache.org/jira/browse/FLINK-1693 Project: Flink Issue Type: Task Components: Spargel Affects Versions: 0.9 Reporter: Vasia Kalavri Assignee: Henry Saputra For the upcoming 0.9 release, we should mark all user-facing methods from the Spargel API as deprecated, with a warning that we are going to remove it at some point. We should also add a comment in the docs and point people to Gelly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-1693 Add DEPRECATED annotations in Sparg...
Github user vasia commented on the pull request: https://github.com/apache/flink/pull/618#issuecomment-95668944 Thank you @hsaputra! Looks good :-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Fixed Configurable HadoopOutputFormat (FLINK-1...
Github user fpompermaier commented on a diff in the pull request: https://github.com/apache/flink/pull/571#discussion_r28966088 --- Diff: flink-staging/flink-hbase/pom.xml --- @@ -112,6 +112,12 @@ under the License. /exclusion /exclusions /dependency + dependency --- End diff -- Could you do that? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1930) NullPointerException in vertex-centric iteration
[ https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509090#comment-14509090 ] ASF GitHub Bot commented on FLINK-1930: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/619#issuecomment-95593489 Manually merged in 4dbf030a6b0415832862c3fd0c3fe7403878a998 NullPointerException in vertex-centric iteration Key: FLINK-1930 URL: https://issues.apache.org/jira/browse/FLINK-1930 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Vasia Kalavri Hello to my Squirrels, I came across this exception when having a vertex-centric iteration output followed by a group by. I'm not sure if what is causing it, since I saw this error in a rather large pipeline, but I managed to reproduce it with [this code example | https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309] and a sufficiently large dataset, e.g. [this one | http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally). It seems like a null Buffer in RecordWriter. The exception message is the following: Exception in thread main org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.NullPointerException at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92) at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1930) NullPointerException in vertex-centric iteration
[ https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509091#comment-14509091 ] ASF GitHub Bot commented on FLINK-1930: --- Github user StephanEwen closed the pull request at: https://github.com/apache/flink/pull/619 NullPointerException in vertex-centric iteration Key: FLINK-1930 URL: https://issues.apache.org/jira/browse/FLINK-1930 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Vasia Kalavri Hello to my Squirrels, I came across this exception when having a vertex-centric iteration output followed by a group by. I'm not sure if what is causing it, since I saw this error in a rather large pipeline, but I managed to reproduce it with [this code example | https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309] and a sufficiently large dataset, e.g. [this one | http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally). It seems like a null Buffer in RecordWriter. The exception message is the following: Exception in thread main org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.NullPointerException at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92) at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1930) NullPointerException in vertex-centric iteration
[ https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509097#comment-14509097 ] Stephan Ewen commented on FLINK-1930: - Should give better error messages now. Please post if this happens again... NullPointerException in vertex-centric iteration Key: FLINK-1930 URL: https://issues.apache.org/jira/browse/FLINK-1930 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Vasia Kalavri Hello to my Squirrels, I came across this exception when having a vertex-centric iteration output followed by a group by. I'm not sure if what is causing it, since I saw this error in a rather large pipeline, but I managed to reproduce it with [this code example | https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309] and a sufficiently large dataset, e.g. [this one | http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally). It seems like a null Buffer in RecordWriter. The exception message is the following: Exception in thread main org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.NullPointerException at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92) at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1930] [runtime] Improve exception when ...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/619#issuecomment-95593489 Manually merged in 4dbf030a6b0415832862c3fd0c3fe7403878a998 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1930) NullPointerException in vertex-centric iteration
[ https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509124#comment-14509124 ] Vasia Kalavri commented on FLINK-1930: -- Ok, so now the program fails with this: Caused by: java.lang.IllegalStateException: Buffer pool is destroyed. at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBuffer(LocalBufferPool.java:144) at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBlocking(LocalBufferPool.java:133) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:91) at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:745) NullPointerException in vertex-centric iteration Key: FLINK-1930 URL: https://issues.apache.org/jira/browse/FLINK-1930 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Vasia Kalavri Hello to my Squirrels, I came across this exception when having a vertex-centric iteration output followed by a group by. I'm not sure if what is causing it, since I saw this error in a rather large pipeline, but I managed to reproduce it with [this code example | https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309] and a sufficiently large dataset, e.g. [this one | http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally). It seems like a null Buffer in RecordWriter. The exception message is the following: Exception in thread main org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.NullPointerException at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92) at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (FLINK-1865) Unstable test KafkaITCase
[ https://issues.apache.org/jira/browse/FLINK-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger reassigned FLINK-1865: - Assignee: Robert Metzger (was: Márton Balassi) Unstable test KafkaITCase - Key: FLINK-1865 URL: https://issues.apache.org/jira/browse/FLINK-1865 Project: Flink Issue Type: Bug Components: Streaming, Tests Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Robert Metzger {code} Running org.apache.flink.streaming.connectors.kafka.KafkaITCase 04/10/2015 13:46:53 Job execution switched to status RUNNING. 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:46:53 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.ChainableInvokable.collect(ChainableInvokable.java:54) at org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39) at org.apache.flink.streaming.connectors.kafka.api.KafkaSource.run(KafkaSource.java:196) at org.apache.flink.streaming.api.invokable.SourceInvokable.callUserFunction(SourceInvokable.java:42) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 4 more Caused by: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:166) at org.apache.flink.streaming.connectors.kafka.KafkaITCase$1.invoke(KafkaITCase.java:141) at org.apache.flink.streaming.api.invokable.SinkInvokable.callUserFunction(SinkInvokable.java:41) at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:139) ... 9 more 04/10/2015 13:47:04 Job execution switched to status FAILING. 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELING 04/10/2015 13:47:04 Custom Source - Stream Sink(1/1) switched to CANCELED 04/10/2015 13:47:04 Job execution switched to status FAILED. 04/10/2015 13:47:05 Job execution switched to status RUNNING. 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to SCHEDULED 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to DEPLOYING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:05 Custom Source - Stream Sink(1/1) switched to RUNNING 04/10/2015 13:47:15 Custom Source - Stream Sink(1/1) switched to FAILED java.lang.RuntimeException: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at org.apache.flink.streaming.api.invokable.SourceInvokable.invoke(SourceInvokable.java:37) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:168) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:701) Caused by: java.lang.RuntimeException: org.apache.flink.streaming.connectors.kafka.KafkaITCase$SuccessException at org.apache.flink.streaming.api.invokable.StreamInvokable.callUserFunctionAndLogException(StreamInvokable.java:145) at
[jira] [Assigned] (FLINK-1921) Rework parallelism/slots handling for per-job YARN sessions
[ https://issues.apache.org/jira/browse/FLINK-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger reassigned FLINK-1921: - Assignee: Robert Metzger Rework parallelism/slots handling for per-job YARN sessions --- Key: FLINK-1921 URL: https://issues.apache.org/jira/browse/FLINK-1921 Project: Flink Issue Type: Bug Components: YARN Client Reporter: Robert Metzger Assignee: Robert Metzger Right now, the -p argument is overwriting the -ys argument for per job yarn sessions. Also, the priorities for parallelism should be documented: low to high 1. flink-conf.yaml (-D arguments on YARN) 2. -p on ./bin/flink 3. ExecutionEnvironment.setParallelism() 4. Operator.setParallelism(). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (FLINK-1920) Passing -D akka.ask.timeout=5 min to yarn client does not work
[ https://issues.apache.org/jira/browse/FLINK-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger reassigned FLINK-1920: - Assignee: Robert Metzger Passing -D akka.ask.timeout=5 min to yarn client does not work -- Key: FLINK-1920 URL: https://issues.apache.org/jira/browse/FLINK-1920 Project: Flink Issue Type: Bug Components: YARN Client Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Thats probably an issue of the command line parsing. Variations like -D akka.ask.timeout=5 min or -D akka.ask.timeout=5 min are also not working. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1615) Introduces a new InputFormat for Tweets
[ https://issues.apache.org/jira/browse/FLINK-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508978#comment-14508978 ] ASF GitHub Bot commented on FLINK-1615: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/442#issuecomment-95570894 Where is your git repository? So that I can checkout your commit and merge it? Introduces a new InputFormat for Tweets --- Key: FLINK-1615 URL: https://issues.apache.org/jira/browse/FLINK-1615 Project: Flink Issue Type: New Feature Components: flink-contrib Affects Versions: 0.8.1 Reporter: mustafa elbehery Priority: Minor An event-driven parser for Tweets into Java Pojos. It parses all the important part of the tweet into Java objects. Tested on cluster and the performance in pretty well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1615] [java api] SimpleTweetInputFormat
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/442#issuecomment-95570894 Where is your git repository? So that I can checkout your commit and merge it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-1935) Reimplement PersistentKafkaSource using high level Kafka API
Robert Metzger created FLINK-1935: - Summary: Reimplement PersistentKafkaSource using high level Kafka API Key: FLINK-1935 URL: https://issues.apache.org/jira/browse/FLINK-1935 Project: Flink Issue Type: Improvement Components: Kafka Connector, Streaming Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Fix For: 0.9 The current PersistentKafkaSource in Flink has some limitations that I seek to overcome by reimplementing it using Kafka's high level API (and manually committing the offsets to ZK). This approach only works when the offsets are committed to ZK directly. The current PersistentKafkaSource does not integrate with existing Kafka tools (for example for monitoring the lag). All the communication with Zookeeper is implemented manually in our current code. This is prone to errors and inefficiencies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509002#comment-14509002 ] ASF GitHub Bot commented on FLINK-1398: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-95577570 +1 for not specializing code in the `DataSet` A `TupleUtil` would be fine, in my opinion A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Assignee: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml} data.extractElement(1); {code} This means that we implement a function for DataSet which extracts a certain element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1398] Introduce extractSingleField() in...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-95577570 +1 for not specializing code in the `DataSet` A `TupleUtil` would be fine, in my opinion --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1472] Fixed Web frontend config overvie...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/439#issuecomment-95579057 I agree to enforce normalization in the `ConfigConstants`, by adding a test case that reflectively gets all fields and checks whether they start with DEFAULT_ or end in _KEY. The mapping between keys/values will probably have to be done manually, unless we add annotations above the key fields declaring what the default value is. The type of the default value should help to figure out what get method to use. (getInteger, getDouble) in order to get the correct conversion method. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1472) Web frontend config overview shows wrong value
[ https://issues.apache.org/jira/browse/FLINK-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509006#comment-14509006 ] ASF GitHub Bot commented on FLINK-1472: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/439#issuecomment-95579057 I agree to enforce normalization in the `ConfigConstants`, by adding a test case that reflectively gets all fields and checks whether they start with DEFAULT_ or end in _KEY. The mapping between keys/values will probably have to be done manually, unless we add annotations above the key fields declaring what the default value is. The type of the default value should help to figure out what get method to use. (getInteger, getDouble) in order to get the correct conversion method. Web frontend config overview shows wrong value -- Key: FLINK-1472 URL: https://issues.apache.org/jira/browse/FLINK-1472 Project: Flink Issue Type: Bug Components: Webfrontend Affects Versions: master Reporter: Ufuk Celebi Assignee: Mingliang Qi Priority: Minor The web frontend shows configuration values even if they could not be correctly parsed. For example I've configured the number of buffers as 123.000, which cannot be parsed as an Integer by GlobalConfiguration and the default value is used. Still, the web frontend shows the not used 123.000. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1745) Add k-nearest-neighbours algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508609#comment-14508609 ] Till Rohrmann commented on FLINK-1745: -- That sounds really good [~chiwanpark] and [~raghav.chalapa...@gmail.com] :-). We can use this issue to track the progress of the exact kNN implementation and I'll add another one for the approximated kNN. Add k-nearest-neighbours algorithm to machine learning library -- Key: FLINK-1745 URL: https://issues.apache.org/jira/browse/FLINK-1745 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Chiwan Park Labels: ML, Starter Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial it is still used as a mean to classify data and to do regression. Could be a starter task. Resources: [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm] [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1933) Add distance measure interface and basic implementation to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508612#comment-14508612 ] Till Rohrmann commented on FLINK-1933: -- Good selection of distance measures. Should be more than enough to start with :-) Add distance measure interface and basic implementation to machine learning library --- Key: FLINK-1933 URL: https://issues.apache.org/jira/browse/FLINK-1933 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Chiwan Park Assignee: Chiwan Park Labels: ML Add distance measure interface to calculate distance between two vectors and some implementations of the interface. In FLINK-1745, [~till.rohrmann] suggests a interface following: {code} trait DistanceMeasure { def distance(a: Vector, b: Vector): Double } {code} I think that following list of implementation is sufficient to provide first to ML library users. * Manhattan distance [1] * Cosine distance [2] * Euclidean distance (and Squared) [3] * Tanimoto distance [4] * Minkowski distance [5] * Chebyshev distance [6] [1]: http://en.wikipedia.org/wiki/Taxicab_geometry [2]: http://en.wikipedia.org/wiki/Cosine_similarity [3]: http://en.wikipedia.org/wiki/Euclidean_distance [4]: http://en.wikipedia.org/wiki/Jaccard_index#Tanimoto_coefficient_.28extended_Jaccard_coefficient.29 [5]: http://en.wikipedia.org/wiki/Minkowski_distance [6]: http://en.wikipedia.org/wiki/Chebyshev_distance -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1867) TaskManagerFailureRecoveryITCase causes stalled travis builds
[ https://issues.apache.org/jira/browse/FLINK-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508635#comment-14508635 ] ASF GitHub Bot commented on FLINK-1867: --- Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/612#discussion_r28943598 --- Diff: flink-tests/src/test/scala/org/apache/flink/api/scala/runtime/taskmanager/TaskManagerFailsITCase.scala --- @@ -231,9 +231,9 @@ with WordSpecLike with Matchers with BeforeAndAfterAll { config.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, numSlots) config.setInteger(ConfigConstants.LOCAL_INSTANCE_MANAGER_NUMBER_TASK_MANAGER, numTaskmanagers) config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1000 ms) -config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 4000 ms) +config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 20 s) --- End diff -- No, it doesn't before and after this change the 4 tests complete in about 8 seconds. @tillrohrmann suggested that, since the actor system is local, there are some other mechanisms in play that signal failure in this case. TaskManagerFailureRecoveryITCase causes stalled travis builds - Key: FLINK-1867 URL: https://issues.apache.org/jira/browse/FLINK-1867 Project: Flink Issue Type: Bug Components: TaskManager, Tests Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Aljoscha Krettek There are currently tests on travis failing: https://travis-ci.org/apache/flink/jobs/57943063 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1867/1880] Raise test timeouts in hope ...
Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/612#discussion_r28943704 --- Diff: flink-tests/src/test/scala/org/apache/flink/api/scala/runtime/jobmanager/JobManagerFailsITCase.scala --- @@ -18,24 +18,23 @@ package org.apache.flink.api.scala.runtime.jobmanager -import akka.actor.Status.{Success, Failure} +import akka.actor.Status.Success import akka.actor.{ActorSystem, PoisonPill} import akka.testkit.{ImplicitSender, TestKit} +import org.junit.Ignore +import org.junit.runner.RunWith +import org.scalatest.junit.JUnitRunner +import org.scalatest.{BeforeAndAfterAll, Matchers, WordSpecLike} + import org.apache.flink.configuration.{ConfigConstants, Configuration} import org.apache.flink.runtime.akka.AkkaUtils -import org.apache.flink.runtime.client.JobExecutionException -import org.apache.flink.runtime.jobgraph.{JobGraph, AbstractJobVertex} -import org.apache.flink.runtime.jobmanager.Tasks.{NoOpInvokable, BlockingNoOpInvokable} +import org.apache.flink.runtime.jobgraph.{AbstractJobVertex, JobGraph} +import org.apache.flink.runtime.jobmanager.Tasks.{BlockingNoOpInvokable, NoOpInvokable} import org.apache.flink.runtime.messages.JobManagerMessages._ import org.apache.flink.runtime.testingUtils.TestingMessages.DisableDisconnect -import org.apache.flink.runtime.testingUtils.TestingTaskManagerMessages.{JobManagerTerminated, -NotifyWhenJobManagerTerminated} +import org.apache.flink.runtime.testingUtils.TestingTaskManagerMessages.{JobManagerTerminated, NotifyWhenJobManagerTerminated} import org.apache.flink.runtime.testingUtils.TestingUtils import org.apache.flink.test.util.ForkableFlinkMiniCluster -import org.junit.Ignore -import org.junit.runner.RunWith -import org.scalatest.junit.JUnitRunner -import org.scalatest.{BeforeAndAfterAll, Matchers, WordSpecLike} @Ignore(Contains a bug with Akka 2.2.1) --- End diff -- True, removing it --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-1486) Add a string to the print method to identify output
[ https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels resolved FLINK-1486. --- Resolution: Implemented Add a string to the print method to identify output --- Key: FLINK-1486 URL: https://issues.apache.org/jira/browse/FLINK-1486 Project: Flink Issue Type: Improvement Components: Local Runtime Reporter: Maximilian Michels Assignee: Maximilian Michels Priority: Minor Labels: usability Fix For: 0.9 The output of the {{print}} method of {[DataSet}} is mainly used for debug purposes. Currently, it is difficult to identify the output. I would suggest to add another {{print(String str)}} method which allows the user to supply a String to identify the output. This could be a prefix before the actual output or a format string (which might be an overkill). {code} DataSet data = env.fromElements(1,2,3,4,5); {code} For example, {{data.print(MyDataSet: )}} would output print {noformat} MyDataSet: 1 MyDataSet: 2 ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1615) Introduces a new InputFormat for Tweets
[ https://issues.apache.org/jira/browse/FLINK-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508743#comment-14508743 ] ASF GitHub Bot commented on FLINK-1615: --- Github user Elbehery commented on the pull request: https://github.com/apache/flink/pull/442#issuecomment-95512206 I did rebase against the master before creating the PR .. but this was long time ago, could this be the problems for the conflicts ?!! Introduces a new InputFormat for Tweets --- Key: FLINK-1615 URL: https://issues.apache.org/jira/browse/FLINK-1615 Project: Flink Issue Type: New Feature Components: flink-contrib Affects Versions: 0.8.1 Reporter: mustafa elbehery Priority: Minor An event-driven parser for Tweets into Java Pojos. It parses all the important part of the tweet into Java objects. Tested on cluster and the performance in pretty well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1867) TaskManagerFailureRecoveryITCase causes stalled travis builds
[ https://issues.apache.org/jira/browse/FLINK-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508643#comment-14508643 ] ASF GitHub Bot commented on FLINK-1867: --- Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/612#discussion_r28943824 --- Diff: flink-tests/src/test/java/org/apache/flink/test/recovery/AbstractProcessFailureRecoveryTest.java --- @@ -112,9 +112,9 @@ public void testTaskManagerProcessFailure() { Tuple2String, Object localAddress = new Tuple2String, Object(localhost, jobManagerPort); Configuration jmConfig = new Configuration(); - jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1 s); - jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 4 s); - jmConfig.setInteger(ConfigConstants.AKKA_WATCH_THRESHOLD, 2); + jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1 ms); --- End diff -- That's a typo, will fix it. TaskManagerFailureRecoveryITCase causes stalled travis builds - Key: FLINK-1867 URL: https://issues.apache.org/jira/browse/FLINK-1867 Project: Flink Issue Type: Bug Components: TaskManager, Tests Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Aljoscha Krettek There are currently tests on travis failing: https://travis-ci.org/apache/flink/jobs/57943063 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1867/1880] Raise test timeouts in hope ...
Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/612#discussion_r28943849 --- Diff: flink-tests/src/test/java/org/apache/flink/test/recovery/AbstractProcessFailureRecoveryTest.java --- @@ -112,9 +112,9 @@ public void testTaskManagerProcessFailure() { Tuple2String, Object localAddress = new Tuple2String, Object(localhost, jobManagerPort); Configuration jmConfig = new Configuration(); - jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1 s); - jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 4 s); - jmConfig.setInteger(ConfigConstants.AKKA_WATCH_THRESHOLD, 2); + jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1 ms); --- End diff -- Seemed to run flawlessly, though. :smile: I can the tests about 100 times by now without seeing another failure in the targeted tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1867/1880] Raise test timeouts in hope ...
Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/612#discussion_r28943875 --- Diff: flink-tests/src/test/java/org/apache/flink/test/recovery/AbstractProcessFailureRecoveryTest.java --- @@ -112,9 +112,9 @@ public void testTaskManagerProcessFailure() { Tuple2String, Object localAddress = new Tuple2String, Object(localhost, jobManagerPort); Configuration jmConfig = new Configuration(); - jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1 s); - jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 4 s); - jmConfig.setInteger(ConfigConstants.AKKA_WATCH_THRESHOLD, 2); + jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1 ms); + jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 20 s); + jmConfig.setInteger(ConfigConstants.AKKA_WATCH_THRESHOLD, 20); --- End diff -- See above. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1867/1880] Raise test timeouts in hope ...
Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/612#discussion_r28943824 --- Diff: flink-tests/src/test/java/org/apache/flink/test/recovery/AbstractProcessFailureRecoveryTest.java --- @@ -112,9 +112,9 @@ public void testTaskManagerProcessFailure() { Tuple2String, Object localAddress = new Tuple2String, Object(localhost, jobManagerPort); Configuration jmConfig = new Configuration(); - jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1 s); - jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 4 s); - jmConfig.setInteger(ConfigConstants.AKKA_WATCH_THRESHOLD, 2); + jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1 ms); --- End diff -- That's a typo, will fix it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1867) TaskManagerFailureRecoveryITCase causes stalled travis builds
[ https://issues.apache.org/jira/browse/FLINK-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508645#comment-14508645 ] ASF GitHub Bot commented on FLINK-1867: --- Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/612#discussion_r28943849 --- Diff: flink-tests/src/test/java/org/apache/flink/test/recovery/AbstractProcessFailureRecoveryTest.java --- @@ -112,9 +112,9 @@ public void testTaskManagerProcessFailure() { Tuple2String, Object localAddress = new Tuple2String, Object(localhost, jobManagerPort); Configuration jmConfig = new Configuration(); - jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1 s); - jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 4 s); - jmConfig.setInteger(ConfigConstants.AKKA_WATCH_THRESHOLD, 2); + jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1 ms); --- End diff -- Seemed to run flawlessly, though. :smile: I can the tests about 100 times by now without seeing another failure in the targeted tests. TaskManagerFailureRecoveryITCase causes stalled travis builds - Key: FLINK-1867 URL: https://issues.apache.org/jira/browse/FLINK-1867 Project: Flink Issue Type: Bug Components: TaskManager, Tests Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Aljoscha Krettek There are currently tests on travis failing: https://travis-ci.org/apache/flink/jobs/57943063 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1725) New Partitioner for better load balancing for skewed data
[ https://issues.apache.org/jira/browse/FLINK-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508652#comment-14508652 ] Gianmarco De Francisci Morales commented on FLINK-1725: --- Hi, The code looks good, but might there be a way to avoid converting the key to a string before hashing it? New Partitioner for better load balancing for skewed data - Key: FLINK-1725 URL: https://issues.apache.org/jira/browse/FLINK-1725 Project: Flink Issue Type: Improvement Components: New Components Affects Versions: 0.8.1 Reporter: Anis Nasir Assignee: Anis Nasir Labels: LoadBalancing, Partitioner Original Estimate: 336h Remaining Estimate: 336h Hi, We have recently studied the problem of load balancing in Storm [1]. In particular, we focused on key distribution of the stream for skewed data. We developed a new stream partitioning scheme (which we call Partial Key Grouping). It achieves better load balancing than key grouping while being more scalable than shuffle grouping in terms of memory. In the paper we show a number of mining algorithms that are easy to implement with partial key grouping, and whose performance can benefit from it. We think that it might also be useful for a larger class of algorithms. Partial key grouping is very easy to implement: it requires just a few lines of code in Java when implemented as a custom grouping in Storm [2]. For all these reasons, we believe it will be a nice addition to the standard Partitioners available in Flink. If the community thinks it's a good idea, we will be happy to offer support in the porting. References: [1]. https://melmeric.files.wordpress.com/2014/11/the-power-of-both-choices-practical-load-balancing-for-distributed-stream-processing-engines.pdf [2]. https://github.com/gdfm/partial-key-grouping -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1615] [java api] SimpleTweetInputFormat
Github user Elbehery commented on the pull request: https://github.com/apache/flink/pull/442#issuecomment-95512206 I did rebase against the master before creating the PR .. but this was long time ago, could this be the problems for the conflicts ?!! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1792) Improve TM Monitoring: CPU utilization, hide graphs by default and show summary only
[ https://issues.apache.org/jira/browse/FLINK-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508619#comment-14508619 ] ASF GitHub Bot commented on FLINK-1792: --- Github user bhatsachin commented on the pull request: https://github.com/apache/flink/pull/553#issuecomment-95480100 The suggested changes have been made Improve TM Monitoring: CPU utilization, hide graphs by default and show summary only Key: FLINK-1792 URL: https://issues.apache.org/jira/browse/FLINK-1792 Project: Flink Issue Type: Sub-task Components: Webfrontend Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Sachin Bhat As per https://github.com/apache/flink/pull/421 from FLINK-1501, there are some enhancements to the current monitoring required - Get the CPU utilization in % from each TaskManager process - Remove the metrics graph from the overview and only show the current stats as numbers (cpu load, heap utilization) and add a button to enable the detailed graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1792] TM Monitoring: CPU utilization, h...
Github user bhatsachin commented on the pull request: https://github.com/apache/flink/pull/553#issuecomment-95480100 The suggested changes have been made --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-1745) Add exact k-nearest-neighbours algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-1745: - Summary: Add exact k-nearest-neighbours algorithm to machine learning library (was: Add k-nearest-neighbours algorithm to machine learning library) Add exact k-nearest-neighbours algorithm to machine learning library Key: FLINK-1745 URL: https://issues.apache.org/jira/browse/FLINK-1745 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Chiwan Park Labels: ML, Starter Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial it is still used as a mean to classify data and to do regression. This issue focuses on the implementation of an exact kNN (H-BNLJ, H-BRJ) algorithm as proposed in [2]. Could be a starter task. Resources: [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm] [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1745) Add k-nearest-neighbours algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-1745: - Description: Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial it is still used as a mean to classify data and to do regression. This issue Could be a starter task. Resources: [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm] [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf] was: Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial it is still used as a mean to classify data and to do regression. Could be a starter task. Resources: [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm] [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf] Add k-nearest-neighbours algorithm to machine learning library -- Key: FLINK-1745 URL: https://issues.apache.org/jira/browse/FLINK-1745 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Chiwan Park Labels: ML, Starter Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial it is still used as a mean to classify data and to do regression. This issue Could be a starter task. Resources: [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm] [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1745) Add k-nearest-neighbours algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-1745: - Description: Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial it is still used as a mean to classify data and to do regression. This issue focuses on the implementation of an exact kNN (H-BNLJ, H-BRJ) algorithm as proposed in [2]. Could be a starter task. Resources: [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm] [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf] was: Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial it is still used as a mean to classify data and to do regression. This issue Could be a starter task. Resources: [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm] [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf] Add k-nearest-neighbours algorithm to machine learning library -- Key: FLINK-1745 URL: https://issues.apache.org/jira/browse/FLINK-1745 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Chiwan Park Labels: ML, Starter Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial it is still used as a mean to classify data and to do regression. This issue focuses on the implementation of an exact kNN (H-BNLJ, H-BRJ) algorithm as proposed in [2]. Could be a starter task. Resources: [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm] [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1867) TaskManagerFailureRecoveryITCase causes stalled travis builds
[ https://issues.apache.org/jira/browse/FLINK-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508637#comment-14508637 ] ASF GitHub Bot commented on FLINK-1867: --- Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/612#discussion_r28943704 --- Diff: flink-tests/src/test/scala/org/apache/flink/api/scala/runtime/jobmanager/JobManagerFailsITCase.scala --- @@ -18,24 +18,23 @@ package org.apache.flink.api.scala.runtime.jobmanager -import akka.actor.Status.{Success, Failure} +import akka.actor.Status.Success import akka.actor.{ActorSystem, PoisonPill} import akka.testkit.{ImplicitSender, TestKit} +import org.junit.Ignore +import org.junit.runner.RunWith +import org.scalatest.junit.JUnitRunner +import org.scalatest.{BeforeAndAfterAll, Matchers, WordSpecLike} + import org.apache.flink.configuration.{ConfigConstants, Configuration} import org.apache.flink.runtime.akka.AkkaUtils -import org.apache.flink.runtime.client.JobExecutionException -import org.apache.flink.runtime.jobgraph.{JobGraph, AbstractJobVertex} -import org.apache.flink.runtime.jobmanager.Tasks.{NoOpInvokable, BlockingNoOpInvokable} +import org.apache.flink.runtime.jobgraph.{AbstractJobVertex, JobGraph} +import org.apache.flink.runtime.jobmanager.Tasks.{BlockingNoOpInvokable, NoOpInvokable} import org.apache.flink.runtime.messages.JobManagerMessages._ import org.apache.flink.runtime.testingUtils.TestingMessages.DisableDisconnect -import org.apache.flink.runtime.testingUtils.TestingTaskManagerMessages.{JobManagerTerminated, -NotifyWhenJobManagerTerminated} +import org.apache.flink.runtime.testingUtils.TestingTaskManagerMessages.{JobManagerTerminated, NotifyWhenJobManagerTerminated} import org.apache.flink.runtime.testingUtils.TestingUtils import org.apache.flink.test.util.ForkableFlinkMiniCluster -import org.junit.Ignore -import org.junit.runner.RunWith -import org.scalatest.junit.JUnitRunner -import org.scalatest.{BeforeAndAfterAll, Matchers, WordSpecLike} @Ignore(Contains a bug with Akka 2.2.1) --- End diff -- True, removing it TaskManagerFailureRecoveryITCase causes stalled travis builds - Key: FLINK-1867 URL: https://issues.apache.org/jira/browse/FLINK-1867 Project: Flink Issue Type: Bug Components: TaskManager, Tests Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Aljoscha Krettek There are currently tests on travis failing: https://travis-ci.org/apache/flink/jobs/57943063 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1867/1880] Raise test timeouts in hope ...
Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/612#discussion_r28943731 --- Diff: flink-tests/src/test/scala/org/apache/flink/api/scala/runtime/jobmanager/JobManagerFailsITCase.scala --- @@ -136,9 +135,9 @@ with WordSpecLike with Matchers with BeforeAndAfterAll { config.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, numSlots) config.setInteger(ConfigConstants.LOCAL_INSTANCE_MANAGER_NUMBER_TASK_MANAGER, numTaskmanagers) config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1000 ms) -config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 4000 ms) +config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 20 s) --- End diff -- As above, it doesn't affect test execution time. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1867) TaskManagerFailureRecoveryITCase causes stalled travis builds
[ https://issues.apache.org/jira/browse/FLINK-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508638#comment-14508638 ] ASF GitHub Bot commented on FLINK-1867: --- Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/612#discussion_r28943731 --- Diff: flink-tests/src/test/scala/org/apache/flink/api/scala/runtime/jobmanager/JobManagerFailsITCase.scala --- @@ -136,9 +135,9 @@ with WordSpecLike with Matchers with BeforeAndAfterAll { config.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, numSlots) config.setInteger(ConfigConstants.LOCAL_INSTANCE_MANAGER_NUMBER_TASK_MANAGER, numTaskmanagers) config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1000 ms) -config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 4000 ms) +config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 20 s) --- End diff -- As above, it doesn't affect test execution time. TaskManagerFailureRecoveryITCase causes stalled travis builds - Key: FLINK-1867 URL: https://issues.apache.org/jira/browse/FLINK-1867 Project: Flink Issue Type: Bug Components: TaskManager, Tests Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Aljoscha Krettek There are currently tests on travis failing: https://travis-ci.org/apache/flink/jobs/57943063 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1789] [core] [runtime] [java-api] Allow...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/593#issuecomment-95500680 This does not work if the user uses classes that are not available on the local machine since you don't add the additional class path entries in JobWithJars.buildUserCodeClassLoader(). Correct? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1799][scala] Fix handling of generic ar...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/582 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1615] [java api] SimpleTweetInputFormat
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/442#issuecomment-95511047 This looks good to merge. Any objections? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1745) Add k-nearest-neighbours algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508606#comment-14508606 ] Till Rohrmann commented on FLINK-1745: -- I agree with grouping the nearest neighbours with respect to the queried vector. Add k-nearest-neighbours algorithm to machine learning library -- Key: FLINK-1745 URL: https://issues.apache.org/jira/browse/FLINK-1745 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Chiwan Park Labels: ML, Starter Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial it is still used as a mean to classify data and to do regression. Could be a starter task. Resources: [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm] [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1934) Add approximative k-nearest-neighbours (kNN) algorithm to machine learning library
Till Rohrmann created FLINK-1934: Summary: Add approximative k-nearest-neighbours (kNN) algorithm to machine learning library Key: FLINK-1934 URL: https://issues.apache.org/jira/browse/FLINK-1934 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann kNN is still a widely used algorithm for classification and regression. However, due to the computational costs of an exact implementation, it does not scale well to large amounts of data. Therefore, it is worthwhile to also add an approximative kNN implementation as proposed in [1,2]. Resources: [1] https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf [2] http://www.computer.org/csdl/proceedings/wacv/2007/2794/00/27940028.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1867/1880] Raise test timeouts in hope ...
Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/612#discussion_r28943598 --- Diff: flink-tests/src/test/scala/org/apache/flink/api/scala/runtime/taskmanager/TaskManagerFailsITCase.scala --- @@ -231,9 +231,9 @@ with WordSpecLike with Matchers with BeforeAndAfterAll { config.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, numSlots) config.setInteger(ConfigConstants.LOCAL_INSTANCE_MANAGER_NUMBER_TASK_MANAGER, numTaskmanagers) config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1000 ms) -config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 4000 ms) +config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 20 s) --- End diff -- No, it doesn't before and after this change the 4 tests complete in about 8 seconds. @tillrohrmann suggested that, since the actor system is local, there are some other mechanisms in play that signal failure in this case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1867) TaskManagerFailureRecoveryITCase causes stalled travis builds
[ https://issues.apache.org/jira/browse/FLINK-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508648#comment-14508648 ] ASF GitHub Bot commented on FLINK-1867: --- Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/612#discussion_r28943875 --- Diff: flink-tests/src/test/java/org/apache/flink/test/recovery/AbstractProcessFailureRecoveryTest.java --- @@ -112,9 +112,9 @@ public void testTaskManagerProcessFailure() { Tuple2String, Object localAddress = new Tuple2String, Object(localhost, jobManagerPort); Configuration jmConfig = new Configuration(); - jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1 s); - jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 4 s); - jmConfig.setInteger(ConfigConstants.AKKA_WATCH_THRESHOLD, 2); + jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1 ms); + jmConfig.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 20 s); + jmConfig.setInteger(ConfigConstants.AKKA_WATCH_THRESHOLD, 20); --- End diff -- See above. TaskManagerFailureRecoveryITCase causes stalled travis builds - Key: FLINK-1867 URL: https://issues.apache.org/jira/browse/FLINK-1867 Project: Flink Issue Type: Bug Components: TaskManager, Tests Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Aljoscha Krettek There are currently tests on travis failing: https://travis-ci.org/apache/flink/jobs/57943063 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/372#issuecomment-95488183 Do we want to break backwards compatibility or include a new method for printing on the client? After all, printing on the workers is a useful tool to debug the dataflow of a program. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output
[ https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508656#comment-14508656 ] ASF GitHub Bot commented on FLINK-1486: --- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/372#issuecomment-95488183 Do we want to break backwards compatibility or include a new method for printing on the client? After all, printing on the workers is a useful tool to debug the dataflow of a program. Add a string to the print method to identify output --- Key: FLINK-1486 URL: https://issues.apache.org/jira/browse/FLINK-1486 Project: Flink Issue Type: Improvement Components: Local Runtime Reporter: Maximilian Michels Assignee: Maximilian Michels Priority: Minor Labels: usability Fix For: 0.9 The output of the {{print}} method of {[DataSet}} is mainly used for debug purposes. Currently, it is difficult to identify the output. I would suggest to add another {{print(String str)}} method which allows the user to supply a String to identify the output. This could be a prefix before the actual output or a format string (which might be an overkill). {code} DataSet data = env.fromElements(1,2,3,4,5); {code} For example, {{data.print(MyDataSet: )}} would output print {noformat} MyDataSet: 1 MyDataSet: 2 ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1789) Allow adding of URLs to the usercode class loader
[ https://issues.apache.org/jira/browse/FLINK-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508704#comment-14508704 ] ASF GitHub Bot commented on FLINK-1789: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/593#issuecomment-95500680 This does not work if the user uses classes that are not available on the local machine since you don't add the additional class path entries in JobWithJars.buildUserCodeClassLoader(). Correct? Allow adding of URLs to the usercode class loader - Key: FLINK-1789 URL: https://issues.apache.org/jira/browse/FLINK-1789 Project: Flink Issue Type: Improvement Components: Distributed Runtime Reporter: Timo Walther Assignee: Timo Walther Priority: Minor Currently, there is no option to add customs classpath URLs to the FlinkUserCodeClassLoader. JARs always need to be shipped to the cluster even if they are already present on all nodes. It would be great if RemoteEnvironment also accepts valid classpaths URLs and forwards them to BlobLibraryCacheManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1799) Scala API does not support generic arrays
[ https://issues.apache.org/jira/browse/FLINK-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek resolved FLINK-1799. - Resolution: Fixed Fixed in https://github.com/apache/flink/commit/4672e95ef158bb5e52b1d493465d62429bbdb29e Scala API does not support generic arrays - Key: FLINK-1799 URL: https://issues.apache.org/jira/browse/FLINK-1799 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Assignee: Aljoscha Krettek The Scala API does not support generic arrays at the moment. It throws a rather unhelpful error message ```InvalidTypesException: The given type is not a valid object array```. Code to reproduce the problem is given below: {code} def main(args: Array[String]) { foobar[Double] } def foobar[T: ClassTag: TypeInformation]: DataSet[Block[T]] = { val tpe = createTypeInformation[Array[T]] null } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1799) Scala API does not support generic arrays
[ https://issues.apache.org/jira/browse/FLINK-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508721#comment-14508721 ] ASF GitHub Bot commented on FLINK-1799: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/582 Scala API does not support generic arrays - Key: FLINK-1799 URL: https://issues.apache.org/jira/browse/FLINK-1799 Project: Flink Issue Type: Bug Reporter: Till Rohrmann Assignee: Aljoscha Krettek The Scala API does not support generic arrays at the moment. It throws a rather unhelpful error message ```InvalidTypesException: The given type is not a valid object array```. Code to reproduce the problem is given below: {code} def main(args: Array[String]) { foobar[Double] } def foobar[T: ClassTag: TypeInformation]: DataSet[Block[T]] = { val tpe = createTypeInformation[Array[T]] null } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1615) Introduces a new InputFormat for Tweets
[ https://issues.apache.org/jira/browse/FLINK-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508741#comment-14508741 ] ASF GitHub Bot commented on FLINK-1615: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/442#issuecomment-95511047 This looks good to merge. Any objections? Introduces a new InputFormat for Tweets --- Key: FLINK-1615 URL: https://issues.apache.org/jira/browse/FLINK-1615 Project: Flink Issue Type: New Feature Components: flink-contrib Affects Versions: 0.8.1 Reporter: mustafa elbehery Priority: Minor An event-driven parser for Tweets into Java Pojos. It parses all the important part of the tweet into Java objects. Tested on cluster and the performance in pretty well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508774#comment-14508774 ] ASF GitHub Bot commented on FLINK-1398: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-95522921 I would put this into the `flink-contrib` module. A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Assignee: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml} data.extractElement(1); {code} This means that we implement a function for DataSet which extracts a certain element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1472) Web frontend config overview shows wrong value
[ https://issues.apache.org/jira/browse/FLINK-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508778#comment-14508778 ] ASF GitHub Bot commented on FLINK-1472: --- Github user qmlmoon commented on the pull request: https://github.com/apache/flink/pull/439#issuecomment-95523387 That would be a nice way, but 1) key/value pairs are not always distinguished by _KEY suffix, we have to format all the fields in ConfigConstants 2) some values are not in the ConfigConstants. e.g task manager slot = 1, akka_startup_time=aka_ask_timeout In summary, we need a formatted one-to-one map of the key/value pairs in ConfigConstants Web frontend config overview shows wrong value -- Key: FLINK-1472 URL: https://issues.apache.org/jira/browse/FLINK-1472 Project: Flink Issue Type: Bug Components: Webfrontend Affects Versions: master Reporter: Ufuk Celebi Assignee: Mingliang Qi Priority: Minor The web frontend shows configuration values even if they could not be correctly parsed. For example I've configured the number of buffers as 123.000, which cannot be parsed as an Integer by GlobalConfiguration and the default value is used. Still, the web frontend shows the not used 123.000. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable
Github user ggevay commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-95523580 OK, I see your point. I am thinking about using NetUtils.findConnectingAddress to determine which interface is used for the communication with the cluster. For this, I would need an IP of something in the cluster to connect to. What would be the best way to get this info? (Maybe I could use RemoteStreamEnvironment.host, but it is a private member, so there is probably a more standard way that I'm just overlooking.) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1930) NullPointerException in vertex-centric iteration
[ https://issues.apache.org/jira/browse/FLINK-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508751#comment-14508751 ] ASF GitHub Bot commented on FLINK-1930: --- GitHub user StephanEwen opened a pull request: https://github.com/apache/flink/pull/619 [FLINK-1930] [runtime] Improve exception when bufferpools have been shutdown You can merge this pull request into a Git repository by running: $ git pull https://github.com/StephanEwen/incubator-flink buffer_exception Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/619.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #619 commit d6d7c3bcf4cf0a3f1164875a6fe3a093aeb2a68b Author: Stephan Ewen se...@apache.org Date: 2015-04-22T15:52:21Z [FLINK-1930] [runtime] Improve exception when bufferpools have been shut down. NullPointerException in vertex-centric iteration Key: FLINK-1930 URL: https://issues.apache.org/jira/browse/FLINK-1930 Project: Flink Issue Type: Bug Affects Versions: 0.9 Reporter: Vasia Kalavri Hello to my Squirrels, I came across this exception when having a vertex-centric iteration output followed by a group by. I'm not sure if what is causing it, since I saw this error in a rather large pipeline, but I managed to reproduce it with [this code example | https://github.com/vasia/flink/commit/1b7bbca1a6130fbcfe98b4b9b43967eb4c61f309] and a sufficiently large dataset, e.g. [this one | http://snap.stanford.edu/data/com-DBLP.html] (I'm running this locally). It seems like a null Buffer in RecordWriter. The exception message is the following: Exception in thread main org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmanager.JobManager$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:319) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:37) at org.apache.flink.runtime.ActorLogMessages$anon$1.apply(ActorLogMessages.scala:30) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) at org.apache.flink.runtime.ActorLogMessages$anon$1.applyOrElse(ActorLogMessages.scala:30) at akka.actor.Actor$class.aroundReceive(Actor.scala:465) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:94) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.ActorCell.invoke(ActorCell.scala:487) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) at akka.dispatch.Mailbox.run(Mailbox.scala:221) at akka.dispatch.Mailbox.exec(Mailbox.scala:231) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.lang.NullPointerException at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92) at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.streamSolutionSetToFinalOutput(IterationHeadPactTask.java:405) at org.apache.flink.runtime.iterative.task.IterationHeadPactTask.run(IterationHeadPactTask.java:365) at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:360) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:221) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (FLINK-1934) Add approximative k-nearest-neighbours (kNN) algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raghav Chalapathy reassigned FLINK-1934: Assignee: Raghav Chalapathy Add approximative k-nearest-neighbours (kNN) algorithm to machine learning library -- Key: FLINK-1934 URL: https://issues.apache.org/jira/browse/FLINK-1934 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Raghav Chalapathy Labels: ML kNN is still a widely used algorithm for classification and regression. However, due to the computational costs of an exact implementation, it does not scale well to large amounts of data. Therefore, it is worthwhile to also add an approximative kNN implementation as proposed in [1,2]. Resources: [1] https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf [2] http://www.computer.org/csdl/proceedings/wacv/2007/2794/00/27940028.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [runtime] Bump Netty version to 4.27.Final and...
Github user uce closed the pull request at: https://github.com/apache/flink/pull/617 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [runtime] Bump Netty version to 4.27.Final and...
GitHub user uce reopened a pull request: https://github.com/apache/flink/pull/617 [runtime] Bump Netty version to 4.27.Final and add javassist @rmetzger, javassist is set in the root POM. Is it OK to leave it in flink-runtime as I have it now w/o version? You can merge this pull request into a Git repository by running: $ git pull https://github.com/uce/incubator-flink javassist Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/617.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #617 commit 6a91d722005d7f8d138e8553f92a134c52209de8 Author: Ufuk Celebi u...@apache.org Date: 2015-04-22T12:50:38Z [runtime] Bump Netty version to 4.27.Final and add javassist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1930] [runtime] Improve exception when ...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/619#issuecomment-95517313 @uce Do you see any further implications to the network stack via this change (exception instead of null) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1930] [runtime] Improve exception when ...
GitHub user StephanEwen opened a pull request: https://github.com/apache/flink/pull/619 [FLINK-1930] [runtime] Improve exception when bufferpools have been shutdown You can merge this pull request into a Git repository by running: $ git pull https://github.com/StephanEwen/incubator-flink buffer_exception Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/619.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #619 commit d6d7c3bcf4cf0a3f1164875a6fe3a093aeb2a68b Author: Stephan Ewen se...@apache.org Date: 2015-04-22T15:52:21Z [FLINK-1930] [runtime] Improve exception when bufferpools have been shut down. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1472] Fixed Web frontend config overvie...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/439#issuecomment-95517673 Hi, sorry for the long wait on this. I really like the feature but the implementation is not scalable: If new config values are added this needs to be updated in several places now. Could you change ConfigConstants and add a static initializer block that builds the hash maps that you manually build in DefaultConfigKeyValues using reflection. The code would just need to loop through all fields that have _KEY at the end, and then find the matching default value without the _KEY at the end. From the default value field the type of the value can be determined and it can be added to the appropriate hash map. This way, the defaults will always stay up to date with the actual config constants. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1472) Web frontend config overview shows wrong value
[ https://issues.apache.org/jira/browse/FLINK-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508756#comment-14508756 ] ASF GitHub Bot commented on FLINK-1472: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/439#issuecomment-95517673 Hi, sorry for the long wait on this. I really like the feature but the implementation is not scalable: If new config values are added this needs to be updated in several places now. Could you change ConfigConstants and add a static initializer block that builds the hash maps that you manually build in DefaultConfigKeyValues using reflection. The code would just need to loop through all fields that have _KEY at the end, and then find the matching default value without the _KEY at the end. From the default value field the type of the value can be determined and it can be added to the appropriate hash map. This way, the defaults will always stay up to date with the actual config constants. Web frontend config overview shows wrong value -- Key: FLINK-1472 URL: https://issues.apache.org/jira/browse/FLINK-1472 Project: Flink Issue Type: Bug Components: Webfrontend Affects Versions: master Reporter: Ufuk Celebi Assignee: Mingliang Qi Priority: Minor The web frontend shows configuration values even if they could not be correctly parsed. For example I've configured the number of buffers as 123.000, which cannot be parsed as an Integer by GlobalConfiguration and the default value is used. Still, the web frontend shows the not used 123.000. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1398] Introduce extractSingleField() in...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-95522921 I would put this into the `flink-contrib` module. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1398] Introduce extractSingleField() in...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-95523597 @aljoscha Those are good points. I like the idea of cleaning the Java and Scala APIs from operations that work on structured data such as projection, aggregation, etc. and support those use cases through the Table API. Not sure if the Table API can serve as a complete replacement at this point in time, but moving it there is the right thing to do, IMO. But this discussion should happen on the dev mailing list, not in some PR comment thread ;-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1670) Collect method for streaming
[ https://issues.apache.org/jira/browse/FLINK-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508781#comment-14508781 ] ASF GitHub Bot commented on FLINK-1670: --- Github user ggevay commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-95523580 OK, I see your point. I am thinking about using NetUtils.findConnectingAddress to determine which interface is used for the communication with the cluster. For this, I would need an IP of something in the cluster to connect to. What would be the best way to get this info? (Maybe I could use RemoteStreamEnvironment.host, but it is a private member, so there is probably a more standard way that I'm just overlooking.) Collect method for streaming Key: FLINK-1670 URL: https://issues.apache.org/jira/browse/FLINK-1670 Project: Flink Issue Type: New Feature Components: Streaming Affects Versions: 0.9 Reporter: Márton Balassi Assignee: Gabor Gevay Priority: Minor A convenience method for streaming back the results of a job to the client. As the client itself is a bottleneck anyway an easy solution would be to provide a socket sink with degree of parallelism 1, from which a client utility can read. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508782#comment-14508782 ] ASF GitHub Bot commented on FLINK-1398: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-95523597 @aljoscha Those are good points. I like the idea of cleaning the Java and Scala APIs from operations that work on structured data such as projection, aggregation, etc. and support those use cases through the Table API. Not sure if the Table API can serve as a complete replacement at this point in time, but moving it there is the right thing to do, IMO. But this discussion should happen on the dev mailing list, not in some PR comment thread ;-) A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Assignee: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml} data.extractElement(1); {code} This means that we implement a function for DataSet which extracts a certain element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1670] Made DataStream iterable
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-95528015 `NetUtils.findConnectingAddress` is a useful util, when you know that the endpoint is up already. If you can make the assumption that the master is running already you can use that. For remote environments, this is probably a fair assumption. For local environments, this may not matter anyways, localhost should work there. Beware of the setups where you start the cluster from one machine (that starts the master on that machine) and launch the program from the same one. This whole network discovery is simply a bit tricky ;-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1670) Collect method for streaming
[ https://issues.apache.org/jira/browse/FLINK-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508803#comment-14508803 ] ASF GitHub Bot commented on FLINK-1670: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/581#issuecomment-95528015 `NetUtils.findConnectingAddress` is a useful util, when you know that the endpoint is up already. If you can make the assumption that the master is running already you can use that. For remote environments, this is probably a fair assumption. For local environments, this may not matter anyways, localhost should work there. Beware of the setups where you start the cluster from one machine (that starts the master on that machine) and launch the program from the same one. This whole network discovery is simply a bit tricky ;-) Collect method for streaming Key: FLINK-1670 URL: https://issues.apache.org/jira/browse/FLINK-1670 Project: Flink Issue Type: New Feature Components: Streaming Affects Versions: 0.9 Reporter: Márton Balassi Assignee: Gabor Gevay Priority: Minor A convenience method for streaming back the results of a job to the client. As the client itself is a bottleneck anyway an easy solution would be to provide a socket sink with degree of parallelism 1, from which a client utility can read. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1789] [core] [runtime] [java-api] Allow...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/593#issuecomment-95564844 Yes, this is true, but the way it is implemented, the folders are not always added to the class loader. Maybe I'm wrong here, but JobWithJars.getUserCodeClassLoader and JobWithJars.buildUserCodeClassLoader don't add the URLs to the ClassLoader that they create. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1789) Allow adding of URLs to the usercode class loader
[ https://issues.apache.org/jira/browse/FLINK-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508954#comment-14508954 ] ASF GitHub Bot commented on FLINK-1789: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/593#issuecomment-95564844 Yes, this is true, but the way it is implemented, the folders are not always added to the class loader. Maybe I'm wrong here, but JobWithJars.getUserCodeClassLoader and JobWithJars.buildUserCodeClassLoader don't add the URLs to the ClassLoader that they create. Allow adding of URLs to the usercode class loader - Key: FLINK-1789 URL: https://issues.apache.org/jira/browse/FLINK-1789 Project: Flink Issue Type: Improvement Components: Distributed Runtime Reporter: Timo Walther Assignee: Timo Walther Priority: Minor Currently, there is no option to add customs classpath URLs to the FlinkUserCodeClassLoader. JARs always need to be shipped to the cluster even if they are already present on all nodes. It would be great if RemoteEnvironment also accepts valid classpaths URLs and forwards them to BlobLibraryCacheManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1398] Introduce extractSingleField() in...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-95566780 Yes, I think we should start a discussion there. I just wanted to give the reasons for my opinion here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple
[ https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508957#comment-14508957 ] ASF GitHub Bot commented on FLINK-1398: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/308#issuecomment-95566780 Yes, I think we should start a discussion there. I just wanted to give the reasons for my opinion here. A new DataSet function: extractElementFromTuple --- Key: FLINK-1398 URL: https://issues.apache.org/jira/browse/FLINK-1398 Project: Flink Issue Type: Wish Reporter: Felix Neutatz Assignee: Felix Neutatz Priority: Minor This is the use case: {code:xml} DataSetTuple2Integer, Double data = env.fromElements(new Tuple2Integer, Double(1,2.0)); data.map(new ElementFromTuple()); } public static final class ElementFromTuple implements MapFunctionTuple2Integer, Double, Double { @Override public Double map(Tuple2Integer, Double value) { return value.f1; } } {code} It would be awesome if we had something like this: {code:xml} data.extractElement(1); {code} This means that we implement a function for DataSet which extracts a certain element from a given Tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-938] Automatically configure the jobman...
Github user uce commented on the pull request: https://github.com/apache/flink/pull/48#issuecomment-95579798 Thanks for the PR. I've looked into the other approach (#248) and I think you can safely close this on now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output
[ https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509016#comment-14509016 ] ASF GitHub Bot commented on FLINK-1486: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/372#issuecomment-95580456 From what I have seen, most people expect print() to actually go to the client. It would be a break of backwards compatibility, agreed. Maybe one of the few that are necessary. Add a string to the print method to identify output --- Key: FLINK-1486 URL: https://issues.apache.org/jira/browse/FLINK-1486 Project: Flink Issue Type: Improvement Components: Local Runtime Reporter: Maximilian Michels Assignee: Maximilian Michels Priority: Minor Labels: usability Fix For: 0.9 The output of the {{print}} method of {[DataSet}} is mainly used for debug purposes. Currently, it is difficult to identify the output. I would suggest to add another {{print(String str)}} method which allows the user to supply a String to identify the output. This could be a prefix before the actual output or a format string (which might be an overkill). {code} DataSet data = env.fromElements(1,2,3,4,5); {code} For example, {{data.print(MyDataSet: )}} would output print {noformat} MyDataSet: 1 MyDataSet: 2 ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/372#issuecomment-95580456 From what I have seen, most people expect print() to actually go to the client. It would be a break of backwards compatibility, agreed. Maybe one of the few that are necessary. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1867) TaskManagerFailureRecoveryITCase causes stalled travis builds
[ https://issues.apache.org/jira/browse/FLINK-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509029#comment-14509029 ] ASF GitHub Bot commented on FLINK-1867: --- Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/612#discussion_r28961064 --- Diff: flink-tests/src/test/scala/org/apache/flink/api/scala/runtime/taskmanager/TaskManagerFailsITCase.scala --- @@ -231,9 +231,9 @@ with WordSpecLike with Matchers with BeforeAndAfterAll { config.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, numSlots) config.setInteger(ConfigConstants.LOCAL_INSTANCE_MANAGER_NUMBER_TASK_MANAGER, numTaskmanagers) config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1000 ms) -config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 4000 ms) +config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 20 s) --- End diff -- True, the heartbeats go only between actor systems, not actors. If everything is local, no hearbeats happen anyways. May as well drop this config entry then, I suppose... TaskManagerFailureRecoveryITCase causes stalled travis builds - Key: FLINK-1867 URL: https://issues.apache.org/jira/browse/FLINK-1867 Project: Flink Issue Type: Bug Components: TaskManager, Tests Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Aljoscha Krettek There are currently tests on travis failing: https://travis-ci.org/apache/flink/jobs/57943063 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1472] Fixed Web frontend config overvie...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/439#issuecomment-95580562 Yes, but then we should change this now and not build more code on top of this that can fail in the future if someone forgets to add the names to the correct hash set in some other class. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1472) Web frontend config overview shows wrong value
[ https://issues.apache.org/jira/browse/FLINK-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509032#comment-14509032 ] ASF GitHub Bot commented on FLINK-1472: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/439#issuecomment-95581307 I added a Jira for this: https://issues.apache.org/jira/browse/FLINK-1936 Web frontend config overview shows wrong value -- Key: FLINK-1472 URL: https://issues.apache.org/jira/browse/FLINK-1472 Project: Flink Issue Type: Bug Components: Webfrontend Affects Versions: master Reporter: Ufuk Celebi Assignee: Mingliang Qi Priority: Minor The web frontend shows configuration values even if they could not be correctly parsed. For example I've configured the number of buffers as 123.000, which cannot be parsed as an Integer by GlobalConfiguration and the default value is used. Still, the web frontend shows the not used 123.000. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1936) Normalize keys and default values in ConfigConstants.java
Aljoscha Krettek created FLINK-1936: --- Summary: Normalize keys and default values in ConfigConstants.java Key: FLINK-1936 URL: https://issues.apache.org/jira/browse/FLINK-1936 Project: Flink Issue Type: Improvement Reporter: Aljoscha Krettek We have to find a way to normalise these. This is for example required as part of [FLINK-1472]. The idea would be to either force _KEY suffix for the string key, and DEFAULT_ prefix for the default value. Another idea would be to specify the default value using an Annotation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1472] Fixed Web frontend config overvie...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/439#issuecomment-95581307 I added a Jira for this: https://issues.apache.org/jira/browse/FLINK-1936 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1828) Impossible to output data to an HBase table
[ https://issues.apache.org/jira/browse/FLINK-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509034#comment-14509034 ] ASF GitHub Bot commented on FLINK-1828: --- Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/571#discussion_r28961417 --- Diff: flink-staging/flink-hbase/pom.xml --- @@ -112,6 +112,12 @@ under the License. /exclusion /exclusions /dependency + dependency --- End diff -- Fair enough. Then the dependency should not be in test scope, but in the default scope, so users get this dependency into their fat jar as well when using the HBase output format. May be worth to define a few exclusions, though, to not get the complete tail of transitive HBase dependencies (I think that even includes JRuby and so on) Impossible to output data to an HBase table --- Key: FLINK-1828 URL: https://issues.apache.org/jira/browse/FLINK-1828 Project: Flink Issue Type: Bug Components: Hadoop Compatibility Affects Versions: 0.9 Reporter: Flavio Pompermaier Labels: hadoop, hbase Fix For: 0.9 Right now it is not possible to use HBase TableOutputFormat as output format because Configurable.setConf is not called in the configure() method of the HadoopOutputFormatBase -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output
[ https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509039#comment-14509039 ] ASF GitHub Bot commented on FLINK-1486: --- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/372#issuecomment-95583993 @StephanEwen Don't think printing on the client can be worse if the output still contains information about the producers (e.g. by a task id). IMO, a sink identifier could still make sense when you make multiple calls to print and want to distinguish easily between the outputs. Add a string to the print method to identify output --- Key: FLINK-1486 URL: https://issues.apache.org/jira/browse/FLINK-1486 Project: Flink Issue Type: Improvement Components: Local Runtime Reporter: Maximilian Michels Assignee: Maximilian Michels Priority: Minor Labels: usability Fix For: 0.9 The output of the {{print}} method of {[DataSet}} is mainly used for debug purposes. Currently, it is difficult to identify the output. I would suggest to add another {{print(String str)}} method which allows the user to supply a String to identify the output. This could be a prefix before the actual output or a format string (which might be an overkill). {code} DataSet data = env.fromElements(1,2,3,4,5); {code} For example, {{data.print(MyDataSet: )}} would output print {noformat} MyDataSet: 1 MyDataSet: 2 ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-938] Auomatically configure the jobmana...
Github user uce commented on the pull request: https://github.com/apache/flink/pull/248#issuecomment-95586174 I've rebased this on the current master, added log messages when falling back to the default address (which I've renamed as Till suggested), and tested it. There is one remaining issue though: When starting the task manager, you use `hostname` as the default JM address. As @StephanEwen pointed out in another thread [1], this will only work for certain setups. Then the question is whether we really improve deployment with this or not. @rmetzger, you reported the original issue. Do you (and others of course) have an idea about this? [1] https://github.com/apache/flink/pull/581#issuecomment-94911176 PS: The rebased branch is here, if you have time to pick it up from there: https://github.com/uce/incubator-flink/tree/jmaddress --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output
[ https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509051#comment-14509051 ] ASF GitHub Bot commented on FLINK-1486: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/372#issuecomment-95586431 Ah, okay. You mean we have two methods: 1. `print()` which just prints everything onto the client 2. `printWithTaskid()`, which prefixes lines with the task ID? Add a string to the print method to identify output --- Key: FLINK-1486 URL: https://issues.apache.org/jira/browse/FLINK-1486 Project: Flink Issue Type: Improvement Components: Local Runtime Reporter: Maximilian Michels Assignee: Maximilian Michels Priority: Minor Labels: usability Fix For: 0.9 The output of the {{print}} method of {[DataSet}} is mainly used for debug purposes. Currently, it is difficult to identify the output. I would suggest to add another {{print(String str)}} method which allows the user to supply a String to identify the output. This could be a prefix before the actual output or a format string (which might be an overkill). {code} DataSet data = env.fromElements(1,2,3,4,5); {code} For example, {{data.print(MyDataSet: )}} would output print {noformat} MyDataSet: 1 MyDataSet: 2 ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-924] Add automatic dependency retrieval...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/35#issuecomment-95590430 Hi @qmlmoon, sorry for the long wait on this PR. Could you please rebase on top of the current master and also get rid of the merge commits in the process? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-924) Extend JarFileCreator to automatically include dependencies
[ https://issues.apache.org/jira/browse/FLINK-924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509075#comment-14509075 ] ASF GitHub Bot commented on FLINK-924: -- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/35#issuecomment-95590430 Hi @qmlmoon, sorry for the long wait on this PR. Could you please rebase on top of the current master and also get rid of the merge commits in the process? Extend JarFileCreator to automatically include dependencies --- Key: FLINK-924 URL: https://issues.apache.org/jira/browse/FLINK-924 Project: Flink Issue Type: Improvement Reporter: Ufuk Celebi Assignee: Mingliang Qi Priority: Minor We have a simple {{JarFileCreator}}, which allows to add classes to a JAR file as follows: {code:java} JarFileCreator jfc = new JarFileCreator(jarFile); jfc.addClass(X.class); jfc.addClass(Y.class); jfc.createJarFile(); {code} The created file can then be used with the remote execution environment, which requires a JAR file to ship. I propose the following improvement: use [ASM|http://asm.ow2.org/] to extract all dependencies and add create the JAR file automatically. There is an [old tutorial|http://asm.ow2.org/doc/tutorial-asm-2.0.html] (for ASM 2), which implements a {{DependencyVisitor}}. Unfortuneately the code does not directly work with ASM 5, but it should be a good starting point. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-938) Change start-cluster.sh script so that users don't have to configure the JobManager address
[ https://issues.apache.org/jira/browse/FLINK-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509010#comment-14509010 ] ASF GitHub Bot commented on FLINK-938: -- Github user uce commented on the pull request: https://github.com/apache/flink/pull/48#issuecomment-95579798 Thanks for the PR. I've looked into the other approach (#248) and I think you can safely close this on now. Change start-cluster.sh script so that users don't have to configure the JobManager address --- Key: FLINK-938 URL: https://issues.apache.org/jira/browse/FLINK-938 Project: Flink Issue Type: Improvement Components: Build System Reporter: Robert Metzger Assignee: Mingliang Qi Priority: Minor Fix For: 0.9 To improve the user experience, Flink should not require users to configure the JobManager's address on a cluster. In combination with FLINK-934, this would allow running Flink with decent performance on a cluster without setting a single configuration value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/372#issuecomment-95580149 Can you think of a case where printing on the client is worse than printing on worker? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-938] Automatically configure the jobman...
Github user qmlmoon closed the pull request at: https://github.com/apache/flink/pull/48 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-938) Change start-cluster.sh script so that users don't have to configure the JobManager address
[ https://issues.apache.org/jira/browse/FLINK-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509013#comment-14509013 ] ASF GitHub Bot commented on FLINK-938: -- Github user qmlmoon closed the pull request at: https://github.com/apache/flink/pull/48 Change start-cluster.sh script so that users don't have to configure the JobManager address --- Key: FLINK-938 URL: https://issues.apache.org/jira/browse/FLINK-938 Project: Flink Issue Type: Improvement Components: Build System Reporter: Robert Metzger Assignee: Mingliang Qi Priority: Minor Fix For: 0.9 To improve the user experience, Flink should not require users to configure the JobManager's address on a cluster. In combination with FLINK-934, this would allow running Flink with decent performance on a cluster without setting a single configuration value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1486) Add a string to the print method to identify output
[ https://issues.apache.org/jira/browse/FLINK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509012#comment-14509012 ] ASF GitHub Bot commented on FLINK-1486: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/372#issuecomment-95580149 Can you think of a case where printing on the client is worse than printing on worker? Add a string to the print method to identify output --- Key: FLINK-1486 URL: https://issues.apache.org/jira/browse/FLINK-1486 Project: Flink Issue Type: Improvement Components: Local Runtime Reporter: Maximilian Michels Assignee: Maximilian Michels Priority: Minor Labels: usability Fix For: 0.9 The output of the {{print}} method of {[DataSet}} is mainly used for debug purposes. Currently, it is difficult to identify the output. I would suggest to add another {{print(String str)}} method which allows the user to supply a String to identify the output. This could be a prefix before the actual output or a format string (which might be an overkill). {code} DataSet data = env.fromElements(1,2,3,4,5); {code} For example, {{data.print(MyDataSet: )}} would output print {noformat} MyDataSet: 1 MyDataSet: 2 ... {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1472) Web frontend config overview shows wrong value
[ https://issues.apache.org/jira/browse/FLINK-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14509017#comment-14509017 ] ASF GitHub Bot commented on FLINK-1472: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/439#issuecomment-95580562 Yes, but then we should change this now and not build more code on top of this that can fail in the future if someone forgets to add the names to the correct hash set in some other class. Web frontend config overview shows wrong value -- Key: FLINK-1472 URL: https://issues.apache.org/jira/browse/FLINK-1472 Project: Flink Issue Type: Bug Components: Webfrontend Affects Versions: master Reporter: Ufuk Celebi Assignee: Mingliang Qi Priority: Minor The web frontend shows configuration values even if they could not be correctly parsed. For example I've configured the number of buffers as 123.000, which cannot be parsed as an Integer by GlobalConfiguration and the default value is used. Still, the web frontend shows the not used 123.000. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1867/1880] Raise test timeouts in hope ...
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/612#discussion_r28961064 --- Diff: flink-tests/src/test/scala/org/apache/flink/api/scala/runtime/taskmanager/TaskManagerFailsITCase.scala --- @@ -231,9 +231,9 @@ with WordSpecLike with Matchers with BeforeAndAfterAll { config.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, numSlots) config.setInteger(ConfigConstants.LOCAL_INSTANCE_MANAGER_NUMBER_TASK_MANAGER, numTaskmanagers) config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1000 ms) -config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 4000 ms) +config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 20 s) --- End diff -- True, the heartbeats go only between actor systems, not actors. If everything is local, no hearbeats happen anyways. May as well drop this config entry then, I suppose... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Fixed Configurable HadoopOutputFormat (FLINK-1...
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/571#discussion_r28961417 --- Diff: flink-staging/flink-hbase/pom.xml --- @@ -112,6 +112,12 @@ under the License. /exclusion /exclusions /dependency + dependency --- End diff -- Fair enough. Then the dependency should not be in test scope, but in the default scope, so users get this dependency into their fat jar as well when using the HBase output format. May be worth to define a few exclusions, though, to not get the complete tail of transitive HBase dependencies (I think that even includes JRuby and so on) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1867/1880] Raise test timeouts in hope ...
Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/612#discussion_r28961799 --- Diff: flink-tests/src/test/scala/org/apache/flink/api/scala/runtime/taskmanager/TaskManagerFailsITCase.scala --- @@ -231,9 +231,9 @@ with WordSpecLike with Matchers with BeforeAndAfterAll { config.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, numSlots) config.setInteger(ConfigConstants.LOCAL_INSTANCE_MANAGER_NUMBER_TASK_MANAGER, numTaskmanagers) config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_INTERVAL, 1000 ms) -config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 4000 ms) +config.setString(ConfigConstants.AKKA_WATCH_HEARTBEAT_PAUSE, 20 s) --- End diff -- Yes, removing them also from JobManagerFailsITCase --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---