[GitHub] flink pull request: [FLINK-1745] Add exact k-nearest-neighbours al...
Github user danielblazevski commented on the pull request: https://github.com/apache/flink/pull/1220#issuecomment-219258096 @chiwanpark thanks! Putting the finishing touches on [approximate](https://github.com/danielblazevski/flink/blob/FLINK-1934/flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/nn/zknn.scala) z-knn -- but also need to do a LSH version of approximate knn since z-knn only applies for dim < 30 (and z-value method is much quicker than LSH method for dim < 30, but at least LSH makes since for dim > 30). Pretty excited about the performance gain compared to exact knn. I'll be presenting on knn for Flink at a Scala meetup in NY at Spotify on May 24th and will definitely mention @chiwanpark and @tillrohrmann for all their help! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-3852] update the quickstart module to i...
Github user markreddy commented on the pull request: https://github.com/apache/flink/pull/1982#issuecomment-219230214 Thanks for the review @tzulitai I've pushed fixes for all your comments :thumbsup: --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-3788) Programs extending App trait cause problems because of DelayedInit
[ https://issues.apache.org/jira/browse/FLINK-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Gevay updated FLINK-3788: --- Summary: Programs extending App trait cause problems because of DelayedInit (was: Local variable values are not distributed to job runners) > Programs extending App trait cause problems because of DelayedInit > -- > > Key: FLINK-3788 > URL: https://issues.apache.org/jira/browse/FLINK-3788 > Project: Flink > Issue Type: Bug > Components: DataSet API >Affects Versions: 1.0.0, 1.0.1 > Environment: Scala 2.11.8 > Sun JDK 1.8.0_65 or OpenJDK 1.8.0_77 > Fedora 25, 4.6.0-0.rc2.git3.1.fc25.x86_64 >Reporter: Andreas C. Osowski > Attachments: FLINK-3788.tgz > > > Variable values of non-elementary types aren't caught and distributed to job > runners, causing them to remain 'null' and causing NPEs upon access when > running on a cluster. Running locally through `flink-clients` works fine. > Changing parallelism or disabling the closure cleaner don't seem to have any > effect. > Minimal example, also see the attached archive. > {code:java} > case class IntWrapper(a1: Int) > val wrapped = IntWrapper(42) > env.readTextFile("myTextFile.txt").map(line => wrapped.toString).collect > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-3701) Cant call execute after first execution
[ https://issues.apache.org/jira/browse/FLINK-3701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels resolved FLINK-3701. --- Resolution: Fixed Fixed in 48b469ad4f0da466b347071cea82913965645de3. > Cant call execute after first execution > --- > > Key: FLINK-3701 > URL: https://issues.apache.org/jira/browse/FLINK-3701 > Project: Flink > Issue Type: Bug > Components: Scala Shell >Affects Versions: 1.1.0 >Reporter: Nikolaas Steenbergen >Assignee: Maximilian Michels > Fix For: 1.1.0 > > > in the scala shell, local mode, version 1.0 this works: > {code} > Scala-Flink> var b = env.fromElements("a","b") > Scala-Flink> b.print > Scala-Flink> var c = env.fromElements("c","d") > Scala-Flink> c.print > {code} > in the current master (after c.print) this leads to : > {code} > java.lang.NullPointerException > at > org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1031) > at > org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:961) > at > org.apache.flink.api.java.ScalaShellRemoteEnvironment.execute(ScalaShellRemoteEnvironment.java:70) > at > org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:855) > at org.apache.flink.api.java.DataSet.collect(DataSet.java:410) > at org.apache.flink.api.java.DataSet.print(DataSet.java:1605) > at org.apache.flink.api.scala.DataSet.print(DataSet.scala:1615) > at .(:56) > at .() > at .(:7) > at .() > at $print() > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:734) > at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:983) > at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573) > at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:604) > at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:568) > at scala.tools.nsc.interpreter.ILoop.reallyInterpret$1(ILoop.scala:760) > at > scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:805) > at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:717) > at scala.tools.nsc.interpreter.ILoop.processLine$1(ILoop.scala:581) > at scala.tools.nsc.interpreter.ILoop.innerLoop$1(ILoop.scala:588) > at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:591) > at > scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:882) > at > scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:837) > at > scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:837) > at > scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) > at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:837) > at > org.apache.flink.api.scala.FlinkShell$.startShell(FlinkShell.scala:199) > at org.apache.flink.api.scala.FlinkShell$.main(FlinkShell.scala:127) > at org.apache.flink.api.scala.FlinkShell.main(FlinkShell.scala) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3911) Sort operation before a group reduce doesn't seem to be implemented on 1.0.2
[ https://issues.apache.org/jira/browse/FLINK-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283506#comment-15283506 ] Patrice Freydiere commented on FLINK-3911: -- Just for thought, it seems that a version check in the CLI client could be valuable and save time Patrice > Sort operation before a group reduce doesn't seem to be implemented on 1.0.2 > > > Key: FLINK-3911 > URL: https://issues.apache.org/jira/browse/FLINK-3911 > Project: Flink > Issue Type: Bug >Affects Versions: 1.0.2 > Environment: Linux Ubuntu, standalone cluster >Reporter: Patrice Freydiere > Labels: newbie > > i have this piece of code: > // group by id and sort on field order > DataSet> waysGeometry = > joinedWaysWithPoints.groupBy(0).sortGroup(1, Order.ASCENDING) > .reduceGroup(new > GroupReduceFunction , Tuple2 byte[]>>() { > @Override > public void > reduce(Iterable > values, > > Collector > out) throws Exception { > long id = -1; > and this exception when executing ; > ava.lang.Exception: The data preparation for task 'GroupReduce (GroupReduce > at constructOSMStreams(ProcessOSM.java:112))' , caused an error: Unrecognized > driver strategy for GroupReduce driver: SORTED_GROUP_COMBINE > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:456) > at > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:345) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.Exception: Unrecognized driver strategy for GroupReduce > driver: SORTED_GROUP_COMBINE > at > org.apache.flink.runtime.operators.GroupReduceDriver.prepare(GroupReduceDriver.java:90) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:450) > ... 3 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3911) Sort operation before a group reduce doesn't seem to be implemented on 1.0.2
[ https://issues.apache.org/jira/browse/FLINK-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283505#comment-15283505 ] Patrice Freydiere commented on FLINK-3911: -- You got it Fabian, it works on local IDE, the job was recompiled using the proper version, but the script used to launch the job on the cluster was referencing a 0.10.2 flink client version :-( My mistake , Thank's for your valuable feedback, Patrice Le sam. 14 mai 2016 à 09:59, Fabian Hueske (JIRA)a > Sort operation before a group reduce doesn't seem to be implemented on 1.0.2 > > > Key: FLINK-3911 > URL: https://issues.apache.org/jira/browse/FLINK-3911 > Project: Flink > Issue Type: Bug >Affects Versions: 1.0.2 > Environment: Linux Ubuntu, standalone cluster >Reporter: Patrice Freydiere > Labels: newbie > > i have this piece of code: > // group by id and sort on field order > DataSet > waysGeometry = > joinedWaysWithPoints.groupBy(0).sortGroup(1, Order.ASCENDING) > .reduceGroup(new > GroupReduceFunction , Tuple2 byte[]>>() { > @Override > public void > reduce(Iterable > values, > > Collector > out) throws Exception { > long id = -1; > and this exception when executing ; > ava.lang.Exception: The data preparation for task 'GroupReduce (GroupReduce > at constructOSMStreams(ProcessOSM.java:112))' , caused an error: Unrecognized > driver strategy for GroupReduce driver: SORTED_GROUP_COMBINE > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:456) > at > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:345) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.Exception: Unrecognized driver strategy for GroupReduce > driver: SORTED_GROUP_COMBINE > at > org.apache.flink.runtime.operators.GroupReduceDriver.prepare(GroupReduceDriver.java:90) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:450) > ... 3 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3911) Sort operation before a group reduce doesn't seem to be implemented on 1.0.2
[ https://issues.apache.org/jira/browse/FLINK-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283471#comment-15283471 ] Fabian Hueske commented on FLINK-3911: -- I verified that between Flink 0.10.x and 1.0.x the Enum ordinals of {{SORTED_GROUP_COMBINE}} and {{SORTED_GROUP_REDUCE}} shifted by one. I assume that you submit a job with a 0.10.x client to a 1.0.x cluster. How are you submitting the job? You need to make sure that the client client has the same version as the cluster you are submitting the job to. > Sort operation before a group reduce doesn't seem to be implemented on 1.0.2 > > > Key: FLINK-3911 > URL: https://issues.apache.org/jira/browse/FLINK-3911 > Project: Flink > Issue Type: Bug >Affects Versions: 1.0.2 > Environment: Linux Ubuntu, standalone cluster >Reporter: Patrice Freydiere > Labels: newbie > > i have this piece of code: > // group by id and sort on field order > DataSet> waysGeometry = > joinedWaysWithPoints.groupBy(0).sortGroup(1, Order.ASCENDING) > .reduceGroup(new > GroupReduceFunction , Tuple2 byte[]>>() { > @Override > public void > reduce(Iterable > values, > > Collector > out) throws Exception { > long id = -1; > and this exception when executing ; > ava.lang.Exception: The data preparation for task 'GroupReduce (GroupReduce > at constructOSMStreams(ProcessOSM.java:112))' , caused an error: Unrecognized > driver strategy for GroupReduce driver: SORTED_GROUP_COMBINE > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:456) > at > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:345) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.Exception: Unrecognized driver strategy for GroupReduce > driver: SORTED_GROUP_COMBINE > at > org.apache.flink.runtime.operators.GroupReduceDriver.prepare(GroupReduceDriver.java:90) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:450) > ... 3 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3911) Sort operation before a group reduce doesn't seem to be implemented on 1.0.2
[ https://issues.apache.org/jira/browse/FLINK-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283460#comment-15283460 ] Patrice Freydiere commented on FLINK-3911: -- Hi thank's for the followup, i just reran the process using the 0.10.2 on a linux local cluster and it works fine (with a job recompile on 0.10.2 version) i'll follow you requirement and run it from the IDE in a standalone execution, with 1.0.3 version, Patrice Le sam. 14 mai 2016 à 00:23, Fabian Hueske (JIRA)a > Sort operation before a group reduce doesn't seem to be implemented on 1.0.2 > > > Key: FLINK-3911 > URL: https://issues.apache.org/jira/browse/FLINK-3911 > Project: Flink > Issue Type: Bug >Affects Versions: 1.0.2 > Environment: Linux Ubuntu, standalone cluster >Reporter: Patrice Freydiere > Labels: newbie > > i have this piece of code: > // group by id and sort on field order > DataSet > waysGeometry = > joinedWaysWithPoints.groupBy(0).sortGroup(1, Order.ASCENDING) > .reduceGroup(new > GroupReduceFunction , Tuple2 byte[]>>() { > @Override > public void > reduce(Iterable > values, > > Collector > out) throws Exception { > long id = -1; > and this exception when executing ; > ava.lang.Exception: The data preparation for task 'GroupReduce (GroupReduce > at constructOSMStreams(ProcessOSM.java:112))' , caused an error: Unrecognized > driver strategy for GroupReduce driver: SORTED_GROUP_COMBINE > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:456) > at > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:345) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.Exception: Unrecognized driver strategy for GroupReduce > driver: SORTED_GROUP_COMBINE > at > org.apache.flink.runtime.operators.GroupReduceDriver.prepare(GroupReduceDriver.java:90) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:450) > ... 3 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1745] Add exact k-nearest-neighbours al...
Github user chiwanpark commented on the pull request: https://github.com/apache/flink/pull/1220#issuecomment-219203934 I would like to merge this PR. I've checked the output of k-NN join (with, without quadtree). Documentation looks good. To maintainers (@tillrohrmann @thvasilo): Could you do final check this PR? You do not need to bother code style issues. I'll address them before merging. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---