[GitHub] flink pull request: [FLINK-1745] Add exact k-nearest-neighbours al...

2016-05-14 Thread danielblazevski
Github user danielblazevski commented on the pull request:

https://github.com/apache/flink/pull/1220#issuecomment-219258096
  
@chiwanpark thanks!  Putting the finishing touches on 
[approximate](https://github.com/danielblazevski/flink/blob/FLINK-1934/flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/nn/zknn.scala)
 z-knn -- but also need to do a LSH version of approximate knn since z-knn only 
applies for dim < 30 (and z-value method is much quicker than LSH method for 
dim < 30, but at least LSH makes since for dim > 30).  Pretty excited about the 
performance gain compared to exact knn.

I'll be presenting on knn for Flink at a Scala meetup in NY at Spotify on 
May 24th and will definitely mention @chiwanpark and @tillrohrmann for all 
their help!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-3852] update the quickstart module to i...

2016-05-14 Thread markreddy
Github user markreddy commented on the pull request:

https://github.com/apache/flink/pull/1982#issuecomment-219230214
  
Thanks for the review @tzulitai I've pushed fixes for all your comments 
:thumbsup:


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-3788) Programs extending App trait cause problems because of DelayedInit

2016-05-14 Thread Gabor Gevay (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Gevay updated FLINK-3788:
---
Summary: Programs extending App trait cause problems because of DelayedInit 
 (was: Local variable values are not distributed to job runners)

> Programs extending App trait cause problems because of DelayedInit
> --
>
> Key: FLINK-3788
> URL: https://issues.apache.org/jira/browse/FLINK-3788
> Project: Flink
>  Issue Type: Bug
>  Components: DataSet API
>Affects Versions: 1.0.0, 1.0.1
> Environment: Scala 2.11.8
> Sun JDK 1.8.0_65 or OpenJDK 1.8.0_77
> Fedora 25, 4.6.0-0.rc2.git3.1.fc25.x86_64
>Reporter: Andreas C. Osowski
> Attachments: FLINK-3788.tgz
>
>
> Variable values of non-elementary types aren't caught and distributed to job 
> runners, causing them to remain 'null' and causing NPEs upon access when 
> running on a cluster. Running locally through `flink-clients` works fine.
> Changing parallelism or disabling the closure cleaner don't seem to have any 
> effect.
> Minimal example, also see the attached archive.
> {code:java}
> case class IntWrapper(a1: Int)
> val wrapped = IntWrapper(42)
> env.readTextFile("myTextFile.txt").map(line => wrapped.toString).collect
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-3701) Cant call execute after first execution

2016-05-14 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved FLINK-3701.
---
Resolution: Fixed

Fixed in 48b469ad4f0da466b347071cea82913965645de3.

> Cant call execute after first execution
> ---
>
> Key: FLINK-3701
> URL: https://issues.apache.org/jira/browse/FLINK-3701
> Project: Flink
>  Issue Type: Bug
>  Components: Scala Shell
>Affects Versions: 1.1.0
>Reporter: Nikolaas Steenbergen
>Assignee: Maximilian Michels
> Fix For: 1.1.0
>
>
> in the scala shell, local mode, version 1.0 this works:
> {code}
> Scala-Flink> var b = env.fromElements("a","b")
> Scala-Flink> b.print
> Scala-Flink> var c = env.fromElements("c","d")
> Scala-Flink> c.print
> {code}
> in the current master (after c.print) this leads to :
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1031)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:961)
>   at 
> org.apache.flink.api.java.ScalaShellRemoteEnvironment.execute(ScalaShellRemoteEnvironment.java:70)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:855)
>   at org.apache.flink.api.java.DataSet.collect(DataSet.java:410)
>   at org.apache.flink.api.java.DataSet.print(DataSet.java:1605)
>   at org.apache.flink.api.scala.DataSet.print(DataSet.scala:1615)
>   at .(:56)
>   at .()
>   at .(:7)
>   at .()
>   at $print()
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:734)
>   at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:983)
>   at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573)
>   at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:604)
>   at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:568)
>   at scala.tools.nsc.interpreter.ILoop.reallyInterpret$1(ILoop.scala:760)
>   at 
> scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:805)
>   at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:717)
>   at scala.tools.nsc.interpreter.ILoop.processLine$1(ILoop.scala:581)
>   at scala.tools.nsc.interpreter.ILoop.innerLoop$1(ILoop.scala:588)
>   at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:591)
>   at 
> scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:882)
>   at 
> scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:837)
>   at 
> scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:837)
>   at 
> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
>   at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:837)
>   at 
> org.apache.flink.api.scala.FlinkShell$.startShell(FlinkShell.scala:199)
>   at org.apache.flink.api.scala.FlinkShell$.main(FlinkShell.scala:127)
>   at org.apache.flink.api.scala.FlinkShell.main(FlinkShell.scala)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3911) Sort operation before a group reduce doesn't seem to be implemented on 1.0.2

2016-05-14 Thread Patrice Freydiere (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283506#comment-15283506
 ] 

Patrice Freydiere commented on FLINK-3911:
--

Just for thought,
it seems that a version check in the CLI client could be valuable and save
time


Patrice





> Sort operation before a group reduce doesn't seem to be implemented on 1.0.2
> 
>
> Key: FLINK-3911
> URL: https://issues.apache.org/jira/browse/FLINK-3911
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.0.2
> Environment: Linux Ubuntu, standalone cluster
>Reporter: Patrice Freydiere
>  Labels: newbie
>
> i have this piece of code: 
>  // group by id and sort on field order
> DataSet> waysGeometry = 
> joinedWaysWithPoints.groupBy(0).sortGroup(1, Order.ASCENDING)
> .reduceGroup(new 
> GroupReduceFunction, Tuple2 byte[]>>() {
> @Override
> public void 
> reduce(Iterable> values,
> 
> Collector> out) throws Exception {
> long id = -1;
> and this exception when executing ;
> ava.lang.Exception: The data preparation for task 'GroupReduce (GroupReduce 
> at constructOSMStreams(ProcessOSM.java:112))' , caused an error: Unrecognized 
> driver strategy for GroupReduce driver: SORTED_GROUP_COMBINE
>   at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:456)
>   at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:345)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception: Unrecognized driver strategy for GroupReduce 
> driver: SORTED_GROUP_COMBINE
>   at 
> org.apache.flink.runtime.operators.GroupReduceDriver.prepare(GroupReduceDriver.java:90)
>   at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:450)
>   ... 3 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3911) Sort operation before a group reduce doesn't seem to be implemented on 1.0.2

2016-05-14 Thread Patrice Freydiere (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283505#comment-15283505
 ] 

Patrice Freydiere commented on FLINK-3911:
--

You got it Fabian,

it works on local IDE, the job was recompiled using the proper version,

but the script used to launch the job on the cluster was referencing a
0.10.2 flink client version  :-(

My mistake ,

Thank's for your valuable feedback,


Patrice



Le sam. 14 mai 2016 à 09:59, Fabian Hueske (JIRA)  a



> Sort operation before a group reduce doesn't seem to be implemented on 1.0.2
> 
>
> Key: FLINK-3911
> URL: https://issues.apache.org/jira/browse/FLINK-3911
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.0.2
> Environment: Linux Ubuntu, standalone cluster
>Reporter: Patrice Freydiere
>  Labels: newbie
>
> i have this piece of code: 
>  // group by id and sort on field order
> DataSet> waysGeometry = 
> joinedWaysWithPoints.groupBy(0).sortGroup(1, Order.ASCENDING)
> .reduceGroup(new 
> GroupReduceFunction, Tuple2 byte[]>>() {
> @Override
> public void 
> reduce(Iterable> values,
> 
> Collector> out) throws Exception {
> long id = -1;
> and this exception when executing ;
> ava.lang.Exception: The data preparation for task 'GroupReduce (GroupReduce 
> at constructOSMStreams(ProcessOSM.java:112))' , caused an error: Unrecognized 
> driver strategy for GroupReduce driver: SORTED_GROUP_COMBINE
>   at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:456)
>   at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:345)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception: Unrecognized driver strategy for GroupReduce 
> driver: SORTED_GROUP_COMBINE
>   at 
> org.apache.flink.runtime.operators.GroupReduceDriver.prepare(GroupReduceDriver.java:90)
>   at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:450)
>   ... 3 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3911) Sort operation before a group reduce doesn't seem to be implemented on 1.0.2

2016-05-14 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283471#comment-15283471
 ] 

Fabian Hueske commented on FLINK-3911:
--

I verified that between Flink 0.10.x and 1.0.x the Enum ordinals of 
{{SORTED_GROUP_COMBINE}} and {{SORTED_GROUP_REDUCE}} shifted by one.
I assume that you submit a job with a 0.10.x client to a 1.0.x cluster. 

How are you submitting the job? 
You need to make sure that the client client has the same version as the 
cluster you are submitting the job to.


> Sort operation before a group reduce doesn't seem to be implemented on 1.0.2
> 
>
> Key: FLINK-3911
> URL: https://issues.apache.org/jira/browse/FLINK-3911
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.0.2
> Environment: Linux Ubuntu, standalone cluster
>Reporter: Patrice Freydiere
>  Labels: newbie
>
> i have this piece of code: 
>  // group by id and sort on field order
> DataSet> waysGeometry = 
> joinedWaysWithPoints.groupBy(0).sortGroup(1, Order.ASCENDING)
> .reduceGroup(new 
> GroupReduceFunction, Tuple2 byte[]>>() {
> @Override
> public void 
> reduce(Iterable> values,
> 
> Collector> out) throws Exception {
> long id = -1;
> and this exception when executing ;
> ava.lang.Exception: The data preparation for task 'GroupReduce (GroupReduce 
> at constructOSMStreams(ProcessOSM.java:112))' , caused an error: Unrecognized 
> driver strategy for GroupReduce driver: SORTED_GROUP_COMBINE
>   at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:456)
>   at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:345)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception: Unrecognized driver strategy for GroupReduce 
> driver: SORTED_GROUP_COMBINE
>   at 
> org.apache.flink.runtime.operators.GroupReduceDriver.prepare(GroupReduceDriver.java:90)
>   at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:450)
>   ... 3 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3911) Sort operation before a group reduce doesn't seem to be implemented on 1.0.2

2016-05-14 Thread Patrice Freydiere (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283460#comment-15283460
 ] 

Patrice Freydiere commented on FLINK-3911:
--

Hi  thank's for the followup, i just reran the process using the 0.10.2 on
a linux local cluster and it works fine (with a job recompile on 0.10.2
version)

i'll follow you requirement and run it from the IDE in a standalone
execution, with 1.0.3 version,

Patrice


Le sam. 14 mai 2016 à 00:23, Fabian Hueske (JIRA)  a



> Sort operation before a group reduce doesn't seem to be implemented on 1.0.2
> 
>
> Key: FLINK-3911
> URL: https://issues.apache.org/jira/browse/FLINK-3911
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.0.2
> Environment: Linux Ubuntu, standalone cluster
>Reporter: Patrice Freydiere
>  Labels: newbie
>
> i have this piece of code: 
>  // group by id and sort on field order
> DataSet> waysGeometry = 
> joinedWaysWithPoints.groupBy(0).sortGroup(1, Order.ASCENDING)
> .reduceGroup(new 
> GroupReduceFunction, Tuple2 byte[]>>() {
> @Override
> public void 
> reduce(Iterable> values,
> 
> Collector> out) throws Exception {
> long id = -1;
> and this exception when executing ;
> ava.lang.Exception: The data preparation for task 'GroupReduce (GroupReduce 
> at constructOSMStreams(ProcessOSM.java:112))' , caused an error: Unrecognized 
> driver strategy for GroupReduce driver: SORTED_GROUP_COMBINE
>   at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:456)
>   at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:345)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception: Unrecognized driver strategy for GroupReduce 
> driver: SORTED_GROUP_COMBINE
>   at 
> org.apache.flink.runtime.operators.GroupReduceDriver.prepare(GroupReduceDriver.java:90)
>   at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:450)
>   ... 3 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1745] Add exact k-nearest-neighbours al...

2016-05-14 Thread chiwanpark
Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/1220#issuecomment-219203934
  
I would like to merge this PR. I've checked the output of k-NN join (with, 
without quadtree). Documentation looks good.

To maintainers (@tillrohrmann @thvasilo): Could you do final check this PR? 
You do not need to bother code style issues. I'll address them before merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---