[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons

2015-01-20 Thread senorcarbone
Github user senorcarbone commented on the pull request:

https://github.com/apache/flink/pull/326#issuecomment-70796718
  

+1 for Gelly :)
It's good to personalize flink libraries somehow, people remember these. 
Though, to be honest, it would be more consistent in that case if other add-ons 
also had names with a certain theme perhaps instead of Flink-

> On 21 Jan 2015, at 02:13, Kostas Tzoumas  wrote:
> 
> I think Gelly has way more personality than flink-graph
> 
> —
> Reply to this email directly or view it on GitHub.
> 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-1424) bin/flink run does not recognize -c parameter anymore

2015-01-20 Thread Carsten Brandt (JIRA)
Carsten Brandt created FLINK-1424:
-

 Summary: bin/flink run does not recognize -c parameter anymore
 Key: FLINK-1424
 URL: https://issues.apache.org/jira/browse/FLINK-1424
 Project: Flink
  Issue Type: Bug
  Components: TaskManager
Affects Versions: master
Reporter: Carsten Brandt


bin/flink binary does not recognize `-c` parameter anymore which specifies the 
class to run:

{noformat}
$ ./flink run "/path/to/target/impro3-ws14-flink-1.0-SNAPSHOT.jar" -c 
de.tu_berlin.impro3.flink.etl.FollowerGraphGenerator /tmp/flink/testgraph.txt 
1
usage: emma-experiments-impro3-ss14-flink
   [-?]
emma-experiments-impro3-ss14-flink: error: unrecognized arguments: '-c'
{noformat}

before this command worked fine and executed the job.

I tracked it down to the following commit using `git bisect`:

{noformat}
93eadca782ee8c77f89609f6d924d73021dcdda9 is the first bad commit
commit 93eadca782ee8c77f89609f6d924d73021dcdda9
Author: Alexander Alexandrov 
Date:   Wed Dec 24 13:49:56 2014 +0200

[FLINK-1027] [cli] Added support for '--' and '-' prefixed tokens in CLI 
program arguments.

This closes #278

:04 04 a1358e6f7fe308b4d51a47069f190a29f87fdeda 
d6f11bbc9444227d5c6297ec908e44b9644289a9 M  flink-clients
{noformat}

https://github.com/apache/flink/commit/93eadca782ee8c77f89609f6d924d73021dcdda9





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons

2015-01-20 Thread ktzoumas
Github user ktzoumas commented on the pull request:

https://github.com/apache/flink/pull/326#issuecomment-70767467
  
I think Gelly has way more personality than flink-graph



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-1423) No Tags for new release on the github repo

2015-01-20 Thread Carsten Brandt (JIRA)
Carsten Brandt created FLINK-1423:
-

 Summary: No Tags for new release on the github repo
 Key: FLINK-1423
 URL: https://issues.apache.org/jira/browse/FLINK-1423
 Project: Flink
  Issue Type: Task
  Components: Project Website
Affects Versions: 0.7.0-incubating
Reporter: Carsten Brandt
Priority: Minor


As far as I know there is an official version 0.7 release but there is no tag 
for that version:

https://github.com/apache/flink/releases



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons

2015-01-20 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/326#issuecomment-70759342
  
I'm very exited about this :-) I will have a look later as well.

Regarding the name: I like the name gelly, but would also prefer 
flink-graph (or something along the lines). My past experience with Spargel 
(which is also a nice name :-)) was that the name was confusing to people new 
to the system, whereas something like flink-graph directly states what it is.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1296] Add sorter support for very large...

2015-01-20 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/249#issuecomment-70758622
  
Cluster tests worked fine. If no one objects I'll merge this later today.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-629) Add support for null values to the java api

2015-01-20 Thread mustafa elbehery (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mustafa elbehery updated FLINK-629:
---
Attachment: SimpleTweetInputFormat.java
Tweet.java

> Add support for null values to the java api
> ---
>
> Key: FLINK-629
> URL: https://issues.apache.org/jira/browse/FLINK-629
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API
>Reporter: Stephan Ewen
>Assignee: Gyula Fora
>Priority: Critical
>  Labels: github-import
> Fix For: pre-apache
>
> Attachments: Selection_006.png, SimpleTweetInputFormat.java, 
> Tweet.java, model.tar.gz
>
>
> Currently, many runtime operations fail when encountering a null value. Tuple 
> serialization should allow null fields.
> I suggest to add a method to the tuples called `getFieldNotNull()` which 
> throws a meaningful exception when the accessed field is null. That way, we 
> simplify the logic of operators that should not dead with null fields, like 
> key grouping or aggregations.
> Even though SQL allows grouping and aggregating of null values, I suggest to 
> exclude this from the java api, because the SQL semantics of aggregating null 
> fields are messy.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/629
> Created by: [StephanEwen|https://github.com/StephanEwen]
> Labels: enhancement, java api, 
> Milestone: Release 0.5.1
> Created at: Wed Mar 26 00:27:49 CET 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-629) Add support for null values to the java api

2015-01-20 Thread mustafa elbehery (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284534#comment-14284534
 ] 

mustafa elbehery commented on FLINK-629:


It is proper POJO with all the setters and getters. And I have set the type 
explicitly as mentioned above, in the InputFormat. I have attached the Pojo and 
the InputFormat .Java, 

Thanks in advance.

> Add support for null values to the java api
> ---
>
> Key: FLINK-629
> URL: https://issues.apache.org/jira/browse/FLINK-629
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API
>Reporter: Stephan Ewen
>Assignee: Gyula Fora
>Priority: Critical
>  Labels: github-import
> Fix For: pre-apache
>
> Attachments: Selection_006.png, SimpleTweetInputFormat.java, 
> Tweet.java, model.tar.gz
>
>
> Currently, many runtime operations fail when encountering a null value. Tuple 
> serialization should allow null fields.
> I suggest to add a method to the tuples called `getFieldNotNull()` which 
> throws a meaningful exception when the accessed field is null. That way, we 
> simplify the logic of operators that should not dead with null fields, like 
> key grouping or aggregations.
> Even though SQL allows grouping and aggregating of null values, I suggest to 
> exclude this from the java api, because the SQL semantics of aggregating null 
> fields are messy.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/629
> Created by: [StephanEwen|https://github.com/StephanEwen]
> Labels: enhancement, java api, 
> Milestone: Release 0.5.1
> Created at: Wed Mar 26 00:27:49 CET 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons

2015-01-20 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/326#issuecomment-70744282
  
I also prefer flink-graph.
After having a quick look at the commits I would suggest to squash a couple 
of them, e. g.  fda6e4c, which is empty or 393902c which deletes 4 lines of 
unused imports.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons

2015-01-20 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/326#issuecomment-70743431
  
Woot! :)

As for merging the streaming back then, it was @StephanEwen who came up 
with the solution, pasting the important parts here:

"
BTW: I used this way to do it:

Have two repositories (clones)
  - /data/repositories/flink
  - /data/repositories/flinkbak

The do the following for every non-merge commit:
 - Check out the state after a commit in the backup (detached head)
 - Remove current streaming directory (physically and from the index)
 - Add it again (files and index), with the state of the cloned repo
 - Commit (git recreates the diffs in a way that they reflect the original 
commit plus any merges)

-

#!/bin/bash

for line in $(cat commits)
do
  cd /data/repositories/flinkbak
  author=`git --no-pager show -s --format='%an <%ae>' $line`
  message=`git --no-pager show -s --format='%s%n' $line`

  echo "picking commit $line from author $author"

  git checkout $line
  cd /data/repositories/flink
  rm -rf "/data/repositories/flink/flink-addons/flink-streaming"
  git rm -r "/data/repositories/flink/flink-addons/flink-streaming"
  cp -r "/data/repositories/flinkbak/flink-addons/flink-streaming" 
"/data/repositories/flink/flink-addons/flink-streaming"
  git add /data/repositories/flink/flink-addons/flink-streaming
  git commit --author "$author" --m "$message"
  
#  read -rsp $'Press any key to continue...\n' -n1 key
done
"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-629) Add support for null values to the java api

2015-01-20 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284480#comment-14284480
 ] 

Robert Metzger commented on FLINK-629:
--

I think the issue is that Kryo is currently not properly creating instances of 
the types its serializing. I think we need to fix this issue for types which 
are instantiateable.
As a quick workaround for you, can you try making your type a proper POJO type. 
You can do that by fulfilling all requirements for POJO types (getter setter 
for private fields). The log output will show you why a type is not a pojo.

> Add support for null values to the java api
> ---
>
> Key: FLINK-629
> URL: https://issues.apache.org/jira/browse/FLINK-629
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API
>Reporter: Stephan Ewen
>Assignee: Gyula Fora
>Priority: Critical
>  Labels: github-import
> Fix For: pre-apache
>
> Attachments: Selection_006.png, model.tar.gz
>
>
> Currently, many runtime operations fail when encountering a null value. Tuple 
> serialization should allow null fields.
> I suggest to add a method to the tuples called `getFieldNotNull()` which 
> throws a meaningful exception when the accessed field is null. That way, we 
> simplify the logic of operators that should not dead with null fields, like 
> key grouping or aggregations.
> Even though SQL allows grouping and aggregating of null values, I suggest to 
> exclude this from the java api, because the SQL semantics of aggregating null 
> fields are messy.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/629
> Created by: [StephanEwen|https://github.com/StephanEwen]
> Labels: enhancement, java api, 
> Milestone: Release 0.5.1
> Created at: Wed Mar 26 00:27:49 CET 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons

2015-01-20 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/326#issuecomment-70740654
  
Regarding the name: I would actually vote to call it "flink-graph".

Mh, back then @mbalassi and @gyfora figured it out the last time. Maybe 
they can remember how they did it.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons

2015-01-20 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/326#issuecomment-70725539
  
Hi @rmetzger! Thanks for starting on it so fast ^^

Regarding the name, we thought it'd be nice to have one.. It was actually 
@ktzoumas that came up with Gelly :)
Regarding the history, my intention was to keep it actually, but it seems I 
failed :disappointed: 
What should I do to preserve it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons

2015-01-20 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/326#discussion_r23252081
  
--- Diff: 
flink-addons/flink-gelly/src/main/java/org/apache/flink/gelly/Edge.java ---
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.gelly;
+
+import java.io.Serializable;
+
+import org.apache.flink.api.java.tuple.Tuple3;
+
+public class Edge & Serializable, V extends 
Serializable> 
--- End diff --

This Edge class, and all the 3 other Edge* classes below are missing 
javadocs. The rest of the graph api is very well documented, but these classes 
not.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1147][Java API] TypeInference on POJOs

2015-01-20 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/315#issuecomment-70723442
  
I didn't find any issues in the code. Good to merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-1303) HadoopInputFormat does not work with Scala API

2015-01-20 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1303:
--
Component/s: Scala API

> HadoopInputFormat does not work with Scala API
> --
>
> Key: FLINK-1303
> URL: https://issues.apache.org/jira/browse/FLINK-1303
> Project: Flink
>  Issue Type: Bug
>  Components: Scala API
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 0.9
>
>
> It fails because the HadoopInputFormat uses the Flink Tuple2 type. For this, 
> type extraction fails at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Implement the convenience methods count and co...

2015-01-20 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/210#issuecomment-70718732
  
I've implemented count and collect in the Scala API. There is still a 
problem with the `ListAccumulator` for non-primitive Objects (e.g. not Integer 
or Long) probably due to Object reuse. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons

2015-01-20 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/326#discussion_r23248831
  
--- Diff: 
flink-addons/flink-gelly/src/test/java/org/apache/flink/gelly/test/TestWeaklyConnected.java
 ---
@@ -0,0 +1,143 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.gelly.test;
+
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.util.Collection;
+import java.util.LinkedList;
+
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.gelly.Graph;
+import org.apache.flink.test.util.JavaProgramTestBase;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+import org.junit.runners.Parameterized.Parameters;
+
+@RunWith(Parameterized.class)
+public class TestWeaklyConnected extends JavaProgramTestBase {
--- End diff --

I think its recommended now to use the `MultipleProgramsTestBase` instead 
of the `JavaProgramTestBase` because the MultiProgramsTB is 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-1422) Missing usage example for "withParameters"

2015-01-20 Thread Alexander Alexandrov (JIRA)
Alexander Alexandrov created FLINK-1422:
---

 Summary: Missing usage example for "withParameters"
 Key: FLINK-1422
 URL: https://issues.apache.org/jira/browse/FLINK-1422
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 0.8
Reporter: Alexander Alexandrov
Priority: Trivial
 Fix For: 0.8.1


I am struggling to find a usage example of the "withParameters" method in the 
documentation. At the moment I only see this note:

{quote}
Note: As the content of broadcast variables is kept in-memory on each node, it 
should not become too large. For simpler things like scalar values you can 
simply make parameters part of the closure of a function, or use the 
withParameters(...) method to pass in a configuration.
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons

2015-01-20 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/326#issuecomment-70717064
  
Great, I'm super excited to see the graph API being offered to the main 
project.

I'll start reviewing the code right away, to merge it as soon as possible.
One question upfront: How did you come up with the name gelly? 
Why don't we call the baby by what it is? a graph api ?

Should we consider moving the classes while preserving their history? Thats 
what we did with the streaming system when we merged it. Right now, basically 
all the code from the graph api has one commit in its history (6c31f8e)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-1372) TaskManager and JobManager do not log startup settings any more

2015-01-20 Thread Ufuk Celebi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi resolved FLINK-1372.

Resolution: Fixed

Fixed in 4e197d5..ce8acc4

> TaskManager and JobManager do not log startup settings any more
> ---
>
> Key: FLINK-1372
> URL: https://issues.apache.org/jira/browse/FLINK-1372
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Till Rohrmann
> Fix For: 0.9
>
>
> In prior versions, the jobmanager and taskmanager logged a lot of startup 
> options:
>  - Environment
>  - ports
>  - memory configuration
>  - network configuration
> Currently, they log very little. We should add the logging back in.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1372] [runtime] Fix akka logging

2015-01-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/321


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Rename coGroupDataSet.scala to CoGroupDataSet....

2015-01-20 Thread hsaputra
Github user hsaputra commented on the pull request:

https://github.com/apache/flink/pull/324#issuecomment-70715529
  
I rename the file from coGroupDataSet.scala to CoGroupDataSet.scala and 
crossDataSet.scala to CrossDataSet.scala to follow convention Scala file naming.

And move out UnfinishedCoGroupOperation class because it is a high level 
public class by itself and not dependent on CoGroupOperation as sealed trait.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1372] [runtime] Fix akka logging

2015-01-20 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/321#issuecomment-70705539
  
I've tested this and the logging works as before (pre akka), e.g. 
everything is in the jm/tm log file. :-)

Logging of the temp directories got somehow lost on the way. I will 
re-enable it and compare the logged information with the old task manager 
before I merge this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1296] Add sorter support for very large...

2015-01-20 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/249#issuecomment-70702082
  
OK I've rebased this PR on the current master. All tests are running 
locally (Travis is still flaky). @mxm is running some cluster tests. I'll push 
to master after his OK. :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1201] Add flink-gelly to flink-addons

2015-01-20 Thread vasia
GitHub user vasia opened a pull request:

https://github.com/apache/flink/pull/326

[FLINK-1201] Add flink-gelly to flink-addons

This PR adds an initial version of Gelly, a graph API for Flink, to the 
flink-addons project.
The development of Gelly took place as a collaboration on a separate 
project, which you can visit here: https://github.com/project-flink/flink-graph.
I have kept the commit history and I hope this won't make it too hard to 
review :)

We are currently working on adding documentation, a few more examples and 
tests and we'll also port remaining issues to JIRA.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vasia/flink flink-gelly

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/326.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #326


commit 1f378eda803783c053adc7b707bd8f938b0616a4
Author: Stephan Ewen 
Date:   2014-08-12T20:48:59Z

Initial commit

commit 22b7464cf73497f09a55b26be4d654cc339cf1ec
Author: Stephan Ewen 
Date:   2014-08-12T20:54:54Z

Initial Maven project structure and .gitignore

commit 6103cf9dea81c957a5727e351fecc4e32df017ba
Author: Stephan Ewen 
Date:   2014-08-27T15:56:32Z

Add example for parallel dense ID assignment

commit f44986d22b5913cba8e11b6e43a77effd925aaa1
Author: Stephan Ewen 
Date:   2014-08-30T17:19:21Z

First mockup of initial API functions

commit 0276986098f18ab18f71b6f81e273fc8c2f86886
Author: Stephan Ewen 
Date:   2014-10-08T10:18:04Z

Reset repository

commit c1336f6950d966220418fcbab241e7e702dcf498
Author: Stephan Ewen 
Date:   2014-10-08T10:24:05Z

Initial package

commit 8083be2d8b34921fd5e80ef6030aee4668ffbb4b
Author: Stephan Ewen 
Date:   2014-10-08T11:50:11Z

Stubs for basic classes

commit a708ced256cf6b4176a3d93ef6cb829b2a4eb1d8
Author: Stephan Ewen 
Date:   2014-10-08T12:29:46Z

fix project setup

commit 1ee9b73428602fa37dd8ae86be53aeb79e1af68c
Author: vasia 
Date:   2014-10-08T13:27:40Z

get undirected graph

commit 08301093d31b66f553a7c3d3d34c39ea9f1dd683
Author: vasia 
Date:   2014-10-09T11:22:24Z

edge abd value extend tuple types

commit 006a65397b152b08d555899ccedeaa6274a8c48e
Author: vasia 
Date:   2014-10-09T12:49:34Z

reverse and graph input methods

commit aa0eb20f5708bdc3e2439651ca7b10436bbde640
Author: Theodore Vasiloudis 
Date:   2014-10-09T14:11:03Z

Added Push-Gather-Apply, mapVertices, subgraph, outDegrees

Added:

mapVertices + Test
subgraph (Unfinished)
outDegrees
Push-Gather-Apply Neighborhood Model
Junit Dependency

Everything is untested as we are blocked on graph creation

commit 044d8097b3b4e4b2d60142bb9f6ad04b15d3fb94
Author: Theodore Vasiloudis 
Date:   2014-10-09T14:45:03Z

Merge branch 'hackathon' of github.com:vasia/flink-graphs into thvasilo

Conflicts:
src/main/java/flink/graphs/Edge.java
src/main/java/flink/graphs/Graph.java
src/main/java/flink/graphs/Vertex.java

commit 43e8b56f81006a85699c842a44b00b1b395474dd
Author: Kostas Tzoumas 
Date:   2014-10-10T11:02:17Z

Initial commit

commit 0b39cc70130a92d67b908a379fae3347d648dc99
Author: Theodore Vasiloudis 
Date:   2014-10-10T14:09:36Z

Reverted to using Tuples, added tests.

commit 3f2610453c64188e04b5e27b31c2145cfe37cee6
Author: Vasia Kalavri 
Date:   2014-10-10T16:14:38Z

Merge pull request #1 from thvasilo/thvasilo

Merge Theo's & Martin's hackathon work.

commit 35b865e8b2c87816e813a5966d367910aadd64a8
Author: vasia 
Date:   2014-10-10T16:19:37Z

keep 1 of the 2 readme files

commit fee97bb4e6daa287b7fd8a678395799fbc7905a0
Author: Kostas Tzoumas 
Date:   2014-10-21T06:11:02Z

Added Kostas' functions from the hackathon; added ExecutionEnvironment 
member to Graph

commit a405e0973d3157933e78a45fb447fcabb9528e7d
Author: vasia 
Date:   2014-10-21T15:37:50Z

reflect current status in README

commit 2c8d859cafcedddb463d895b584a4d29d1be5d26
Author: vasia 
Date:   2014-10-25T14:14:42Z

type information in Graph and in getUndirected method

commit 7ff3f8a463a73495bdc3d2e8f11a204c787b2c9c
Author: vasia 
Date:   2014-10-25T15:08:53Z

typeinfo in reverse and getOutdegrees

commit 5af110c10c2301bcca22e19afeb10fff542153e5
Author: vasia 
Date:   2014-10-25T16:22:11Z

add dependency to flink-test-utils

commit 9ad2c319bec98eec2bac9998cedba4e5b84e8139
Author: vasia 
Date:   2014-10-25T16:27:15Z

test for undirected, reverse and outDegrees

commit 393902c3b214c2b1de67f0d59e543ec7bc4847c7
Author: vasia 
Date:   2014-10-25T16:30:20Z

remove unused imports

commit c7e805fd747d8cceb4cb678bfa3257434d52e437
Author: vasia 
Date:   2014-10-26T19:40:32Z

made functions for getUndirected, outDegrees and reverse static inner

commit a8e4e

[GitHub] flink pull request: Rename coGroupDataSet.scala to CoGroupDataSet....

2015-01-20 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/324#issuecomment-70676663
  
Why do you want to move them?
On Jan 20, 2015 4:42 AM, "Henry Saputra"  wrote:

> FYI @aljoscha 
>
> —
> Reply to this email directly or view it on GitHub
> .
>


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1352) Buggy registration from TaskManager to JobManager

2015-01-20 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283945#comment-14283945
 ] 

Till Rohrmann commented on FLINK-1352:
--

That is a good point. I'll let the JobManager send a RegistrationRefused 
message to the TaskManager which will terminate itself in such a case.

I'm not sure if the TaskManager should try to connect to a JobManager only a 
limited number of times or indefinitely often. The pros of the former approach 
are that in case of a permanent JobManager outage, we don't have TaskManagers 
lingering around forever if not stopped manually. However, there is always the 
possibility that the connect to JobManager interval is too short to connect to 
a slow starting JobManager. Thus, some TaskManager might wrongly terminate. 
What do you think outweighs the other?

> Buggy registration from TaskManager to JobManager
> -
>
> Key: FLINK-1352
> URL: https://issues.apache.org/jira/browse/FLINK-1352
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Till Rohrmann
> Fix For: 0.9
>
>
> The JobManager's InstanceManager may refuse the registration attempt from a 
> TaskManager, because it has this taskmanager already connected, or,in the 
> future, because the TaskManager has been blacklisted as unreliable.
> Unpon refused registration, the instance ID is null, to signal that refused 
> registration. TaskManager reacts incorrectly to such methods, assuming 
> successful registration
> Possible solution: JobManager sends back a dedicated "RegistrationRefused" 
> message, if the instance manager returns null as the registration result. If 
> the TastManager receives that before being registered, it knows that the 
> registration response was lost (which should not happen on TCP and it would 
> indicate a corrupt connection)
> Followup question: Does it make sense to have the TaskManager trying 
> indefinitely to connect to the JobManager. With increasing interval (from 
> seconds to minutes)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1351) Inconclusive error when TaskManager cannot connect to JobManager

2015-01-20 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283932#comment-14283932
 ] 

Till Rohrmann commented on FLINK-1351:
--

I could not reproduce the timeout error. What did you do exactly? Did you block 
the JobManager by some breakpoints?

I also tested the behaviour in case of an unreachable job manager (wrong 
address or job manager died). The TaskManager tries 10 times to connect to it 
with a pause of 10 seconds in between. If he does not succeed, then he prints 
an error message saying that he could not connect to the job manager and 
terminates itself.

> Inconclusive error when TaskManager cannot connect to JobManager
> 
>
> Key: FLINK-1351
> URL: https://issues.apache.org/jira/browse/FLINK-1351
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager, TaskManager
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Till Rohrmann
>
> The taskmanager currently registers at the jobmanager by resolving the akka 
> URL
> {code}
> val jobManager = context.actorSelection(jobManagerAkkaURL)
> {code}
> When the actor lookup fails (actor systems cannot connect), it gives an 
> unspecific timeout message. This is the case when the TaskManager cannot 
> connect to the JobManager.
> This should be fixed to give a conclusive error message.
> I suggest to add a test where TaskManager is started without JobManager actor 
> system being available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-629) Add support for null values to the java api

2015-01-20 Thread mustafa elbehery (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283815#comment-14283815
 ] 

mustafa elbehery commented on FLINK-629:


I have tried to set the type explicitly By implementing 
ResultTypeQueryable, but still the problem that Kryo does not create an 
instance of the Tweet, it just create a reference .. Waiting for your reply

> Add support for null values to the java api
> ---
>
> Key: FLINK-629
> URL: https://issues.apache.org/jira/browse/FLINK-629
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API
>Reporter: Stephan Ewen
>Assignee: Gyula Fora
>Priority: Critical
>  Labels: github-import
> Fix For: pre-apache
>
> Attachments: Selection_006.png, model.tar.gz
>
>
> Currently, many runtime operations fail when encountering a null value. Tuple 
> serialization should allow null fields.
> I suggest to add a method to the tuples called `getFieldNotNull()` which 
> throws a meaningful exception when the accessed field is null. That way, we 
> simplify the logic of operators that should not dead with null fields, like 
> key grouping or aggregations.
> Even though SQL allows grouping and aggregating of null values, I suggest to 
> exclude this from the java api, because the SQL semantics of aggregating null 
> fields are messy.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/629
> Created by: [StephanEwen|https://github.com/StephanEwen]
> Labels: enhancement, java api, 
> Milestone: Release 0.5.1
> Created at: Wed Mar 26 00:27:49 CET 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1353] Fixes the Execution to use the co...

2015-01-20 Thread tillrohrmann
GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/325

[FLINK-1353] Fixes the Execution to use the configured Akka timeout value

Loops through the JobManager's timeout value to the Execution to make the 
timeout for the Execution configurable.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink fixExecutionTimeout

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/325.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #325


commit db45a93fa9930634db22498a0e2ff87c6f78be5e
Author: Till Rohrmann 
Date:   2015-01-20T13:13:50Z

[FLINK-1353] [runtime] Loops through the JobManager's timeout to the 
Execution objects.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-629) Add support for null values to the java api

2015-01-20 Thread mustafa elbehery (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mustafa elbehery updated FLINK-629:
---
Attachment: (was: Selection_006.png)

> Add support for null values to the java api
> ---
>
> Key: FLINK-629
> URL: https://issues.apache.org/jira/browse/FLINK-629
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API
>Reporter: Stephan Ewen
>Assignee: Gyula Fora
>Priority: Critical
>  Labels: github-import
> Fix For: pre-apache
>
> Attachments: Selection_006.png, model.tar.gz
>
>
> Currently, many runtime operations fail when encountering a null value. Tuple 
> serialization should allow null fields.
> I suggest to add a method to the tuples called `getFieldNotNull()` which 
> throws a meaningful exception when the accessed field is null. That way, we 
> simplify the logic of operators that should not dead with null fields, like 
> key grouping or aggregations.
> Even though SQL allows grouping and aggregating of null values, I suggest to 
> exclude this from the java api, because the SQL semantics of aggregating null 
> fields are messy.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/629
> Created by: [StephanEwen|https://github.com/StephanEwen]
> Labels: enhancement, java api, 
> Milestone: Release 0.5.1
> Created at: Wed Mar 26 00:27:49 CET 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-629) Add support for null values to the java api

2015-01-20 Thread mustafa elbehery (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mustafa elbehery updated FLINK-629:
---
Attachment: Selection_006.png

GenericType not Pojo. I have attached a screenShot of the debugger

> Add support for null values to the java api
> ---
>
> Key: FLINK-629
> URL: https://issues.apache.org/jira/browse/FLINK-629
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API
>Reporter: Stephan Ewen
>Assignee: Gyula Fora
>Priority: Critical
>  Labels: github-import
> Fix For: pre-apache
>
> Attachments: Selection_006.png, Selection_006.png, model.tar.gz
>
>
> Currently, many runtime operations fail when encountering a null value. Tuple 
> serialization should allow null fields.
> I suggest to add a method to the tuples called `getFieldNotNull()` which 
> throws a meaningful exception when the accessed field is null. That way, we 
> simplify the logic of operators that should not dead with null fields, like 
> key grouping or aggregations.
> Even though SQL allows grouping and aggregating of null values, I suggest to 
> exclude this from the java api, because the SQL semantics of aggregating null 
> fields are messy.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/629
> Created by: [StephanEwen|https://github.com/StephanEwen]
> Labels: enhancement, java api, 
> Milestone: Release 0.5.1
> Created at: Wed Mar 26 00:27:49 CET 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-629) Add support for null values to the java api

2015-01-20 Thread mustafa elbehery (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mustafa elbehery updated FLINK-629:
---
Attachment: Selection_006.png

> Add support for null values to the java api
> ---
>
> Key: FLINK-629
> URL: https://issues.apache.org/jira/browse/FLINK-629
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API
>Reporter: Stephan Ewen
>Assignee: Gyula Fora
>Priority: Critical
>  Labels: github-import
> Fix For: pre-apache
>
> Attachments: Selection_006.png, model.tar.gz
>
>
> Currently, many runtime operations fail when encountering a null value. Tuple 
> serialization should allow null fields.
> I suggest to add a method to the tuples called `getFieldNotNull()` which 
> throws a meaningful exception when the accessed field is null. That way, we 
> simplify the logic of operators that should not dead with null fields, like 
> key grouping or aggregations.
> Even though SQL allows grouping and aggregating of null values, I suggest to 
> exclude this from the java api, because the SQL semantics of aggregating null 
> fields are messy.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/629
> Created by: [StephanEwen|https://github.com/StephanEwen]
> Labels: enhancement, java api, 
> Milestone: Release 0.5.1
> Created at: Wed Mar 26 00:27:49 CET 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1296] Add sorter support for very large...

2015-01-20 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/249#issuecomment-70645554
  
+1 

I will merge this in the next batch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1353) Execution always uses DefaultAkkaAskTimeout, rather than the configured value

2015-01-20 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283731#comment-14283731
 ] 

Till Rohrmann commented on FLINK-1353:
--

Good catch. I'll forward the timeout set in the configuration to the Execution.

> Execution always uses DefaultAkkaAskTimeout, rather than the configured value
> -
>
> Key: FLINK-1353
> URL: https://issues.apache.org/jira/browse/FLINK-1353
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Till Rohrmann
> Fix For: 0.9
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-1319) Add static code analysis for UDFs

2015-01-20 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther reassigned FLINK-1319:
---

Assignee: Timo Walther

> Add static code analysis for UDFs
> -
>
> Key: FLINK-1319
> URL: https://issues.apache.org/jira/browse/FLINK-1319
> Project: Flink
>  Issue Type: New Feature
>  Components: Java API, Scala API
>Reporter: Stephan Ewen
>Assignee: Timo Walther
>Priority: Minor
>
> Flink's Optimizer takes information that tells it for UDFs which fields of 
> the input elements are accessed, modified, or frwarded/copied. This 
> information frequently helps to reuse partitionings, sorts, etc. It may speed 
> up programs significantly, as it can frequently eliminate sorts and shuffles, 
> which are costly.
> Right now, users can add lightweight annotations to UDFs to provide this 
> information (such as adding {{@ConstandFields("0->3, 1, 2->1")}}.
> We worked with static code analysis of UDFs before, to determine this 
> information automatically. This is an incredible feature, as it "magically" 
> makes programs faster.
> For record-at-a-time operations (Map, Reduce, FlatMap, Join, Cross), this 
> works surprisingly well in many cases. We used the "Soot" toolkit for the 
> static code analysis. Unfortunately, Soot is LGPL licensed and thus we did 
> not include any of the code so far.
> I propose to add this functionality to Flink, in the form of a drop-in 
> addition, to work around the LGPL incompatibility with ALS 2.0. Users could 
> simply download a special "flink-code-analysis.jar" and drop it into the 
> "lib" folder to enable this functionality. We may even add a script to 
> "tools" that downloads that library automatically into the lib folder. This 
> should be legally fine, since we do not redistribute LGPL code and only 
> dynamically link it (the incompatibility with ASL 2.0 is mainly in the 
> patentability, if I remember correctly).
> Prior work on this has been done by [~aljoscha] and [~skunert], which could 
> provide a code base to start with.
> *Appendix*
> Hompage to Soot static analysis toolkit: http://www.sable.mcgill.ca/soot/
> Papers on static analysis and for optimization: 
> http://stratosphere.eu/assets/papers/EnablingOperatorReorderingSCA_12.pdf and 
> http://stratosphere.eu/assets/papers/openingTheBlackBoxes_12.pdf
> Quick introduction to the Optimizer: 
> http://stratosphere.eu/assets/papers/2014-VLDBJ_Stratosphere_Overview.pdf 
> (Section 6)
> Optimizer for Iterations: 
> http://stratosphere.eu/assets/papers/spinningFastIterativeDataFlows_12.pdf 
> (Sections 4.3 and 5.3)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1369] [types] Add support for Subclasse...

2015-01-20 Thread twalthr
Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/316#issuecomment-70635445
  
Looks good. You should add generics or least `` to `TypeSerializer` in 
the `PojoSerializer`. My IDE shows me lots of warnings.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-1417) Automatically register nested types at Kryo

2015-01-20 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-1417:
-

Assignee: Robert Metzger

> Automatically register nested types at Kryo
> ---
>
> Key: FLINK-1417
> URL: https://issues.apache.org/jira/browse/FLINK-1417
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API, Scala API
>Reporter: Stephan Ewen
>Assignee: Robert Metzger
> Fix For: 0.9
>
>
> Currently, the {{GenericTypeInfo}} registers the class of the type at Kryo. 
> In order to get the best performance, it should recursively walk the classes 
> and make sure that it registered all contained subtypes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1421) Implement a SAMOA Adapter for Flink Streaming

2015-01-20 Thread Gyula Fora (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora updated FLINK-1421:
--
Assignee: Paris Carbone  (was: Gyula Fora)

> Implement a SAMOA Adapter for Flink Streaming
> -
>
> Key: FLINK-1421
> URL: https://issues.apache.org/jira/browse/FLINK-1421
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Reporter: Paris Carbone
>Assignee: Paris Carbone
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Yahoo's Samoa is an experimental incremental machine learning library that 
> builds on an abstract compositional data streaming model to write streaming 
> algorithms. The task is to provide an adapter from SAMOA topologies to 
> Flink-streaming job graphs in order to support Flink as a backend engine for 
> SAMOA tasks.
> A statup guide can be viewed here :
> https://docs.google.com/document/d/18glDJDYmnFGT1UGtZimaxZpGeeg1Ch14NgDoymhPk2A/pub
> The main working branch of the adapter :
> https://github.com/senorcarbone/samoa/tree/flink



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1376] [runtime] Add proper shared slot ...

2015-01-20 Thread uce
Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/318#discussion_r23212042
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/instance/SharedSlot.java 
---
@@ -0,0 +1,154 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.instance;
+
+import org.apache.flink.runtime.AbstractID;
+import org.apache.flink.runtime.jobgraph.JobID;
+import 
org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroupAssignment;
+
+import java.util.HashSet;
+import java.util.Set;
+
+/**
+ * This class represents a shared slot. A shared slot can have multiple
+ * {@link org.apache.flink.runtime.instance.SimpleSlot} instances within 
itself. This allows to
+ * schedule multiple tasks simultaneously, enabling Flink's streaming 
capabilities.
+ *
+ * IMPORTANT: This class contains no synchronization. Thus, the caller has 
to guarantee proper
+ * synchronization. In the current implementation, all concurrently 
modifying operations are
+ * passed through a {@link SlotSharingGroupAssignment} object which is 
responsible for
+ * synchronization.
+ *
+ */
+public class SharedSlot extends Slot {
+
+   private final SlotSharingGroupAssignment assignmentGroup;
+
+   private final Set subSlots;
+
+   public SharedSlot(JobID jobID, Instance instance, int slotNumber,
+   SlotSharingGroupAssignment 
assignmentGroup, SharedSlot parent,
+   AbstractID groupID) {
+   super(jobID, instance, slotNumber, parent, groupID);
+
+   this.assignmentGroup = assignmentGroup;
+   this.subSlots = new HashSet();
+   }
+
+   public Set getSubSlots() {
+   return subSlots;
+   }
+
+   /**
+* Removes the simple slot from the {@link 
org.apache.flink.runtime.instance.SharedSlot}. Should
+* only be called through the
+* {@link 
org.apache.flink.runtime.jobmanager.scheduler.SlotSharingGroupAssignment} 
attribute
+* assignmnetGroup.
+*
+* @param slot slot to be removed from the set of sub slots.
+* @return Number of remaining sub slots
+*/
+   public int freeSubSlot(Slot slot){
+   if(!subSlots.remove(slot)){
+   throw new IllegalArgumentException("Wrong shared slot 
for sub slot.");
+   }
+
+   return subSlots.size();
+   }
+
+   @Override
+   public int getNumberLeaves() {
+   int result = 0;
+
+   for(Slot slot: subSlots){
+   result += slot.getNumberLeaves();
+   }
+
+   return result;
+   }
+
+   @Override
+   public void cancel() {
+   // Guarantee that the operation is only executed once
+   if (markCancelled()) {
+   assignmentGroup.releaseSharedSlot(this);
+   }
+   }
+
+   /**
+* Release this shared slot. In order to do this:
+*
+* 1. Cancel and release all sub slots atomically with respect to the 
assigned assignment group.
+* 2. Set the state of the shared slot to be cancelled.
+* 3. Dispose the shared slot (returning the slot to the instance).
+*
+* After cancelAndReleaseSubSlots, the shared slot is marked to be 
dead. This prevents further
+* sub slot creation by the scheduler.
+*/
+   @Override
+   public void releaseSlot() {
+   assignmentGroup.releaseSharedSlot(this);
+   }
+
+   /**
+* Creates a new sub slot if the slot is not dead, yet. This method 
should only be called from
+* the assignment group instance to guarantee synchronization.
+*
+* @param jID id to identify tasks which can be deployed in this sub 
slot
+* @return new sub slot if the shared slot is still alive, otherwise 
null
+*/
+   public Simpl

[jira] [Commented] (FLINK-629) Add support for null values to the java api

2015-01-20 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283640#comment-14283640
 ] 

Robert Metzger commented on FLINK-629:
--

This screenshot from github is related I guess 
(https://cloud.githubusercontent.com/assets/2375289/5814551/13a7bc88-a08d-11e4-9133-a0b0d0239533.png)
 ?
So your problem is that your input format is getting a "null" instead a 
reference to a "Tweet" object?
Is Flink treating your POJO (Tweet) as a POJO or as a generic type?
What is {{TypeExtractor.createTypeInfo(Tweet.class)}} returning?

> Add support for null values to the java api
> ---
>
> Key: FLINK-629
> URL: https://issues.apache.org/jira/browse/FLINK-629
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API
>Reporter: Stephan Ewen
>Assignee: Gyula Fora
>Priority: Critical
>  Labels: github-import
> Fix For: pre-apache
>
> Attachments: model.tar.gz
>
>
> Currently, many runtime operations fail when encountering a null value. Tuple 
> serialization should allow null fields.
> I suggest to add a method to the tuples called `getFieldNotNull()` which 
> throws a meaningful exception when the accessed field is null. That way, we 
> simplify the logic of operators that should not dead with null fields, like 
> key grouping or aggregations.
> Even though SQL allows grouping and aggregating of null values, I suggest to 
> exclude this from the java api, because the SQL semantics of aggregating null 
> fields are messy.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/629
> Created by: [StephanEwen|https://github.com/StephanEwen]
> Labels: enhancement, java api, 
> Milestone: Release 0.5.1
> Created at: Wed Mar 26 00:27:49 CET 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-629) Add support for null values to the java api

2015-01-20 Thread mustafa elbehery (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283628#comment-14283628
 ] 

mustafa elbehery commented on FLINK-629:


@rmetzger I have a problem with the Kryo serializer, it creates a reference of 
my Pojo and not an object. I am using an event driven Json parser, in which I 
should fill the object manually, and for performance, I am trying to create 
only one object. The problem now that Kryo creates only a reference, not an 
object. 

> Add support for null values to the java api
> ---
>
> Key: FLINK-629
> URL: https://issues.apache.org/jira/browse/FLINK-629
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API
>Reporter: Stephan Ewen
>Assignee: Gyula Fora
>Priority: Critical
>  Labels: github-import
> Fix For: pre-apache
>
> Attachments: model.tar.gz
>
>
> Currently, many runtime operations fail when encountering a null value. Tuple 
> serialization should allow null fields.
> I suggest to add a method to the tuples called `getFieldNotNull()` which 
> throws a meaningful exception when the accessed field is null. That way, we 
> simplify the logic of operators that should not dead with null fields, like 
> key grouping or aggregations.
> Even though SQL allows grouping and aggregating of null values, I suggest to 
> exclude this from the java api, because the SQL semantics of aggregating null 
> fields are messy.
>  Imported from GitHub 
> Url: https://github.com/stratosphere/stratosphere/issues/629
> Created by: [StephanEwen|https://github.com/StephanEwen]
> Labels: enhancement, java api, 
> Milestone: Release 0.5.1
> Created at: Wed Mar 26 00:27:49 CET 2014
> State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-610] Replace Avro with Kryo as the Gene...

2015-01-20 Thread Elbehery
Github user Elbehery commented on the pull request:

https://github.com/apache/flink/pull/271#issuecomment-70624472
  
@rmetzger  I have a problem with the Kryo serializer, it creates a 
reference of my Pojo and not an object. I am using an event driven Json parser, 
in which I should fill the object manually, and for performance, I am trying to 
create only one object. The problem now that Kryo creates only a reference, not 
an object. Attached screen shot for the problem.

![selection_005](https://cloud.githubusercontent.com/assets/2375289/5814551/13a7bc88-a08d-11e4-9133-a0b0d0239533.png)
 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---