[jira] [Commented] (FLINK-2457) Integrate Tuple0

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693572#comment-14693572
 ] 

ASF GitHub Bot commented on FLINK-2457:
---

Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/983#issuecomment-130321100
  
I did those manually, too. I am just reworking the code to fix this up.


 Integrate Tuple0
 

 Key: FLINK-2457
 URL: https://issues.apache.org/jira/browse/FLINK-2457
 Project: Flink
  Issue Type: Improvement
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Minor

 Tuple0 is not cleanly integrated:
   - missing serialization/deserialization support in runtime
  - Tuple.getTupleClass(int arity) cannot handle arity zero, ie, cannot create 
 an instance of Tuple0
 Tuple0 is currently only used in Python API, but will be integrated into 
 Storm compatibility, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2512]Add client.close() before throw Ru...

2015-08-12 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/1009#issuecomment-130323420
  
If you move the try up, you can certainly remove the manual close. 
Regarding the check in 103, it really depends on whether the Strom compat layer 
depends on having only a single job per client. Therefore I would keep it in 
and let it throw the RuntimeException as before. The finally block will then 
make sure that the client is closed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2512) Add client.close() before throw RuntimeException

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693593#comment-14693593
 ] 

ASF GitHub Bot commented on FLINK-2512:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/1009#issuecomment-130323420
  
If you move the try up, you can certainly remove the manual close. 
Regarding the check in 103, it really depends on whether the Strom compat layer 
depends on having only a single job per client. Therefore I would keep it in 
and let it throw the RuntimeException as before. The finally block will then 
make sure that the client is closed.


 Add client.close() before throw RuntimeException
 

 Key: FLINK-2512
 URL: https://issues.apache.org/jira/browse/FLINK-2512
 Project: Flink
  Issue Type: Bug
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: fangfengbin
Assignee: fangfengbin
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2457) Integrate Tuple0

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693595#comment-14693595
 ] 

ASF GitHub Bot commented on FLINK-2457:
---

Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/983#issuecomment-130323723
  
I did the following changes:
 - added Tuple0 to `TupleGenerator.modifyTupleType()`
 - changes `LinkedList` to `ArrayList` in TupleGenerator to create code for 
`TupleX.java`
 - added methods `toString()`, `equals()`, and `hashCode()` to `Tuple0`
(Of course, I run TupleGenerator to test it.)

This PR should be ready for merging now.


 Integrate Tuple0
 

 Key: FLINK-2457
 URL: https://issues.apache.org/jira/browse/FLINK-2457
 Project: Flink
  Issue Type: Improvement
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Minor

 Tuple0 is not cleanly integrated:
   - missing serialization/deserialization support in runtime
  - Tuple.getTupleClass(int arity) cannot handle arity zero, ie, cannot create 
 an instance of Tuple0
 Tuple0 is currently only used in Python API, but will be integrated into 
 Storm compatibility, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693603#comment-14693603
 ] 

ASF GitHub Bot commented on FLINK-1819:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130324909
  
Functions also need to extend RichFunction to have access to `open()` and 
`close()`.
I think the two things a different enough that any strife for consistency 
is actually pretty random.
If your thoughts currently revolve around the RuntimeContext, it apprears 
more consistent. If you thoughts are on the life cycle methods, it seems 
inconsistent. Random.

I think you should go ahead and just call them Rich. It is just a name, 
and what matters is that the JavaDocs describe what it actually means...


 Allow access to RuntimeContext from Input and OutputFormats
 ---

 Key: FLINK-1819
 URL: https://issues.apache.org/jira/browse/FLINK-1819
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.9, 0.8.1
Reporter: Fabian Hueske
Assignee: Sachin Goel
Priority: Minor
 Fix For: 0.9


 User function that extend a RichFunction can access a {{RuntimeContext}} 
 which gives the parallel id of the task and access to Accumulators and 
 BroadcastVariables. 
 Right now, Input and OutputFormats cannot access their {{RuntimeContext}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2457) Integrate Tuple0

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693570#comment-14693570
 ] 

ASF GitHub Bot commented on FLINK-2457:
---

Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/983#issuecomment-130320695
  
Your are right Tuple0 support for the TupleGenerator is not important. 
Actually, I meant the change of `import java.util.LinkedList;` to `import 
java.util.ArrayList;` otherwise these changes get lost if the TupleGenerator is 
executed.


 Integrate Tuple0
 

 Key: FLINK-2457
 URL: https://issues.apache.org/jira/browse/FLINK-2457
 Project: Flink
  Issue Type: Improvement
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Minor

 Tuple0 is not cleanly integrated:
   - missing serialization/deserialization support in runtime
  - Tuple.getTupleClass(int arity) cannot handle arity zero, ie, cannot create 
 an instance of Tuple0
 Tuple0 is currently only used in Python API, but will be integrated into 
 Storm compatibility, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-12 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130318970
  
For transformation functions, there is a clear case for thin versus 
rich, for Java8 lambdas.
Input formats are a different game. They are super rich by default anyways.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2457] Integrate Tuple0

2015-08-12 Thread twalthr
Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/983#issuecomment-130320695
  
Your are right Tuple0 support for the TupleGenerator is not important. 
Actually, I meant the change of `import java.util.LinkedList;` to `import 
java.util.ArrayList;` otherwise these changes get lost if the TupleGenerator is 
executed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-12 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130324909
  
Functions also need to extend RichFunction to have access to `open()` and 
`close()`.
I think the two things a different enough that any strife for consistency 
is actually pretty random.
If your thoughts currently revolve around the RuntimeContext, it apprears 
more consistent. If you thoughts are on the life cycle methods, it seems 
inconsistent. Random.

I think you should go ahead and just call them Rich. It is just a name, 
and what matters is that the JavaDocs describe what it actually means...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-12 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130324770
  
Rich does not refer to the number of methods but the fact that it has the 
RuntimeContext available. All non-rich variants do not get state inserted. This 
follows a naming convention in Flink. `AbstractInputFormat` might be a more 
intuitive name for novices but I'm more inclined to naming consistency. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2457] Integrate Tuple0

2015-08-12 Thread mjsax
Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/983#issuecomment-130323723
  
I did the following changes:
 - added Tuple0 to `TupleGenerator.modifyTupleType()`
 - changes `LinkedList` to `ArrayList` in TupleGenerator to create code for 
`TupleX.java`
 - added methods `toString()`, `equals()`, and `hashCode()` to `Tuple0`
(Of course, I run TupleGenerator to test it.)

This PR should be ready for merging now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693602#comment-14693602
 ] 

ASF GitHub Bot commented on FLINK-1819:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130324770
  
Rich does not refer to the number of methods but the fact that it has the 
RuntimeContext available. All non-rich variants do not get state inserted. This 
follows a naming convention in Flink. `AbstractInputFormat` might be a more 
intuitive name for novices but I'm more inclined to naming consistency. 


 Allow access to RuntimeContext from Input and OutputFormats
 ---

 Key: FLINK-1819
 URL: https://issues.apache.org/jira/browse/FLINK-1819
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.9, 0.8.1
Reporter: Fabian Hueske
Assignee: Sachin Goel
Priority: Minor
 Fix For: 0.9


 User function that extend a RichFunction can access a {{RuntimeContext}} 
 which gives the parallel id of the task and access to Accumulators and 
 BroadcastVariables. 
 Right now, Input and OutputFormats cannot access their {{RuntimeContext}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [CLEANUP] Add space between quotes and plus si...

2015-08-12 Thread hsaputra
Github user hsaputra commented on the pull request:

https://github.com/apache/flink/pull/1010#issuecomment-130443578
  
Thanks @StephanEwen merging ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2509] Add class loader info to exceptio...

2015-08-12 Thread hsaputra
Github user hsaputra commented on the pull request:

https://github.com/apache/flink/pull/1008#issuecomment-130388161
  
No worries =)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-2501) [py] Remove the need to specify types for transformations

2015-08-12 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-2501:
---

Assignee: Chesnay Schepler

 [py] Remove the need to specify types for transformations
 -

 Key: FLINK-2501
 URL: https://issues.apache.org/jira/browse/FLINK-2501
 Project: Flink
  Issue Type: Improvement
  Components: Python API
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler

 Currently, users of the Python API have to provide type arguments when using 
 a UDF, like so:
 {code}
 d1.map(Mapper(), (INT, STRING))
 {code}
 Instead, it would be really convenient to be able to do this:
 {code}
 d1.map(Mapper())
 {code}
 The intention behind this issue is convenience, and it's also not really 
 pythonic to specify types.
 Before I'll go into possible solutions, let me summarize the way these type 
 arguments are currently used, and in general how types are handled:
 The type argument passed is actually an object of the type it represents, as 
 INT is a constant int value, whereas STRING is a constant string value. You 
 could as well write the following and it would still work.
 {code}
 d1.map(Mapper(), (1, ImNotATypInfo))
 {code}
 This object is transmitted to the java side during the plan binding (and is 
 now an actual Tuple2Integer, String), then passed to the type extractor, 
 and the resulting TypeInformation saved in the java counterpart of the udf, 
 which all implement the ResultTypeQueryable interface. 
 The TypeInformation object is only used by the Java API, python never touches 
 it. Instead, at runtime, the serializers used between python and java check 
 the classes of the values passed and are thus generated dynamically.
 This means that, if a UDF does not pass the type it claims to pass, the 
 Python API wont complain, but the underlying java API will when it's 
 serializers fail.
 Now let's talk solutions.
 In discussions on the mailing list, pretty much 2 proposals were made:
 # Add a way to disable/circumvent type checks during the plan phase in the 
 Java API and generate serializers dynamically.
 # Have objects always in serialized form on the java side, stored in a single 
 bytearray or Tuple2 containing a key/value pair.
 These proposals vary wildly in the changes necessary to the system:
 # How can we change the Java API to support this?
 This proposal would hardly change the way the Python API works, or even touch 
 the related source code. It mostly deals with the Java API. Since I'm not to 
 familiar with the Plan processing life-cycle on the java side I can't assess 
 which classes would have to be changed.
 # How can we make this work within the limits of the Java API?
 is the exact opposite, it changes nothing in the Java API. Instead, the 
 following issues would have to be solved:
 * Alter the plan to extract keys before keyed operations, while hiding these 
 keys from the UDF. This is exactly how KeySelectors (will) work, and as such 
 is generally solved. In fact, this solution would make a few things easier in 
 regards to KeySelectors.
 * Rework all operations that currently rely on Java API functions, that need 
 deserialized data, for example Projections or the upcoming Aggregations; 
 This generally means implementing them in python, or with special java UDF's 
 (they could de-/serialize data within the udf call, or work on serialized 
 data).
 * Change (De)Serializers accordingly
 * implement a reliable, not all-memory-consuming sorting mechanism on the 
 python side
 Personally i prefer the second option, as it
 # does not modify the Java API, it works within it's well-tested limits
 # Plan changes are similar to issues that are already worked on (KeySelectors)
 # Sorting implementation was necessary anyway (for chained reducers)
 # having data in serialized form was a performance-related consideration 
 already
 While the first option could work, and most likely require less work, i feel 
 like many of the things required for option 2 will be implemented eventually 
 anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1745) Add exact k-nearest-neighbours algorithm to machine learning library

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694338#comment-14694338
 ] 

ASF GitHub Bot commented on FLINK-1745:
---

Github user kno10 commented on the pull request:

https://github.com/apache/flink/pull/696#issuecomment-130469648
  
R-trees are hard to parallelize.
For distributed and gigabyte size data, an approximative approach is 
preferable, like the one we discuss in this article:

E. Schubert, A. Zimek, H.-P. Kriegel
Fast and Scalable Outlier Detection with Approximate Nearest Neighbor 
Ensembles
In Proceedings of the 20th International Conference on Database Systems for 
Advanced Applications (DASFAA), Hanoi, Vietnam: 19–36, 2015. 

We discuss an approach that is easy to parallelize. It needs sorting and a 
sliding window (or blocks), so it is not strict MapReduce, but it should be a 
good match for Flink. The hardest part is to get the different space filling 
curves right and efficient. The other components (random projections to reduce 
dimensionality, ensemble to improve quality, and list inversions to also build 
reverse kNN that then allow accelerating methods such as LOF are much easier).

The main drawback of most of these kNN-join approaches (including ours) is 
that they only work with Minkowski norms. There are much more interesting 
distance functions than that...

We also discuss why the space filling curves appear to give better results 
for kNN, while LSH etc. work better for radius joins. LSH is another option, 
but it cannot guarantee to find k neighbors and parameter tuning is tricky. So 
you may want to have a look at this recent ensemble approach instead.


 Add exact k-nearest-neighbours algorithm to machine learning library
 

 Key: FLINK-1745
 URL: https://issues.apache.org/jira/browse/FLINK-1745
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Till Rohrmann
  Labels: ML, Starter

 Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial 
 it is still used as a mean to classify data and to do regression. This issue 
 focuses on the implementation of an exact kNN (H-BNLJ, H-BRJ) algorithm as 
 proposed in [2].
 Could be a starter task.
 Resources:
 [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm]
 [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [CLEANUP] Add space between quotes and plus si...

2015-08-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1010


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-2493) Simplify names of example program JARs

2015-08-12 Thread chenliang (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chenliang updated FLINK-2493:
-
Description: 
I find the names of the example JARs a bit annoying.

Why not name the file {{examples/ConnectedComponents.jar}} rather than 
{{examples/flink-java-examples-0.10-SNAPSHOT-ConnectedComponents.jar}}

And combine flink-java-examples and flink-scala-examples  project to one 
examples project。

  was:
I find the names of the example JARs a bit annoying.

Why not name the file {{examples/ConnectedComponents.jar}} rather than 
{{examples/flink-java-examples-0.10-SNAPSHOT-ConnectedComponents.jar}}


 Simplify names of example program JARs
 --

 Key: FLINK-2493
 URL: https://issues.apache.org/jira/browse/FLINK-2493
 Project: Flink
  Issue Type: Improvement
  Components: Examples
Affects Versions: 0.10
Reporter: Stephan Ewen
Assignee: chenliang
Priority: Minor
  Labels: easyfix, starter

 I find the names of the example JARs a bit annoying.
 Why not name the file {{examples/ConnectedComponents.jar}} rather than 
 {{examples/flink-java-examples-0.10-SNAPSHOT-ConnectedComponents.jar}}
 And combine flink-java-examples and flink-scala-examples  project to one 
 examples project。



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1901) Create sample operator for Dataset

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694548#comment-14694548
 ] 

ASF GitHub Bot commented on FLINK-1901:
---

Github user ChengXiangLi commented on a diff in the pull request:

https://github.com/apache/flink/pull/949#discussion_r36934499
  
--- Diff: 
flink-core/src/test/java/org/apache/flink/api/common/operators/util/RandomSamplerTest.java
 ---
@@ -0,0 +1,425 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.api.common.operators.util;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Sets;
+import org.apache.commons.math3.stat.inference.KolmogorovSmirnovTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.Iterator;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Set;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+
+/**
+ * This test suite try to verify whether all the random samplers work as 
we expected, which mainly focus on:
+ * ul
+ * liDoes sampled result fit into input parameters? we check parameters 
like sample fraction, sample size,
+ * w/o replacement, and so on./li
+ * liDoes sampled result randomly selected? we verify this by measure 
how much does it distributed on source data.
+ * Run Kolmogorov-Smirnov (KS) test between the random samplers and 
default reference samplers which is distributed
+ * well-proportioned on source data. If random sampler select elements 
randomly from source, it would distributed
+ * well-proportioned on source data as well. The KS test will fail to 
strongly reject the null hypothesis that
+ * the distributions of sampling gaps are the same.
+ * /li
+ * /ul
+ *
+ * @see a 
href=https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test;Kolmogorov 
Smirnov test/a
+ */
+public class RandomSamplerTest {
+   private final static int SOURCE_SIZE = 1;
+   private static KolmogorovSmirnovTest ksTest;
+   private static ListDouble source;
+   private final static int DEFFAULT_PARTITION_NUMBER=10;
+   private ListDouble[] sourcePartitions = new 
List[DEFFAULT_PARTITION_NUMBER];
+
+   @BeforeClass
+   public static void init() {
+   // initiate source data set.
+   source = new ArrayListDouble(SOURCE_SIZE);
+   for (int i = 0; i  SOURCE_SIZE; i++) {
+   source.add((double) i);
+   }
+   
+   ksTest = new KolmogorovSmirnovTest();
+   }
+
+   private void initSourcePartition() {
+   for (int i=0; iDEFFAULT_PARTITION_NUMBER; i++) {
+   sourcePartitions[i] = new LinkedListDouble();
+   }
+   for (int i = 0; i SOURCE_SIZE; i++) {
+   int index = i % DEFFAULT_PARTITION_NUMBER;
+   sourcePartitions[index].add((double)i);
+   }
+   }
+   
+   @Test(expected = java.lang.IllegalArgumentException.class)
+   public void testBernoulliSamplerWithUnexpectedFraction1() {
+   verifySamplerFraction(-1, false);
+   }
+   
+   @Test(expected = java.lang.IllegalArgumentException.class)
+   public void testBernoulliSamplerWithUnexpectedFraction2() {
+   verifySamplerFraction(2, false);
+   }
+   
+   @Test
+   public void testBernoulliSamplerFraction() {
+   verifySamplerFraction(0.01, false);
+   verifySamplerFraction(0.05, false);
+   verifySamplerFraction(0.1, false);
+   verifySamplerFraction(0.3, false);
+   verifySamplerFraction(0.5, false);
+   verifySamplerFraction(0.854, false);
+   

[jira] [Commented] (FLINK-2512) Add client.close() before throw RuntimeException

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694566#comment-14694566
 ] 

ASF GitHub Bot commented on FLINK-2512:
---

Github user ffbin commented on the pull request:

https://github.com/apache/flink/pull/1009#issuecomment-130509029
  
@uce @hsaputra  Thanks. I have move the try up and rely on finally to close 
the client.


 Add client.close() before throw RuntimeException
 

 Key: FLINK-2512
 URL: https://issues.apache.org/jira/browse/FLINK-2512
 Project: Flink
  Issue Type: Bug
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: fangfengbin
Assignee: fangfengbin
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694658#comment-14694658
 ] 

ASF GitHub Bot commented on FLINK-2490:
---

Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-130524804
  
@mxm 
Hi, I fixed the StringBuffer and add the test.
Take a look whether it`s correct.
Thank you!


 Remove unwanted boolean check in function 
 SocketTextStreamFunction.streamFromSocket
 ---

 Key: FLINK-2490
 URL: https://issues.apache.org/jira/browse/FLINK-2490
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.10
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693074#comment-14693074
 ] 

ASF GitHub Bot commented on FLINK-2509:
---

Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/1008#discussion_r36834144
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/util/ClassLoaderUtilsTest.java
 ---
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.util;
+
+import static org.junit.Assert.*;
+
+import org.junit.Test;
+
+import java.io.File;
+import java.io.FileOutputStream;
+import java.net.URL;
+import java.net.URLClassLoader;
+import java.util.jar.JarFile;
+
+/**
+ * Tests that validate the {@link ClassLoaderUtil}.
+ */
+public class ClassLoaderUtilsTest {
+
+   @Test
+   public void testWithURLClassLoader() {
+   File validJar = null;
+   File invalidJar = null;
+   
+   try {
+   // file with jar contents
+   validJar = File.createTempFile(flink-url-test, 
.tmp);
+   JarFileCreator jarFileCreator = new 
JarFileCreator(validJar);
+   jarFileCreator.addClass(ClassLoaderUtilsTest.class);
+   jarFileCreator.createJarFile();
+   
+   // validate that the JAR is correct and the test setup 
is not broken
+   try {
+   new JarFile(validJar.getAbsolutePath());
+   }
+   catch (Exception e) {
+   e.printStackTrace();
+   fail(test setup broken: cannot create a valid 
jar file);
+   }
+   
+   // file with some random contents
+   invalidJar = File.createTempFile(flink-url-test, 
.tmp);
+   try (FileOutputStream invalidout = new 
FileOutputStream(invalidJar)) {
+   invalidout.write(new byte[] { -1, 1, -2, 3, -3, 
4, });
+   }
+   
+   // non existing file
+   File nonExisting = 
File.createTempFile(flink-url-test, .tmp);
+   assertTrue(Cannot create and delete temp file, 
nonExisting.delete());
+   
+   
+   // create a URL classloader with
+   // - a HTTP URL
+   // - a file URL for an existing jar file
+   // - a file URL for an existing file that is not a jar 
file
+   // - a file URL for a non-existing file
+   
+   URL[] urls = {
+   new URL(http, localhost, 26712, 
/some/file/path),
+   new URL(file, null, 
validJar.getAbsolutePath()),
+   new URL(file, null, 
invalidJar.getAbsolutePath()),
+   new URL(file, null, 
nonExisting.getAbsolutePath()),
+   };
+
+   URLClassLoader loader = new URLClassLoader(urls, 
getClass().getClassLoader());
+   String info = 
ClassLoaderUtil.getUserCodeClassLoaderInfo(loader);
+   
+   assertTrue(info.indexOf(/some/file/path)  0);
+   assertTrue(info.indexOf(validJar.getAbsolutePath() + ' 
(valid)  0);
+   assertTrue(info.indexOf(invalidJar.getAbsolutePath() + 
' (invalid JAR)  0);
+   assertTrue(info.indexOf(nonExisting.getAbsolutePath() + 
' (missing)  0);
+
+   System.out.println(info);
+   }
+   catch (Exception e) {
+   e.printStackTrace();
+   

[GitHub] flink pull request: [FLINK-2457] Integrate Tuple0

2015-08-12 Thread mjsax
Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/983#issuecomment-130211866
  
Any news about this PR?  @twalthr : Are you going to review it again?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2457) Integrate Tuple0

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693091#comment-14693091
 ] 

ASF GitHub Bot commented on FLINK-2457:
---

Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/983#issuecomment-130211866
  
Any news about this PR?  @twalthr : Are you going to review it again?


 Integrate Tuple0
 

 Key: FLINK-2457
 URL: https://issues.apache.org/jira/browse/FLINK-2457
 Project: Flink
  Issue Type: Improvement
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Minor

 Tuple0 is not cleanly integrated:
   - missing serialization/deserialization support in runtime
  - Tuple.getTupleClass(int arity) cannot handle arity zero, ie, cannot create 
 an instance of Tuple0
 Tuple0 is currently only used in Python API, but will be integrated into 
 Storm compatibility, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693104#comment-14693104
 ] 

ASF GitHub Bot commented on FLINK-2490:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-130213092
  
@HuangWHWHW `read()` method of the `BufferedReader` object returns `-1` in 
case the end of the stream has been reached.

A couple of things I noticed apart from the `retryForever` issue. I wonder 
if we can fix these with this pull request as well:

1. The control flow of the `streamFromSocket` function is hard to predict 
because there are many `while` loops with `break`, `continue`, or `throw` 
statements.
2. We could use `StringBuilder` instead of `StringBuffer` in this class. 
`StringBuilder` is faster in the case of single-threaded access.
3. The function reads a single character at a time from the socket. It is 
more efficient to use a buffer and read several characters at once.

@HuangWHWHW You asked how you could count the number of retries in a unit 
test. Typically, you would insert a `Mock` or a `Spy` into your test method. 
Unfortunately, this does not work here because the socket variables is 
overwritten in case of a retry. So for this test, I would recommend creating a 
local `ServerSocket` and let the function connect to this socket. You can then 
control the failures from your test socket. 


 Remove unwanted boolean check in function 
 SocketTextStreamFunction.streamFromSocket
 ---

 Key: FLINK-2490
 URL: https://issues.apache.org/jira/browse/FLINK-2490
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.10
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...

2015-08-12 Thread HuangWHWHW
Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-130217420
  
@mxm 
Hi, thank you for suggestions.
I will try to follow your suggestions and improve the test.
I understand almost of yours and I also read the Class documentation of 
BufferedReader.read().
When I was doing the test I found the BufferedReader.read() would never 
stop until it read next char from socket server or throw a Exception when 
socket is closed.
Returning -1 in BufferedReader.read() seems to be only worked in text file 
instead socket message.
And I looked for help in the net that some guys said you might add a 
method(Socket.setSoTimeout()) so the BufferedReader.read() would stop.
But this way is not satisfied neither since it would throw a exception.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2437] Fix default constructor detection...

2015-08-12 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/960#issuecomment-130219402
  
Your changes look good to merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2437) TypeExtractor.analyzePojo has some problems around the default constructor detection

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693141#comment-14693141
 ] 

ASF GitHub Bot commented on FLINK-2437:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/960#issuecomment-130219402
  
Your changes look good to merge.


 TypeExtractor.analyzePojo has some problems around the default constructor 
 detection
 

 Key: FLINK-2437
 URL: https://issues.apache.org/jira/browse/FLINK-2437
 Project: Flink
  Issue Type: Bug
  Components: Type Serialization System
Reporter: Gabor Gevay
Assignee: Gabor Gevay
Priority: Minor

 If a class does have a default constructor, but the user forgot to make it 
 public, then TypeExtractor.analyzePojo still thinks everything is OK, so it 
 creates a PojoTypeInfo. Then PojoSerializer.createInstance blows up.
 Furthermore, a return null seems to be missing from the then case of the if 
 after catching the NoSuchMethodException which would also cause a headache 
 for PojoSerializer.
 An additional minor issue is that the word class is printed twice in 
 several places, because class.toString also prepends it to the class name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2077) Rework Path class and add extend support for Windows paths

2015-08-12 Thread GaoLun (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693036#comment-14693036
 ] 

GaoLun commented on FLINK-2077:
---

Hi, Fabian.
what do you mean about 'path like //host/dir1/dir2' ? 
In the dir1 or dir2 ,there must hava several '/' .How to pick out dir1 and dir2 
with a slash '/' 

 Rework Path class and add extend support for Windows paths
 --

 Key: FLINK-2077
 URL: https://issues.apache.org/jira/browse/FLINK-2077
 Project: Flink
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.9
Reporter: Fabian Hueske
Assignee: GaoLun
Priority: Minor
  Labels: starter

 The class {{org.apache.flink.core.fs.Path}} handles paths for Flink's 
 {{FileInputFormat}} and {{FileOutputFormat}}. Over time, this class has 
 become quite hard to read and modify. 
 It would benefit from some cleaning and refactoring. Along with the 
 refactoring, support for Windows paths like {{//host/dir1/dir2}} could be 
 added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2512]Add client.close() before throw Ru...

2015-08-12 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/1009#issuecomment-130205016
  
I think the best thing would be to just move the `try` up a little.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2509] Add class loader info to exceptio...

2015-08-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1008


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693086#comment-14693086
 ] 

ASF GitHub Bot commented on FLINK-2509:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1008


 Improve error messages when user code classes are not found
 ---

 Key: FLINK-2509
 URL: https://issues.apache.org/jira/browse/FLINK-2509
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.10
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.10


 When a job fails because the user code classes are not found, we should add 
 some information about the class loader and class path into the exception 
 message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-2509) Improve error messages when user code classes are not found

2015-08-12 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-2509.
-
Resolution: Implemented

Implemented in eeec1912b478ed43a045449d82e0a2fd3700d720

 Improve error messages when user code classes are not found
 ---

 Key: FLINK-2509
 URL: https://issues.apache.org/jira/browse/FLINK-2509
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.10
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.10


 When a job fails because the user code classes are not found, we should add 
 some information about the class loader and class path into the exception 
 message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2512) Add client.close() before throw RuntimeException

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693070#comment-14693070
 ] 

ASF GitHub Bot commented on FLINK-2512:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/1009#issuecomment-130205016
  
I think the best thing would be to just move the `try` up a little.


 Add client.close() before throw RuntimeException
 

 Key: FLINK-2512
 URL: https://issues.apache.org/jira/browse/FLINK-2512
 Project: Flink
  Issue Type: Bug
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: fangfengbin
Assignee: fangfengbin
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2509) Improve error messages when user code classes are not found

2015-08-12 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-2509.
---

 Improve error messages when user code classes are not found
 ---

 Key: FLINK-2509
 URL: https://issues.apache.org/jira/browse/FLINK-2509
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.10
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.10


 When a job fails because the user code classes are not found, we should add 
 some information about the class loader and class path into the exception 
 message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...

2015-08-12 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-130213092
  
@HuangWHWHW `read()` method of the `BufferedReader` object returns `-1` in 
case the end of the stream has been reached.

A couple of things I noticed apart from the `retryForever` issue. I wonder 
if we can fix these with this pull request as well:

1. The control flow of the `streamFromSocket` function is hard to predict 
because there are many `while` loops with `break`, `continue`, or `throw` 
statements.
2. We could use `StringBuilder` instead of `StringBuffer` in this class. 
`StringBuilder` is faster in the case of single-threaded access.
3. The function reads a single character at a time from the socket. It is 
more efficient to use a buffer and read several characters at once.

@HuangWHWHW You asked how you could count the number of retries in a unit 
test. Typically, you would insert a `Mock` or a `Spy` into your test method. 
Unfortunately, this does not work here because the socket variables is 
overwritten in case of a retry. So for this test, I would recommend creating a 
local `ServerSocket` and let the function connect to this socket. You can then 
control the failures from your test socket. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...

2015-08-12 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-130222078
  
Actually point 3 is not so bad because we're using a buffered reader that 
fills the buffer and does not read a character from the socket on every call to 
`read()`.

The `read()` method may throw an Exception or return -1. So we need to 
handle both of these cases. If closed properly, the socket should send the EOF 
event and the `read()` method returns -1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693076#comment-14693076
 ] 

ASF GitHub Bot commented on FLINK-2490:
---

Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-130206658
  
@mxm 
@StephanEwen 
Hi, I do a test for this today and I got another problem.
The SocketTextStreamFunction use BufferedReader.read() to get the buffer 
which is sent by socket server.
And whether this function BufferedReader.read() will never return -1 as the 
end of the sent message?
If it was there should be another bug that code following will never be 
reachable:

if (data == -1) {
socket.close();
long retry = 0;
boolean success = false;
while ((retry  maxRetry || 
retryForever)  !success) {
if (!retryForever) {
retry++;
}
LOG.warn(Lost connection to 
server socket. Retrying in 
+ 
(CONNECTION_RETRY_SLEEP / 1000) +  seconds...);
try {
socket = new Socket();
socket.connect(new 
InetSocketAddress(hostname, port),

CONNECTION_TIMEOUT_TIME);
success = true;
} catch (ConnectException ce) {

Thread.sleep(CONNECTION_RETRY_SLEEP);
socket.close();
}
}

if (success) {
LOG.info(Server socket is 
reconnected.);
} else {
LOG.error(Could not reconnect 
to server socket.);
break;
}
reader = new BufferedReader(new 
InputStreamReader(socket.getInputStream()));
continue;
}


 Remove unwanted boolean check in function 
 SocketTextStreamFunction.streamFromSocket
 ---

 Key: FLINK-2490
 URL: https://issues.apache.org/jira/browse/FLINK-2490
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.10
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693127#comment-14693127
 ] 

ASF GitHub Bot commented on FLINK-2490:
---

Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-130217420
  
@mxm 
Hi, thank you for suggestions.
I will try to follow your suggestions and improve the test.
I understand almost of yours and I also read the Class documentation of 
BufferedReader.read().
When I was doing the test I found the BufferedReader.read() would never 
stop until it read next char from socket server or throw a Exception when 
socket is closed.
Returning -1 in BufferedReader.read() seems to be only worked in text file 
instead socket message.
And I looked for help in the net that some guys said you might add a 
method(Socket.setSoTimeout()) so the BufferedReader.read() would stop.
But this way is not satisfied neither since it would throw a exception.




 Remove unwanted boolean check in function 
 SocketTextStreamFunction.streamFromSocket
 ---

 Key: FLINK-2490
 URL: https://issues.apache.org/jira/browse/FLINK-2490
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.10
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2507) Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693147#comment-14693147
 ] 

ASF GitHub Bot commented on FLINK-2507:
---

Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/1007#issuecomment-130219835
  
Will merge this...


 Rename the function tansformAndEmit in 
 org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector
 --

 Key: FLINK-2507
 URL: https://issues.apache.org/jira/browse/FLINK-2507
 Project: Flink
  Issue Type: Bug
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: fangfengbin
Assignee: fangfengbin
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2507]Rename the function tansformAndEmi...

2015-08-12 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/1007#issuecomment-130219835
  
Will merge this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...

2015-08-12 Thread HuangWHWHW
Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-130206658
  
@mxm 
@StephanEwen 
Hi, I do a test for this today and I got another problem.
The SocketTextStreamFunction use BufferedReader.read() to get the buffer 
which is sent by socket server.
And whether this function BufferedReader.read() will never return -1 as the 
end of the sent message?
If it was there should be another bug that code following will never be 
reachable:

if (data == -1) {
socket.close();
long retry = 0;
boolean success = false;
while ((retry  maxRetry || 
retryForever)  !success) {
if (!retryForever) {
retry++;
}
LOG.warn(Lost connection to 
server socket. Retrying in 
+ 
(CONNECTION_RETRY_SLEEP / 1000) +  seconds...);
try {
socket = new Socket();
socket.connect(new 
InetSocketAddress(hostname, port),

CONNECTION_TIMEOUT_TIME);
success = true;
} catch (ConnectException ce) {

Thread.sleep(CONNECTION_RETRY_SLEEP);
socket.close();
}
}

if (success) {
LOG.info(Server socket is 
reconnected.);
} else {
LOG.error(Could not reconnect 
to server socket.);
break;
}
reader = new BufferedReader(new 
InputStreamReader(socket.getInputStream()));
continue;
}


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2509] Add class loader info to exceptio...

2015-08-12 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/1008#issuecomment-130206616
  
Very nice addition! :) 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693075#comment-14693075
 ] 

ASF GitHub Bot commented on FLINK-2509:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/1008#issuecomment-130206616
  
Very nice addition! :) 


 Improve error messages when user code classes are not found
 ---

 Key: FLINK-2509
 URL: https://issues.apache.org/jira/browse/FLINK-2509
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.10
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.10


 When a job fails because the user code classes are not found, we should add 
 some information about the class loader and class path into the exception 
 message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2509] Add class loader info to exceptio...

2015-08-12 Thread uce
Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/1008#discussion_r36834144
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/util/ClassLoaderUtilsTest.java
 ---
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.util;
+
+import static org.junit.Assert.*;
+
+import org.junit.Test;
+
+import java.io.File;
+import java.io.FileOutputStream;
+import java.net.URL;
+import java.net.URLClassLoader;
+import java.util.jar.JarFile;
+
+/**
+ * Tests that validate the {@link ClassLoaderUtil}.
+ */
+public class ClassLoaderUtilsTest {
+
+   @Test
+   public void testWithURLClassLoader() {
+   File validJar = null;
+   File invalidJar = null;
+   
+   try {
+   // file with jar contents
+   validJar = File.createTempFile(flink-url-test, 
.tmp);
+   JarFileCreator jarFileCreator = new 
JarFileCreator(validJar);
+   jarFileCreator.addClass(ClassLoaderUtilsTest.class);
+   jarFileCreator.createJarFile();
+   
+   // validate that the JAR is correct and the test setup 
is not broken
+   try {
+   new JarFile(validJar.getAbsolutePath());
+   }
+   catch (Exception e) {
+   e.printStackTrace();
+   fail(test setup broken: cannot create a valid 
jar file);
+   }
+   
+   // file with some random contents
+   invalidJar = File.createTempFile(flink-url-test, 
.tmp);
+   try (FileOutputStream invalidout = new 
FileOutputStream(invalidJar)) {
+   invalidout.write(new byte[] { -1, 1, -2, 3, -3, 
4, });
+   }
+   
+   // non existing file
+   File nonExisting = 
File.createTempFile(flink-url-test, .tmp);
+   assertTrue(Cannot create and delete temp file, 
nonExisting.delete());
+   
+   
+   // create a URL classloader with
+   // - a HTTP URL
+   // - a file URL for an existing jar file
+   // - a file URL for an existing file that is not a jar 
file
+   // - a file URL for a non-existing file
+   
+   URL[] urls = {
+   new URL(http, localhost, 26712, 
/some/file/path),
+   new URL(file, null, 
validJar.getAbsolutePath()),
+   new URL(file, null, 
invalidJar.getAbsolutePath()),
+   new URL(file, null, 
nonExisting.getAbsolutePath()),
+   };
+
+   URLClassLoader loader = new URLClassLoader(urls, 
getClass().getClassLoader());
+   String info = 
ClassLoaderUtil.getUserCodeClassLoaderInfo(loader);
+   
+   assertTrue(info.indexOf(/some/file/path)  0);
+   assertTrue(info.indexOf(validJar.getAbsolutePath() + ' 
(valid)  0);
+   assertTrue(info.indexOf(invalidJar.getAbsolutePath() + 
' (invalid JAR)  0);
+   assertTrue(info.indexOf(nonExisting.getAbsolutePath() + 
' (missing)  0);
+
+   System.out.println(info);
+   }
+   catch (Exception e) {
+   e.printStackTrace();
+   fail(e.getMessage());
+   }
+   finally {
+   if (validJar != null) {
+   //noinspection ResultOfMethodCallIgnored
+   validJar.delete();
+   

[jira] [Commented] (FLINK-2437) TypeExtractor.analyzePojo has some problems around the default constructor detection

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693133#comment-14693133
 ] 

ASF GitHub Bot commented on FLINK-2437:
---

Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/960#discussion_r36837638
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/typeutils/TypeExtractor.java 
---
@@ -1328,24 +1329,29 @@ else if(typeHierarchy.size() = 1) {
ListMethod methods = getAllDeclaredMethods(clazz);
for (Method method : methods) {
if (method.getName().equals(readObject) || 
method.getName().equals(writeObject)) {
-   LOG.info(Class +clazz+ contains custom 
serialization methods we do not call.);
+   LOG.info(clazz+ contains custom serialization 
methods we do not call.);
return null;
}
}
 
// Try retrieving the default constructor, if it does not have 
one
// we cannot use this because the serializer uses it.
+   Constructor defaultConstructor = null;
try {
-   clazz.getDeclaredConstructor();
+   defaultConstructor = clazz.getDeclaredConstructor();
} catch (NoSuchMethodException e) {
if (clazz.isInterface() || 
Modifier.isAbstract(clazz.getModifiers())) {
-   LOG.info(Class  + clazz +  is abstract or an 
interface, having a concrete  +
+   LOG.info(clazz +  is abstract or an interface, 
having a concrete  +
type can increase 
performance.);
} else {
-   LOG.info(Class  + clazz +  must have a 
default constructor to be used as a POJO.);
+   LOG.info(clazz +  must have a default 
constructor to be used as a POJO.);
return null;
}
}
+   if(defaultConstructor != null  
(defaultConstructor.getModifiers()  Modifier.PUBLIC) == 0) {
--- End diff --

`if(defaultConstructor != null  
Modifier.isPublic(defaultConstructor.getModifiers())` seems to be more readable 
to me but your approach is fine too.


 TypeExtractor.analyzePojo has some problems around the default constructor 
 detection
 

 Key: FLINK-2437
 URL: https://issues.apache.org/jira/browse/FLINK-2437
 Project: Flink
  Issue Type: Bug
  Components: Type Serialization System
Reporter: Gabor Gevay
Assignee: Gabor Gevay
Priority: Minor

 If a class does have a default constructor, but the user forgot to make it 
 public, then TypeExtractor.analyzePojo still thinks everything is OK, so it 
 creates a PojoTypeInfo. Then PojoSerializer.createInstance blows up.
 Furthermore, a return null seems to be missing from the then case of the if 
 after catching the NoSuchMethodException which would also cause a headache 
 for PojoSerializer.
 An additional minor issue is that the word class is printed twice in 
 several places, because class.toString also prepends it to the class name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2437] Fix default constructor detection...

2015-08-12 Thread mxm
Github user mxm commented on a diff in the pull request:

https://github.com/apache/flink/pull/960#discussion_r36837638
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/typeutils/TypeExtractor.java 
---
@@ -1328,24 +1329,29 @@ else if(typeHierarchy.size() = 1) {
ListMethod methods = getAllDeclaredMethods(clazz);
for (Method method : methods) {
if (method.getName().equals(readObject) || 
method.getName().equals(writeObject)) {
-   LOG.info(Class +clazz+ contains custom 
serialization methods we do not call.);
+   LOG.info(clazz+ contains custom serialization 
methods we do not call.);
return null;
}
}
 
// Try retrieving the default constructor, if it does not have 
one
// we cannot use this because the serializer uses it.
+   Constructor defaultConstructor = null;
try {
-   clazz.getDeclaredConstructor();
+   defaultConstructor = clazz.getDeclaredConstructor();
} catch (NoSuchMethodException e) {
if (clazz.isInterface() || 
Modifier.isAbstract(clazz.getModifiers())) {
-   LOG.info(Class  + clazz +  is abstract or an 
interface, having a concrete  +
+   LOG.info(clazz +  is abstract or an interface, 
having a concrete  +
type can increase 
performance.);
} else {
-   LOG.info(Class  + clazz +  must have a 
default constructor to be used as a POJO.);
+   LOG.info(clazz +  must have a default 
constructor to be used as a POJO.);
return null;
}
}
+   if(defaultConstructor != null  
(defaultConstructor.getModifiers()  Modifier.PUBLIC) == 0) {
--- End diff --

`if(defaultConstructor != null  
Modifier.isPublic(defaultConstructor.getModifiers())` seems to be more readable 
to me but your approach is fine too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...

2015-08-12 Thread HuangWHWHW
Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-130228505
  
Hi, there are two more questions:
1.In using StringBuilder, does it mean that we should use 
BufferedReader.readLine() instead of BufferedReader.read()?
2.Could you tell me how to make the BufferedReader.read() return -1? I 
tried many ways that all filed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693183#comment-14693183
 ] 

ASF GitHub Bot commented on FLINK-2490:
---

Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-130228505
  
Hi, there are two more questions:
1.In using StringBuilder, does it mean that we should use 
BufferedReader.readLine() instead of BufferedReader.read()?
2.Could you tell me how to make the BufferedReader.read() return -1? I 
tried many ways that all filed.


 Remove unwanted boolean check in function 
 SocketTextStreamFunction.streamFromSocket
 ---

 Key: FLINK-2490
 URL: https://issues.apache.org/jira/browse/FLINK-2490
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.10
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2507) Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector

2015-08-12 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/FLINK-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Márton Balassi closed FLINK-2507.
-
   Resolution: Fixed
Fix Version/s: 0.10

Via 54311aa.

 Rename the function tansformAndEmit in 
 org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector
 --

 Key: FLINK-2507
 URL: https://issues.apache.org/jira/browse/FLINK-2507
 Project: Flink
  Issue Type: Bug
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: fangfengbin
Assignee: fangfengbin
Priority: Minor
 Fix For: 0.10






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2507]Rename the function tansformAndEmi...

2015-08-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1007


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1901) Create sample operator for Dataset

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693288#comment-14693288
 ] 

ASF GitHub Bot commented on FLINK-1901:
---

Github user thvasilo commented on a diff in the pull request:

https://github.com/apache/flink/pull/949#discussion_r36844879
  
--- Diff: 
flink-core/src/test/java/org/apache/flink/api/common/operators/util/RandomSamplerTest.java
 ---
@@ -0,0 +1,425 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.api.common.operators.util;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Sets;
+import org.apache.commons.math3.stat.inference.KolmogorovSmirnovTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.Iterator;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Set;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+
+/**
+ * This test suite try to verify whether all the random samplers work as 
we expected, which mainly focus on:
+ * ul
+ * liDoes sampled result fit into input parameters? we check parameters 
like sample fraction, sample size,
+ * w/o replacement, and so on./li
+ * liDoes sampled result randomly selected? we verify this by measure 
how much does it distributed on source data.
+ * Run Kolmogorov-Smirnov (KS) test between the random samplers and 
default reference samplers which is distributed
+ * well-proportioned on source data. If random sampler select elements 
randomly from source, it would distributed
+ * well-proportioned on source data as well. The KS test will fail to 
strongly reject the null hypothesis that
+ * the distributions of sampling gaps are the same.
+ * /li
+ * /ul
+ *
+ * @see a 
href=https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test;Kolmogorov 
Smirnov test/a
+ */
+public class RandomSamplerTest {
+   private final static int SOURCE_SIZE = 1;
+   private static KolmogorovSmirnovTest ksTest;
+   private static ListDouble source;
+   private final static int DEFFAULT_PARTITION_NUMBER=10;
+   private ListDouble[] sourcePartitions = new 
List[DEFFAULT_PARTITION_NUMBER];
+
+   @BeforeClass
+   public static void init() {
+   // initiate source data set.
+   source = new ArrayListDouble(SOURCE_SIZE);
+   for (int i = 0; i  SOURCE_SIZE; i++) {
+   source.add((double) i);
+   }
+   
+   ksTest = new KolmogorovSmirnovTest();
+   }
+
+   private void initSourcePartition() {
+   for (int i=0; iDEFFAULT_PARTITION_NUMBER; i++) {
+   sourcePartitions[i] = new LinkedListDouble();
+   }
+   for (int i = 0; i SOURCE_SIZE; i++) {
+   int index = i % DEFFAULT_PARTITION_NUMBER;
+   sourcePartitions[index].add((double)i);
+   }
+   }
+   
+   @Test(expected = java.lang.IllegalArgumentException.class)
+   public void testBernoulliSamplerWithUnexpectedFraction1() {
+   verifySamplerFraction(-1, false);
+   }
+   
+   @Test(expected = java.lang.IllegalArgumentException.class)
+   public void testBernoulliSamplerWithUnexpectedFraction2() {
+   verifySamplerFraction(2, false);
+   }
+   
+   @Test
+   public void testBernoulliSamplerFraction() {
+   verifySamplerFraction(0.01, false);
+   verifySamplerFraction(0.05, false);
+   verifySamplerFraction(0.1, false);
+   verifySamplerFraction(0.3, false);
+   verifySamplerFraction(0.5, false);
+   verifySamplerFraction(0.854, false);
+   

[jira] [Commented] (FLINK-2491) Operators are not participating in state checkpointing in some cases

2015-08-12 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693304#comment-14693304
 ] 

Márton Balassi commented on FLINK-2491:
---

This is troublesome, when setting log level to debug it shows that the 
`StreamTask` never calls a checkpoint on the sink. I am looking into it.

 Operators are not participating in state checkpointing in some cases
 

 Key: FLINK-2491
 URL: https://issues.apache.org/jira/browse/FLINK-2491
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.10
Reporter: Robert Metzger
Assignee: Márton Balassi
Priority: Critical
 Fix For: 0.10


 While implementing a test case for the Kafka Consumer, I came across the 
 following bug:
 Consider the following topology, with the operator parallelism in parentheses:
 Source (2) -- Sink (1).
 In this setup, the {{snapshotState()}} method is called on the source, but 
 not on the Sink.
 The sink receives the generated data.
 The only one of the two sources is generating data.
 I've implemented a test case for this, you can find it here: 
 https://github.com/rmetzger/flink/blob/para_checkpoint_bug/flink-tests/src/test/java/org/apache/flink/test/checkpointing/ParallelismChangeCheckpoinedITCase.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2507) Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693214#comment-14693214
 ] 

ASF GitHub Bot commented on FLINK-2507:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1007


 Rename the function tansformAndEmit in 
 org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector
 --

 Key: FLINK-2507
 URL: https://issues.apache.org/jira/browse/FLINK-2507
 Project: Flink
  Issue Type: Bug
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: fangfengbin
Assignee: fangfengbin
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2437) TypeExtractor.analyzePojo has some problems around the default constructor detection

2015-08-12 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-2437:
--
Affects Version/s: 0.9.0
   0.10

 TypeExtractor.analyzePojo has some problems around the default constructor 
 detection
 

 Key: FLINK-2437
 URL: https://issues.apache.org/jira/browse/FLINK-2437
 Project: Flink
  Issue Type: Bug
  Components: Type Serialization System
Affects Versions: 0.10, 0.9.0
Reporter: Gabor Gevay
Assignee: Gabor Gevay
Priority: Minor
 Fix For: 0.10, 0.9.1


 If a class does have a default constructor, but the user forgot to make it 
 public, then TypeExtractor.analyzePojo still thinks everything is OK, so it 
 creates a PojoTypeInfo. Then PojoSerializer.createInstance blows up.
 Furthermore, a return null seems to be missing from the then case of the if 
 after catching the NoSuchMethodException which would also cause a headache 
 for PojoSerializer.
 An additional minor issue is that the word class is printed twice in 
 several places, because class.toString also prepends it to the class name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2437] Fix default constructor detection...

2015-08-12 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/960#issuecomment-130251642
  
Thanks for your contribution!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2437) TypeExtractor.analyzePojo has some problems around the default constructor detection

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693298#comment-14693298
 ] 

ASF GitHub Bot commented on FLINK-2437:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/960#issuecomment-130251642
  
Thanks for your contribution!


 TypeExtractor.analyzePojo has some problems around the default constructor 
 detection
 

 Key: FLINK-2437
 URL: https://issues.apache.org/jira/browse/FLINK-2437
 Project: Flink
  Issue Type: Bug
  Components: Type Serialization System
Affects Versions: 0.10, 0.9.0
Reporter: Gabor Gevay
Assignee: Gabor Gevay
Priority: Minor
 Fix For: 0.10, 0.9.1


 If a class does have a default constructor, but the user forgot to make it 
 public, then TypeExtractor.analyzePojo still thinks everything is OK, so it 
 creates a PojoTypeInfo. Then PojoSerializer.createInstance blows up.
 Furthermore, a return null seems to be missing from the then case of the if 
 after catching the NoSuchMethodException which would also cause a headache 
 for PojoSerializer.
 An additional minor issue is that the word class is printed twice in 
 several places, because class.toString also prepends it to the class name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2437) TypeExtractor.analyzePojo has some problems around the default constructor detection

2015-08-12 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-2437.
-
Resolution: Fixed

 TypeExtractor.analyzePojo has some problems around the default constructor 
 detection
 

 Key: FLINK-2437
 URL: https://issues.apache.org/jira/browse/FLINK-2437
 Project: Flink
  Issue Type: Bug
  Components: Type Serialization System
Affects Versions: 0.10, 0.9.0
Reporter: Gabor Gevay
Assignee: Gabor Gevay
Priority: Minor
 Fix For: 0.10, 0.9.1


 If a class does have a default constructor, but the user forgot to make it 
 public, then TypeExtractor.analyzePojo still thinks everything is OK, so it 
 creates a PojoTypeInfo. Then PojoSerializer.createInstance blows up.
 Furthermore, a return null seems to be missing from the then case of the if 
 after catching the NoSuchMethodException which would also cause a headache 
 for PojoSerializer.
 An additional minor issue is that the word class is printed twice in 
 several places, because class.toString also prepends it to the class name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2512) Add client.close() before throw RuntimeException

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693199#comment-14693199
 ] 

ASF GitHub Bot commented on FLINK-2512:
---

Github user ffbin commented on the pull request:

https://github.com/apache/flink/pull/1009#issuecomment-130233136
  
@uce Thanks. What about remove if(client.getTopologyJobId(name) != null) 
{...} in line 103, because submitTopologyWithOpts() has check it at the head 
of function and will throw AlreadyAliveException.Then in finally{} close client.


 Add client.close() before throw RuntimeException
 

 Key: FLINK-2512
 URL: https://issues.apache.org/jira/browse/FLINK-2512
 Project: Flink
  Issue Type: Bug
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: fangfengbin
Assignee: fangfengbin
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2512]Add client.close() before throw Ru...

2015-08-12 Thread ffbin
Github user ffbin commented on the pull request:

https://github.com/apache/flink/pull/1009#issuecomment-130233136
  
@uce Thanks. What about remove if(client.getTopologyJobId(name) != null) 
{...} in line 103, because submitTopologyWithOpts() has check it at the head 
of function and will throw AlreadyAliveException.Then in finally{} close client.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2399) Fail when actor versions don't match

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693235#comment-14693235
 ] 

ASF GitHub Bot commented on FLINK-2399:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/945#issuecomment-130241985
  
I've already added a version check between `JobClient` and `JobManager`. 
Will there be any further review of this?


 Fail when actor versions don't match
 

 Key: FLINK-2399
 URL: https://issues.apache.org/jira/browse/FLINK-2399
 Project: Flink
  Issue Type: Improvement
  Components: JobManager, TaskManager
Affects Versions: 0.9, master
Reporter: Ufuk Celebi
Assignee: Sachin Goel
Priority: Minor
 Fix For: 0.10


 Problem: there can be subtle errors when actors from different Flink versions 
 communicate with each other, for example when an old client (e.g. Flink 0.9) 
 communicates with a new JobManager (e.g. Flink 0.10-SNAPSHOT).
 We can check that the versions match on first communication between the 
 actors and fail if they don't match.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2508) Confusing sharing of StreamExecutionEnvironment

2015-08-12 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693282#comment-14693282
 ] 

Márton Balassi commented on FLINK-2508:
---

Thanks for spotting this, I think the purpose of having the current environment 
cached was to be able to set the `TestStreamEnvironment` as context, but the 
current state of the code seems a bit messy.

 Confusing sharing of StreamExecutionEnvironment
 ---

 Key: FLINK-2508
 URL: https://issues.apache.org/jira/browse/FLINK-2508
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.10
Reporter: Stephan Ewen
 Fix For: 0.10


 In the {{StreamExecutionEnvironment}}, the environment is once created and 
 then shared with a static variable to all successive calls to 
 {{getExecutionEnvironment()}}. But it can be overridden by calls to 
 {{createLocalEnvironment()}} and {{createRemoteEnvironment()}}.
 This seems a bit un-intuitive, and probably creates confusion when 
 dispatching multiple streaming jobs from within the same JVM.
 Why is it even necessary to cache the current execution environment?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...

2015-08-12 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-130237642
  
 1.In using StringBuilder, does it mean that we should use 
BufferedReader.readLine() instead of BufferedReader.read()?

Reading by character is the way to go if we use a custom `delimiter`. If 
our delimiter was `\n` then it would be ok to read entire lines.

  Could you tell me how to make the BufferedReader.read() return -1? I 
tried many ways that all filed.

Ok :) Here is a minimal working example where `read()` returns `-1`:

```java
public static void main(String[] args) throws IOException {

ServerSocket socket = new ServerSocket(12345);

final SocketAddress socketAddress = socket.getLocalSocketAddress();

new Thread(new Runnable() {
@Override
public void run() {
Socket socket = new Socket();

try {
socket.connect(socketAddress);
} catch (IOException e) {
e.printStackTrace();
}

try {
BufferedReader bufferedReader = new 
BufferedReader(new InputStreamReader(socket.getInputStream()));
System.out.println((bufferedReader.read()));
} catch (IOException e) {
e.printStackTrace();
}
}
}).start();

Socket channel = socket.accept();

channel.close();
}
```
Output:
```
-1
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693224#comment-14693224
 ] 

ASF GitHub Bot commented on FLINK-2490:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-130237642
  
 1.In using StringBuilder, does it mean that we should use 
BufferedReader.readLine() instead of BufferedReader.read()?

Reading by character is the way to go if we use a custom `delimiter`. If 
our delimiter was `\n` then it would be ok to read entire lines.

  Could you tell me how to make the BufferedReader.read() return -1? I 
tried many ways that all filed.

Ok :) Here is a minimal working example where `read()` returns `-1`:

```java
public static void main(String[] args) throws IOException {

ServerSocket socket = new ServerSocket(12345);

final SocketAddress socketAddress = socket.getLocalSocketAddress();

new Thread(new Runnable() {
@Override
public void run() {
Socket socket = new Socket();

try {
socket.connect(socketAddress);
} catch (IOException e) {
e.printStackTrace();
}

try {
BufferedReader bufferedReader = new 
BufferedReader(new InputStreamReader(socket.getInputStream()));
System.out.println((bufferedReader.read()));
} catch (IOException e) {
e.printStackTrace();
}
}
}).start();

Socket channel = socket.accept();

channel.close();
}
```
Output:
```
-1
```


 Remove unwanted boolean check in function 
 SocketTextStreamFunction.streamFromSocket
 ---

 Key: FLINK-2490
 URL: https://issues.apache.org/jira/browse/FLINK-2490
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.10
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2437) TypeExtractor.analyzePojo has some problems around the default constructor detection

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693245#comment-14693245
 ] 

ASF GitHub Bot commented on FLINK-2437:
---

Github user ggevay commented on a diff in the pull request:

https://github.com/apache/flink/pull/960#discussion_r36843332
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/typeutils/TypeExtractor.java 
---
@@ -1328,24 +1329,29 @@ else if(typeHierarchy.size() = 1) {
ListMethod methods = getAllDeclaredMethods(clazz);
for (Method method : methods) {
if (method.getName().equals(readObject) || 
method.getName().equals(writeObject)) {
-   LOG.info(Class +clazz+ contains custom 
serialization methods we do not call.);
+   LOG.info(clazz+ contains custom serialization 
methods we do not call.);
return null;
}
}
 
// Try retrieving the default constructor, if it does not have 
one
// we cannot use this because the serializer uses it.
+   Constructor defaultConstructor = null;
try {
-   clazz.getDeclaredConstructor();
+   defaultConstructor = clazz.getDeclaredConstructor();
} catch (NoSuchMethodException e) {
if (clazz.isInterface() || 
Modifier.isAbstract(clazz.getModifiers())) {
-   LOG.info(Class  + clazz +  is abstract or an 
interface, having a concrete  +
+   LOG.info(clazz +  is abstract or an interface, 
having a concrete  +
type can increase 
performance.);
} else {
-   LOG.info(Class  + clazz +  must have a 
default constructor to be used as a POJO.);
+   LOG.info(clazz +  must have a default 
constructor to be used as a POJO.);
return null;
}
}
+   if(defaultConstructor != null  
(defaultConstructor.getModifiers()  Modifier.PUBLIC) == 0) {
--- End diff --

You are right, I have changed  it.


 TypeExtractor.analyzePojo has some problems around the default constructor 
 detection
 

 Key: FLINK-2437
 URL: https://issues.apache.org/jira/browse/FLINK-2437
 Project: Flink
  Issue Type: Bug
  Components: Type Serialization System
Reporter: Gabor Gevay
Assignee: Gabor Gevay
Priority: Minor

 If a class does have a default constructor, but the user forgot to make it 
 public, then TypeExtractor.analyzePojo still thinks everything is OK, so it 
 creates a PojoTypeInfo. Then PojoSerializer.createInstance blows up.
 Furthermore, a return null seems to be missing from the then case of the if 
 after catching the NoSuchMethodException which would also cause a headache 
 for PojoSerializer.
 An additional minor issue is that the word class is printed twice in 
 several places, because class.toString also prepends it to the class name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2399] Version checks for Job Manager an...

2015-08-12 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/945#issuecomment-130241985
  
I've already added a version check between `JobClient` and `JobManager`. 
Will there be any further review of this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2437) TypeExtractor.analyzePojo has some problems around the default constructor detection

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693294#comment-14693294
 ] 

ASF GitHub Bot commented on FLINK-2437:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/960


 TypeExtractor.analyzePojo has some problems around the default constructor 
 detection
 

 Key: FLINK-2437
 URL: https://issues.apache.org/jira/browse/FLINK-2437
 Project: Flink
  Issue Type: Bug
  Components: Type Serialization System
Reporter: Gabor Gevay
Assignee: Gabor Gevay
Priority: Minor

 If a class does have a default constructor, but the user forgot to make it 
 public, then TypeExtractor.analyzePojo still thinks everything is OK, so it 
 creates a PojoTypeInfo. Then PojoSerializer.createInstance blows up.
 Furthermore, a return null seems to be missing from the then case of the if 
 after catching the NoSuchMethodException which would also cause a headache 
 for PojoSerializer.
 An additional minor issue is that the word class is printed twice in 
 several places, because class.toString also prepends it to the class name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2437) TypeExtractor.analyzePojo has some problems around the default constructor detection

2015-08-12 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-2437:
--
Fix Version/s: 0.9.1
   0.10

 TypeExtractor.analyzePojo has some problems around the default constructor 
 detection
 

 Key: FLINK-2437
 URL: https://issues.apache.org/jira/browse/FLINK-2437
 Project: Flink
  Issue Type: Bug
  Components: Type Serialization System
Affects Versions: 0.10, 0.9.0
Reporter: Gabor Gevay
Assignee: Gabor Gevay
Priority: Minor
 Fix For: 0.10, 0.9.1


 If a class does have a default constructor, but the user forgot to make it 
 public, then TypeExtractor.analyzePojo still thinks everything is OK, so it 
 creates a PojoTypeInfo. Then PojoSerializer.createInstance blows up.
 Furthermore, a return null seems to be missing from the then case of the if 
 after catching the NoSuchMethodException which would also cause a headache 
 for PojoSerializer.
 An additional minor issue is that the word class is printed twice in 
 several places, because class.toString also prepends it to the class name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2457) Integrate Tuple0

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693292#comment-14693292
 ] 

ASF GitHub Bot commented on FLINK-2457:
---

Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/983#issuecomment-130251096
  
The typeutil classes look good. I see that you have modified the 
TupleXXBuilders, have you modified them by hand or by running the 
`TupleGenerator`? I can't see the modified `TupleGenerator` in your PR.


 Integrate Tuple0
 

 Key: FLINK-2457
 URL: https://issues.apache.org/jira/browse/FLINK-2457
 Project: Flink
  Issue Type: Improvement
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Minor

 Tuple0 is not cleanly integrated:
   - missing serialization/deserialization support in runtime
  - Tuple.getTupleClass(int arity) cannot handle arity zero, ie, cannot create 
 an instance of Tuple0
 Tuple0 is currently only used in Python API, but will be integrated into 
 Storm compatibility, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2512) Add client.close() before throw RuntimeException

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693775#comment-14693775
 ] 

ASF GitHub Bot commented on FLINK-2512:
---

Github user hsaputra commented on the pull request:

https://github.com/apache/flink/pull/1009#issuecomment-130363877
  
Since ``getTopologyJobId`` can also throw exception, we could just move up 
the try-catch to include the call to that method and rely on finally to close 
the client.


 Add client.close() before throw RuntimeException
 

 Key: FLINK-2512
 URL: https://issues.apache.org/jira/browse/FLINK-2512
 Project: Flink
  Issue Type: Bug
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: fangfengbin
Assignee: fangfengbin
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693831#comment-14693831
 ] 

ASF GitHub Bot commented on FLINK-2509:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1008#issuecomment-130369569
  
I thought not, but you are right, I pushed the wrong branch.
Sorry, git-fail on my side!

I'll push an update in a bit...


 Improve error messages when user code classes are not found
 ---

 Key: FLINK-2509
 URL: https://issues.apache.org/jira/browse/FLINK-2509
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.10
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.10


 When a job fails because the user code classes are not found, we should add 
 some information about the class loader and class path into the exception 
 message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693766#comment-14693766
 ] 

ASF GitHub Bot commented on FLINK-2509:
---

Github user hsaputra commented on the pull request:

https://github.com/apache/flink/pull/1008#issuecomment-130362926
  
HI @StephanEwen why is this PR is merged without addressing existing 
comments?


 Improve error messages when user code classes are not found
 ---

 Key: FLINK-2509
 URL: https://issues.apache.org/jira/browse/FLINK-2509
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.10
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.10


 When a job fails because the user code classes are not found, we should add 
 some information about the class loader and class path into the exception 
 message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2457] Integrate Tuple0

2015-08-12 Thread twalthr
Github user twalthr commented on the pull request:

https://github.com/apache/flink/pull/983#issuecomment-130251096
  
The typeutil classes look good. I see that you have modified the 
TupleXXBuilders, have you modified them by hand or by running the 
`TupleGenerator`? I can't see the modified `TupleGenerator` in your PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2480][test]Add tests for PrintSinkFunct...

2015-08-12 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/991#issuecomment-130295740
  
Your pull request doesn't compile: 
https://s3.amazonaws.com/archive.travis-ci.org/jobs/74504427/log.txt


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2480) Improving tests coverage for org.apache.flink.streaming.api

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693457#comment-14693457
 ] 

ASF GitHub Bot commented on FLINK-2480:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/991#issuecomment-130295740
  
Your pull request doesn't compile: 
https://s3.amazonaws.com/archive.travis-ci.org/jobs/74504427/log.txt


 Improving tests coverage for org.apache.flink.streaming.api
 ---

 Key: FLINK-2480
 URL: https://issues.apache.org/jira/browse/FLINK-2480
 Project: Flink
  Issue Type: Test
  Components: Streaming
Affects Versions: 0.10
Reporter: Huang Wei
 Fix For: 0.10

   Original Estimate: 504h
  Remaining Estimate: 504h

 The streaming API is quite a bit newer than the other code so it is not that 
 well covered with tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2491) Operators are not participating in state checkpointing in some cases

2015-08-12 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693387#comment-14693387
 ] 

Márton Balassi commented on FLINK-2491:
---

Here is the root cause. [1]

[1] 
https://github.com/apache/flink/blob/master/flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGenerator.java#L415

The same parallelism case works because of chaining.

 Operators are not participating in state checkpointing in some cases
 

 Key: FLINK-2491
 URL: https://issues.apache.org/jira/browse/FLINK-2491
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.10
Reporter: Robert Metzger
Assignee: Márton Balassi
Priority: Critical
 Fix For: 0.10


 While implementing a test case for the Kafka Consumer, I came across the 
 following bug:
 Consider the following topology, with the operator parallelism in parentheses:
 Source (2) -- Sink (1).
 In this setup, the {{snapshotState()}} method is called on the source, but 
 not on the Sink.
 The sink receives the generated data.
 The only one of the two sources is generating data.
 I've implemented a test case for this, you can find it here: 
 https://github.com/rmetzger/flink/blob/para_checkpoint_bug/flink-tests/src/test/java/org/apache/flink/test/checkpointing/ParallelismChangeCheckpoinedITCase.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2506) HBase connection closing down (table distributed over more than 1 region server - Flink Cluster-Mode)

2015-08-12 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-2506:
--
Description: 
If I fill a default table (create 'test-table', 'someCf') with the 
HBaseWriteExample.java program from the HBase addon library then a table 
without start/end key is created. 
The data reading works great with the HBaseReadExample.java.

Nevertheless, if I manually create a test-table that is distributed over more 
than one region server (create 'test-table2', 'someCf',{NUMREGIONS = 
3,SPLITALGO = 'HexStringSplit'}) the run is canceled with the following error 
message: 

{noformat}
grips2
Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
attempts=35, exceptions:
Fri Aug 07 11:18:29 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:18:38 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:18:48 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:18:58 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:08 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:18 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:28 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:38 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:48 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:58 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:20:18 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:20:38 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:20:58 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:21:19 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:21:39 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:21:59 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:22:19 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:22:39 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:22:59 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:23:19 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:23:39 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:24:00 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:24:20 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:24:40 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:25:00 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:25:20 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:25:40 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:26:00 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:26:20 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:26:40 CEST 

[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...

2015-08-12 Thread HuangWHWHW
Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-130311909
  
Thank you!
I`ll try again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693496#comment-14693496
 ] 

ASF GitHub Bot commented on FLINK-2490:
---

Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-130311909
  
Thank you!
I`ll try again.


 Remove unwanted boolean check in function 
 SocketTextStreamFunction.streamFromSocket
 ---

 Key: FLINK-2490
 URL: https://issues.apache.org/jira/browse/FLINK-2490
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.10
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2457) Integrate Tuple0

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693466#comment-14693466
 ] 

ASF GitHub Bot commented on FLINK-2457:
---

Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/983#issuecomment-130298853
  
I modified by hand. Was not aware of `TupleGenerator`. Just had a look into 
it. Not sure if `Tuple0` can be included appropriately. For example, it is no 
generic class; it is implemented as soft singleton. Extending `TupleGenerator` 
would result in special handling of Tuple0 in every place. Thus, adding it 
manually and not generate the source code for it seems better to me.


 Integrate Tuple0
 

 Key: FLINK-2457
 URL: https://issues.apache.org/jira/browse/FLINK-2457
 Project: Flink
  Issue Type: Improvement
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Minor

 Tuple0 is not cleanly integrated:
   - missing serialization/deserialization support in runtime
  - Tuple.getTupleClass(int arity) cannot handle arity zero, ie, cannot create 
 an instance of Tuple0
 Tuple0 is currently only used in Python API, but will be integrated into 
 Storm compatibility, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2457] Integrate Tuple0

2015-08-12 Thread mjsax
Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/983#issuecomment-130298853
  
I modified by hand. Was not aware of `TupleGenerator`. Just had a look into 
it. Not sure if `Tuple0` can be included appropriately. For example, it is no 
generic class; it is implemented as soft singleton. Extending `TupleGenerator` 
would result in special handling of Tuple0 in every place. Thus, adding it 
manually and not generate the source code for it seems better to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2508) Confusing sharing of StreamExecutionEnvironment

2015-08-12 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693474#comment-14693474
 ] 

Maximilian Michels commented on FLINK-2508:
---

Just like in {{ExecutionEnvironment}} we can have static variable which holds a 
factory. Setting the factory, we can change the default environment returned 
for tests, local, or cluster execution.

That would also remove the clutter in the {{getExecutionEnvironment()}} method.

 Confusing sharing of StreamExecutionEnvironment
 ---

 Key: FLINK-2508
 URL: https://issues.apache.org/jira/browse/FLINK-2508
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.10
Reporter: Stephan Ewen
 Fix For: 0.10


 In the {{StreamExecutionEnvironment}}, the environment is once created and 
 then shared with a static variable to all successive calls to 
 {{getExecutionEnvironment()}}. But it can be overridden by calls to 
 {{createLocalEnvironment()}} and {{createRemoteEnvironment()}}.
 This seems a bit un-intuitive, and probably creates confusion when 
 dispatching multiple streaming jobs from within the same JVM.
 Why is it even necessary to cache the current execution environment?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693546#comment-14693546
 ] 

ASF GitHub Bot commented on FLINK-1819:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130318553
  
We are back to square one ;-)

  - `Function`: single abstract method
  - `RichFunction` = 5 methods. I see how that gets rich.

  - `InputFormat`: 8 methods
  - `RichInputFormat`: 10 methods.
We could call it `SlightlyMoreRichInputFormat` ;-) I cannot help but find 
this very confusing.

Why the urgency to stick with the Rich prefix?


 Allow access to RuntimeContext from Input and OutputFormats
 ---

 Key: FLINK-1819
 URL: https://issues.apache.org/jira/browse/FLINK-1819
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.9, 0.8.1
Reporter: Fabian Hueske
Assignee: Sachin Goel
Priority: Minor
 Fix For: 0.9


 User function that extend a RichFunction can access a {{RuntimeContext}} 
 which gives the parallel id of the task and access to Accumulators and 
 BroadcastVariables. 
 Right now, Input and OutputFormats cannot access their {{RuntimeContext}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-12 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130318553
  
We are back to square one ;-)

  - `Function`: single abstract method
  - `RichFunction` = 5 methods. I see how that gets rich.

  - `InputFormat`: 8 methods
  - `RichInputFormat`: 10 methods.
We could call it `SlightlyMoreRichInputFormat` ;-) I cannot help but find 
this very confusing.

Why the urgency to stick with the Rich prefix?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693549#comment-14693549
 ] 

ASF GitHub Bot commented on FLINK-1819:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130318970
  
For transformation functions, there is a clear case for thin versus 
rich, for Java8 lambdas.
Input formats are a different game. They are super rich by default anyways.


 Allow access to RuntimeContext from Input and OutputFormats
 ---

 Key: FLINK-1819
 URL: https://issues.apache.org/jira/browse/FLINK-1819
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.9, 0.8.1
Reporter: Fabian Hueske
Assignee: Sachin Goel
Priority: Minor
 Fix For: 0.9


 User function that extend a RichFunction can access a {{RuntimeContext}} 
 which gives the parallel id of the task and access to Accumulators and 
 BroadcastVariables. 
 Right now, Input and OutputFormats cannot access their {{RuntimeContext}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693553#comment-14693553
 ] 

ASF GitHub Bot commented on FLINK-1819:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130319611
  
The suggestion `AbstractInputFormat` was not so bad, in my opinion.
If you want a name that explains what's happening, you can always call it 
`InputFormatWithContext` ;-)



 Allow access to RuntimeContext from Input and OutputFormats
 ---

 Key: FLINK-1819
 URL: https://issues.apache.org/jira/browse/FLINK-1819
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.9, 0.8.1
Reporter: Fabian Hueske
Assignee: Sachin Goel
Priority: Minor
 Fix For: 0.9


 User function that extend a RichFunction can access a {{RuntimeContext}} 
 which gives the parallel id of the task and access to Accumulators and 
 BroadcastVariables. 
 Right now, Input and OutputFormats cannot access their {{RuntimeContext}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693554#comment-14693554
 ] 

ASF GitHub Bot commented on FLINK-1819:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130319656
  
To keep it consistent with the remaining API. For functions you need to 
extend a RichFunction if you want to have access to the RuntimeContext. I agree 
that the name is not perfect (and I think everybody else got your point as 
well) but I think it's a valid point to aim for consistency.



 Allow access to RuntimeContext from Input and OutputFormats
 ---

 Key: FLINK-1819
 URL: https://issues.apache.org/jira/browse/FLINK-1819
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.9, 0.8.1
Reporter: Fabian Hueske
Assignee: Sachin Goel
Priority: Minor
 Fix For: 0.9


 User function that extend a RichFunction can access a {{RuntimeContext}} 
 which gives the parallel id of the task and access to Accumulators and 
 BroadcastVariables. 
 Right now, Input and OutputFormats cannot access their {{RuntimeContext}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-12 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130319611
  
The suggestion `AbstractInputFormat` was not so bad, in my opinion.
If you want a name that explains what's happening, you can always call it 
`InputFormatWithContext` ;-)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2457] Integrate Tuple0

2015-08-12 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/983#issuecomment-130316822
  
Agree, Tuple0 should probably not go through the tuple generator.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2457) Integrate Tuple0

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693527#comment-14693527
 ] 

ASF GitHub Bot commented on FLINK-2457:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/983#issuecomment-130316822
  
Agree, Tuple0 should probably not go through the tuple generator.


 Integrate Tuple0
 

 Key: FLINK-2457
 URL: https://issues.apache.org/jira/browse/FLINK-2457
 Project: Flink
  Issue Type: Improvement
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Minor

 Tuple0 is not cleanly integrated:
   - missing serialization/deserialization support in runtime
  - Tuple.getTupleClass(int arity) cannot handle arity zero, ie, cannot create 
 an instance of Tuple0
 Tuple0 is currently only used in Python API, but will be integrated into 
 Storm compatibility, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-12 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130319656
  
To keep it consistent with the remaining API. For functions you need to 
extend a RichFunction if you want to have access to the RuntimeContext. I agree 
that the name is not perfect (and I think everybody else got your point as 
well) but I think it's a valid point to aim for consistency.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [CLEANUP] Add space between quotes and plus si...

2015-08-12 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1010#issuecomment-130415584
  
+1, good to merge


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-12 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130384903
  
Oh apologies. I  only saw the first comment on the email thread. I guess
it's more or less settled. I'll leave it up to you guys to make a final
decision on this. :')
On Aug 12, 2015 10:59 PM, Sachin Goel sachingoel0...@gmail.com wrote:

 I agree with Stephan's argument that addition of context to I/O formats is
 a very marginal enhancement. He literally stole my words. :')
 However, from my perspective, when I first started using flink, Rich meant
 runtime context. The idea of open and close wasn't as nearly exciting as
 the runtime context.
 What if we changed back to the original name mentioned on jira and make it
 `ContextAwareInputFormat`? Would everyone be okay with that?
 On Aug 12, 2015 8:10 PM, Stephan Ewen notificati...@github.com wrote:

 Functions also need to extend RichFunction to have access to open() and
 close().
 I think the two things a different enough that any strife for
 consistency is actually pretty random.
 If your thoughts currently revolve around the RuntimeContext, it apprears
 more consistent. If you thoughts are on the life cycle methods, it seems
 inconsistent. Random.

 I think you should go ahead and just call them Rich. It is just a name,
 and what matters is that the JavaDocs describe what it actually means...

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/flink/pull/966#issuecomment-130324909.





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693925#comment-14693925
 ] 

ASF GitHub Bot commented on FLINK-2509:
---

Github user hsaputra commented on the pull request:

https://github.com/apache/flink/pull/1008#issuecomment-130388161
  
No worries =)


 Improve error messages when user code classes are not found
 ---

 Key: FLINK-2509
 URL: https://issues.apache.org/jira/browse/FLINK-2509
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.10
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.10


 When a job fails because the user code classes are not found, we should add 
 some information about the class loader and class path into the exception 
 message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [CLEANUP] Add space between quotes and plus si...

2015-08-12 Thread hsaputra
Github user hsaputra commented on the pull request:

https://github.com/apache/flink/pull/1010#issuecomment-130386385
  
+1

We agree that we would play it loose with style but this kind of cleanup 
helps readability.

I will send PR to change the check style to be more strict on this kind of 
violations.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-12 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130383984
  
I agree with Stephan's argument that addition of context to I/O formats is
a very marginal enhancement. He literally stole my words. :')
However, from my perspective, when I first started using flink, Rich meant
runtime context. The idea of open and close wasn't as nearly exciting as
the runtime context.
What if we changed back to the original name mentioned on jira and make it
`ContextAwareInputFormat`? Would everyone be okay with that?
On Aug 12, 2015 8:10 PM, Stephan Ewen notificati...@github.com wrote:

 Functions also need to extend RichFunction to have access to open() and
 close().
 I think the two things a different enough that any strife for
 consistency is actually pretty random.
 If your thoughts currently revolve around the RuntimeContext, it apprears
 more consistent. If you thoughts are on the life cycle methods, it seems
 inconsistent. Random.

 I think you should go ahead and just call them Rich. It is just a name,
 and what matters is that the JavaDocs describe what it actually means...

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/flink/pull/966#issuecomment-130324909.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693893#comment-14693893
 ] 

ASF GitHub Bot commented on FLINK-1819:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130383984
  
I agree with Stephan's argument that addition of context to I/O formats is
a very marginal enhancement. He literally stole my words. :')
However, from my perspective, when I first started using flink, Rich meant
runtime context. The idea of open and close wasn't as nearly exciting as
the runtime context.
What if we changed back to the original name mentioned on jira and make it
`ContextAwareInputFormat`? Would everyone be okay with that?
On Aug 12, 2015 8:10 PM, Stephan Ewen notificati...@github.com wrote:

 Functions also need to extend RichFunction to have access to open() and
 close().
 I think the two things a different enough that any strife for
 consistency is actually pretty random.
 If your thoughts currently revolve around the RuntimeContext, it apprears
 more consistent. If you thoughts are on the life cycle methods, it seems
 inconsistent. Random.

 I think you should go ahead and just call them Rich. It is just a name,
 and what matters is that the JavaDocs describe what it actually means...

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/flink/pull/966#issuecomment-130324909.




 Allow access to RuntimeContext from Input and OutputFormats
 ---

 Key: FLINK-1819
 URL: https://issues.apache.org/jira/browse/FLINK-1819
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.9, 0.8.1
Reporter: Fabian Hueske
Assignee: Sachin Goel
Priority: Minor
 Fix For: 0.9


 User function that extend a RichFunction can access a {{RuntimeContext}} 
 which gives the parallel id of the task and access to Accumulators and 
 BroadcastVariables. 
 Right now, Input and OutputFormats cannot access their {{RuntimeContext}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats

2015-08-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693899#comment-14693899
 ] 

ASF GitHub Bot commented on FLINK-1819:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130384903
  
Oh apologies. I  only saw the first comment on the email thread. I guess
it's more or less settled. I'll leave it up to you guys to make a final
decision on this. :')
On Aug 12, 2015 10:59 PM, Sachin Goel sachingoel0...@gmail.com wrote:

 I agree with Stephan's argument that addition of context to I/O formats is
 a very marginal enhancement. He literally stole my words. :')
 However, from my perspective, when I first started using flink, Rich meant
 runtime context. The idea of open and close wasn't as nearly exciting as
 the runtime context.
 What if we changed back to the original name mentioned on jira and make it
 `ContextAwareInputFormat`? Would everyone be okay with that?
 On Aug 12, 2015 8:10 PM, Stephan Ewen notificati...@github.com wrote:

 Functions also need to extend RichFunction to have access to open() and
 close().
 I think the two things a different enough that any strife for
 consistency is actually pretty random.
 If your thoughts currently revolve around the RuntimeContext, it apprears
 more consistent. If you thoughts are on the life cycle methods, it seems
 inconsistent. Random.

 I think you should go ahead and just call them Rich. It is just a name,
 and what matters is that the JavaDocs describe what it actually means...

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/flink/pull/966#issuecomment-130324909.





 Allow access to RuntimeContext from Input and OutputFormats
 ---

 Key: FLINK-1819
 URL: https://issues.apache.org/jira/browse/FLINK-1819
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.9, 0.8.1
Reporter: Fabian Hueske
Assignee: Sachin Goel
Priority: Minor
 Fix For: 0.9


 User function that extend a RichFunction can access a {{RuntimeContext}} 
 which gives the parallel id of the task and access to Accumulators and 
 BroadcastVariables. 
 Right now, Input and OutputFormats cannot access their {{RuntimeContext}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2512]Add client.close() before throw Ru...

2015-08-12 Thread hsaputra
Github user hsaputra commented on the pull request:

https://github.com/apache/flink/pull/1009#issuecomment-130363877
  
Since ``getTopologyJobId`` can also throw exception, we could just move up 
the try-catch to include the call to that method and rely on finally to close 
the client.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [CLEANUP] Add space between quotes and plus si...

2015-08-12 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1010#issuecomment-130383158
  
I like this.
I would actually like to make this a checkstyle rule. Most of the code is 
in this shape, occasional files go with a different style.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


  1   2   >