[jira] [Commented] (FLINK-2457) Integrate Tuple0
[ https://issues.apache.org/jira/browse/FLINK-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693572#comment-14693572 ] ASF GitHub Bot commented on FLINK-2457: --- Github user mjsax commented on the pull request: https://github.com/apache/flink/pull/983#issuecomment-130321100 I did those manually, too. I am just reworking the code to fix this up. Integrate Tuple0 Key: FLINK-2457 URL: https://issues.apache.org/jira/browse/FLINK-2457 Project: Flink Issue Type: Improvement Reporter: Matthias J. Sax Assignee: Matthias J. Sax Priority: Minor Tuple0 is not cleanly integrated: - missing serialization/deserialization support in runtime - Tuple.getTupleClass(int arity) cannot handle arity zero, ie, cannot create an instance of Tuple0 Tuple0 is currently only used in Python API, but will be integrated into Storm compatibility, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2512]Add client.close() before throw Ru...
Github user uce commented on the pull request: https://github.com/apache/flink/pull/1009#issuecomment-130323420 If you move the try up, you can certainly remove the manual close. Regarding the check in 103, it really depends on whether the Strom compat layer depends on having only a single job per client. Therefore I would keep it in and let it throw the RuntimeException as before. The finally block will then make sure that the client is closed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2512) Add client.close() before throw RuntimeException
[ https://issues.apache.org/jira/browse/FLINK-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693593#comment-14693593 ] ASF GitHub Bot commented on FLINK-2512: --- Github user uce commented on the pull request: https://github.com/apache/flink/pull/1009#issuecomment-130323420 If you move the try up, you can certainly remove the manual close. Regarding the check in 103, it really depends on whether the Strom compat layer depends on having only a single job per client. Therefore I would keep it in and let it throw the RuntimeException as before. The finally block will then make sure that the client is closed. Add client.close() before throw RuntimeException Key: FLINK-2512 URL: https://issues.apache.org/jira/browse/FLINK-2512 Project: Flink Issue Type: Bug Components: flink-contrib Affects Versions: 0.8.1 Reporter: fangfengbin Assignee: fangfengbin Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2457) Integrate Tuple0
[ https://issues.apache.org/jira/browse/FLINK-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693595#comment-14693595 ] ASF GitHub Bot commented on FLINK-2457: --- Github user mjsax commented on the pull request: https://github.com/apache/flink/pull/983#issuecomment-130323723 I did the following changes: - added Tuple0 to `TupleGenerator.modifyTupleType()` - changes `LinkedList` to `ArrayList` in TupleGenerator to create code for `TupleX.java` - added methods `toString()`, `equals()`, and `hashCode()` to `Tuple0` (Of course, I run TupleGenerator to test it.) This PR should be ready for merging now. Integrate Tuple0 Key: FLINK-2457 URL: https://issues.apache.org/jira/browse/FLINK-2457 Project: Flink Issue Type: Improvement Reporter: Matthias J. Sax Assignee: Matthias J. Sax Priority: Minor Tuple0 is not cleanly integrated: - missing serialization/deserialization support in runtime - Tuple.getTupleClass(int arity) cannot handle arity zero, ie, cannot create an instance of Tuple0 Tuple0 is currently only used in Python API, but will be integrated into Storm compatibility, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats
[ https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693603#comment-14693603 ] ASF GitHub Bot commented on FLINK-1819: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130324909 Functions also need to extend RichFunction to have access to `open()` and `close()`. I think the two things a different enough that any strife for consistency is actually pretty random. If your thoughts currently revolve around the RuntimeContext, it apprears more consistent. If you thoughts are on the life cycle methods, it seems inconsistent. Random. I think you should go ahead and just call them Rich. It is just a name, and what matters is that the JavaDocs describe what it actually means... Allow access to RuntimeContext from Input and OutputFormats --- Key: FLINK-1819 URL: https://issues.apache.org/jira/browse/FLINK-1819 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 0.9, 0.8.1 Reporter: Fabian Hueske Assignee: Sachin Goel Priority: Minor Fix For: 0.9 User function that extend a RichFunction can access a {{RuntimeContext}} which gives the parallel id of the task and access to Accumulators and BroadcastVariables. Right now, Input and OutputFormats cannot access their {{RuntimeContext}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2457) Integrate Tuple0
[ https://issues.apache.org/jira/browse/FLINK-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693570#comment-14693570 ] ASF GitHub Bot commented on FLINK-2457: --- Github user twalthr commented on the pull request: https://github.com/apache/flink/pull/983#issuecomment-130320695 Your are right Tuple0 support for the TupleGenerator is not important. Actually, I meant the change of `import java.util.LinkedList;` to `import java.util.ArrayList;` otherwise these changes get lost if the TupleGenerator is executed. Integrate Tuple0 Key: FLINK-2457 URL: https://issues.apache.org/jira/browse/FLINK-2457 Project: Flink Issue Type: Improvement Reporter: Matthias J. Sax Assignee: Matthias J. Sax Priority: Minor Tuple0 is not cleanly integrated: - missing serialization/deserialization support in runtime - Tuple.getTupleClass(int arity) cannot handle arity zero, ie, cannot create an instance of Tuple0 Tuple0 is currently only used in Python API, but will be integrated into Storm compatibility, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130318970 For transformation functions, there is a clear case for thin versus rich, for Java8 lambdas. Input formats are a different game. They are super rich by default anyways. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2457] Integrate Tuple0
Github user twalthr commented on the pull request: https://github.com/apache/flink/pull/983#issuecomment-130320695 Your are right Tuple0 support for the TupleGenerator is not important. Actually, I meant the change of `import java.util.LinkedList;` to `import java.util.ArrayList;` otherwise these changes get lost if the TupleGenerator is executed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130324909 Functions also need to extend RichFunction to have access to `open()` and `close()`. I think the two things a different enough that any strife for consistency is actually pretty random. If your thoughts currently revolve around the RuntimeContext, it apprears more consistent. If you thoughts are on the life cycle methods, it seems inconsistent. Random. I think you should go ahead and just call them Rich. It is just a name, and what matters is that the JavaDocs describe what it actually means... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130324770 Rich does not refer to the number of methods but the fact that it has the RuntimeContext available. All non-rich variants do not get state inserted. This follows a naming convention in Flink. `AbstractInputFormat` might be a more intuitive name for novices but I'm more inclined to naming consistency. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2457] Integrate Tuple0
Github user mjsax commented on the pull request: https://github.com/apache/flink/pull/983#issuecomment-130323723 I did the following changes: - added Tuple0 to `TupleGenerator.modifyTupleType()` - changes `LinkedList` to `ArrayList` in TupleGenerator to create code for `TupleX.java` - added methods `toString()`, `equals()`, and `hashCode()` to `Tuple0` (Of course, I run TupleGenerator to test it.) This PR should be ready for merging now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats
[ https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693602#comment-14693602 ] ASF GitHub Bot commented on FLINK-1819: --- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130324770 Rich does not refer to the number of methods but the fact that it has the RuntimeContext available. All non-rich variants do not get state inserted. This follows a naming convention in Flink. `AbstractInputFormat` might be a more intuitive name for novices but I'm more inclined to naming consistency. Allow access to RuntimeContext from Input and OutputFormats --- Key: FLINK-1819 URL: https://issues.apache.org/jira/browse/FLINK-1819 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 0.9, 0.8.1 Reporter: Fabian Hueske Assignee: Sachin Goel Priority: Minor Fix For: 0.9 User function that extend a RichFunction can access a {{RuntimeContext}} which gives the parallel id of the task and access to Accumulators and BroadcastVariables. Right now, Input and OutputFormats cannot access their {{RuntimeContext}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [CLEANUP] Add space between quotes and plus si...
Github user hsaputra commented on the pull request: https://github.com/apache/flink/pull/1010#issuecomment-130443578 Thanks @StephanEwen merging ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2509] Add class loader info to exceptio...
Github user hsaputra commented on the pull request: https://github.com/apache/flink/pull/1008#issuecomment-130388161 No worries =) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (FLINK-2501) [py] Remove the need to specify types for transformations
[ https://issues.apache.org/jira/browse/FLINK-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler reassigned FLINK-2501: --- Assignee: Chesnay Schepler [py] Remove the need to specify types for transformations - Key: FLINK-2501 URL: https://issues.apache.org/jira/browse/FLINK-2501 Project: Flink Issue Type: Improvement Components: Python API Reporter: Chesnay Schepler Assignee: Chesnay Schepler Currently, users of the Python API have to provide type arguments when using a UDF, like so: {code} d1.map(Mapper(), (INT, STRING)) {code} Instead, it would be really convenient to be able to do this: {code} d1.map(Mapper()) {code} The intention behind this issue is convenience, and it's also not really pythonic to specify types. Before I'll go into possible solutions, let me summarize the way these type arguments are currently used, and in general how types are handled: The type argument passed is actually an object of the type it represents, as INT is a constant int value, whereas STRING is a constant string value. You could as well write the following and it would still work. {code} d1.map(Mapper(), (1, ImNotATypInfo)) {code} This object is transmitted to the java side during the plan binding (and is now an actual Tuple2Integer, String), then passed to the type extractor, and the resulting TypeInformation saved in the java counterpart of the udf, which all implement the ResultTypeQueryable interface. The TypeInformation object is only used by the Java API, python never touches it. Instead, at runtime, the serializers used between python and java check the classes of the values passed and are thus generated dynamically. This means that, if a UDF does not pass the type it claims to pass, the Python API wont complain, but the underlying java API will when it's serializers fail. Now let's talk solutions. In discussions on the mailing list, pretty much 2 proposals were made: # Add a way to disable/circumvent type checks during the plan phase in the Java API and generate serializers dynamically. # Have objects always in serialized form on the java side, stored in a single bytearray or Tuple2 containing a key/value pair. These proposals vary wildly in the changes necessary to the system: # How can we change the Java API to support this? This proposal would hardly change the way the Python API works, or even touch the related source code. It mostly deals with the Java API. Since I'm not to familiar with the Plan processing life-cycle on the java side I can't assess which classes would have to be changed. # How can we make this work within the limits of the Java API? is the exact opposite, it changes nothing in the Java API. Instead, the following issues would have to be solved: * Alter the plan to extract keys before keyed operations, while hiding these keys from the UDF. This is exactly how KeySelectors (will) work, and as such is generally solved. In fact, this solution would make a few things easier in regards to KeySelectors. * Rework all operations that currently rely on Java API functions, that need deserialized data, for example Projections or the upcoming Aggregations; This generally means implementing them in python, or with special java UDF's (they could de-/serialize data within the udf call, or work on serialized data). * Change (De)Serializers accordingly * implement a reliable, not all-memory-consuming sorting mechanism on the python side Personally i prefer the second option, as it # does not modify the Java API, it works within it's well-tested limits # Plan changes are similar to issues that are already worked on (KeySelectors) # Sorting implementation was necessary anyway (for chained reducers) # having data in serialized form was a performance-related consideration already While the first option could work, and most likely require less work, i feel like many of the things required for option 2 will be implemented eventually anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1745) Add exact k-nearest-neighbours algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694338#comment-14694338 ] ASF GitHub Bot commented on FLINK-1745: --- Github user kno10 commented on the pull request: https://github.com/apache/flink/pull/696#issuecomment-130469648 R-trees are hard to parallelize. For distributed and gigabyte size data, an approximative approach is preferable, like the one we discuss in this article: E. Schubert, A. Zimek, H.-P. Kriegel Fast and Scalable Outlier Detection with Approximate Nearest Neighbor Ensembles In Proceedings of the 20th International Conference on Database Systems for Advanced Applications (DASFAA), Hanoi, Vietnam: 19–36, 2015. We discuss an approach that is easy to parallelize. It needs sorting and a sliding window (or blocks), so it is not strict MapReduce, but it should be a good match for Flink. The hardest part is to get the different space filling curves right and efficient. The other components (random projections to reduce dimensionality, ensemble to improve quality, and list inversions to also build reverse kNN that then allow accelerating methods such as LOF are much easier). The main drawback of most of these kNN-join approaches (including ours) is that they only work with Minkowski norms. There are much more interesting distance functions than that... We also discuss why the space filling curves appear to give better results for kNN, while LSH etc. work better for radius joins. LSH is another option, but it cannot guarantee to find k neighbors and parameter tuning is tricky. So you may want to have a look at this recent ensemble approach instead. Add exact k-nearest-neighbours algorithm to machine learning library Key: FLINK-1745 URL: https://issues.apache.org/jira/browse/FLINK-1745 Project: Flink Issue Type: New Feature Components: Machine Learning Library Reporter: Till Rohrmann Assignee: Till Rohrmann Labels: ML, Starter Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial it is still used as a mean to classify data and to do regression. This issue focuses on the implementation of an exact kNN (H-BNLJ, H-BRJ) algorithm as proposed in [2]. Could be a starter task. Resources: [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm] [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [CLEANUP] Add space between quotes and plus si...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1010 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-2493) Simplify names of example program JARs
[ https://issues.apache.org/jira/browse/FLINK-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenliang updated FLINK-2493: - Description: I find the names of the example JARs a bit annoying. Why not name the file {{examples/ConnectedComponents.jar}} rather than {{examples/flink-java-examples-0.10-SNAPSHOT-ConnectedComponents.jar}} And combine flink-java-examples and flink-scala-examples project to one examples project。 was: I find the names of the example JARs a bit annoying. Why not name the file {{examples/ConnectedComponents.jar}} rather than {{examples/flink-java-examples-0.10-SNAPSHOT-ConnectedComponents.jar}} Simplify names of example program JARs -- Key: FLINK-2493 URL: https://issues.apache.org/jira/browse/FLINK-2493 Project: Flink Issue Type: Improvement Components: Examples Affects Versions: 0.10 Reporter: Stephan Ewen Assignee: chenliang Priority: Minor Labels: easyfix, starter I find the names of the example JARs a bit annoying. Why not name the file {{examples/ConnectedComponents.jar}} rather than {{examples/flink-java-examples-0.10-SNAPSHOT-ConnectedComponents.jar}} And combine flink-java-examples and flink-scala-examples project to one examples project。 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1901) Create sample operator for Dataset
[ https://issues.apache.org/jira/browse/FLINK-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694548#comment-14694548 ] ASF GitHub Bot commented on FLINK-1901: --- Github user ChengXiangLi commented on a diff in the pull request: https://github.com/apache/flink/pull/949#discussion_r36934499 --- Diff: flink-core/src/test/java/org/apache/flink/api/common/operators/util/RandomSamplerTest.java --- @@ -0,0 +1,425 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.flink.api.common.operators.util; + +import com.google.common.base.Preconditions; +import com.google.common.collect.Lists; +import com.google.common.collect.Sets; +import org.apache.commons.math3.stat.inference.KolmogorovSmirnovTest; +import org.junit.BeforeClass; +import org.junit.Test; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.Comparator; +import java.util.Iterator; +import java.util.LinkedList; +import java.util.List; +import java.util.Set; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertTrue; + +/** + * This test suite try to verify whether all the random samplers work as we expected, which mainly focus on: + * ul + * liDoes sampled result fit into input parameters? we check parameters like sample fraction, sample size, + * w/o replacement, and so on./li + * liDoes sampled result randomly selected? we verify this by measure how much does it distributed on source data. + * Run Kolmogorov-Smirnov (KS) test between the random samplers and default reference samplers which is distributed + * well-proportioned on source data. If random sampler select elements randomly from source, it would distributed + * well-proportioned on source data as well. The KS test will fail to strongly reject the null hypothesis that + * the distributions of sampling gaps are the same. + * /li + * /ul + * + * @see a href=https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test;Kolmogorov Smirnov test/a + */ +public class RandomSamplerTest { + private final static int SOURCE_SIZE = 1; + private static KolmogorovSmirnovTest ksTest; + private static ListDouble source; + private final static int DEFFAULT_PARTITION_NUMBER=10; + private ListDouble[] sourcePartitions = new List[DEFFAULT_PARTITION_NUMBER]; + + @BeforeClass + public static void init() { + // initiate source data set. + source = new ArrayListDouble(SOURCE_SIZE); + for (int i = 0; i SOURCE_SIZE; i++) { + source.add((double) i); + } + + ksTest = new KolmogorovSmirnovTest(); + } + + private void initSourcePartition() { + for (int i=0; iDEFFAULT_PARTITION_NUMBER; i++) { + sourcePartitions[i] = new LinkedListDouble(); + } + for (int i = 0; i SOURCE_SIZE; i++) { + int index = i % DEFFAULT_PARTITION_NUMBER; + sourcePartitions[index].add((double)i); + } + } + + @Test(expected = java.lang.IllegalArgumentException.class) + public void testBernoulliSamplerWithUnexpectedFraction1() { + verifySamplerFraction(-1, false); + } + + @Test(expected = java.lang.IllegalArgumentException.class) + public void testBernoulliSamplerWithUnexpectedFraction2() { + verifySamplerFraction(2, false); + } + + @Test + public void testBernoulliSamplerFraction() { + verifySamplerFraction(0.01, false); + verifySamplerFraction(0.05, false); + verifySamplerFraction(0.1, false); + verifySamplerFraction(0.3, false); + verifySamplerFraction(0.5, false); + verifySamplerFraction(0.854, false); +
[jira] [Commented] (FLINK-2512) Add client.close() before throw RuntimeException
[ https://issues.apache.org/jira/browse/FLINK-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694566#comment-14694566 ] ASF GitHub Bot commented on FLINK-2512: --- Github user ffbin commented on the pull request: https://github.com/apache/flink/pull/1009#issuecomment-130509029 @uce @hsaputra Thanks. I have move the try up and rely on finally to close the client. Add client.close() before throw RuntimeException Key: FLINK-2512 URL: https://issues.apache.org/jira/browse/FLINK-2512 Project: Flink Issue Type: Bug Components: flink-contrib Affects Versions: 0.8.1 Reporter: fangfengbin Assignee: fangfengbin Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket
[ https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694658#comment-14694658 ] ASF GitHub Bot commented on FLINK-2490: --- Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-130524804 @mxm Hi, I fixed the StringBuffer and add the test. Take a look whether it`s correct. Thank you! Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket --- Key: FLINK-2490 URL: https://issues.apache.org/jira/browse/FLINK-2490 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.10 Reporter: Huang Wei Priority: Minor Fix For: 0.10 Original Estimate: 168h Remaining Estimate: 168h -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found
[ https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693074#comment-14693074 ] ASF GitHub Bot commented on FLINK-2509: --- Github user uce commented on a diff in the pull request: https://github.com/apache/flink/pull/1008#discussion_r36834144 --- Diff: flink-runtime/src/test/java/org/apache/flink/runtime/util/ClassLoaderUtilsTest.java --- @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.util; + +import static org.junit.Assert.*; + +import org.junit.Test; + +import java.io.File; +import java.io.FileOutputStream; +import java.net.URL; +import java.net.URLClassLoader; +import java.util.jar.JarFile; + +/** + * Tests that validate the {@link ClassLoaderUtil}. + */ +public class ClassLoaderUtilsTest { + + @Test + public void testWithURLClassLoader() { + File validJar = null; + File invalidJar = null; + + try { + // file with jar contents + validJar = File.createTempFile(flink-url-test, .tmp); + JarFileCreator jarFileCreator = new JarFileCreator(validJar); + jarFileCreator.addClass(ClassLoaderUtilsTest.class); + jarFileCreator.createJarFile(); + + // validate that the JAR is correct and the test setup is not broken + try { + new JarFile(validJar.getAbsolutePath()); + } + catch (Exception e) { + e.printStackTrace(); + fail(test setup broken: cannot create a valid jar file); + } + + // file with some random contents + invalidJar = File.createTempFile(flink-url-test, .tmp); + try (FileOutputStream invalidout = new FileOutputStream(invalidJar)) { + invalidout.write(new byte[] { -1, 1, -2, 3, -3, 4, }); + } + + // non existing file + File nonExisting = File.createTempFile(flink-url-test, .tmp); + assertTrue(Cannot create and delete temp file, nonExisting.delete()); + + + // create a URL classloader with + // - a HTTP URL + // - a file URL for an existing jar file + // - a file URL for an existing file that is not a jar file + // - a file URL for a non-existing file + + URL[] urls = { + new URL(http, localhost, 26712, /some/file/path), + new URL(file, null, validJar.getAbsolutePath()), + new URL(file, null, invalidJar.getAbsolutePath()), + new URL(file, null, nonExisting.getAbsolutePath()), + }; + + URLClassLoader loader = new URLClassLoader(urls, getClass().getClassLoader()); + String info = ClassLoaderUtil.getUserCodeClassLoaderInfo(loader); + + assertTrue(info.indexOf(/some/file/path) 0); + assertTrue(info.indexOf(validJar.getAbsolutePath() + ' (valid) 0); + assertTrue(info.indexOf(invalidJar.getAbsolutePath() + ' (invalid JAR) 0); + assertTrue(info.indexOf(nonExisting.getAbsolutePath() + ' (missing) 0); + + System.out.println(info); + } + catch (Exception e) { + e.printStackTrace(); +
[GitHub] flink pull request: [FLINK-2457] Integrate Tuple0
Github user mjsax commented on the pull request: https://github.com/apache/flink/pull/983#issuecomment-130211866 Any news about this PR? @twalthr : Are you going to review it again? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2457) Integrate Tuple0
[ https://issues.apache.org/jira/browse/FLINK-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693091#comment-14693091 ] ASF GitHub Bot commented on FLINK-2457: --- Github user mjsax commented on the pull request: https://github.com/apache/flink/pull/983#issuecomment-130211866 Any news about this PR? @twalthr : Are you going to review it again? Integrate Tuple0 Key: FLINK-2457 URL: https://issues.apache.org/jira/browse/FLINK-2457 Project: Flink Issue Type: Improvement Reporter: Matthias J. Sax Assignee: Matthias J. Sax Priority: Minor Tuple0 is not cleanly integrated: - missing serialization/deserialization support in runtime - Tuple.getTupleClass(int arity) cannot handle arity zero, ie, cannot create an instance of Tuple0 Tuple0 is currently only used in Python API, but will be integrated into Storm compatibility, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket
[ https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693104#comment-14693104 ] ASF GitHub Bot commented on FLINK-2490: --- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-130213092 @HuangWHWHW `read()` method of the `BufferedReader` object returns `-1` in case the end of the stream has been reached. A couple of things I noticed apart from the `retryForever` issue. I wonder if we can fix these with this pull request as well: 1. The control flow of the `streamFromSocket` function is hard to predict because there are many `while` loops with `break`, `continue`, or `throw` statements. 2. We could use `StringBuilder` instead of `StringBuffer` in this class. `StringBuilder` is faster in the case of single-threaded access. 3. The function reads a single character at a time from the socket. It is more efficient to use a buffer and read several characters at once. @HuangWHWHW You asked how you could count the number of retries in a unit test. Typically, you would insert a `Mock` or a `Spy` into your test method. Unfortunately, this does not work here because the socket variables is overwritten in case of a retry. So for this test, I would recommend creating a local `ServerSocket` and let the function connect to this socket. You can then control the failures from your test socket. Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket --- Key: FLINK-2490 URL: https://issues.apache.org/jira/browse/FLINK-2490 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.10 Reporter: Huang Wei Priority: Minor Fix For: 0.10 Original Estimate: 168h Remaining Estimate: 168h -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...
Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-130217420 @mxm Hi, thank you for suggestions. I will try to follow your suggestions and improve the test. I understand almost of yours and I also read the Class documentation of BufferedReader.read(). When I was doing the test I found the BufferedReader.read() would never stop until it read next char from socket server or throw a Exception when socket is closed. Returning -1 in BufferedReader.read() seems to be only worked in text file instead socket message. And I looked for help in the net that some guys said you might add a method(Socket.setSoTimeout()) so the BufferedReader.read() would stop. But this way is not satisfied neither since it would throw a exception. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2437] Fix default constructor detection...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/960#issuecomment-130219402 Your changes look good to merge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2437) TypeExtractor.analyzePojo has some problems around the default constructor detection
[ https://issues.apache.org/jira/browse/FLINK-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693141#comment-14693141 ] ASF GitHub Bot commented on FLINK-2437: --- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/960#issuecomment-130219402 Your changes look good to merge. TypeExtractor.analyzePojo has some problems around the default constructor detection Key: FLINK-2437 URL: https://issues.apache.org/jira/browse/FLINK-2437 Project: Flink Issue Type: Bug Components: Type Serialization System Reporter: Gabor Gevay Assignee: Gabor Gevay Priority: Minor If a class does have a default constructor, but the user forgot to make it public, then TypeExtractor.analyzePojo still thinks everything is OK, so it creates a PojoTypeInfo. Then PojoSerializer.createInstance blows up. Furthermore, a return null seems to be missing from the then case of the if after catching the NoSuchMethodException which would also cause a headache for PojoSerializer. An additional minor issue is that the word class is printed twice in several places, because class.toString also prepends it to the class name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2077) Rework Path class and add extend support for Windows paths
[ https://issues.apache.org/jira/browse/FLINK-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693036#comment-14693036 ] GaoLun commented on FLINK-2077: --- Hi, Fabian. what do you mean about 'path like //host/dir1/dir2' ? In the dir1 or dir2 ,there must hava several '/' .How to pick out dir1 and dir2 with a slash '/' Rework Path class and add extend support for Windows paths -- Key: FLINK-2077 URL: https://issues.apache.org/jira/browse/FLINK-2077 Project: Flink Issue Type: Improvement Components: Core Affects Versions: 0.9 Reporter: Fabian Hueske Assignee: GaoLun Priority: Minor Labels: starter The class {{org.apache.flink.core.fs.Path}} handles paths for Flink's {{FileInputFormat}} and {{FileOutputFormat}}. Over time, this class has become quite hard to read and modify. It would benefit from some cleaning and refactoring. Along with the refactoring, support for Windows paths like {{//host/dir1/dir2}} could be added. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2512]Add client.close() before throw Ru...
Github user uce commented on the pull request: https://github.com/apache/flink/pull/1009#issuecomment-130205016 I think the best thing would be to just move the `try` up a little. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2509] Add class loader info to exceptio...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1008 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found
[ https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693086#comment-14693086 ] ASF GitHub Bot commented on FLINK-2509: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1008 Improve error messages when user code classes are not found --- Key: FLINK-2509 URL: https://issues.apache.org/jira/browse/FLINK-2509 Project: Flink Issue Type: Bug Components: Core Affects Versions: 0.10 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.10 When a job fails because the user code classes are not found, we should add some information about the class loader and class path into the exception message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-2509) Improve error messages when user code classes are not found
[ https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-2509. - Resolution: Implemented Implemented in eeec1912b478ed43a045449d82e0a2fd3700d720 Improve error messages when user code classes are not found --- Key: FLINK-2509 URL: https://issues.apache.org/jira/browse/FLINK-2509 Project: Flink Issue Type: Bug Components: Core Affects Versions: 0.10 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.10 When a job fails because the user code classes are not found, we should add some information about the class loader and class path into the exception message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2512) Add client.close() before throw RuntimeException
[ https://issues.apache.org/jira/browse/FLINK-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693070#comment-14693070 ] ASF GitHub Bot commented on FLINK-2512: --- Github user uce commented on the pull request: https://github.com/apache/flink/pull/1009#issuecomment-130205016 I think the best thing would be to just move the `try` up a little. Add client.close() before throw RuntimeException Key: FLINK-2512 URL: https://issues.apache.org/jira/browse/FLINK-2512 Project: Flink Issue Type: Bug Components: flink-contrib Affects Versions: 0.8.1 Reporter: fangfengbin Assignee: fangfengbin Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2509) Improve error messages when user code classes are not found
[ https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-2509. --- Improve error messages when user code classes are not found --- Key: FLINK-2509 URL: https://issues.apache.org/jira/browse/FLINK-2509 Project: Flink Issue Type: Bug Components: Core Affects Versions: 0.10 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.10 When a job fails because the user code classes are not found, we should add some information about the class loader and class path into the exception message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-130213092 @HuangWHWHW `read()` method of the `BufferedReader` object returns `-1` in case the end of the stream has been reached. A couple of things I noticed apart from the `retryForever` issue. I wonder if we can fix these with this pull request as well: 1. The control flow of the `streamFromSocket` function is hard to predict because there are many `while` loops with `break`, `continue`, or `throw` statements. 2. We could use `StringBuilder` instead of `StringBuffer` in this class. `StringBuilder` is faster in the case of single-threaded access. 3. The function reads a single character at a time from the socket. It is more efficient to use a buffer and read several characters at once. @HuangWHWHW You asked how you could count the number of retries in a unit test. Typically, you would insert a `Mock` or a `Spy` into your test method. Unfortunately, this does not work here because the socket variables is overwritten in case of a retry. So for this test, I would recommend creating a local `ServerSocket` and let the function connect to this socket. You can then control the failures from your test socket. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-130222078 Actually point 3 is not so bad because we're using a buffered reader that fills the buffer and does not read a character from the socket on every call to `read()`. The `read()` method may throw an Exception or return -1. So we need to handle both of these cases. If closed properly, the socket should send the EOF event and the `read()` method returns -1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket
[ https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693076#comment-14693076 ] ASF GitHub Bot commented on FLINK-2490: --- Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-130206658 @mxm @StephanEwen Hi, I do a test for this today and I got another problem. The SocketTextStreamFunction use BufferedReader.read() to get the buffer which is sent by socket server. And whether this function BufferedReader.read() will never return -1 as the end of the sent message? If it was there should be another bug that code following will never be reachable: if (data == -1) { socket.close(); long retry = 0; boolean success = false; while ((retry maxRetry || retryForever) !success) { if (!retryForever) { retry++; } LOG.warn(Lost connection to server socket. Retrying in + (CONNECTION_RETRY_SLEEP / 1000) + seconds...); try { socket = new Socket(); socket.connect(new InetSocketAddress(hostname, port), CONNECTION_TIMEOUT_TIME); success = true; } catch (ConnectException ce) { Thread.sleep(CONNECTION_RETRY_SLEEP); socket.close(); } } if (success) { LOG.info(Server socket is reconnected.); } else { LOG.error(Could not reconnect to server socket.); break; } reader = new BufferedReader(new InputStreamReader(socket.getInputStream())); continue; } Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket --- Key: FLINK-2490 URL: https://issues.apache.org/jira/browse/FLINK-2490 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.10 Reporter: Huang Wei Priority: Minor Fix For: 0.10 Original Estimate: 168h Remaining Estimate: 168h -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket
[ https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693127#comment-14693127 ] ASF GitHub Bot commented on FLINK-2490: --- Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-130217420 @mxm Hi, thank you for suggestions. I will try to follow your suggestions and improve the test. I understand almost of yours and I also read the Class documentation of BufferedReader.read(). When I was doing the test I found the BufferedReader.read() would never stop until it read next char from socket server or throw a Exception when socket is closed. Returning -1 in BufferedReader.read() seems to be only worked in text file instead socket message. And I looked for help in the net that some guys said you might add a method(Socket.setSoTimeout()) so the BufferedReader.read() would stop. But this way is not satisfied neither since it would throw a exception. Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket --- Key: FLINK-2490 URL: https://issues.apache.org/jira/browse/FLINK-2490 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.10 Reporter: Huang Wei Priority: Minor Fix For: 0.10 Original Estimate: 168h Remaining Estimate: 168h -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2507) Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector
[ https://issues.apache.org/jira/browse/FLINK-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693147#comment-14693147 ] ASF GitHub Bot commented on FLINK-2507: --- Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/1007#issuecomment-130219835 Will merge this... Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector -- Key: FLINK-2507 URL: https://issues.apache.org/jira/browse/FLINK-2507 Project: Flink Issue Type: Bug Components: flink-contrib Affects Versions: 0.8.1 Reporter: fangfengbin Assignee: fangfengbin Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2507]Rename the function tansformAndEmi...
Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/1007#issuecomment-130219835 Will merge this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...
Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-130206658 @mxm @StephanEwen Hi, I do a test for this today and I got another problem. The SocketTextStreamFunction use BufferedReader.read() to get the buffer which is sent by socket server. And whether this function BufferedReader.read() will never return -1 as the end of the sent message? If it was there should be another bug that code following will never be reachable: if (data == -1) { socket.close(); long retry = 0; boolean success = false; while ((retry maxRetry || retryForever) !success) { if (!retryForever) { retry++; } LOG.warn(Lost connection to server socket. Retrying in + (CONNECTION_RETRY_SLEEP / 1000) + seconds...); try { socket = new Socket(); socket.connect(new InetSocketAddress(hostname, port), CONNECTION_TIMEOUT_TIME); success = true; } catch (ConnectException ce) { Thread.sleep(CONNECTION_RETRY_SLEEP); socket.close(); } } if (success) { LOG.info(Server socket is reconnected.); } else { LOG.error(Could not reconnect to server socket.); break; } reader = new BufferedReader(new InputStreamReader(socket.getInputStream())); continue; } --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2509] Add class loader info to exceptio...
Github user uce commented on the pull request: https://github.com/apache/flink/pull/1008#issuecomment-130206616 Very nice addition! :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found
[ https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693075#comment-14693075 ] ASF GitHub Bot commented on FLINK-2509: --- Github user uce commented on the pull request: https://github.com/apache/flink/pull/1008#issuecomment-130206616 Very nice addition! :) Improve error messages when user code classes are not found --- Key: FLINK-2509 URL: https://issues.apache.org/jira/browse/FLINK-2509 Project: Flink Issue Type: Bug Components: Core Affects Versions: 0.10 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.10 When a job fails because the user code classes are not found, we should add some information about the class loader and class path into the exception message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2509] Add class loader info to exceptio...
Github user uce commented on a diff in the pull request: https://github.com/apache/flink/pull/1008#discussion_r36834144 --- Diff: flink-runtime/src/test/java/org/apache/flink/runtime/util/ClassLoaderUtilsTest.java --- @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.util; + +import static org.junit.Assert.*; + +import org.junit.Test; + +import java.io.File; +import java.io.FileOutputStream; +import java.net.URL; +import java.net.URLClassLoader; +import java.util.jar.JarFile; + +/** + * Tests that validate the {@link ClassLoaderUtil}. + */ +public class ClassLoaderUtilsTest { + + @Test + public void testWithURLClassLoader() { + File validJar = null; + File invalidJar = null; + + try { + // file with jar contents + validJar = File.createTempFile(flink-url-test, .tmp); + JarFileCreator jarFileCreator = new JarFileCreator(validJar); + jarFileCreator.addClass(ClassLoaderUtilsTest.class); + jarFileCreator.createJarFile(); + + // validate that the JAR is correct and the test setup is not broken + try { + new JarFile(validJar.getAbsolutePath()); + } + catch (Exception e) { + e.printStackTrace(); + fail(test setup broken: cannot create a valid jar file); + } + + // file with some random contents + invalidJar = File.createTempFile(flink-url-test, .tmp); + try (FileOutputStream invalidout = new FileOutputStream(invalidJar)) { + invalidout.write(new byte[] { -1, 1, -2, 3, -3, 4, }); + } + + // non existing file + File nonExisting = File.createTempFile(flink-url-test, .tmp); + assertTrue(Cannot create and delete temp file, nonExisting.delete()); + + + // create a URL classloader with + // - a HTTP URL + // - a file URL for an existing jar file + // - a file URL for an existing file that is not a jar file + // - a file URL for a non-existing file + + URL[] urls = { + new URL(http, localhost, 26712, /some/file/path), + new URL(file, null, validJar.getAbsolutePath()), + new URL(file, null, invalidJar.getAbsolutePath()), + new URL(file, null, nonExisting.getAbsolutePath()), + }; + + URLClassLoader loader = new URLClassLoader(urls, getClass().getClassLoader()); + String info = ClassLoaderUtil.getUserCodeClassLoaderInfo(loader); + + assertTrue(info.indexOf(/some/file/path) 0); + assertTrue(info.indexOf(validJar.getAbsolutePath() + ' (valid) 0); + assertTrue(info.indexOf(invalidJar.getAbsolutePath() + ' (invalid JAR) 0); + assertTrue(info.indexOf(nonExisting.getAbsolutePath() + ' (missing) 0); + + System.out.println(info); + } + catch (Exception e) { + e.printStackTrace(); + fail(e.getMessage()); + } + finally { + if (validJar != null) { + //noinspection ResultOfMethodCallIgnored + validJar.delete(); +
[jira] [Commented] (FLINK-2437) TypeExtractor.analyzePojo has some problems around the default constructor detection
[ https://issues.apache.org/jira/browse/FLINK-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693133#comment-14693133 ] ASF GitHub Bot commented on FLINK-2437: --- Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/960#discussion_r36837638 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/typeutils/TypeExtractor.java --- @@ -1328,24 +1329,29 @@ else if(typeHierarchy.size() = 1) { ListMethod methods = getAllDeclaredMethods(clazz); for (Method method : methods) { if (method.getName().equals(readObject) || method.getName().equals(writeObject)) { - LOG.info(Class +clazz+ contains custom serialization methods we do not call.); + LOG.info(clazz+ contains custom serialization methods we do not call.); return null; } } // Try retrieving the default constructor, if it does not have one // we cannot use this because the serializer uses it. + Constructor defaultConstructor = null; try { - clazz.getDeclaredConstructor(); + defaultConstructor = clazz.getDeclaredConstructor(); } catch (NoSuchMethodException e) { if (clazz.isInterface() || Modifier.isAbstract(clazz.getModifiers())) { - LOG.info(Class + clazz + is abstract or an interface, having a concrete + + LOG.info(clazz + is abstract or an interface, having a concrete + type can increase performance.); } else { - LOG.info(Class + clazz + must have a default constructor to be used as a POJO.); + LOG.info(clazz + must have a default constructor to be used as a POJO.); return null; } } + if(defaultConstructor != null (defaultConstructor.getModifiers() Modifier.PUBLIC) == 0) { --- End diff -- `if(defaultConstructor != null Modifier.isPublic(defaultConstructor.getModifiers())` seems to be more readable to me but your approach is fine too. TypeExtractor.analyzePojo has some problems around the default constructor detection Key: FLINK-2437 URL: https://issues.apache.org/jira/browse/FLINK-2437 Project: Flink Issue Type: Bug Components: Type Serialization System Reporter: Gabor Gevay Assignee: Gabor Gevay Priority: Minor If a class does have a default constructor, but the user forgot to make it public, then TypeExtractor.analyzePojo still thinks everything is OK, so it creates a PojoTypeInfo. Then PojoSerializer.createInstance blows up. Furthermore, a return null seems to be missing from the then case of the if after catching the NoSuchMethodException which would also cause a headache for PojoSerializer. An additional minor issue is that the word class is printed twice in several places, because class.toString also prepends it to the class name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2437] Fix default constructor detection...
Github user mxm commented on a diff in the pull request: https://github.com/apache/flink/pull/960#discussion_r36837638 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/typeutils/TypeExtractor.java --- @@ -1328,24 +1329,29 @@ else if(typeHierarchy.size() = 1) { ListMethod methods = getAllDeclaredMethods(clazz); for (Method method : methods) { if (method.getName().equals(readObject) || method.getName().equals(writeObject)) { - LOG.info(Class +clazz+ contains custom serialization methods we do not call.); + LOG.info(clazz+ contains custom serialization methods we do not call.); return null; } } // Try retrieving the default constructor, if it does not have one // we cannot use this because the serializer uses it. + Constructor defaultConstructor = null; try { - clazz.getDeclaredConstructor(); + defaultConstructor = clazz.getDeclaredConstructor(); } catch (NoSuchMethodException e) { if (clazz.isInterface() || Modifier.isAbstract(clazz.getModifiers())) { - LOG.info(Class + clazz + is abstract or an interface, having a concrete + + LOG.info(clazz + is abstract or an interface, having a concrete + type can increase performance.); } else { - LOG.info(Class + clazz + must have a default constructor to be used as a POJO.); + LOG.info(clazz + must have a default constructor to be used as a POJO.); return null; } } + if(defaultConstructor != null (defaultConstructor.getModifiers() Modifier.PUBLIC) == 0) { --- End diff -- `if(defaultConstructor != null Modifier.isPublic(defaultConstructor.getModifiers())` seems to be more readable to me but your approach is fine too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...
Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-130228505 Hi, there are two more questions: 1.In using StringBuilder, does it mean that we should use BufferedReader.readLine() instead of BufferedReader.read()? 2.Could you tell me how to make the BufferedReader.read() return -1? I tried many ways that all filed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket
[ https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693183#comment-14693183 ] ASF GitHub Bot commented on FLINK-2490: --- Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-130228505 Hi, there are two more questions: 1.In using StringBuilder, does it mean that we should use BufferedReader.readLine() instead of BufferedReader.read()? 2.Could you tell me how to make the BufferedReader.read() return -1? I tried many ways that all filed. Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket --- Key: FLINK-2490 URL: https://issues.apache.org/jira/browse/FLINK-2490 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.10 Reporter: Huang Wei Priority: Minor Fix For: 0.10 Original Estimate: 168h Remaining Estimate: 168h -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2507) Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector
[ https://issues.apache.org/jira/browse/FLINK-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Márton Balassi closed FLINK-2507. - Resolution: Fixed Fix Version/s: 0.10 Via 54311aa. Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector -- Key: FLINK-2507 URL: https://issues.apache.org/jira/browse/FLINK-2507 Project: Flink Issue Type: Bug Components: flink-contrib Affects Versions: 0.8.1 Reporter: fangfengbin Assignee: fangfengbin Priority: Minor Fix For: 0.10 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2507]Rename the function tansformAndEmi...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1007 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1901) Create sample operator for Dataset
[ https://issues.apache.org/jira/browse/FLINK-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693288#comment-14693288 ] ASF GitHub Bot commented on FLINK-1901: --- Github user thvasilo commented on a diff in the pull request: https://github.com/apache/flink/pull/949#discussion_r36844879 --- Diff: flink-core/src/test/java/org/apache/flink/api/common/operators/util/RandomSamplerTest.java --- @@ -0,0 +1,425 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.flink.api.common.operators.util; + +import com.google.common.base.Preconditions; +import com.google.common.collect.Lists; +import com.google.common.collect.Sets; +import org.apache.commons.math3.stat.inference.KolmogorovSmirnovTest; +import org.junit.BeforeClass; +import org.junit.Test; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.Comparator; +import java.util.Iterator; +import java.util.LinkedList; +import java.util.List; +import java.util.Set; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertTrue; + +/** + * This test suite try to verify whether all the random samplers work as we expected, which mainly focus on: + * ul + * liDoes sampled result fit into input parameters? we check parameters like sample fraction, sample size, + * w/o replacement, and so on./li + * liDoes sampled result randomly selected? we verify this by measure how much does it distributed on source data. + * Run Kolmogorov-Smirnov (KS) test between the random samplers and default reference samplers which is distributed + * well-proportioned on source data. If random sampler select elements randomly from source, it would distributed + * well-proportioned on source data as well. The KS test will fail to strongly reject the null hypothesis that + * the distributions of sampling gaps are the same. + * /li + * /ul + * + * @see a href=https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test;Kolmogorov Smirnov test/a + */ +public class RandomSamplerTest { + private final static int SOURCE_SIZE = 1; + private static KolmogorovSmirnovTest ksTest; + private static ListDouble source; + private final static int DEFFAULT_PARTITION_NUMBER=10; + private ListDouble[] sourcePartitions = new List[DEFFAULT_PARTITION_NUMBER]; + + @BeforeClass + public static void init() { + // initiate source data set. + source = new ArrayListDouble(SOURCE_SIZE); + for (int i = 0; i SOURCE_SIZE; i++) { + source.add((double) i); + } + + ksTest = new KolmogorovSmirnovTest(); + } + + private void initSourcePartition() { + for (int i=0; iDEFFAULT_PARTITION_NUMBER; i++) { + sourcePartitions[i] = new LinkedListDouble(); + } + for (int i = 0; i SOURCE_SIZE; i++) { + int index = i % DEFFAULT_PARTITION_NUMBER; + sourcePartitions[index].add((double)i); + } + } + + @Test(expected = java.lang.IllegalArgumentException.class) + public void testBernoulliSamplerWithUnexpectedFraction1() { + verifySamplerFraction(-1, false); + } + + @Test(expected = java.lang.IllegalArgumentException.class) + public void testBernoulliSamplerWithUnexpectedFraction2() { + verifySamplerFraction(2, false); + } + + @Test + public void testBernoulliSamplerFraction() { + verifySamplerFraction(0.01, false); + verifySamplerFraction(0.05, false); + verifySamplerFraction(0.1, false); + verifySamplerFraction(0.3, false); + verifySamplerFraction(0.5, false); + verifySamplerFraction(0.854, false); +
[jira] [Commented] (FLINK-2491) Operators are not participating in state checkpointing in some cases
[ https://issues.apache.org/jira/browse/FLINK-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693304#comment-14693304 ] Márton Balassi commented on FLINK-2491: --- This is troublesome, when setting log level to debug it shows that the `StreamTask` never calls a checkpoint on the sink. I am looking into it. Operators are not participating in state checkpointing in some cases Key: FLINK-2491 URL: https://issues.apache.org/jira/browse/FLINK-2491 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.10 Reporter: Robert Metzger Assignee: Márton Balassi Priority: Critical Fix For: 0.10 While implementing a test case for the Kafka Consumer, I came across the following bug: Consider the following topology, with the operator parallelism in parentheses: Source (2) -- Sink (1). In this setup, the {{snapshotState()}} method is called on the source, but not on the Sink. The sink receives the generated data. The only one of the two sources is generating data. I've implemented a test case for this, you can find it here: https://github.com/rmetzger/flink/blob/para_checkpoint_bug/flink-tests/src/test/java/org/apache/flink/test/checkpointing/ParallelismChangeCheckpoinedITCase.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2507) Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector
[ https://issues.apache.org/jira/browse/FLINK-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693214#comment-14693214 ] ASF GitHub Bot commented on FLINK-2507: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1007 Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector -- Key: FLINK-2507 URL: https://issues.apache.org/jira/browse/FLINK-2507 Project: Flink Issue Type: Bug Components: flink-contrib Affects Versions: 0.8.1 Reporter: fangfengbin Assignee: fangfengbin Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-2437) TypeExtractor.analyzePojo has some problems around the default constructor detection
[ https://issues.apache.org/jira/browse/FLINK-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels updated FLINK-2437: -- Affects Version/s: 0.9.0 0.10 TypeExtractor.analyzePojo has some problems around the default constructor detection Key: FLINK-2437 URL: https://issues.apache.org/jira/browse/FLINK-2437 Project: Flink Issue Type: Bug Components: Type Serialization System Affects Versions: 0.10, 0.9.0 Reporter: Gabor Gevay Assignee: Gabor Gevay Priority: Minor Fix For: 0.10, 0.9.1 If a class does have a default constructor, but the user forgot to make it public, then TypeExtractor.analyzePojo still thinks everything is OK, so it creates a PojoTypeInfo. Then PojoSerializer.createInstance blows up. Furthermore, a return null seems to be missing from the then case of the if after catching the NoSuchMethodException which would also cause a headache for PojoSerializer. An additional minor issue is that the word class is printed twice in several places, because class.toString also prepends it to the class name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2437] Fix default constructor detection...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/960#issuecomment-130251642 Thanks for your contribution! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2437) TypeExtractor.analyzePojo has some problems around the default constructor detection
[ https://issues.apache.org/jira/browse/FLINK-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693298#comment-14693298 ] ASF GitHub Bot commented on FLINK-2437: --- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/960#issuecomment-130251642 Thanks for your contribution! TypeExtractor.analyzePojo has some problems around the default constructor detection Key: FLINK-2437 URL: https://issues.apache.org/jira/browse/FLINK-2437 Project: Flink Issue Type: Bug Components: Type Serialization System Affects Versions: 0.10, 0.9.0 Reporter: Gabor Gevay Assignee: Gabor Gevay Priority: Minor Fix For: 0.10, 0.9.1 If a class does have a default constructor, but the user forgot to make it public, then TypeExtractor.analyzePojo still thinks everything is OK, so it creates a PojoTypeInfo. Then PojoSerializer.createInstance blows up. Furthermore, a return null seems to be missing from the then case of the if after catching the NoSuchMethodException which would also cause a headache for PojoSerializer. An additional minor issue is that the word class is printed twice in several places, because class.toString also prepends it to the class name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2437) TypeExtractor.analyzePojo has some problems around the default constructor detection
[ https://issues.apache.org/jira/browse/FLINK-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels closed FLINK-2437. - Resolution: Fixed TypeExtractor.analyzePojo has some problems around the default constructor detection Key: FLINK-2437 URL: https://issues.apache.org/jira/browse/FLINK-2437 Project: Flink Issue Type: Bug Components: Type Serialization System Affects Versions: 0.10, 0.9.0 Reporter: Gabor Gevay Assignee: Gabor Gevay Priority: Minor Fix For: 0.10, 0.9.1 If a class does have a default constructor, but the user forgot to make it public, then TypeExtractor.analyzePojo still thinks everything is OK, so it creates a PojoTypeInfo. Then PojoSerializer.createInstance blows up. Furthermore, a return null seems to be missing from the then case of the if after catching the NoSuchMethodException which would also cause a headache for PojoSerializer. An additional minor issue is that the word class is printed twice in several places, because class.toString also prepends it to the class name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2512) Add client.close() before throw RuntimeException
[ https://issues.apache.org/jira/browse/FLINK-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693199#comment-14693199 ] ASF GitHub Bot commented on FLINK-2512: --- Github user ffbin commented on the pull request: https://github.com/apache/flink/pull/1009#issuecomment-130233136 @uce Thanks. What about remove if(client.getTopologyJobId(name) != null) {...} in line 103, because submitTopologyWithOpts() has check it at the head of function and will throw AlreadyAliveException.Then in finally{} close client. Add client.close() before throw RuntimeException Key: FLINK-2512 URL: https://issues.apache.org/jira/browse/FLINK-2512 Project: Flink Issue Type: Bug Components: flink-contrib Affects Versions: 0.8.1 Reporter: fangfengbin Assignee: fangfengbin Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2512]Add client.close() before throw Ru...
Github user ffbin commented on the pull request: https://github.com/apache/flink/pull/1009#issuecomment-130233136 @uce Thanks. What about remove if(client.getTopologyJobId(name) != null) {...} in line 103, because submitTopologyWithOpts() has check it at the head of function and will throw AlreadyAliveException.Then in finally{} close client. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2399) Fail when actor versions don't match
[ https://issues.apache.org/jira/browse/FLINK-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693235#comment-14693235 ] ASF GitHub Bot commented on FLINK-2399: --- Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/945#issuecomment-130241985 I've already added a version check between `JobClient` and `JobManager`. Will there be any further review of this? Fail when actor versions don't match Key: FLINK-2399 URL: https://issues.apache.org/jira/browse/FLINK-2399 Project: Flink Issue Type: Improvement Components: JobManager, TaskManager Affects Versions: 0.9, master Reporter: Ufuk Celebi Assignee: Sachin Goel Priority: Minor Fix For: 0.10 Problem: there can be subtle errors when actors from different Flink versions communicate with each other, for example when an old client (e.g. Flink 0.9) communicates with a new JobManager (e.g. Flink 0.10-SNAPSHOT). We can check that the versions match on first communication between the actors and fail if they don't match. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2508) Confusing sharing of StreamExecutionEnvironment
[ https://issues.apache.org/jira/browse/FLINK-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693282#comment-14693282 ] Márton Balassi commented on FLINK-2508: --- Thanks for spotting this, I think the purpose of having the current environment cached was to be able to set the `TestStreamEnvironment` as context, but the current state of the code seems a bit messy. Confusing sharing of StreamExecutionEnvironment --- Key: FLINK-2508 URL: https://issues.apache.org/jira/browse/FLINK-2508 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.10 Reporter: Stephan Ewen Fix For: 0.10 In the {{StreamExecutionEnvironment}}, the environment is once created and then shared with a static variable to all successive calls to {{getExecutionEnvironment()}}. But it can be overridden by calls to {{createLocalEnvironment()}} and {{createRemoteEnvironment()}}. This seems a bit un-intuitive, and probably creates confusion when dispatching multiple streaming jobs from within the same JVM. Why is it even necessary to cache the current execution environment? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-130237642 1.In using StringBuilder, does it mean that we should use BufferedReader.readLine() instead of BufferedReader.read()? Reading by character is the way to go if we use a custom `delimiter`. If our delimiter was `\n` then it would be ok to read entire lines. Could you tell me how to make the BufferedReader.read() return -1? I tried many ways that all filed. Ok :) Here is a minimal working example where `read()` returns `-1`: ```java public static void main(String[] args) throws IOException { ServerSocket socket = new ServerSocket(12345); final SocketAddress socketAddress = socket.getLocalSocketAddress(); new Thread(new Runnable() { @Override public void run() { Socket socket = new Socket(); try { socket.connect(socketAddress); } catch (IOException e) { e.printStackTrace(); } try { BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(socket.getInputStream())); System.out.println((bufferedReader.read())); } catch (IOException e) { e.printStackTrace(); } } }).start(); Socket channel = socket.accept(); channel.close(); } ``` Output: ``` -1 ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket
[ https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693224#comment-14693224 ] ASF GitHub Bot commented on FLINK-2490: --- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-130237642 1.In using StringBuilder, does it mean that we should use BufferedReader.readLine() instead of BufferedReader.read()? Reading by character is the way to go if we use a custom `delimiter`. If our delimiter was `\n` then it would be ok to read entire lines. Could you tell me how to make the BufferedReader.read() return -1? I tried many ways that all filed. Ok :) Here is a minimal working example where `read()` returns `-1`: ```java public static void main(String[] args) throws IOException { ServerSocket socket = new ServerSocket(12345); final SocketAddress socketAddress = socket.getLocalSocketAddress(); new Thread(new Runnable() { @Override public void run() { Socket socket = new Socket(); try { socket.connect(socketAddress); } catch (IOException e) { e.printStackTrace(); } try { BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(socket.getInputStream())); System.out.println((bufferedReader.read())); } catch (IOException e) { e.printStackTrace(); } } }).start(); Socket channel = socket.accept(); channel.close(); } ``` Output: ``` -1 ``` Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket --- Key: FLINK-2490 URL: https://issues.apache.org/jira/browse/FLINK-2490 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.10 Reporter: Huang Wei Priority: Minor Fix For: 0.10 Original Estimate: 168h Remaining Estimate: 168h -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2437) TypeExtractor.analyzePojo has some problems around the default constructor detection
[ https://issues.apache.org/jira/browse/FLINK-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693245#comment-14693245 ] ASF GitHub Bot commented on FLINK-2437: --- Github user ggevay commented on a diff in the pull request: https://github.com/apache/flink/pull/960#discussion_r36843332 --- Diff: flink-java/src/main/java/org/apache/flink/api/java/typeutils/TypeExtractor.java --- @@ -1328,24 +1329,29 @@ else if(typeHierarchy.size() = 1) { ListMethod methods = getAllDeclaredMethods(clazz); for (Method method : methods) { if (method.getName().equals(readObject) || method.getName().equals(writeObject)) { - LOG.info(Class +clazz+ contains custom serialization methods we do not call.); + LOG.info(clazz+ contains custom serialization methods we do not call.); return null; } } // Try retrieving the default constructor, if it does not have one // we cannot use this because the serializer uses it. + Constructor defaultConstructor = null; try { - clazz.getDeclaredConstructor(); + defaultConstructor = clazz.getDeclaredConstructor(); } catch (NoSuchMethodException e) { if (clazz.isInterface() || Modifier.isAbstract(clazz.getModifiers())) { - LOG.info(Class + clazz + is abstract or an interface, having a concrete + + LOG.info(clazz + is abstract or an interface, having a concrete + type can increase performance.); } else { - LOG.info(Class + clazz + must have a default constructor to be used as a POJO.); + LOG.info(clazz + must have a default constructor to be used as a POJO.); return null; } } + if(defaultConstructor != null (defaultConstructor.getModifiers() Modifier.PUBLIC) == 0) { --- End diff -- You are right, I have changed it. TypeExtractor.analyzePojo has some problems around the default constructor detection Key: FLINK-2437 URL: https://issues.apache.org/jira/browse/FLINK-2437 Project: Flink Issue Type: Bug Components: Type Serialization System Reporter: Gabor Gevay Assignee: Gabor Gevay Priority: Minor If a class does have a default constructor, but the user forgot to make it public, then TypeExtractor.analyzePojo still thinks everything is OK, so it creates a PojoTypeInfo. Then PojoSerializer.createInstance blows up. Furthermore, a return null seems to be missing from the then case of the if after catching the NoSuchMethodException which would also cause a headache for PojoSerializer. An additional minor issue is that the word class is printed twice in several places, because class.toString also prepends it to the class name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2399] Version checks for Job Manager an...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/945#issuecomment-130241985 I've already added a version check between `JobClient` and `JobManager`. Will there be any further review of this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2437) TypeExtractor.analyzePojo has some problems around the default constructor detection
[ https://issues.apache.org/jira/browse/FLINK-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693294#comment-14693294 ] ASF GitHub Bot commented on FLINK-2437: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/960 TypeExtractor.analyzePojo has some problems around the default constructor detection Key: FLINK-2437 URL: https://issues.apache.org/jira/browse/FLINK-2437 Project: Flink Issue Type: Bug Components: Type Serialization System Reporter: Gabor Gevay Assignee: Gabor Gevay Priority: Minor If a class does have a default constructor, but the user forgot to make it public, then TypeExtractor.analyzePojo still thinks everything is OK, so it creates a PojoTypeInfo. Then PojoSerializer.createInstance blows up. Furthermore, a return null seems to be missing from the then case of the if after catching the NoSuchMethodException which would also cause a headache for PojoSerializer. An additional minor issue is that the word class is printed twice in several places, because class.toString also prepends it to the class name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-2437) TypeExtractor.analyzePojo has some problems around the default constructor detection
[ https://issues.apache.org/jira/browse/FLINK-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels updated FLINK-2437: -- Fix Version/s: 0.9.1 0.10 TypeExtractor.analyzePojo has some problems around the default constructor detection Key: FLINK-2437 URL: https://issues.apache.org/jira/browse/FLINK-2437 Project: Flink Issue Type: Bug Components: Type Serialization System Affects Versions: 0.10, 0.9.0 Reporter: Gabor Gevay Assignee: Gabor Gevay Priority: Minor Fix For: 0.10, 0.9.1 If a class does have a default constructor, but the user forgot to make it public, then TypeExtractor.analyzePojo still thinks everything is OK, so it creates a PojoTypeInfo. Then PojoSerializer.createInstance blows up. Furthermore, a return null seems to be missing from the then case of the if after catching the NoSuchMethodException which would also cause a headache for PojoSerializer. An additional minor issue is that the word class is printed twice in several places, because class.toString also prepends it to the class name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2457) Integrate Tuple0
[ https://issues.apache.org/jira/browse/FLINK-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693292#comment-14693292 ] ASF GitHub Bot commented on FLINK-2457: --- Github user twalthr commented on the pull request: https://github.com/apache/flink/pull/983#issuecomment-130251096 The typeutil classes look good. I see that you have modified the TupleXXBuilders, have you modified them by hand or by running the `TupleGenerator`? I can't see the modified `TupleGenerator` in your PR. Integrate Tuple0 Key: FLINK-2457 URL: https://issues.apache.org/jira/browse/FLINK-2457 Project: Flink Issue Type: Improvement Reporter: Matthias J. Sax Assignee: Matthias J. Sax Priority: Minor Tuple0 is not cleanly integrated: - missing serialization/deserialization support in runtime - Tuple.getTupleClass(int arity) cannot handle arity zero, ie, cannot create an instance of Tuple0 Tuple0 is currently only used in Python API, but will be integrated into Storm compatibility, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2512) Add client.close() before throw RuntimeException
[ https://issues.apache.org/jira/browse/FLINK-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693775#comment-14693775 ] ASF GitHub Bot commented on FLINK-2512: --- Github user hsaputra commented on the pull request: https://github.com/apache/flink/pull/1009#issuecomment-130363877 Since ``getTopologyJobId`` can also throw exception, we could just move up the try-catch to include the call to that method and rely on finally to close the client. Add client.close() before throw RuntimeException Key: FLINK-2512 URL: https://issues.apache.org/jira/browse/FLINK-2512 Project: Flink Issue Type: Bug Components: flink-contrib Affects Versions: 0.8.1 Reporter: fangfengbin Assignee: fangfengbin Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found
[ https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693831#comment-14693831 ] ASF GitHub Bot commented on FLINK-2509: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1008#issuecomment-130369569 I thought not, but you are right, I pushed the wrong branch. Sorry, git-fail on my side! I'll push an update in a bit... Improve error messages when user code classes are not found --- Key: FLINK-2509 URL: https://issues.apache.org/jira/browse/FLINK-2509 Project: Flink Issue Type: Bug Components: Core Affects Versions: 0.10 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.10 When a job fails because the user code classes are not found, we should add some information about the class loader and class path into the exception message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found
[ https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693766#comment-14693766 ] ASF GitHub Bot commented on FLINK-2509: --- Github user hsaputra commented on the pull request: https://github.com/apache/flink/pull/1008#issuecomment-130362926 HI @StephanEwen why is this PR is merged without addressing existing comments? Improve error messages when user code classes are not found --- Key: FLINK-2509 URL: https://issues.apache.org/jira/browse/FLINK-2509 Project: Flink Issue Type: Bug Components: Core Affects Versions: 0.10 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.10 When a job fails because the user code classes are not found, we should add some information about the class loader and class path into the exception message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2457] Integrate Tuple0
Github user twalthr commented on the pull request: https://github.com/apache/flink/pull/983#issuecomment-130251096 The typeutil classes look good. I see that you have modified the TupleXXBuilders, have you modified them by hand or by running the `TupleGenerator`? I can't see the modified `TupleGenerator` in your PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2480][test]Add tests for PrintSinkFunct...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/991#issuecomment-130295740 Your pull request doesn't compile: https://s3.amazonaws.com/archive.travis-ci.org/jobs/74504427/log.txt --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2480) Improving tests coverage for org.apache.flink.streaming.api
[ https://issues.apache.org/jira/browse/FLINK-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693457#comment-14693457 ] ASF GitHub Bot commented on FLINK-2480: --- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/991#issuecomment-130295740 Your pull request doesn't compile: https://s3.amazonaws.com/archive.travis-ci.org/jobs/74504427/log.txt Improving tests coverage for org.apache.flink.streaming.api --- Key: FLINK-2480 URL: https://issues.apache.org/jira/browse/FLINK-2480 Project: Flink Issue Type: Test Components: Streaming Affects Versions: 0.10 Reporter: Huang Wei Fix For: 0.10 Original Estimate: 504h Remaining Estimate: 504h The streaming API is quite a bit newer than the other code so it is not that well covered with tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2491) Operators are not participating in state checkpointing in some cases
[ https://issues.apache.org/jira/browse/FLINK-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693387#comment-14693387 ] Márton Balassi commented on FLINK-2491: --- Here is the root cause. [1] [1] https://github.com/apache/flink/blob/master/flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGenerator.java#L415 The same parallelism case works because of chaining. Operators are not participating in state checkpointing in some cases Key: FLINK-2491 URL: https://issues.apache.org/jira/browse/FLINK-2491 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.10 Reporter: Robert Metzger Assignee: Márton Balassi Priority: Critical Fix For: 0.10 While implementing a test case for the Kafka Consumer, I came across the following bug: Consider the following topology, with the operator parallelism in parentheses: Source (2) -- Sink (1). In this setup, the {{snapshotState()}} method is called on the source, but not on the Sink. The sink receives the generated data. The only one of the two sources is generating data. I've implemented a test case for this, you can find it here: https://github.com/rmetzger/flink/blob/para_checkpoint_bug/flink-tests/src/test/java/org/apache/flink/test/checkpointing/ParallelismChangeCheckpoinedITCase.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-2506) HBase connection closing down (table distributed over more than 1 region server - Flink Cluster-Mode)
[ https://issues.apache.org/jira/browse/FLINK-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels updated FLINK-2506: -- Description: If I fill a default table (create 'test-table', 'someCf') with the HBaseWriteExample.java program from the HBase addon library then a table without start/end key is created. The data reading works great with the HBaseReadExample.java. Nevertheless, if I manually create a test-table that is distributed over more than one region server (create 'test-table2', 'someCf',{NUMREGIONS = 3,SPLITALGO = 'HexStringSplit'}) the run is canceled with the following error message: {noformat} grips2 Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=35, exceptions: Fri Aug 07 11:18:29 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:18:38 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:18:48 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:18:58 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:08 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:18 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:28 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:38 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:48 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:58 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:20:18 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:20:38 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:20:58 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:21:19 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:21:39 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:21:59 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:22:19 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:22:39 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:22:59 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:23:19 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:23:39 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:24:00 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:24:20 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:24:40 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:25:00 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:25:20 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:25:40 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:26:00 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:26:20 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:26:40 CEST
[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...
Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-130311909 Thank you! I`ll try again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket
[ https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693496#comment-14693496 ] ASF GitHub Bot commented on FLINK-2490: --- Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-130311909 Thank you! I`ll try again. Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket --- Key: FLINK-2490 URL: https://issues.apache.org/jira/browse/FLINK-2490 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.10 Reporter: Huang Wei Priority: Minor Fix For: 0.10 Original Estimate: 168h Remaining Estimate: 168h -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2457) Integrate Tuple0
[ https://issues.apache.org/jira/browse/FLINK-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693466#comment-14693466 ] ASF GitHub Bot commented on FLINK-2457: --- Github user mjsax commented on the pull request: https://github.com/apache/flink/pull/983#issuecomment-130298853 I modified by hand. Was not aware of `TupleGenerator`. Just had a look into it. Not sure if `Tuple0` can be included appropriately. For example, it is no generic class; it is implemented as soft singleton. Extending `TupleGenerator` would result in special handling of Tuple0 in every place. Thus, adding it manually and not generate the source code for it seems better to me. Integrate Tuple0 Key: FLINK-2457 URL: https://issues.apache.org/jira/browse/FLINK-2457 Project: Flink Issue Type: Improvement Reporter: Matthias J. Sax Assignee: Matthias J. Sax Priority: Minor Tuple0 is not cleanly integrated: - missing serialization/deserialization support in runtime - Tuple.getTupleClass(int arity) cannot handle arity zero, ie, cannot create an instance of Tuple0 Tuple0 is currently only used in Python API, but will be integrated into Storm compatibility, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2457] Integrate Tuple0
Github user mjsax commented on the pull request: https://github.com/apache/flink/pull/983#issuecomment-130298853 I modified by hand. Was not aware of `TupleGenerator`. Just had a look into it. Not sure if `Tuple0` can be included appropriately. For example, it is no generic class; it is implemented as soft singleton. Extending `TupleGenerator` would result in special handling of Tuple0 in every place. Thus, adding it manually and not generate the source code for it seems better to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2508) Confusing sharing of StreamExecutionEnvironment
[ https://issues.apache.org/jira/browse/FLINK-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693474#comment-14693474 ] Maximilian Michels commented on FLINK-2508: --- Just like in {{ExecutionEnvironment}} we can have static variable which holds a factory. Setting the factory, we can change the default environment returned for tests, local, or cluster execution. That would also remove the clutter in the {{getExecutionEnvironment()}} method. Confusing sharing of StreamExecutionEnvironment --- Key: FLINK-2508 URL: https://issues.apache.org/jira/browse/FLINK-2508 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.10 Reporter: Stephan Ewen Fix For: 0.10 In the {{StreamExecutionEnvironment}}, the environment is once created and then shared with a static variable to all successive calls to {{getExecutionEnvironment()}}. But it can be overridden by calls to {{createLocalEnvironment()}} and {{createRemoteEnvironment()}}. This seems a bit un-intuitive, and probably creates confusion when dispatching multiple streaming jobs from within the same JVM. Why is it even necessary to cache the current execution environment? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats
[ https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693546#comment-14693546 ] ASF GitHub Bot commented on FLINK-1819: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130318553 We are back to square one ;-) - `Function`: single abstract method - `RichFunction` = 5 methods. I see how that gets rich. - `InputFormat`: 8 methods - `RichInputFormat`: 10 methods. We could call it `SlightlyMoreRichInputFormat` ;-) I cannot help but find this very confusing. Why the urgency to stick with the Rich prefix? Allow access to RuntimeContext from Input and OutputFormats --- Key: FLINK-1819 URL: https://issues.apache.org/jira/browse/FLINK-1819 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 0.9, 0.8.1 Reporter: Fabian Hueske Assignee: Sachin Goel Priority: Minor Fix For: 0.9 User function that extend a RichFunction can access a {{RuntimeContext}} which gives the parallel id of the task and access to Accumulators and BroadcastVariables. Right now, Input and OutputFormats cannot access their {{RuntimeContext}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130318553 We are back to square one ;-) - `Function`: single abstract method - `RichFunction` = 5 methods. I see how that gets rich. - `InputFormat`: 8 methods - `RichInputFormat`: 10 methods. We could call it `SlightlyMoreRichInputFormat` ;-) I cannot help but find this very confusing. Why the urgency to stick with the Rich prefix? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats
[ https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693549#comment-14693549 ] ASF GitHub Bot commented on FLINK-1819: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130318970 For transformation functions, there is a clear case for thin versus rich, for Java8 lambdas. Input formats are a different game. They are super rich by default anyways. Allow access to RuntimeContext from Input and OutputFormats --- Key: FLINK-1819 URL: https://issues.apache.org/jira/browse/FLINK-1819 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 0.9, 0.8.1 Reporter: Fabian Hueske Assignee: Sachin Goel Priority: Minor Fix For: 0.9 User function that extend a RichFunction can access a {{RuntimeContext}} which gives the parallel id of the task and access to Accumulators and BroadcastVariables. Right now, Input and OutputFormats cannot access their {{RuntimeContext}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats
[ https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693553#comment-14693553 ] ASF GitHub Bot commented on FLINK-1819: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130319611 The suggestion `AbstractInputFormat` was not so bad, in my opinion. If you want a name that explains what's happening, you can always call it `InputFormatWithContext` ;-) Allow access to RuntimeContext from Input and OutputFormats --- Key: FLINK-1819 URL: https://issues.apache.org/jira/browse/FLINK-1819 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 0.9, 0.8.1 Reporter: Fabian Hueske Assignee: Sachin Goel Priority: Minor Fix For: 0.9 User function that extend a RichFunction can access a {{RuntimeContext}} which gives the parallel id of the task and access to Accumulators and BroadcastVariables. Right now, Input and OutputFormats cannot access their {{RuntimeContext}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats
[ https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693554#comment-14693554 ] ASF GitHub Bot commented on FLINK-1819: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130319656 To keep it consistent with the remaining API. For functions you need to extend a RichFunction if you want to have access to the RuntimeContext. I agree that the name is not perfect (and I think everybody else got your point as well) but I think it's a valid point to aim for consistency. Allow access to RuntimeContext from Input and OutputFormats --- Key: FLINK-1819 URL: https://issues.apache.org/jira/browse/FLINK-1819 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 0.9, 0.8.1 Reporter: Fabian Hueske Assignee: Sachin Goel Priority: Minor Fix For: 0.9 User function that extend a RichFunction can access a {{RuntimeContext}} which gives the parallel id of the task and access to Accumulators and BroadcastVariables. Right now, Input and OutputFormats cannot access their {{RuntimeContext}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130319611 The suggestion `AbstractInputFormat` was not so bad, in my opinion. If you want a name that explains what's happening, you can always call it `InputFormatWithContext` ;-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2457] Integrate Tuple0
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/983#issuecomment-130316822 Agree, Tuple0 should probably not go through the tuple generator. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2457) Integrate Tuple0
[ https://issues.apache.org/jira/browse/FLINK-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693527#comment-14693527 ] ASF GitHub Bot commented on FLINK-2457: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/983#issuecomment-130316822 Agree, Tuple0 should probably not go through the tuple generator. Integrate Tuple0 Key: FLINK-2457 URL: https://issues.apache.org/jira/browse/FLINK-2457 Project: Flink Issue Type: Improvement Reporter: Matthias J. Sax Assignee: Matthias J. Sax Priority: Minor Tuple0 is not cleanly integrated: - missing serialization/deserialization support in runtime - Tuple.getTupleClass(int arity) cannot handle arity zero, ie, cannot create an instance of Tuple0 Tuple0 is currently only used in Python API, but will be integrated into Storm compatibility, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130319656 To keep it consistent with the remaining API. For functions you need to extend a RichFunction if you want to have access to the RuntimeContext. I agree that the name is not perfect (and I think everybody else got your point as well) but I think it's a valid point to aim for consistency. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [CLEANUP] Add space between quotes and plus si...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1010#issuecomment-130415584 +1, good to merge --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130384903 Oh apologies. I only saw the first comment on the email thread. I guess it's more or less settled. I'll leave it up to you guys to make a final decision on this. :') On Aug 12, 2015 10:59 PM, Sachin Goel sachingoel0...@gmail.com wrote: I agree with Stephan's argument that addition of context to I/O formats is a very marginal enhancement. He literally stole my words. :') However, from my perspective, when I first started using flink, Rich meant runtime context. The idea of open and close wasn't as nearly exciting as the runtime context. What if we changed back to the original name mentioned on jira and make it `ContextAwareInputFormat`? Would everyone be okay with that? On Aug 12, 2015 8:10 PM, Stephan Ewen notificati...@github.com wrote: Functions also need to extend RichFunction to have access to open() and close(). I think the two things a different enough that any strife for consistency is actually pretty random. If your thoughts currently revolve around the RuntimeContext, it apprears more consistent. If you thoughts are on the life cycle methods, it seems inconsistent. Random. I think you should go ahead and just call them Rich. It is just a name, and what matters is that the JavaDocs describe what it actually means... â Reply to this email directly or view it on GitHub https://github.com/apache/flink/pull/966#issuecomment-130324909. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found
[ https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693925#comment-14693925 ] ASF GitHub Bot commented on FLINK-2509: --- Github user hsaputra commented on the pull request: https://github.com/apache/flink/pull/1008#issuecomment-130388161 No worries =) Improve error messages when user code classes are not found --- Key: FLINK-2509 URL: https://issues.apache.org/jira/browse/FLINK-2509 Project: Flink Issue Type: Bug Components: Core Affects Versions: 0.10 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.10 When a job fails because the user code classes are not found, we should add some information about the class loader and class path into the exception message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [CLEANUP] Add space between quotes and plus si...
Github user hsaputra commented on the pull request: https://github.com/apache/flink/pull/1010#issuecomment-130386385 +1 We agree that we would play it loose with style but this kind of cleanup helps readability. I will send PR to change the check style to be more strict on this kind of violations. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130383984 I agree with Stephan's argument that addition of context to I/O formats is a very marginal enhancement. He literally stole my words. :') However, from my perspective, when I first started using flink, Rich meant runtime context. The idea of open and close wasn't as nearly exciting as the runtime context. What if we changed back to the original name mentioned on jira and make it `ContextAwareInputFormat`? Would everyone be okay with that? On Aug 12, 2015 8:10 PM, Stephan Ewen notificati...@github.com wrote: Functions also need to extend RichFunction to have access to open() and close(). I think the two things a different enough that any strife for consistency is actually pretty random. If your thoughts currently revolve around the RuntimeContext, it apprears more consistent. If you thoughts are on the life cycle methods, it seems inconsistent. Random. I think you should go ahead and just call them Rich. It is just a name, and what matters is that the JavaDocs describe what it actually means... â Reply to this email directly or view it on GitHub https://github.com/apache/flink/pull/966#issuecomment-130324909. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats
[ https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693893#comment-14693893 ] ASF GitHub Bot commented on FLINK-1819: --- Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130383984 I agree with Stephan's argument that addition of context to I/O formats is a very marginal enhancement. He literally stole my words. :') However, from my perspective, when I first started using flink, Rich meant runtime context. The idea of open and close wasn't as nearly exciting as the runtime context. What if we changed back to the original name mentioned on jira and make it `ContextAwareInputFormat`? Would everyone be okay with that? On Aug 12, 2015 8:10 PM, Stephan Ewen notificati...@github.com wrote: Functions also need to extend RichFunction to have access to open() and close(). I think the two things a different enough that any strife for consistency is actually pretty random. If your thoughts currently revolve around the RuntimeContext, it apprears more consistent. If you thoughts are on the life cycle methods, it seems inconsistent. Random. I think you should go ahead and just call them Rich. It is just a name, and what matters is that the JavaDocs describe what it actually means... — Reply to this email directly or view it on GitHub https://github.com/apache/flink/pull/966#issuecomment-130324909. Allow access to RuntimeContext from Input and OutputFormats --- Key: FLINK-1819 URL: https://issues.apache.org/jira/browse/FLINK-1819 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 0.9, 0.8.1 Reporter: Fabian Hueske Assignee: Sachin Goel Priority: Minor Fix For: 0.9 User function that extend a RichFunction can access a {{RuntimeContext}} which gives the parallel id of the task and access to Accumulators and BroadcastVariables. Right now, Input and OutputFormats cannot access their {{RuntimeContext}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats
[ https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693899#comment-14693899 ] ASF GitHub Bot commented on FLINK-1819: --- Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130384903 Oh apologies. I only saw the first comment on the email thread. I guess it's more or less settled. I'll leave it up to you guys to make a final decision on this. :') On Aug 12, 2015 10:59 PM, Sachin Goel sachingoel0...@gmail.com wrote: I agree with Stephan's argument that addition of context to I/O formats is a very marginal enhancement. He literally stole my words. :') However, from my perspective, when I first started using flink, Rich meant runtime context. The idea of open and close wasn't as nearly exciting as the runtime context. What if we changed back to the original name mentioned on jira and make it `ContextAwareInputFormat`? Would everyone be okay with that? On Aug 12, 2015 8:10 PM, Stephan Ewen notificati...@github.com wrote: Functions also need to extend RichFunction to have access to open() and close(). I think the two things a different enough that any strife for consistency is actually pretty random. If your thoughts currently revolve around the RuntimeContext, it apprears more consistent. If you thoughts are on the life cycle methods, it seems inconsistent. Random. I think you should go ahead and just call them Rich. It is just a name, and what matters is that the JavaDocs describe what it actually means... — Reply to this email directly or view it on GitHub https://github.com/apache/flink/pull/966#issuecomment-130324909. Allow access to RuntimeContext from Input and OutputFormats --- Key: FLINK-1819 URL: https://issues.apache.org/jira/browse/FLINK-1819 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 0.9, 0.8.1 Reporter: Fabian Hueske Assignee: Sachin Goel Priority: Minor Fix For: 0.9 User function that extend a RichFunction can access a {{RuntimeContext}} which gives the parallel id of the task and access to Accumulators and BroadcastVariables. Right now, Input and OutputFormats cannot access their {{RuntimeContext}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2512]Add client.close() before throw Ru...
Github user hsaputra commented on the pull request: https://github.com/apache/flink/pull/1009#issuecomment-130363877 Since ``getTopologyJobId`` can also throw exception, we could just move up the try-catch to include the call to that method and rely on finally to close the client. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [CLEANUP] Add space between quotes and plus si...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1010#issuecomment-130383158 I like this. I would actually like to make this a checkstyle rule. Most of the code is in this shape, occasional files go with a different style. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---