[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats
[ https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14687386#comment-14687386 ] ASF GitHub Bot commented on FLINK-1819: --- Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130078564 Okay. I'll hold off on making any changes right now. We should reach at a consensus first. Allow access to RuntimeContext from Input and OutputFormats --- Key: FLINK-1819 URL: https://issues.apache.org/jira/browse/FLINK-1819 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 0.9, 0.8.1 Reporter: Fabian Hueske Assignee: Sachin Goel Priority: Minor Fix For: 0.9 User function that extend a RichFunction can access a {{RuntimeContext}} which gives the parallel id of the task and access to Accumulators and BroadcastVariables. Right now, Input and OutputFormats cannot access their {{RuntimeContext}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-130078564 Okay. I'll hold off on making any changes right now. We should reach at a consensus first. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2512) Add client.close() before throw RuntimeException
[ https://issues.apache.org/jira/browse/FLINK-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14692814#comment-14692814 ] ASF GitHub Bot commented on FLINK-2512: --- GitHub user ffbin opened a pull request: https://github.com/apache/flink/pull/1009 [FLINK-2512]Add client.close() before throw RuntimeException In line 129, it close client in finally{} before throw exception. But in line 105, it throw exception without close client. So I think it is better to close client before throw RuntimeException. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ffbin/flink FLINK-2512 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1009.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1009 commit 7510fdb6a8a082d8be85ba7968e9e2760edf2af0 Author: ffbin 869218...@qq.com Date: 2015-08-12T02:06:21Z [FLINK-2512]Add client.close() before throw RuntimeException Add client.close() before throw RuntimeException Key: FLINK-2512 URL: https://issues.apache.org/jira/browse/FLINK-2512 Project: Flink Issue Type: Bug Components: flink-contrib Affects Versions: 0.8.1 Reporter: fangfengbin Assignee: fangfengbin Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-2506) HBase connection closing down (table distributed over more than 1 region server - Flink Cluster-Mode)
[ https://issues.apache.org/jira/browse/FLINK-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lydia Ickler updated FLINK-2506: Description: If I fill a default table (create 'test-table', 'someCf') with the HBaseWriteExample.java program from the HBase addon library then a table without start/end key is created. The data reading works great with the HBaseReadExample.java. Nevertheless, if I manually create a test-table that is distributed over more than one region server (create 'test-table2', 'someCf', {NUMREGIONS=3,SPLITALGO='HexStringSplit'}) the run is canceled with the following error message: grips2 Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=35, exceptions: Fri Aug 07 11:18:29 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:18:38 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:18:48 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:18:58 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:08 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:18 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:28 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:38 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:48 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:58 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:20:18 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:20:38 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:20:58 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:21:19 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:21:39 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:21:59 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:22:19 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:22:39 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:22:59 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:23:19 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:23:39 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:24:00 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:24:20 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:24:40 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:25:00 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:25:20 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:25:40 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:26:00 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:26:20 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:26:40 CEST 2015,
[jira] [Commented] (FLINK-2077) Rework Path class and add extend support for Windows paths
[ https://issues.apache.org/jira/browse/FLINK-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681419#comment-14681419 ] Fabian Hueske commented on FLINK-2077: -- Hi [~gallenvara_bg], thanks for picking this issue. I'll assign it to you. Let me know if you have questions. Rework Path class and add extend support for Windows paths -- Key: FLINK-2077 URL: https://issues.apache.org/jira/browse/FLINK-2077 Project: Flink Issue Type: Improvement Components: Core Affects Versions: 0.9 Reporter: Fabian Hueske Priority: Minor Labels: starter The class {{org.apache.flink.core.fs.Path}} handles paths for Flink's {{FileInputFormat}} and {{FileOutputFormat}}. Over time, this class has become quite hard to read and modify. It would benefit from some cleaning and refactoring. Along with the refactoring, support for Windows paths like {{//host/dir1/dir2}} could be added. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1901) Create sample operator for Dataset
[ https://issues.apache.org/jira/browse/FLINK-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Theodore Vasiloudis updated FLINK-1901: --- Description: In order to be able to implement Stochastic Gradient Descent and a number of other machine learning algorithms we need to have a way to take a random sample from a Dataset. We need to be able to sample with or without replacement from the Dataset, choose the relative or exact size of the sample, set a seed for reproducibility, and support sampling within iterations. was: In order to be able to implement Stochastic Gradient Descent and a number of other machine learning algorithms we need to have a way to take a random sample from a Dataset. We need to be able to sample with or without replacement from the Dataset, choose the relative size of the sample, and set a seed for reproducibility. Create sample operator for Dataset -- Key: FLINK-1901 URL: https://issues.apache.org/jira/browse/FLINK-1901 Project: Flink Issue Type: Improvement Components: Core Reporter: Theodore Vasiloudis Assignee: Chengxiang Li In order to be able to implement Stochastic Gradient Descent and a number of other machine learning algorithms we need to have a way to take a random sample from a Dataset. We need to be able to sample with or without replacement from the Dataset, choose the relative or exact size of the sample, set a seed for reproducibility, and support sampling within iterations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2277) In Scala API delta Iterations can not be set to unmanaged
[ https://issues.apache.org/jira/browse/FLINK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681562#comment-14681562 ] ASF GitHub Bot commented on FLINK-2277: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1005#issuecomment-129805238 I like the change, but I would like to change it to an overloaded method, rather than adding a parameter with default value. In the current way, it breaks the binary API compatibility. With an overloaded method, it would not break it. In Scala API delta Iterations can not be set to unmanaged - Key: FLINK-2277 URL: https://issues.apache.org/jira/browse/FLINK-2277 Project: Flink Issue Type: Improvement Components: Scala API Reporter: Aljoscha Krettek Assignee: PJ Van Aeken Labels: Starter DeltaIteration.java has method solutionSetUnManaged(). In the Scala API this could be added as an optional parameter on iterateDelta() on the DataSet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2277] In Scala API delta Iterations can...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1005#issuecomment-129805238 I like the change, but I would like to change it to an overloaded method, rather than adding a parameter with default value. In the current way, it breaks the binary API compatibility. With an overloaded method, it would not break it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2499) start-cluster.sh can start multiple TaskManager on the same node
[ https://issues.apache.org/jira/browse/FLINK-2499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681438#comment-14681438 ] Maximilian Michels commented on FLINK-2499: --- {{start-cluster.sh}} only starts a task manager when executed more than once. We should give the user a warning that an existing cluster is still running. Currently, it says {noformat} Starting jobmanager daemon on host dataslave. Starting taskmanager daemon on host dataslave. {noformat} However, it only starts the task manager because the new job manager fails and quits afterwards. start-cluster.sh can start multiple TaskManager on the same node Key: FLINK-2499 URL: https://issues.apache.org/jira/browse/FLINK-2499 Project: Flink Issue Type: Bug Affects Versions: 0.8.1 Reporter: Chen He 11562 JobHistoryServer 3251 Main 10596 Jps 17934 RunJar 6879 Main 8837 Main 19215 RunJar 28902 DataNode 6627 TaskManager 642 NodeManager 10408 RunJar 10210 TaskManager 5067 TaskManager 357 ApplicationHistoryServer 3540 RunJar 28501 ResourceManager 28572 SecondaryNameNode 17630 QuorumPeerMain 9069 TaskManager If we keep execute the start-cluster.sh, it may generate infinite TaskManagers in a single system. And the nohup command in the start-cluster.sh can generate nohup.out file that disturb any other nohup processes in the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-2503) Inconsistencies in FileInputFormat hierarchy
[ https://issues.apache.org/jira/browse/FLINK-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels updated FLINK-2503: -- Description: From a thread in the user mailing list (Invalid argument reading a file containing a Kryo object). I think that there are some inconsistencies in the hierarchy of InputFormats. The BinaryOutputFormat/TypeSerializerInputFormat should somehow inherit the behaviour of the FileInputFormat (so respect unsplittable and enumerateNestedFiles) while they doesn't take into account those flags. Moreover in the TypeSerializerInputFormat there's a // TODO: fix this shit that maybe should be removed or fixed :) Also maintaing aligned testForUnsplittable and decorateInputStream is somehow dangerous.. And maybe visibility for getBlockIndexForPosition should be changed to protected? My need was to implement a TypeSerializerInputFormatRowBundle but to achieve that I had to make a lot of overrides..am I doing something wrong or are those inputFormat somehow to improve..? This is my IF code (remark: from the comment Copied from FileInputFormat (override TypeSerializerInputFormat) on the code is copied-and-pasted from FileInputFormat..thus MY code ends there): {code:java} public class RowBundleInputFormat extends TypeSerializerInputFormatRowBundle { private static final long serialVersionUID = 1L; private static final Logger LOG = LoggerFactory.getLogger(RowBundleInputFormat.class); /** The fraction that the last split may be larger than the others. */ private static final float MAX_SPLIT_SIZE_DISCREPANCY = 1.1f; private boolean objectRead; public RowBundleInputFormat() { super(new GenericTypeInfo(RowBundle.class)); unsplittable = true; } @Override protected FSDataInputStream decorateInputStream(FSDataInputStream inputStream, FileInputSplit fileSplit) throws Throwable { return inputStream; } @Override protected boolean testForUnsplittable(FileStatus pathFile) { return true; } @Override public void open(FileInputSplit split) throws IOException { super.open(split); objectRead = false; } @Override public boolean reachedEnd() throws IOException { return this.objectRead; } @Override public RowBundle nextRecord(RowBundle reuse) throws IOException { RowBundle yourObject = super.nextRecord(reuse); this.objectRead = true; // read only one object return yourObject; } // --- // Copied from FileInputFormat (overriding TypeSerializerInputFormat) // --- @Override public FileInputSplit[] createInputSplits(int minNumSplits) throws IOException {...} private long addNestedFiles(Path path, ListFileStatus files, long length, boolean logExcludedFiles) throws IOException {...} private int getBlockIndexForPosition(BlockLocation[] blocks, long offset, long halfSplitSize, int startIndex) { ... } } {code} was: From a thread in the user mailing list (Invalid argument reading a file containing a Kryo object). I think that there are some inconsistencies in the hierarchy of InputFormats. The BinaryOutputFormat/TypeSerializerInputFormat should somehow inherit the behaviour of the FileInputFormat (so respect unsplittable and enumerateNestedFiles) while they doesn't take into account those flags. Moreover in the TypeSerializerInputFormat there's a // TODO: fix this shit that maybe should be removed or fixed :) Also maintaing aligned testForUnsplittable and decorateInputStream is somehow dangerous.. And maybe visibility for getBlockIndexForPosition should be changed to protected? My need was to implement a TypeSerializerInputFormatRowBundle but to achieve that I had to make a lot of overrides..am I doing something wrong or are those inputFormat somehow to improve..? This is my IF code (remark: from the comment Copied from FileInputFormat (override TypeSerializerInputFormat) on the code is copied-and-pasted from FileInputFormat..thus MY code ends there): public class RowBundleInputFormat extends TypeSerializerInputFormatRowBundle { private static final long serialVersionUID = 1L; private static final Logger LOG = LoggerFactory.getLogger(RowBundleInputFormat.class); /** The fraction that the last split may be larger than the others. */ private static final float MAX_SPLIT_SIZE_DISCREPANCY = 1.1f; private boolean objectRead; public RowBundleInputFormat() { super(new
[GitHub] flink pull request: [FLINK-2477][Add]Add tests for SocketClientSin...
Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/977#issuecomment-129785617 @StephanEwen Hi, I fixed all of your reviews but the problem about one thread (for server and connection). I just plan to do some complex test late. So, if you think it isn`t necessary I`ll change. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2477) Add test for SocketClientSink
[ https://issues.apache.org/jira/browse/FLINK-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681494#comment-14681494 ] ASF GitHub Bot commented on FLINK-2477: --- Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/977#issuecomment-129785617 @StephanEwen Hi, I fixed all of your reviews but the problem about one thread (for server and connection). I just plan to do some complex test late. So, if you think it isn`t necessary I`ll change. Add test for SocketClientSink - Key: FLINK-2477 URL: https://issues.apache.org/jira/browse/FLINK-2477 Project: Flink Issue Type: Test Components: Streaming Affects Versions: 0.10 Environment: win7 32bit;linux Reporter: Huang Wei Priority: Minor Fix For: 0.10 Original Estimate: 168h Remaining Estimate: 168h Add some tests for SocketClientSink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2357] Update Node installation instruct...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1006#issuecomment-129802503 Super, thank you! Will merge this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1984] Integrate Flink with Apache Mesos
Github user uce commented on the pull request: https://github.com/apache/flink/pull/948#issuecomment-129785986 Thanks for the reply at the mailing list. I will try out your PR this week and have a look at the code. Sorry for the delay. I needed to clear some more time, because it is a big addition. :) Thanks for all your effort. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1984) Integrate Flink with Apache Mesos
[ https://issues.apache.org/jira/browse/FLINK-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681496#comment-14681496 ] ASF GitHub Bot commented on FLINK-1984: --- Github user uce commented on the pull request: https://github.com/apache/flink/pull/948#issuecomment-129785986 Thanks for the reply at the mailing list. I will try out your PR this week and have a look at the code. Sorry for the delay. I needed to clear some more time, because it is a big addition. :) Thanks for all your effort. Integrate Flink with Apache Mesos - Key: FLINK-1984 URL: https://issues.apache.org/jira/browse/FLINK-1984 Project: Flink Issue Type: New Feature Components: New Components Reporter: Robert Metzger Priority: Minor Attachments: 251.patch There are some users asking for an integration of Flink into Mesos. There also is a pending pull request for adding Mesos support for Flink: https://github.com/apache/flink/pull/251 But the PR is insufficiently tested. I'll add the code of the pull request to this JIRA in case somebody wants to pick it up in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-2507) Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector
fangfengbin created FLINK-2507: -- Summary: Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector Key: FLINK-2507 URL: https://issues.apache.org/jira/browse/FLINK-2507 Project: Flink Issue Type: Bug Components: flink-contrib Affects Versions: 0.8.1 Reporter: fangfengbin Assignee: fangfengbin Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2477) Add test for SocketClientSink
[ https://issues.apache.org/jira/browse/FLINK-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681306#comment-14681306 ] ASF GitHub Bot commented on FLINK-2477: --- Github user HuangWHWHW commented on a diff in the pull request: https://github.com/apache/flink/pull/977#discussion_r36714718 --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/functions/SocketClientSinkTest.java --- @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.api.functions; + +import java.io.IOException; +import java.net.Socket; + +import org.apache.flink.configuration.Configuration; +import org.apache.flink.streaming.api.functions.sink.SocketClientSink; +import org.apache.flink.streaming.util.serialization.SerializationSchema; + +import org.junit.Test; + +import static java.lang.Thread.sleep; +import static org.junit.Assert.*; + +import java.io.BufferedReader; +import java.io.InputStreamReader; +import java.io.PrintWriter; +import java.net.ServerSocket; + +/** + * Tests for the {@link org.apache.flink.streaming.api.functions.sink.SocketClientSink}. + */ +public class SocketClientSinkTest{ + + private final String host = 127.0.0.1; + private int port = ; + private String access; + public SocketServer.ServerThread th = null; + + class SocketServer extends Thread { + + private ServerSocket server = null; + private Socket sk = null; + private BufferedReader rdr = null; + private PrintWriter wtr = null; + + private SocketServer(int port) { + while (port 0) { --- End diff -- Sorry, you may ignore this reply. I`ve got the picture. Add test for SocketClientSink - Key: FLINK-2477 URL: https://issues.apache.org/jira/browse/FLINK-2477 Project: Flink Issue Type: Test Components: Streaming Affects Versions: 0.10 Environment: win7 32bit;linux Reporter: Huang Wei Priority: Minor Fix For: 0.10 Original Estimate: 168h Remaining Estimate: 168h Add some tests for SocketClientSink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2477][Add]Add tests for SocketClientSin...
Github user HuangWHWHW commented on a diff in the pull request: https://github.com/apache/flink/pull/977#discussion_r36714718 --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/functions/SocketClientSinkTest.java --- @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.api.functions; + +import java.io.IOException; +import java.net.Socket; + +import org.apache.flink.configuration.Configuration; +import org.apache.flink.streaming.api.functions.sink.SocketClientSink; +import org.apache.flink.streaming.util.serialization.SerializationSchema; + +import org.junit.Test; + +import static java.lang.Thread.sleep; +import static org.junit.Assert.*; + +import java.io.BufferedReader; +import java.io.InputStreamReader; +import java.io.PrintWriter; +import java.net.ServerSocket; + +/** + * Tests for the {@link org.apache.flink.streaming.api.functions.sink.SocketClientSink}. + */ +public class SocketClientSinkTest{ + + private final String host = 127.0.0.1; + private int port = ; + private String access; + public SocketServer.ServerThread th = null; + + class SocketServer extends Thread { + + private ServerSocket server = null; + private Socket sk = null; + private BufferedReader rdr = null; + private PrintWriter wtr = null; + + private SocketServer(int port) { + while (port 0) { --- End diff -- Sorry, you may ignore this reply. I`ve got the picture. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1901] [core] Create sample operator for...
Github user thvasilo commented on the pull request: https://github.com/apache/flink/pull/949#issuecomment-129744713 Agreed, we can take a look at the optimized algorithm later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2504) ExternalSortLargeRecordsITCase.testSortWithLongAndShortRecordsMixed failed spuriously
[ https://issues.apache.org/jira/browse/FLINK-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681515#comment-14681515 ] Sachin Goel commented on FLINK-2504: Yes, I had discovered those before I sent a shout-out on IRC. I wanted to know if someone else had gotten a failed build on this test before re-opening the ticket. ExternalSortLargeRecordsITCase.testSortWithLongAndShortRecordsMixed failed spuriously - Key: FLINK-2504 URL: https://issues.apache.org/jira/browse/FLINK-2504 Project: Flink Issue Type: Bug Reporter: Till Rohrmann The test {{ExternalSortLargeRecordsITCase.testSortWithLongAndShortRecordsMixed}} failed in one of my Travis builds: https://travis-ci.org/tillrohrmann/flink/jobs/74881883 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-2506) HBase table that is distributed over more than 1 region server (Flink Cluster-Mode)
[ https://issues.apache.org/jira/browse/FLINK-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lydia Ickler updated FLINK-2506: Fix Version/s: (was: 0.10) Description: If I fill a default table (create 'test-table', 'someCf') with the HBaseWriteExample.java program from the HBase addon library then a table without start/end key is created. The data reading works great with the HBaseReadExample.java. Nevertheless, if I manually create a test-table that is distributed over more than one region server (create 'test-table2', 'someCf', ) the run is canceled with the following error message: grips2 Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=35, exceptions: Fri Aug 07 11:18:29 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:18:38 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:18:48 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:18:58 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:08 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:18 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:28 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:38 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:48 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:58 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:20:18 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:20:38 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:20:58 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:21:19 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:21:39 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:21:59 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:22:19 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:22:39 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:22:59 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:23:19 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:23:39 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:24:00 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:24:20 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:24:40 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:25:00 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:25:20 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:25:40 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:26:00 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:26:20 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:26:40 CEST 2015,
[GitHub] flink pull request: [FLINK-1984] Integrate Flink with Apache Mesos
Github user ankurcha commented on the pull request: https://github.com/apache/flink/pull/948#issuecomment-129765378 @StephanEwen Thanks for the pointer, I replied on the mailing list thread. Any code-review comments for this pull request? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-2077) Rework Path class and add extend support for Windows paths
[ https://issues.apache.org/jira/browse/FLINK-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske updated FLINK-2077: - Assignee: GaoLun Rework Path class and add extend support for Windows paths -- Key: FLINK-2077 URL: https://issues.apache.org/jira/browse/FLINK-2077 Project: Flink Issue Type: Improvement Components: Core Affects Versions: 0.9 Reporter: Fabian Hueske Assignee: GaoLun Priority: Minor Labels: starter The class {{org.apache.flink.core.fs.Path}} handles paths for Flink's {{FileInputFormat}} and {{FileOutputFormat}}. Over time, this class has become quite hard to read and modify. It would benefit from some cleaning and refactoring. Along with the refactoring, support for Windows paths like {{//host/dir1/dir2}} could be added. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1984) Integrate Flink with Apache Mesos
[ https://issues.apache.org/jira/browse/FLINK-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681423#comment-14681423 ] ASF GitHub Bot commented on FLINK-1984: --- Github user ankurcha commented on the pull request: https://github.com/apache/flink/pull/948#issuecomment-129765378 @StephanEwen Thanks for the pointer, I replied on the mailing list thread. Any code-review comments for this pull request? Integrate Flink with Apache Mesos - Key: FLINK-1984 URL: https://issues.apache.org/jira/browse/FLINK-1984 Project: Flink Issue Type: New Feature Components: New Components Reporter: Robert Metzger Priority: Minor Attachments: 251.patch There are some users asking for an integration of Flink into Mesos. There also is a pending pull request for adding Mesos support for Flink: https://github.com/apache/flink/pull/251 But the PR is insufficiently tested. I'll add the code of the pull request to this JIRA in case somebody wants to pick it up in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2504) ExternalSortLargeRecordsITCase.testSortWithLongAndShortRecordsMixed failed spuriously
[ https://issues.apache.org/jira/browse/FLINK-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681475#comment-14681475 ] Till Rohrmann commented on FLINK-2504: -- [~StephanEwen] the output of travis is all I got. There is apparently a problem with the watchdog script for my repository which prevented the uploading. [~sachingoel0101] your stack trace looks similar to FLINK-1455. ExternalSortLargeRecordsITCase.testSortWithLongAndShortRecordsMixed failed spuriously - Key: FLINK-2504 URL: https://issues.apache.org/jira/browse/FLINK-2504 Project: Flink Issue Type: Bug Reporter: Till Rohrmann The test {{ExternalSortLargeRecordsITCase.testSortWithLongAndShortRecordsMixed}} failed in one of my Travis builds: https://travis-ci.org/tillrohrmann/flink/jobs/74881883 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2502) FiniteStormSpout documenation does not render correclty
[ https://issues.apache.org/jira/browse/FLINK-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681563#comment-14681563 ] ASF GitHub Bot commented on FLINK-2502: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1002#issuecomment-129805634 Looks good, will merge this... FiniteStormSpout documenation does not render correclty --- Key: FLINK-2502 URL: https://issues.apache.org/jira/browse/FLINK-2502 Project: Flink Issue Type: Bug Components: Documentation Reporter: Matthias J. Sax Assignee: Matthias J. Sax Priority: Trivial Attachments: screenshot.png The code examples do not render correctly, due to missing empty lines. !screenshot.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2502] FiniteStormSpout documenation doe...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1002#issuecomment-129805634 Looks good, will merge this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-2506) HBase connection closing down (table distributed over more than 1 region server - Flink Cluster-Mode)
[ https://issues.apache.org/jira/browse/FLINK-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lydia Ickler updated FLINK-2506: Description: If I fill a default table (create 'test-table', 'someCf') with the HBaseWriteExample.java program from the HBase addon library then a table without start/end key is created. The data reading works great with the HBaseReadExample.java. Nevertheless, if I manually create a test-table that is distributed over more than one region server (create 'test-table2', 'someCf',{NUMREGIONS = 3,SPLITALGO = 'HexStringSplit'}) the run is canceled with the following error message: grips2 Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=35, exceptions: Fri Aug 07 11:18:29 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:18:38 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:18:48 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:18:58 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:08 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:18 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:28 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:38 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:48 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:19:58 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:20:18 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:20:38 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:20:58 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:21:19 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:21:39 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:21:59 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:22:19 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:22:39 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:22:59 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:23:19 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:23:39 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:24:00 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:24:20 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:24:40 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:25:00 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:25:20 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:25:40 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:26:00 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:26:20 CEST 2015, org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: hconnection-0x47bf79d7 closed Fri Aug 07 11:26:40 CEST 2015,
[jira] [Commented] (FLINK-2499) start-cluster.sh can start multiple TaskManager on the same node
[ https://issues.apache.org/jira/browse/FLINK-2499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681507#comment-14681507 ] Ufuk Celebi commented on FLINK-2499: I agree with Max. We can print a warning or fail if the user starts a cluster and a JM or TM is already running. And we can also check whether the processes started up properly before continuing. Before this was detected, because you couldn't start another TM instance etc. start-cluster.sh can start multiple TaskManager on the same node Key: FLINK-2499 URL: https://issues.apache.org/jira/browse/FLINK-2499 Project: Flink Issue Type: Bug Affects Versions: 0.8.1 Reporter: Chen He 11562 JobHistoryServer 3251 Main 10596 Jps 17934 RunJar 6879 Main 8837 Main 19215 RunJar 28902 DataNode 6627 TaskManager 642 NodeManager 10408 RunJar 10210 TaskManager 5067 TaskManager 357 ApplicationHistoryServer 3540 RunJar 28501 ResourceManager 28572 SecondaryNameNode 17630 QuorumPeerMain 9069 TaskManager If we keep execute the start-cluster.sh, it may generate infinite TaskManagers in a single system. And the nohup command in the start-cluster.sh can generate nohup.out file that disturb any other nohup processes in the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2477) Add test for SocketClientSink
[ https://issues.apache.org/jira/browse/FLINK-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681305#comment-14681305 ] ASF GitHub Bot commented on FLINK-2477: --- Github user HuangWHWHW commented on a diff in the pull request: https://github.com/apache/flink/pull/977#discussion_r36714696 --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/functions/SocketClientSinkTest.java --- @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.api.functions; + +import java.io.IOException; +import java.net.Socket; + +import org.apache.flink.configuration.Configuration; +import org.apache.flink.streaming.api.functions.sink.SocketClientSink; +import org.apache.flink.streaming.util.serialization.SerializationSchema; + +import org.junit.Test; + +import static java.lang.Thread.sleep; +import static org.junit.Assert.*; + +import java.io.BufferedReader; +import java.io.InputStreamReader; +import java.io.PrintWriter; +import java.net.ServerSocket; + +/** + * Tests for the {@link org.apache.flink.streaming.api.functions.sink.SocketClientSink}. + */ +public class SocketClientSinkTest{ + + private final String host = 127.0.0.1; + private int port = ; + private String access; + public SocketServer.ServerThread th = null; + + class SocketServer extends Thread { + + private ServerSocket server = null; + private Socket sk = null; + private BufferedReader rdr = null; + private PrintWriter wtr = null; + + private SocketServer(int port) { + while (port 0) { + try { + this.server = new ServerSocket(port); + break; + } catch (Exception e) { + --port; + if (port 0) { + continue; + } + else{ + e.printStackTrace(); + } + } + } + } + + public void run() { + System.out.println(Listenning...); + try { + sk = server.accept(); --- End diff -- Sorry, you may ignore this reply. I`ve got the picture. Add test for SocketClientSink - Key: FLINK-2477 URL: https://issues.apache.org/jira/browse/FLINK-2477 Project: Flink Issue Type: Test Components: Streaming Affects Versions: 0.10 Environment: win7 32bit;linux Reporter: Huang Wei Priority: Minor Fix For: 0.10 Original Estimate: 168h Remaining Estimate: 168h Add some tests for SocketClientSink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2477][Add]Add tests for SocketClientSin...
Github user HuangWHWHW commented on a diff in the pull request: https://github.com/apache/flink/pull/977#discussion_r36714696 --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/functions/SocketClientSinkTest.java --- @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.api.functions; + +import java.io.IOException; +import java.net.Socket; + +import org.apache.flink.configuration.Configuration; +import org.apache.flink.streaming.api.functions.sink.SocketClientSink; +import org.apache.flink.streaming.util.serialization.SerializationSchema; + +import org.junit.Test; + +import static java.lang.Thread.sleep; +import static org.junit.Assert.*; + +import java.io.BufferedReader; +import java.io.InputStreamReader; +import java.io.PrintWriter; +import java.net.ServerSocket; + +/** + * Tests for the {@link org.apache.flink.streaming.api.functions.sink.SocketClientSink}. + */ +public class SocketClientSinkTest{ + + private final String host = 127.0.0.1; + private int port = ; + private String access; + public SocketServer.ServerThread th = null; + + class SocketServer extends Thread { + + private ServerSocket server = null; + private Socket sk = null; + private BufferedReader rdr = null; + private PrintWriter wtr = null; + + private SocketServer(int port) { + while (port 0) { + try { + this.server = new ServerSocket(port); + break; + } catch (Exception e) { + --port; + if (port 0) { + continue; + } + else{ + e.printStackTrace(); + } + } + } + } + + public void run() { + System.out.println(Listenning...); + try { + sk = server.accept(); --- End diff -- Sorry, you may ignore this reply. I`ve got the picture. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-2506) HBase table that is distributed over more than 1 region server
Lydia Ickler created FLINK-2506: --- Summary: HBase table that is distributed over more than 1 region server Key: FLINK-2506 URL: https://issues.apache.org/jira/browse/FLINK-2506 Project: Flink Issue Type: Bug Components: Examples Affects Versions: 0.10 Reporter: Lydia Ickler Fix For: 0.10 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2502) FiniteStormSpout documenation does not render correclty
[ https://issues.apache.org/jira/browse/FLINK-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-2502. --- FiniteStormSpout documenation does not render correclty --- Key: FLINK-2502 URL: https://issues.apache.org/jira/browse/FLINK-2502 Project: Flink Issue Type: Bug Components: Documentation Reporter: Matthias J. Sax Assignee: Matthias J. Sax Priority: Trivial Fix For: 0.10 Attachments: screenshot.png The code examples do not render correctly, due to missing empty lines. !screenshot.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2500][Streaming]remove a unwanted if ...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1001 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2277) In Scala API delta Iterations can not be set to unmanaged
[ https://issues.apache.org/jira/browse/FLINK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681626#comment-14681626 ] ASF GitHub Bot commented on FLINK-2277: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1005 In Scala API delta Iterations can not be set to unmanaged - Key: FLINK-2277 URL: https://issues.apache.org/jira/browse/FLINK-2277 Project: Flink Issue Type: Improvement Components: Scala API Reporter: Aljoscha Krettek Assignee: PJ Van Aeken Labels: Starter DeltaIteration.java has method solutionSetUnManaged(). In the Scala API this could be added as an optional parameter on iterateDelta() on the DataSet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-2502) FiniteStormSpout documenation does not render correclty
[ https://issues.apache.org/jira/browse/FLINK-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-2502. - Resolution: Fixed Fix Version/s: 0.10 Fixed via 5bb855bac7441701495ce47db7ba03ab0e0c6963 FiniteStormSpout documenation does not render correclty --- Key: FLINK-2502 URL: https://issues.apache.org/jira/browse/FLINK-2502 Project: Flink Issue Type: Bug Components: Documentation Reporter: Matthias J. Sax Assignee: Matthias J. Sax Priority: Trivial Fix For: 0.10 Attachments: screenshot.png The code examples do not render correctly, due to missing empty lines. !screenshot.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2386) Implement Kafka connector using the new Kafka Consumer API
[ https://issues.apache.org/jira/browse/FLINK-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681686#comment-14681686 ] ASF GitHub Bot commented on FLINK-2386: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/996#issuecomment-129849082 I'll take a stab at checking out this monster ;-) Implement Kafka connector using the new Kafka Consumer API -- Key: FLINK-2386 URL: https://issues.apache.org/jira/browse/FLINK-2386 Project: Flink Issue Type: Improvement Components: Kafka Connector Reporter: Robert Metzger Assignee: Robert Metzger Once Kafka has released its new consumer API, we should provide a connector for that version. The release will probably be called 0.9 or 0.8.3. The connector will be mostly compatible with Kafka 0.8.2.x, except for committing offsets to the broker (the new connector expects a coordinator to be available on Kafka). To work around that, we can provide a configuration option to commit offsets to zookeeper (managed by flink code). For 0.9/0.8.3 it will be fully compatible. It will not be compatible with 0.8.1 because of mismatching Kafka messages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-2508) Confusing sharing of StreamExecutionEnvironment
Stephan Ewen created FLINK-2508: --- Summary: Confusing sharing of StreamExecutionEnvironment Key: FLINK-2508 URL: https://issues.apache.org/jira/browse/FLINK-2508 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.10 Reporter: Stephan Ewen Fix For: 0.10 In the {{StreamExecutionEnvironment}}, the environment is once created and then shared with a static variable to all successive calls to {{getExecutionEnvironment()}}. But it can be overridden by calls to {{createLocalEnvironment()}} and {{createRemoteEnvironment()}}. This seems a bit un-intuitive, and probably creates confusion when dispatching multiple streaming jobs from within the same JVM. Why is it even necessary to cache the current execution environment? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2277) In Scala API delta Iterations can not be set to unmanaged
[ https://issues.apache.org/jira/browse/FLINK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681580#comment-14681580 ] ASF GitHub Bot commented on FLINK-2277: --- Github user PieterJanVanAeken commented on the pull request: https://github.com/apache/flink/pull/1005#issuecomment-129817057 I changed the implementation to overloaded methods and I added one for integer based keyFields, a method I overlooked in my previous commit. In Scala API delta Iterations can not be set to unmanaged - Key: FLINK-2277 URL: https://issues.apache.org/jira/browse/FLINK-2277 Project: Flink Issue Type: Improvement Components: Scala API Reporter: Aljoscha Krettek Assignee: PJ Van Aeken Labels: Starter DeltaIteration.java has method solutionSetUnManaged(). In the Scala API this could be added as an optional parameter on iterateDelta() on the DataSet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2277] In Scala API delta Iterations can...
Github user PieterJanVanAeken commented on the pull request: https://github.com/apache/flink/pull/1005#issuecomment-129817057 I changed the implementation to overloaded methods and I added one for integer based keyFields, a method I overlooked in my previous commit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2500) Some reviews to improve code quality
[ https://issues.apache.org/jira/browse/FLINK-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681629#comment-14681629 ] ASF GitHub Bot commented on FLINK-2500: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1001 Some reviews to improve code quality Key: FLINK-2500 URL: https://issues.apache.org/jira/browse/FLINK-2500 Project: Flink Issue Type: Improvement Components: Streaming Affects Versions: 0.10 Reporter: Huang Wei Priority: Minor Fix For: 0.10 Original Estimate: 672h Remaining Estimate: 672h I reviewed the Streaming module and there are some suggestions to improve the code quality(.i.e reduce the complexity of the loop, fix memory leak...). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2357) New JobManager Runtime Web Frontend
[ https://issues.apache.org/jira/browse/FLINK-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681628#comment-14681628 ] ASF GitHub Bot commented on FLINK-2357: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1006 New JobManager Runtime Web Frontend --- Key: FLINK-2357 URL: https://issues.apache.org/jira/browse/FLINK-2357 Project: Flink Issue Type: New Feature Components: Webfrontend Affects Versions: 0.10 Reporter: Stephan Ewen Attachments: Webfrontend Mockup.pdf We need to improve rework the Job Manager Web Frontend. The current web frontend is limited and has a lot of design issues - It does not display and progress while operators are running. This is especially problematic for streaming jobs - It has no graph representation of the data flows - it does not allow to look into execution attempts - it has no hook to deal with the upcoming live accumulators - The architecture is not very modular/extensible I propose to add a new JobManager web frontend: - Based on Netty HTTP (very lightweight) - Using rest-style URLs for jobs and vertices - integrating the D3 graph renderer of the previews with the runtime monitor - with details on execution attempts - first class visualization of records processed and bytes processed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2277] In Scala API delta Iterations can...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1005 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2502] FiniteStormSpout documenation doe...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1002 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2357] Update Node installation instruct...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1006 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2502) FiniteStormSpout documenation does not render correclty
[ https://issues.apache.org/jira/browse/FLINK-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681627#comment-14681627 ] ASF GitHub Bot commented on FLINK-2502: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/1002 FiniteStormSpout documenation does not render correclty --- Key: FLINK-2502 URL: https://issues.apache.org/jira/browse/FLINK-2502 Project: Flink Issue Type: Bug Components: Documentation Reporter: Matthias J. Sax Assignee: Matthias J. Sax Priority: Trivial Attachments: screenshot.png The code examples do not render correctly, due to missing empty lines. !screenshot.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-129849477 Yes, existing formats should converted to rich formats. The name `Abstract..` makes more sense if it becomes the new default way of interfacing with the rich input/output formats. I still think, calling it `Rich...` makes more sense in terms of consistency with the user-defined functions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats
[ https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681689#comment-14681689 ] ASF GitHub Bot commented on FLINK-1819: --- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-129849477 Yes, existing formats should converted to rich formats. The name `Abstract..` makes more sense if it becomes the new default way of interfacing with the rich input/output formats. I still think, calling it `Rich...` makes more sense in terms of consistency with the user-defined functions. Allow access to RuntimeContext from Input and OutputFormats --- Key: FLINK-1819 URL: https://issues.apache.org/jira/browse/FLINK-1819 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 0.9, 0.8.1 Reporter: Fabian Hueske Assignee: Sachin Goel Priority: Minor Fix For: 0.9 User function that extend a RichFunction can access a {{RuntimeContext}} which gives the parallel id of the task and access to Accumulators and BroadcastVariables. Right now, Input and OutputFormats cannot access their {{RuntimeContext}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2500) Some reviews to improve code quality
[ https://issues.apache.org/jira/browse/FLINK-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-2500. --- Some reviews to improve code quality Key: FLINK-2500 URL: https://issues.apache.org/jira/browse/FLINK-2500 Project: Flink Issue Type: Improvement Components: Streaming Affects Versions: 0.10 Reporter: Huang Wei Priority: Minor Fix For: 0.10 Original Estimate: 672h Remaining Estimate: 672h I reviewed the Streaming module and there are some suggestions to improve the code quality(.i.e reduce the complexity of the loop, fix memory leak...). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-2500) Some reviews to improve code quality
[ https://issues.apache.org/jira/browse/FLINK-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-2500. - Resolution: Fixed Fixed via b42fbf7a81c5b57dcf9760825edb175ffd944fb2 Thank you for the contribution! Some reviews to improve code quality Key: FLINK-2500 URL: https://issues.apache.org/jira/browse/FLINK-2500 Project: Flink Issue Type: Improvement Components: Streaming Affects Versions: 0.10 Reporter: Huang Wei Priority: Minor Fix For: 0.10 Original Estimate: 672h Remaining Estimate: 672h I reviewed the Streaming module and there are some suggestions to improve the code quality(.i.e reduce the complexity of the loop, fix memory leak...). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2277] In Scala API delta Iterations can...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1005#issuecomment-129831692 Thanks! I know it is extra code, but allowing existing code to continue running without recompile is something that people really appreciate. Merging this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2277) In Scala API delta Iterations can not be set to unmanaged
[ https://issues.apache.org/jira/browse/FLINK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681600#comment-14681600 ] ASF GitHub Bot commented on FLINK-2277: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1005#issuecomment-129831692 Thanks! I know it is extra code, but allowing existing code to continue running without recompile is something that people really appreciate. Merging this... In Scala API delta Iterations can not be set to unmanaged - Key: FLINK-2277 URL: https://issues.apache.org/jira/browse/FLINK-2277 Project: Flink Issue Type: Improvement Components: Scala API Reporter: Aljoscha Krettek Assignee: PJ Van Aeken Labels: Starter DeltaIteration.java has method solutionSetUnManaged(). In the Scala API this could be added as an optional parameter on iterateDelta() on the DataSet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats
[ https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681603#comment-14681603 ] ASF GitHub Bot commented on FLINK-1819: --- Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-129834217 Okay. I think maybe calling them `Rich` maybe an overkill. I will change the names to `Abstract`. Do we keep the existing formats *non-rich* or would it be okay to make all of them *rich* [which is the case right now]? The problem with not making them *rich* is that then we're limiting any extending classes to never be *rich*. Allow access to RuntimeContext from Input and OutputFormats --- Key: FLINK-1819 URL: https://issues.apache.org/jira/browse/FLINK-1819 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 0.9, 0.8.1 Reporter: Fabian Hueske Assignee: Sachin Goel Priority: Minor Fix For: 0.9 User function that extend a RichFunction can access a {{RuntimeContext}} which gives the parallel id of the task and access to Accumulators and BroadcastVariables. Right now, Input and OutputFormats cannot access their {{RuntimeContext}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-2277) In Scala API delta Iterations can not be set to unmanaged
[ https://issues.apache.org/jira/browse/FLINK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-2277. - Resolution: Fixed Fix Version/s: 0.10 Fixed via f50ae26a2fb4a0c7f5b390e2f0f5528be9f61730 Thank you for the contribution! In Scala API delta Iterations can not be set to unmanaged - Key: FLINK-2277 URL: https://issues.apache.org/jira/browse/FLINK-2277 Project: Flink Issue Type: Improvement Components: Scala API Reporter: Aljoscha Krettek Assignee: PJ Van Aeken Labels: Starter Fix For: 0.10 DeltaIteration.java has method solutionSetUnManaged(). In the Scala API this could be added as an optional parameter on iterateDelta() on the DataSet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats
[ https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681674#comment-14681674 ] ASF GitHub Bot commented on FLINK-1819: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-129847518 I'm in favor of making the existing formats rich. Allow access to RuntimeContext from Input and OutputFormats --- Key: FLINK-1819 URL: https://issues.apache.org/jira/browse/FLINK-1819 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 0.9, 0.8.1 Reporter: Fabian Hueske Assignee: Sachin Goel Priority: Minor Fix For: 0.9 User function that extend a RichFunction can access a {{RuntimeContext}} which gives the parallel id of the task and access to Accumulators and BroadcastVariables. Right now, Input and OutputFormats cannot access their {{RuntimeContext}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-129847518 I'm in favor of making the existing formats rich. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [WIP][FLINK-2386] Add new Kafka Consumers
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/996#issuecomment-129849082 I'll take a stab at checking out this monster ;-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2477][Add]Add tests for SocketClientSin...
Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/977#issuecomment-129854454 @StephanEwen Hi, I push a new fix about the exception. Would you please to check whether it`s correct? And there is another PR(https://github.com/apache/flink/pull/991) that the CI is failed. Since I couldn`t see the CI detail yet, could you do me a favor having a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2477) Add test for SocketClientSink
[ https://issues.apache.org/jira/browse/FLINK-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681711#comment-14681711 ] ASF GitHub Bot commented on FLINK-2477: --- Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/977#issuecomment-129854454 @StephanEwen Hi, I push a new fix about the exception. Would you please to check whether it`s correct? And there is another PR(https://github.com/apache/flink/pull/991) that the CI is failed. Since I couldn`t see the CI detail yet, could you do me a favor having a look? Add test for SocketClientSink - Key: FLINK-2477 URL: https://issues.apache.org/jira/browse/FLINK-2477 Project: Flink Issue Type: Test Components: Streaming Affects Versions: 0.10 Environment: win7 32bit;linux Reporter: Huang Wei Priority: Minor Fix For: 0.10 Original Estimate: 168h Remaining Estimate: 168h Add some tests for SocketClientSink. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2477][Add]Add tests for SocketClientSin...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/977#issuecomment-129829151 Looks much better. Two more comments: 1. It would be good to print the exception in addition to failing the test, because otherwise the tests only fails and gives no indication why. A typical pattern is: ```java try { // some stuff } catch (Exception e) { e.printStackTrace(); Assert.fail(e.getMessage()); } ``` Here, the test prints nothing when working properly, but prints the error when it fails. 2. `Assert.fail()` does not work when used in spawned threads. The reason is that JUnit communicates the failure with a special `AssertionFailedException`, which needs to reach the invoking framework. That does not happen if the `fail()` method is called in a spawned thread. Here is how you can do it. It is a bit clumsy, because you need to forward the exception to the main thread, but it works well: ```java final AtomicReferenceThrowable error = new AtomicReference(); Thread thread = new Thread(server thread) { @Override public void run() { try { doStuff(); } catch (Throwable t) { ref.set(t); } } }; thread.start(); // do some test logic thread.join(); if (error.get() != null) { Throwable t = error.get(); t.printStackTrace(); fail(Error in spawned thread: + t.getMessage()); } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...
Github user sachingoel0101 commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-129834217 Okay. I think maybe calling them `Rich` maybe an overkill. I will change the names to `Abstract`. Do we keep the existing formats *non-rich* or would it be okay to make all of them *rich* [which is the case right now]? The problem with not making them *rich* is that then we're limiting any extending classes to never be *rich*. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2507) Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector
[ https://issues.apache.org/jira/browse/FLINK-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681622#comment-14681622 ] ASF GitHub Bot commented on FLINK-2507: --- GitHub user ffbin opened a pull request: https://github.com/apache/flink/pull/1007 [FLINK-2507]Rename the function tansformAndEmit I think the function name tansformAndEmit in ' org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector' is a wrong spelling, it should be transformAndEmit is more better. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ffbin/flink FLINK-2507 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1007.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1007 commit 39058ceb9dd3e5baf96daf1a3d2a9bd050ebf702 Author: ffbin 869218...@qq.com Date: 2015-08-11T10:42:39Z [FLINK-2507]Rename the function tansformAndEmit Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector -- Key: FLINK-2507 URL: https://issues.apache.org/jira/browse/FLINK-2507 Project: Flink Issue Type: Bug Components: flink-contrib Affects Versions: 0.8.1 Reporter: fangfengbin Assignee: fangfengbin Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2507]Rename the function tansformAndEmi...
GitHub user ffbin opened a pull request: https://github.com/apache/flink/pull/1007 [FLINK-2507]Rename the function tansformAndEmit I think the function name tansformAndEmit in ' org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector' is a wrong spelling, it should be transformAndEmit is more better. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ffbin/flink FLINK-2507 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1007.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1007 commit 39058ceb9dd3e5baf96daf1a3d2a9bd050ebf702 Author: ffbin 869218...@qq.com Date: 2015-08-11T10:42:39Z [FLINK-2507]Rename the function tansformAndEmit --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2507]Rename the function tansformAndEmi...
Github user chiwanpark commented on the pull request: https://github.com/apache/flink/pull/1007#issuecomment-129850783 Why the permissions of file are changed from 644 to 755? Other changes seems good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2507) Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector
[ https://issues.apache.org/jira/browse/FLINK-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681694#comment-14681694 ] ASF GitHub Bot commented on FLINK-2507: --- Github user chiwanpark commented on the pull request: https://github.com/apache/flink/pull/1007#issuecomment-129850783 Why the permissions of file are changed from 644 to 755? Other changes seems good. Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector -- Key: FLINK-2507 URL: https://issues.apache.org/jira/browse/FLINK-2507 Project: Flink Issue Type: Bug Components: flink-contrib Affects Versions: 0.8.1 Reporter: fangfengbin Assignee: fangfengbin Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2507]Rename the function tansformAndEmi...
Github user ffbin commented on the pull request: https://github.com/apache/flink/pull/1007#issuecomment-129854098 @chiwanpark Thank you very much. I submit code in linux and modify the permissions Incorrectly.I will submit again,. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-129864531 @HuangWHWHW `retryForever` is just a convenience variable for `maxRetry 0`. Your fix is correct because the loop will only execute if `maxRetry 0` and thus not execute at all if it should retry forever. It would be great if you added a test that checks for the correct number of retries. In case of infinite retries, just check up to a certain number (e.g. 100 retries). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket
[ https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681771#comment-14681771 ] ASF GitHub Bot commented on FLINK-2490: --- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-129864531 @HuangWHWHW `retryForever` is just a convenience variable for `maxRetry 0`. Your fix is correct because the loop will only execute if `maxRetry 0` and thus not execute at all if it should retry forever. It would be great if you added a test that checks for the correct number of retries. In case of infinite retries, just check up to a certain number (e.g. 100 retries). Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket --- Key: FLINK-2490 URL: https://issues.apache.org/jira/browse/FLINK-2490 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.10 Reporter: Huang Wei Priority: Minor Fix For: 0.10 Original Estimate: 168h Remaining Estimate: 168h -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats
[ https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681594#comment-14681594 ] ASF GitHub Bot commented on FLINK-1819: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-129830222 Okay, so how about calling the abstract base class`AbstractInputFormat` and have it implement the runtime context and leaves the other methods abstract? Allow access to RuntimeContext from Input and OutputFormats --- Key: FLINK-1819 URL: https://issues.apache.org/jira/browse/FLINK-1819 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 0.9, 0.8.1 Reporter: Fabian Hueske Assignee: Sachin Goel Priority: Minor Fix For: 0.9 User function that extend a RichFunction can access a {{RuntimeContext}} which gives the parallel id of the task and access to Accumulators and BroadcastVariables. Right now, Input and OutputFormats cannot access their {{RuntimeContext}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-129830222 Okay, so how about calling the abstract base class`AbstractInputFormat` and have it implement the runtime context and leaves the other methods abstract? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2423) Properly test checkpoint notifications
[ https://issues.apache.org/jira/browse/FLINK-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681704#comment-14681704 ] ASF GitHub Bot commented on FLINK-2423: --- Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/980#issuecomment-129851909 Merging... Properly test checkpoint notifications -- Key: FLINK-2423 URL: https://issues.apache.org/jira/browse/FLINK-2423 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Gyula Fora Assignee: Márton Balassi Checkpoint notifications (via the CheckpointNotifier interface) are currently not properly tested. A test should be included to verify that checkpoint notifications are eventually called on successful checkpoints, and that they are only called once per checkpointID. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2423] [streaming] ITCase for checkpoint...
Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/980#issuecomment-129851909 Merging... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-129826022 @StephanEwen @fhueske The `Rich` prefix might seem a bit odd but it is a naming convention that is consistent with the user-defined functions which have the RuntimeContext available. As for `abstract classes vs interface` I agree with @sachingoel0101 that it makes sense to implement access to the RuntimeContext once for the user. We may also add other rich functions; if we implemented that using an interface we would have to change it later on or add another interface. I think Input/Ouput formats should be rich by default. However, we might not want to break the API for existing external implementations. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats
[ https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681584#comment-14681584 ] ASF GitHub Bot commented on FLINK-1819: --- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/966#issuecomment-129826022 @StephanEwen @fhueske The `Rich` prefix might seem a bit odd but it is a naming convention that is consistent with the user-defined functions which have the RuntimeContext available. As for `abstract classes vs interface` I agree with @sachingoel0101 that it makes sense to implement access to the RuntimeContext once for the user. We may also add other rich functions; if we implemented that using an interface we would have to change it later on or add another interface. I think Input/Ouput formats should be rich by default. However, we might not want to break the API for existing external implementations. Allow access to RuntimeContext from Input and OutputFormats --- Key: FLINK-1819 URL: https://issues.apache.org/jira/browse/FLINK-1819 Project: Flink Issue Type: Improvement Components: Local Runtime Affects Versions: 0.9, 0.8.1 Reporter: Fabian Hueske Assignee: Sachin Goel Priority: Minor Fix For: 0.9 User function that extend a RichFunction can access a {{RuntimeContext}} which gives the parallel id of the task and access to Accumulators and BroadcastVariables. Right now, Input and OutputFormats cannot access their {{RuntimeContext}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2277) In Scala API delta Iterations can not be set to unmanaged
[ https://issues.apache.org/jira/browse/FLINK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-2277. --- In Scala API delta Iterations can not be set to unmanaged - Key: FLINK-2277 URL: https://issues.apache.org/jira/browse/FLINK-2277 Project: Flink Issue Type: Improvement Components: Scala API Reporter: Aljoscha Krettek Assignee: PJ Van Aeken Labels: Starter Fix For: 0.10 DeltaIteration.java has method solutionSetUnManaged(). In the Scala API this could be added as an optional parameter on iterateDelta() on the DataSet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2507) Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector
[ https://issues.apache.org/jira/browse/FLINK-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681708#comment-14681708 ] ASF GitHub Bot commented on FLINK-2507: --- Github user ffbin commented on the pull request: https://github.com/apache/flink/pull/1007#issuecomment-129854098 @chiwanpark Thank you very much. I submit code in linux and modify the permissions Incorrectly.I will submit again,. Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector -- Key: FLINK-2507 URL: https://issues.apache.org/jira/browse/FLINK-2507 Project: Flink Issue Type: Bug Components: flink-contrib Affects Versions: 0.8.1 Reporter: fangfengbin Assignee: fangfengbin Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...
Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-129875331 @mxm Ok, I`ll add a test. There is a little difficult that I can`t get the retry times in test since the retry is a local variable. So can I add a function to get the retry times? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2423) Properly test checkpoint notifications
[ https://issues.apache.org/jira/browse/FLINK-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681827#comment-14681827 ] ASF GitHub Bot commented on FLINK-2423: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/980 Properly test checkpoint notifications -- Key: FLINK-2423 URL: https://issues.apache.org/jira/browse/FLINK-2423 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Gyula Fora Assignee: Márton Balassi Checkpoint notifications (via the CheckpointNotifier interface) are currently not properly tested. A test should be included to verify that checkpoint notifications are eventually called on successful checkpoints, and that they are only called once per checkpointID. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-2509) Improve error messages when user code classes are not found
Stephan Ewen created FLINK-2509: --- Summary: Improve error messages when user code classes are not found Key: FLINK-2509 URL: https://issues.apache.org/jira/browse/FLINK-2509 Project: Flink Issue Type: Bug Components: Core Affects Versions: 0.10 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.10 When a job fails because the user code classes are not found, we should add some information about the class loader and class path into the exception message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLINK-2501) [py] Remove the need to specify types for transformations
[ https://issues.apache.org/jira/browse/FLINK-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680732#comment-14680732 ] Chesnay Schepler edited comment on FLINK-2501 at 8/11/15 2:49 PM: -- hm. one question that still remains is how would we tell the Java API what size the emitted tuples have? This is the primary reason i didn't list such a solution. (3) has really nice things going for it: you save bandwidth as you don't send separate keys around; you save computation power by not having to extract keys; you reduce complexity since you don't have to alter the program plan or discard/hide the keys. But unless a solution for the above issue is brought up (2) seems like the way to go. Unless I'm misunderstanding something. Regarding your other points: projections is the only operation i see right now that wouldn't need a special implementation with (3), as it allows access to individual fields. to skip the sort implementation one could modify (2) to work on a tuple of keys. so for grouped operations, instead of Tuple2byte[], byte[] you work on a Tuple2TupleXbyte[],.., byte[]. this would make sorts equivalent for (2) and (3). and yes,the binary data would contain type information, was (Author: zentol): hm. one question that still remains is how would we tell the Java API whether a UDF emits a basic type (which would just be a byte[]) or an arbitrarily nested tuple? This is the primary reason i didn't list such a solution. (3) has really nice things going for it: you save bandwidth as you don't send separate keys around; you save computation power by not having to extract keys; you reduce complexity since you don't have to alter the program plan or discard/hide the keys. But unless a solution for the above issue is brought up (2) seems like the way to go. Unless I'm misunderstanding something. Regarding your other points: projections is the only operation i see right now that wouldn't need a special implementation with (3), as it allows access to individual fields. to skip the sort implementation one could modify (2) to work on a tuple of keys. so for grouped operations, instead of Tuple2byte[], byte[] you work on a Tuple2TupleXbyte[],.., byte[]. this would make sorts equivalent for (2) and (3). and yes,the binary data would contain type information, [py] Remove the need to specify types for transformations - Key: FLINK-2501 URL: https://issues.apache.org/jira/browse/FLINK-2501 Project: Flink Issue Type: Improvement Components: Python API Reporter: Chesnay Schepler Currently, users of the Python API have to provide type arguments when using a UDF, like so: {code} d1.map(Mapper(), (INT, STRING)) {code} Instead, it would be really convenient to be able to do this: {code} d1.map(Mapper()) {code} The intention behind this issue is convenience, and it's also not really pythonic to specify types. Before I'll go into possible solutions, let me summarize the way these type arguments are currently used, and in general how types are handled: The type argument passed is actually an object of the type it represents, as INT is a constant int value, whereas STRING is a constant string value. You could as well write the following and it would still work. {code} d1.map(Mapper(), (1, ImNotATypInfo)) {code} This object is transmitted to the java side during the plan binding (and is now an actual Tuple2Integer, String), then passed to the type extractor, and the resulting TypeInformation saved in the java counterpart of the udf, which all implement the ResultTypeQueryable interface. The TypeInformation object is only used by the Java API, python never touches it. Instead, at runtime, the serializers used between python and java check the classes of the values passed and are thus generated dynamically. This means that, if a UDF does not pass the type it claims to pass, the Python API wont complain, but the underlying java API will when it's serializers fail. Now let's talk solutions. In discussions on the mailing list, pretty much 2 proposals were made: # Add a way to disable/circumvent type checks during the plan phase in the Java API and generate serializers dynamically. # Have objects always in serialized form on the java side, stored in a single bytearray or Tuple2 containing a key/value pair. These proposals vary wildly in the changes necessary to the system: # How can we change the Java API to support this? This proposal would hardly change the way the Python API works, or even touch the related source code. It mostly deals with the Java API. Since I'm not to familiar with the Plan processing life-cycle on the java side I
[jira] [Commented] (FLINK-2501) [py] Remove the need to specify types for transformations
[ https://issues.apache.org/jira/browse/FLINK-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681906#comment-14681906 ] Stephan Ewen commented on FLINK-2501: - Can the Java API always think as Tuple2 (key/value) and the python API interprets them? I am not too familiar with Python and the Python API, so I am just thinking out loud here ;-) [py] Remove the need to specify types for transformations - Key: FLINK-2501 URL: https://issues.apache.org/jira/browse/FLINK-2501 Project: Flink Issue Type: Improvement Components: Python API Reporter: Chesnay Schepler Currently, users of the Python API have to provide type arguments when using a UDF, like so: {code} d1.map(Mapper(), (INT, STRING)) {code} Instead, it would be really convenient to be able to do this: {code} d1.map(Mapper()) {code} The intention behind this issue is convenience, and it's also not really pythonic to specify types. Before I'll go into possible solutions, let me summarize the way these type arguments are currently used, and in general how types are handled: The type argument passed is actually an object of the type it represents, as INT is a constant int value, whereas STRING is a constant string value. You could as well write the following and it would still work. {code} d1.map(Mapper(), (1, ImNotATypInfo)) {code} This object is transmitted to the java side during the plan binding (and is now an actual Tuple2Integer, String), then passed to the type extractor, and the resulting TypeInformation saved in the java counterpart of the udf, which all implement the ResultTypeQueryable interface. The TypeInformation object is only used by the Java API, python never touches it. Instead, at runtime, the serializers used between python and java check the classes of the values passed and are thus generated dynamically. This means that, if a UDF does not pass the type it claims to pass, the Python API wont complain, but the underlying java API will when it's serializers fail. Now let's talk solutions. In discussions on the mailing list, pretty much 2 proposals were made: # Add a way to disable/circumvent type checks during the plan phase in the Java API and generate serializers dynamically. # Have objects always in serialized form on the java side, stored in a single bytearray or Tuple2 containing a key/value pair. These proposals vary wildly in the changes necessary to the system: # How can we change the Java API to support this? This proposal would hardly change the way the Python API works, or even touch the related source code. It mostly deals with the Java API. Since I'm not to familiar with the Plan processing life-cycle on the java side I can't assess which classes would have to be changed. # How can we make this work within the limits of the Java API? is the exact opposite, it changes nothing in the Java API. Instead, the following issues would have to be solved: * Alter the plan to extract keys before keyed operations, while hiding these keys from the UDF. This is exactly how KeySelectors (will) work, and as such is generally solved. In fact, this solution would make a few things easier in regards to KeySelectors. * Rework all operations that currently rely on Java API functions, that need deserialized data, for example Projections or the upcoming Aggregations; This generally means implementing them in python, or with special java UDF's (they could de-/serialize data within the udf call, or work on serialized data). * Change (De)Serializers accordingly * implement a reliable, not all-memory-consuming sorting mechanism on the python side Personally i prefer the second option, as it # does not modify the Java API, it works within it's well-tested limits # Plan changes are similar to issues that are already worked on (KeySelectors) # Sorting implementation was necessary anyway (for chained reducers) # having data in serialized form was a performance-related consideration already While the first option could work, and most likely require less work, i feel like many of the things required for option 2 will be implemented eventually anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found
[ https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681842#comment-14681842 ] ASF GitHub Bot commented on FLINK-2509: --- GitHub user StephanEwen opened a pull request: https://github.com/apache/flink/pull/1008 [FLINK-2509] Add class loader info to exception message when user code classes are not found This pull request adds the following type of info to the exception message: ``` Cannot load user class: com.foo.bar.SomeClass ClassLoader info: URL ClassLoader: -- http://localhost:26712/some/file/path -- file: '/tmp/flink-url-test2825910185710886145.tmp' (valid JAR) -- file: '/tmp/flink-url-test4914469684250387785.tmp' (invalid JAR: error in opening zip file) -- file: '/tmp/flink-url-test5462325335282720117.tmp' (missing) ``` This should help diagnosing why sometimes classes cannot be resolved, even though the JAR files seem valid. You can merge this pull request into a Git repository by running: $ git pull https://github.com/StephanEwen/incubator-flink class_loader_info Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1008.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1008 commit 00e123709f304bd7f602989f28b111b8d282ebb1 Author: Stephan Ewen se...@apache.org Date: 2015-08-11T14:07:22Z [FLINK-2509] [runtime] Add class loader info into the exception message when user code classes are not found. Improve error messages when user code classes are not found --- Key: FLINK-2509 URL: https://issues.apache.org/jira/browse/FLINK-2509 Project: Flink Issue Type: Bug Components: Core Affects Versions: 0.10 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.10 When a job fails because the user code classes are not found, we should add some information about the class loader and class path into the exception message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2509] Add class loader info to exceptio...
GitHub user StephanEwen opened a pull request: https://github.com/apache/flink/pull/1008 [FLINK-2509] Add class loader info to exception message when user code classes are not found This pull request adds the following type of info to the exception message: ``` Cannot load user class: com.foo.bar.SomeClass ClassLoader info: URL ClassLoader: -- http://localhost:26712/some/file/path -- file: '/tmp/flink-url-test2825910185710886145.tmp' (valid JAR) -- file: '/tmp/flink-url-test4914469684250387785.tmp' (invalid JAR: error in opening zip file) -- file: '/tmp/flink-url-test5462325335282720117.tmp' (missing) ``` This should help diagnosing why sometimes classes cannot be resolved, even though the JAR files seem valid. You can merge this pull request into a Git repository by running: $ git pull https://github.com/StephanEwen/incubator-flink class_loader_info Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1008.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1008 commit 00e123709f304bd7f602989f28b111b8d282ebb1 Author: Stephan Ewen se...@apache.org Date: 2015-08-11T14:07:22Z [FLINK-2509] [runtime] Add class loader info into the exception message when user code classes are not found. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Comment Edited] (FLINK-2501) [py] Remove the need to specify types for transformations
[ https://issues.apache.org/jira/browse/FLINK-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681911#comment-14681911 ] Chesnay Schepler edited comment on FLINK-2501 at 8/11/15 3:02 PM: -- yes. modified (2) would mean that you always have either a Tuple2TupleXbyte[],..., byte[] or a byte[]. which case exactly can be deducted from the program plan without user input. was (Author: zentol): yes. modified (2) would mean that you have a Tuple2TupleXbyte[],..., byte[]. [py] Remove the need to specify types for transformations - Key: FLINK-2501 URL: https://issues.apache.org/jira/browse/FLINK-2501 Project: Flink Issue Type: Improvement Components: Python API Reporter: Chesnay Schepler Currently, users of the Python API have to provide type arguments when using a UDF, like so: {code} d1.map(Mapper(), (INT, STRING)) {code} Instead, it would be really convenient to be able to do this: {code} d1.map(Mapper()) {code} The intention behind this issue is convenience, and it's also not really pythonic to specify types. Before I'll go into possible solutions, let me summarize the way these type arguments are currently used, and in general how types are handled: The type argument passed is actually an object of the type it represents, as INT is a constant int value, whereas STRING is a constant string value. You could as well write the following and it would still work. {code} d1.map(Mapper(), (1, ImNotATypInfo)) {code} This object is transmitted to the java side during the plan binding (and is now an actual Tuple2Integer, String), then passed to the type extractor, and the resulting TypeInformation saved in the java counterpart of the udf, which all implement the ResultTypeQueryable interface. The TypeInformation object is only used by the Java API, python never touches it. Instead, at runtime, the serializers used between python and java check the classes of the values passed and are thus generated dynamically. This means that, if a UDF does not pass the type it claims to pass, the Python API wont complain, but the underlying java API will when it's serializers fail. Now let's talk solutions. In discussions on the mailing list, pretty much 2 proposals were made: # Add a way to disable/circumvent type checks during the plan phase in the Java API and generate serializers dynamically. # Have objects always in serialized form on the java side, stored in a single bytearray or Tuple2 containing a key/value pair. These proposals vary wildly in the changes necessary to the system: # How can we change the Java API to support this? This proposal would hardly change the way the Python API works, or even touch the related source code. It mostly deals with the Java API. Since I'm not to familiar with the Plan processing life-cycle on the java side I can't assess which classes would have to be changed. # How can we make this work within the limits of the Java API? is the exact opposite, it changes nothing in the Java API. Instead, the following issues would have to be solved: * Alter the plan to extract keys before keyed operations, while hiding these keys from the UDF. This is exactly how KeySelectors (will) work, and as such is generally solved. In fact, this solution would make a few things easier in regards to KeySelectors. * Rework all operations that currently rely on Java API functions, that need deserialized data, for example Projections or the upcoming Aggregations; This generally means implementing them in python, or with special java UDF's (they could de-/serialize data within the udf call, or work on serialized data). * Change (De)Serializers accordingly * implement a reliable, not all-memory-consuming sorting mechanism on the python side Personally i prefer the second option, as it # does not modify the Java API, it works within it's well-tested limits # Plan changes are similar to issues that are already worked on (KeySelectors) # Sorting implementation was necessary anyway (for chained reducers) # having data in serialized form was a performance-related consideration already While the first option could work, and most likely require less work, i feel like many of the things required for option 2 will be implemented eventually anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2501) [py] Remove the need to specify types for transformations
[ https://issues.apache.org/jira/browse/FLINK-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681911#comment-14681911 ] Chesnay Schepler commented on FLINK-2501: - yes. modified (2) would mean that you have a Tuple2TupleXbyte[],..., byte[]. [py] Remove the need to specify types for transformations - Key: FLINK-2501 URL: https://issues.apache.org/jira/browse/FLINK-2501 Project: Flink Issue Type: Improvement Components: Python API Reporter: Chesnay Schepler Currently, users of the Python API have to provide type arguments when using a UDF, like so: {code} d1.map(Mapper(), (INT, STRING)) {code} Instead, it would be really convenient to be able to do this: {code} d1.map(Mapper()) {code} The intention behind this issue is convenience, and it's also not really pythonic to specify types. Before I'll go into possible solutions, let me summarize the way these type arguments are currently used, and in general how types are handled: The type argument passed is actually an object of the type it represents, as INT is a constant int value, whereas STRING is a constant string value. You could as well write the following and it would still work. {code} d1.map(Mapper(), (1, ImNotATypInfo)) {code} This object is transmitted to the java side during the plan binding (and is now an actual Tuple2Integer, String), then passed to the type extractor, and the resulting TypeInformation saved in the java counterpart of the udf, which all implement the ResultTypeQueryable interface. The TypeInformation object is only used by the Java API, python never touches it. Instead, at runtime, the serializers used between python and java check the classes of the values passed and are thus generated dynamically. This means that, if a UDF does not pass the type it claims to pass, the Python API wont complain, but the underlying java API will when it's serializers fail. Now let's talk solutions. In discussions on the mailing list, pretty much 2 proposals were made: # Add a way to disable/circumvent type checks during the plan phase in the Java API and generate serializers dynamically. # Have objects always in serialized form on the java side, stored in a single bytearray or Tuple2 containing a key/value pair. These proposals vary wildly in the changes necessary to the system: # How can we change the Java API to support this? This proposal would hardly change the way the Python API works, or even touch the related source code. It mostly deals with the Java API. Since I'm not to familiar with the Plan processing life-cycle on the java side I can't assess which classes would have to be changed. # How can we make this work within the limits of the Java API? is the exact opposite, it changes nothing in the Java API. Instead, the following issues would have to be solved: * Alter the plan to extract keys before keyed operations, while hiding these keys from the UDF. This is exactly how KeySelectors (will) work, and as such is generally solved. In fact, this solution would make a few things easier in regards to KeySelectors. * Rework all operations that currently rely on Java API functions, that need deserialized data, for example Projections or the upcoming Aggregations; This generally means implementing them in python, or with special java UDF's (they could de-/serialize data within the udf call, or work on serialized data). * Change (De)Serializers accordingly * implement a reliable, not all-memory-consuming sorting mechanism on the python side Personally i prefer the second option, as it # does not modify the Java API, it works within it's well-tested limits # Plan changes are similar to issues that are already worked on (KeySelectors) # Sorting implementation was necessary anyway (for chained reducers) # having data in serialized form was a performance-related consideration already While the first option could work, and most likely require less work, i feel like many of the things required for option 2 will be implemented eventually anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket
[ https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681802#comment-14681802 ] ASF GitHub Bot commented on FLINK-2490: --- Github user HuangWHWHW commented on the pull request: https://github.com/apache/flink/pull/992#issuecomment-129875331 @mxm Ok, I`ll add a test. There is a little difficult that I can`t get the retry times in test since the retry is a local variable. So can I add a function to get the retry times? Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket --- Key: FLINK-2490 URL: https://issues.apache.org/jira/browse/FLINK-2490 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.10 Reporter: Huang Wei Priority: Minor Fix For: 0.10 Original Estimate: 168h Remaining Estimate: 168h -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2423] [streaming] ITCase for checkpoint...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/980 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2499) start-cluster.sh can start multiple TaskManager on the same node
[ https://issues.apache.org/jira/browse/FLINK-2499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681926#comment-14681926 ] Chen He commented on FLINK-2499: +1, not only warning, we need to track each started TM so that we can stop corresponding TM when we run stop-cluster.sh. Another problem is using nohup command. Multiple TM using nohup, their output interleaving each other, it is not a good way to debug. We should redirect to TM.out instead of nohup.out. It also affect other nohup processes because they all dump stdout to nohup.out start-cluster.sh can start multiple TaskManager on the same node Key: FLINK-2499 URL: https://issues.apache.org/jira/browse/FLINK-2499 Project: Flink Issue Type: Bug Affects Versions: 0.8.1 Reporter: Chen He 11562 JobHistoryServer 3251 Main 10596 Jps 17934 RunJar 6879 Main 8837 Main 19215 RunJar 28902 DataNode 6627 TaskManager 642 NodeManager 10408 RunJar 10210 TaskManager 5067 TaskManager 357 ApplicationHistoryServer 3540 RunJar 28501 ResourceManager 28572 SecondaryNameNode 17630 QuorumPeerMain 9069 TaskManager If we keep execute the start-cluster.sh, it may generate infinite TaskManagers in a single system. And the nohup command in the start-cluster.sh can generate nohup.out file that disturb any other nohup processes in the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2423) Properly test checkpoint notifications
[ https://issues.apache.org/jira/browse/FLINK-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Márton Balassi closed FLINK-2423. - Resolution: Implemented Fix Version/s: 0.10 Via 10ce2e2. Properly test checkpoint notifications -- Key: FLINK-2423 URL: https://issues.apache.org/jira/browse/FLINK-2423 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Gyula Fora Assignee: Márton Balassi Fix For: 0.10 Checkpoint notifications (via the CheckpointNotifier interface) are currently not properly tested. A test should be included to verify that checkpoint notifications are eventually called on successful checkpoints, and that they are only called once per checkpointID. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2510) KafkaConnector should access partition metadata from master/cluster
[ https://issues.apache.org/jira/browse/FLINK-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681994#comment-14681994 ] Stephan Ewen commented on FLINK-2510: - Currently blocked by [FLINK-2386] KafkaConnector should access partition metadata from master/cluster --- Key: FLINK-2510 URL: https://issues.apache.org/jira/browse/FLINK-2510 Project: Flink Issue Type: Improvement Components: Streaming Connectors Affects Versions: 0.10 Reporter: Stephan Ewen Priority: Minor Currently, the Kafka connector assumes that it can access the partition metadata from the client. There may be setups where this is not possible, and where the access needs to happen from within the cluster (due to firewalls). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found
[ https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681990#comment-14681990 ] ASF GitHub Bot commented on FLINK-2509: --- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/1008#issuecomment-129947817 This is great and will be very helpful while debugging! Only the output is very hard to read. How about this? Just some new lines and indention: ``` Cannot load user class: com.foo.bar.SomeClass ClassLoader info: URL ClassLoader: url: 'http://localhost:26712/some/file/path' file: '/tmp/flink-url-test2825910185710886145.tmp' (valid JAR) file: '/tmp/flink-url-test4914469684250387785.tmp' (invalid JAR: error in opening zip file) file: '/tmp/flink-url-test5462325335282720117.tmp' (missing) ``` I think users are less prone to read Exception messages when they have to scroll horizontally. We usually don't include newlines in our Exceptions. Improve error messages when user code classes are not found --- Key: FLINK-2509 URL: https://issues.apache.org/jira/browse/FLINK-2509 Project: Flink Issue Type: Bug Components: Core Affects Versions: 0.10 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.10 When a job fails because the user code classes are not found, we should add some information about the class loader and class path into the exception message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2509] Add class loader info to exceptio...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/1008#issuecomment-129947817 This is great and will be very helpful while debugging! Only the output is very hard to read. How about this? Just some new lines and indention: ``` Cannot load user class: com.foo.bar.SomeClass ClassLoader info: URL ClassLoader: url: 'http://localhost:26712/some/file/path' file: '/tmp/flink-url-test2825910185710886145.tmp' (valid JAR) file: '/tmp/flink-url-test4914469684250387785.tmp' (invalid JAR: error in opening zip file) file: '/tmp/flink-url-test5462325335282720117.tmp' (missing) ``` I think users are less prone to read Exception messages when they have to scroll horizontally. We usually don't include newlines in our Exceptions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-2510) KafkaConnector should access partition metadata from master/cluster
Stephan Ewen created FLINK-2510: --- Summary: KafkaConnector should access partition metadata from master/cluster Key: FLINK-2510 URL: https://issues.apache.org/jira/browse/FLINK-2510 Project: Flink Issue Type: Improvement Components: Streaming Connectors Affects Versions: 0.10 Reporter: Stephan Ewen Priority: Minor Currently, the Kafka connector assumes that it can access the partition metadata from the client. There may be setups where this is not possible, and where the access needs to happen from within the cluster (due to firewalls). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2509] Add class loader info to exceptio...
Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1008#issuecomment-129955067 Good comment, will adjust this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found
[ https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14682030#comment-14682030 ] ASF GitHub Bot commented on FLINK-2509: --- Github user StephanEwen commented on the pull request: https://github.com/apache/flink/pull/1008#issuecomment-129955067 Good comment, will adjust this. Improve error messages when user code classes are not found --- Key: FLINK-2509 URL: https://issues.apache.org/jira/browse/FLINK-2509 Project: Flink Issue Type: Bug Components: Core Affects Versions: 0.10 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.10 When a job fails because the user code classes are not found, we should add some information about the class loader and class path into the exception message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2451] [gelly] examples and library clea...
Github user andralungu commented on the pull request: https://github.com/apache/flink/pull/1000#issuecomment-130002982 Apart from the minor comment, this looks good to merge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2451] [gelly] examples and library clea...
Github user andralungu commented on a diff in the pull request: https://github.com/apache/flink/pull/1000#discussion_r36779494 --- Diff: flink-staging/flink-gelly/src/test/java/org/apache/flink/graph/test/GatherSumApplyITCase.java --- @@ -86,26 +72,35 @@ public void testConnectedComponents() throws Exception { @Test public void testSingleSourceShortestPaths() throws Exception { - GSASingleSourceShortestPaths.main(new String[]{1, edgesPath, resultPath, 16}); - expectedResult = 1 0.0\n + - 2 12.0\n + - 3 13.0\n + - 4 47.0\n + - 5 48.0\n + - 6 Infinity\n + - 7 Infinity\n; + final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); + + GraphLong, Double, Double inputGraph = Graph.fromDataSet( + SingleSourceShortestPathsData.getDefaultEdgeDataSet(env), + new InitMapperSSSP(), env); + +ListVertexLong, Double result = inputGraph.run(new GSASingleSourceShortestPathsLong(1l, 16)) + .getVertices().collect(); --- End diff -- Is there a specific reason for which we decided to use `collect()` in the tests? It does not seem to be a consistency issue. The other tests (in flink) are still using `compareResultsByLinesInMemory()`. Do we gain anything? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [Flink-Gelly] [example] added missing assumpti...
Github user andralungu commented on the pull request: https://github.com/apache/flink/pull/883#issuecomment-130003492 I guess the only request that was not fulfilled for this PR was to squash the commits. Then it should be fine. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2451) Cleanup Gelly examples
[ https://issues.apache.org/jira/browse/FLINK-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14682220#comment-14682220 ] ASF GitHub Bot commented on FLINK-2451: --- Github user andralungu commented on a diff in the pull request: https://github.com/apache/flink/pull/1000#discussion_r36779494 --- Diff: flink-staging/flink-gelly/src/test/java/org/apache/flink/graph/test/GatherSumApplyITCase.java --- @@ -86,26 +72,35 @@ public void testConnectedComponents() throws Exception { @Test public void testSingleSourceShortestPaths() throws Exception { - GSASingleSourceShortestPaths.main(new String[]{1, edgesPath, resultPath, 16}); - expectedResult = 1 0.0\n + - 2 12.0\n + - 3 13.0\n + - 4 47.0\n + - 5 48.0\n + - 6 Infinity\n + - 7 Infinity\n; + final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); + + GraphLong, Double, Double inputGraph = Graph.fromDataSet( + SingleSourceShortestPathsData.getDefaultEdgeDataSet(env), + new InitMapperSSSP(), env); + +ListVertexLong, Double result = inputGraph.run(new GSASingleSourceShortestPathsLong(1l, 16)) + .getVertices().collect(); --- End diff -- Is there a specific reason for which we decided to use `collect()` in the tests? It does not seem to be a consistency issue. The other tests (in flink) are still using `compareResultsByLinesInMemory()`. Do we gain anything? Cleanup Gelly examples -- Key: FLINK-2451 URL: https://issues.apache.org/jira/browse/FLINK-2451 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 0.10 Reporter: Vasia Kalavri Assignee: Vasia Kalavri Priority: Minor As per discussion in the dev@ mailing list, this issue proposes the following changes to the Gelly examples and library: 1. Keep the following examples as they are: EuclideanGraphWeighing, GraphMetrics, IncrementalSSSP, JaccardSimilarity, MusicProfiles. 2. Keep only 1 example to show how to use library methods. 3. Add 1 example for vertex-centric iterations. 4. Keep 1 example for GSA iterations and move the redundant GSA implementations to the library. 5. Improve the examples documentation and refer to the functionality that each of them demonstrates. 6. Port and modify existing example tests accordingly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1520]Read edges and vertices from CSV f...
Github user andralungu commented on the pull request: https://github.com/apache/flink/pull/847#issuecomment-130014300 Saw this I will also update the documentation afterwards... Sorry! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---