[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14687386#comment-14687386
 ] 

ASF GitHub Bot commented on FLINK-1819:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130078564
  
Okay. I'll hold off on making any changes right now. We should reach at a 
consensus first.


 Allow access to RuntimeContext from Input and OutputFormats
 ---

 Key: FLINK-1819
 URL: https://issues.apache.org/jira/browse/FLINK-1819
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.9, 0.8.1
Reporter: Fabian Hueske
Assignee: Sachin Goel
Priority: Minor
 Fix For: 0.9


 User function that extend a RichFunction can access a {{RuntimeContext}} 
 which gives the parallel id of the task and access to Accumulators and 
 BroadcastVariables. 
 Right now, Input and OutputFormats cannot access their {{RuntimeContext}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-11 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-130078564
  
Okay. I'll hold off on making any changes right now. We should reach at a 
consensus first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2512) Add client.close() before throw RuntimeException

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14692814#comment-14692814
 ] 

ASF GitHub Bot commented on FLINK-2512:
---

GitHub user ffbin opened a pull request:

https://github.com/apache/flink/pull/1009

[FLINK-2512]Add client.close() before throw RuntimeException

In line 129, it close client in finally{} before throw exception.
But in line 105, it throw exception without close client.
So I think it is better to close client before throw RuntimeException.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ffbin/flink FLINK-2512

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1009.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1009


commit 7510fdb6a8a082d8be85ba7968e9e2760edf2af0
Author: ffbin 869218...@qq.com
Date:   2015-08-12T02:06:21Z

[FLINK-2512]Add client.close() before throw RuntimeException




 Add client.close() before throw RuntimeException
 

 Key: FLINK-2512
 URL: https://issues.apache.org/jira/browse/FLINK-2512
 Project: Flink
  Issue Type: Bug
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: fangfengbin
Assignee: fangfengbin
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2506) HBase connection closing down (table distributed over more than 1 region server - Flink Cluster-Mode)

2015-08-11 Thread Lydia Ickler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lydia Ickler updated FLINK-2506:

Description: 
If I fill a default table (create 'test-table', 'someCf') with the 
HBaseWriteExample.java program from the HBase addon library then a table 
without start/end key is created. 
The data reading works great with the HBaseReadExample.java.

Nevertheless, if I manually create a test-table that is distributed over more 
than one region server (create 'test-table2', 'someCf', 
{NUMREGIONS=3,SPLITALGO='HexStringSplit'}) the run is canceled with the 
following error message: 

grips2
Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
attempts=35, exceptions:
Fri Aug 07 11:18:29 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:18:38 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:18:48 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:18:58 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:08 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:18 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:28 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:38 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:48 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:58 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:20:18 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:20:38 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:20:58 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:21:19 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:21:39 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:21:59 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:22:19 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:22:39 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:22:59 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:23:19 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:23:39 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:24:00 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:24:20 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:24:40 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:25:00 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:25:20 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:25:40 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:26:00 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:26:20 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:26:40 CEST 2015, 

[jira] [Commented] (FLINK-2077) Rework Path class and add extend support for Windows paths

2015-08-11 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681419#comment-14681419
 ] 

Fabian Hueske commented on FLINK-2077:
--

Hi [~gallenvara_bg], thanks for picking this issue.
I'll assign it to you.

Let me know if you have questions.

 Rework Path class and add extend support for Windows paths
 --

 Key: FLINK-2077
 URL: https://issues.apache.org/jira/browse/FLINK-2077
 Project: Flink
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.9
Reporter: Fabian Hueske
Priority: Minor
  Labels: starter

 The class {{org.apache.flink.core.fs.Path}} handles paths for Flink's 
 {{FileInputFormat}} and {{FileOutputFormat}}. Over time, this class has 
 become quite hard to read and modify. 
 It would benefit from some cleaning and refactoring. Along with the 
 refactoring, support for Windows paths like {{//host/dir1/dir2}} could be 
 added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1901) Create sample operator for Dataset

2015-08-11 Thread Theodore Vasiloudis (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Theodore Vasiloudis updated FLINK-1901:
---
Description: 
In order to be able to implement Stochastic Gradient Descent and a number of 
other machine learning algorithms we need to have a way to take a random sample 
from a Dataset.

We need to be able to sample with or without replacement from the Dataset, 
choose the relative or exact size of the sample, set a seed for 
reproducibility, and support sampling within iterations.

  was:
In order to be able to implement Stochastic Gradient Descent and a number of 
other machine learning algorithms we need to have a way to take a random sample 
from a Dataset.

We need to be able to sample with or without replacement from the Dataset, 
choose the relative size of the sample, and set a seed for reproducibility.


 Create sample operator for Dataset
 --

 Key: FLINK-1901
 URL: https://issues.apache.org/jira/browse/FLINK-1901
 Project: Flink
  Issue Type: Improvement
  Components: Core
Reporter: Theodore Vasiloudis
Assignee: Chengxiang Li

 In order to be able to implement Stochastic Gradient Descent and a number of 
 other machine learning algorithms we need to have a way to take a random 
 sample from a Dataset.
 We need to be able to sample with or without replacement from the Dataset, 
 choose the relative or exact size of the sample, set a seed for 
 reproducibility, and support sampling within iterations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2277) In Scala API delta Iterations can not be set to unmanaged

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681562#comment-14681562
 ] 

ASF GitHub Bot commented on FLINK-2277:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1005#issuecomment-129805238
  
I like the change, but I would like to change it to an overloaded method, 
rather than adding a parameter with default value.

In the current way, it breaks the binary API compatibility. With an 
overloaded method, it would not break it.


 In Scala API delta Iterations can not be set to unmanaged
 -

 Key: FLINK-2277
 URL: https://issues.apache.org/jira/browse/FLINK-2277
 Project: Flink
  Issue Type: Improvement
  Components: Scala API
Reporter: Aljoscha Krettek
Assignee: PJ Van Aeken
  Labels: Starter

 DeltaIteration.java has method solutionSetUnManaged(). In the Scala API this 
 could be added as an optional parameter on iterateDelta() on the DataSet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2277] In Scala API delta Iterations can...

2015-08-11 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1005#issuecomment-129805238
  
I like the change, but I would like to change it to an overloaded method, 
rather than adding a parameter with default value.

In the current way, it breaks the binary API compatibility. With an 
overloaded method, it would not break it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2499) start-cluster.sh can start multiple TaskManager on the same node

2015-08-11 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681438#comment-14681438
 ] 

Maximilian Michels commented on FLINK-2499:
---

{{start-cluster.sh}} only starts a task manager when executed more than once. 
We should give the user a warning that an existing cluster is still running. 
Currently, it says

{noformat}
Starting jobmanager daemon on host dataslave.
Starting taskmanager daemon on host dataslave.
{noformat}

However, it only starts the task manager because the new job manager fails and 
quits afterwards.

 start-cluster.sh can start multiple TaskManager on the same node
 

 Key: FLINK-2499
 URL: https://issues.apache.org/jira/browse/FLINK-2499
 Project: Flink
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Chen He

 11562 JobHistoryServer
 3251 Main
 10596 Jps
 17934 RunJar
 6879 Main
 8837 Main
 19215 RunJar
 28902 DataNode
 6627 TaskManager
 642 NodeManager
 10408 RunJar
 10210 TaskManager
 5067 TaskManager
 357 ApplicationHistoryServer
 3540 RunJar
 28501 ResourceManager
 28572 SecondaryNameNode
 17630 QuorumPeerMain
 9069 TaskManager
 If we keep execute the start-cluster.sh, it may generate infinite 
 TaskManagers in a single system.
 And the nohup command in the start-cluster.sh can generate nohup.out file 
 that disturb any other nohup processes in the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2503) Inconsistencies in FileInputFormat hierarchy

2015-08-11 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-2503:
--
Description: 
From a thread in the user mailing list (Invalid argument reading a file 
containing a Kryo object).

I think that there are some inconsistencies in the hierarchy of InputFormats.
The BinaryOutputFormat/TypeSerializerInputFormat should somehow inherit the 
behaviour of the FileInputFormat (so respect unsplittable and 
enumerateNestedFiles) while they doesn't take into account those flags.
Moreover in the TypeSerializerInputFormat there's a // TODO: fix this shit 
that maybe should be removed or fixed :)

Also maintaing aligned testForUnsplittable and decorateInputStream is somehow 
dangerous..
And maybe visibility for getBlockIndexForPosition should be changed to 
protected?

My need was to implement a TypeSerializerInputFormatRowBundle but to achieve 
that I had to make a lot of overrides..am I doing something wrong or are those 
inputFormat somehow to improve..? This is my IF code (remark: from the comment 
Copied from FileInputFormat (override TypeSerializerInputFormat) on the code 
is copied-and-pasted from FileInputFormat..thus MY code ends there):

{code:java}
public class RowBundleInputFormat extends TypeSerializerInputFormatRowBundle {

private static final long serialVersionUID = 1L;
private static final Logger LOG = 
LoggerFactory.getLogger(RowBundleInputFormat.class);

/** The fraction that the last split may be larger than the others. */
private static final float MAX_SPLIT_SIZE_DISCREPANCY = 1.1f;
private boolean objectRead;

public RowBundleInputFormat() {
super(new GenericTypeInfo(RowBundle.class));
unsplittable = true;
}

@Override
protected FSDataInputStream decorateInputStream(FSDataInputStream 
inputStream, FileInputSplit fileSplit) throws Throwable {
return inputStream;
}

@Override
protected boolean testForUnsplittable(FileStatus pathFile) {
return true;
}

@Override
public void open(FileInputSplit split) throws IOException {
super.open(split);
objectRead = false;
}

@Override
public boolean reachedEnd() throws IOException {
return this.objectRead;
}

@Override
public RowBundle nextRecord(RowBundle reuse) throws IOException {
RowBundle yourObject = super.nextRecord(reuse);
this.objectRead = true; // read only one object
return yourObject;
}

// ---
// Copied from FileInputFormat (overriding TypeSerializerInputFormat)
// ---
@Override
public FileInputSplit[] createInputSplits(int minNumSplits)
throws IOException {...}

private long addNestedFiles(Path path, ListFileStatus files, long 
length, boolean logExcludedFiles) throws IOException {...}

private int getBlockIndexForPosition(BlockLocation[] blocks, long 
offset, long halfSplitSize, int startIndex) { ... }

}
{code}

  was:
From a thread in the user mailing list (Invalid argument reading a file 
containing a Kryo object).

I think that there are some inconsistencies in the hierarchy of InputFormats.
The BinaryOutputFormat/TypeSerializerInputFormat should somehow inherit the 
behaviour of the FileInputFormat (so respect unsplittable and 
enumerateNestedFiles) while they doesn't take into account those flags.
Moreover in the TypeSerializerInputFormat there's a // TODO: fix this shit 
that maybe should be removed or fixed :)

Also maintaing aligned testForUnsplittable and decorateInputStream is somehow 
dangerous..
And maybe visibility for getBlockIndexForPosition should be changed to 
protected?

My need was to implement a TypeSerializerInputFormatRowBundle but to achieve 
that I had to make a lot of overrides..am I doing something wrong or are those 
inputFormat somehow to improve..? This is my IF code (remark: from the comment 
Copied from FileInputFormat (override TypeSerializerInputFormat) on the code 
is copied-and-pasted from FileInputFormat..thus MY code ends there):

public class RowBundleInputFormat extends TypeSerializerInputFormatRowBundle {

private static final long serialVersionUID = 1L;
private static final Logger LOG = 
LoggerFactory.getLogger(RowBundleInputFormat.class);

/** The fraction that the last split may be larger than the others. */
private static final float MAX_SPLIT_SIZE_DISCREPANCY = 1.1f;
private boolean objectRead;

public RowBundleInputFormat() {
super(new 

[GitHub] flink pull request: [FLINK-2477][Add]Add tests for SocketClientSin...

2015-08-11 Thread HuangWHWHW
Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/977#issuecomment-129785617
  
@StephanEwen 
Hi,
I fixed all of your reviews but the problem about one thread (for server 
and connection).
I just plan to do some complex test late.
So, if you think it isn`t necessary I`ll change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2477) Add test for SocketClientSink

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681494#comment-14681494
 ] 

ASF GitHub Bot commented on FLINK-2477:
---

Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/977#issuecomment-129785617
  
@StephanEwen 
Hi,
I fixed all of your reviews but the problem about one thread (for server 
and connection).
I just plan to do some complex test late.
So, if you think it isn`t necessary I`ll change.


 Add test for SocketClientSink
 -

 Key: FLINK-2477
 URL: https://issues.apache.org/jira/browse/FLINK-2477
 Project: Flink
  Issue Type: Test
  Components: Streaming
Affects Versions: 0.10
 Environment: win7 32bit;linux
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h

 Add some tests for SocketClientSink.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2357] Update Node installation instruct...

2015-08-11 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1006#issuecomment-129802503
  
Super, thank you!

Will merge this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1984] Integrate Flink with Apache Mesos

2015-08-11 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/948#issuecomment-129785986
  
Thanks for the reply at the mailing list. I will try out your PR this week 
and have a look at the code. Sorry for the delay. I needed to clear some more 
time, because it is a big addition. :) Thanks for all your effort.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1984) Integrate Flink with Apache Mesos

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681496#comment-14681496
 ] 

ASF GitHub Bot commented on FLINK-1984:
---

Github user uce commented on the pull request:

https://github.com/apache/flink/pull/948#issuecomment-129785986
  
Thanks for the reply at the mailing list. I will try out your PR this week 
and have a look at the code. Sorry for the delay. I needed to clear some more 
time, because it is a big addition. :) Thanks for all your effort.


 Integrate Flink with Apache Mesos
 -

 Key: FLINK-1984
 URL: https://issues.apache.org/jira/browse/FLINK-1984
 Project: Flink
  Issue Type: New Feature
  Components: New Components
Reporter: Robert Metzger
Priority: Minor
 Attachments: 251.patch


 There are some users asking for an integration of Flink into Mesos.
 There also is a pending pull request for adding Mesos support for Flink: 
 https://github.com/apache/flink/pull/251
 But the PR is insufficiently tested. I'll add the code of the pull request to 
 this JIRA in case somebody wants to pick it up in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2507) Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector

2015-08-11 Thread fangfengbin (JIRA)
fangfengbin created FLINK-2507:
--

 Summary: Rename the function tansformAndEmit in 
org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector
 Key: FLINK-2507
 URL: https://issues.apache.org/jira/browse/FLINK-2507
 Project: Flink
  Issue Type: Bug
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: fangfengbin
Assignee: fangfengbin
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2477) Add test for SocketClientSink

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681306#comment-14681306
 ] 

ASF GitHub Bot commented on FLINK-2477:
---

Github user HuangWHWHW commented on a diff in the pull request:

https://github.com/apache/flink/pull/977#discussion_r36714718
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/functions/SocketClientSinkTest.java
 ---
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.functions;
+
+import java.io.IOException;
+import java.net.Socket;
+
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.streaming.api.functions.sink.SocketClientSink;
+import org.apache.flink.streaming.util.serialization.SerializationSchema;
+
+import org.junit.Test;
+
+import static java.lang.Thread.sleep;
+import static org.junit.Assert.*;
+
+import java.io.BufferedReader;
+import java.io.InputStreamReader;
+import java.io.PrintWriter;
+import java.net.ServerSocket;
+
+/**
+ * Tests for the {@link 
org.apache.flink.streaming.api.functions.sink.SocketClientSink}.
+ */
+public class SocketClientSinkTest{
+
+   private final String host = 127.0.0.1;
+   private int port = ;
+   private String access;
+   public SocketServer.ServerThread th = null;
+
+   class SocketServer extends Thread {
+
+   private ServerSocket server = null;
+   private Socket sk = null;
+   private BufferedReader rdr = null;
+   private PrintWriter wtr = null;
+
+   private SocketServer(int port) {
+   while (port  0) {
--- End diff --

Sorry, you may ignore this reply.
I`ve got the picture.


 Add test for SocketClientSink
 -

 Key: FLINK-2477
 URL: https://issues.apache.org/jira/browse/FLINK-2477
 Project: Flink
  Issue Type: Test
  Components: Streaming
Affects Versions: 0.10
 Environment: win7 32bit;linux
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h

 Add some tests for SocketClientSink.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2477][Add]Add tests for SocketClientSin...

2015-08-11 Thread HuangWHWHW
Github user HuangWHWHW commented on a diff in the pull request:

https://github.com/apache/flink/pull/977#discussion_r36714718
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/functions/SocketClientSinkTest.java
 ---
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.functions;
+
+import java.io.IOException;
+import java.net.Socket;
+
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.streaming.api.functions.sink.SocketClientSink;
+import org.apache.flink.streaming.util.serialization.SerializationSchema;
+
+import org.junit.Test;
+
+import static java.lang.Thread.sleep;
+import static org.junit.Assert.*;
+
+import java.io.BufferedReader;
+import java.io.InputStreamReader;
+import java.io.PrintWriter;
+import java.net.ServerSocket;
+
+/**
+ * Tests for the {@link 
org.apache.flink.streaming.api.functions.sink.SocketClientSink}.
+ */
+public class SocketClientSinkTest{
+
+   private final String host = 127.0.0.1;
+   private int port = ;
+   private String access;
+   public SocketServer.ServerThread th = null;
+
+   class SocketServer extends Thread {
+
+   private ServerSocket server = null;
+   private Socket sk = null;
+   private BufferedReader rdr = null;
+   private PrintWriter wtr = null;
+
+   private SocketServer(int port) {
+   while (port  0) {
--- End diff --

Sorry, you may ignore this reply.
I`ve got the picture.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1901] [core] Create sample operator for...

2015-08-11 Thread thvasilo
Github user thvasilo commented on the pull request:

https://github.com/apache/flink/pull/949#issuecomment-129744713
  
Agreed, we can take a look at the optimized algorithm later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---



[jira] [Commented] (FLINK-2504) ExternalSortLargeRecordsITCase.testSortWithLongAndShortRecordsMixed failed spuriously

2015-08-11 Thread Sachin Goel (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681515#comment-14681515
 ] 

Sachin Goel commented on FLINK-2504:


Yes, I had discovered those before I sent a shout-out on IRC. I wanted to know 
if someone else had gotten a failed build on this test before re-opening the 
ticket.

 ExternalSortLargeRecordsITCase.testSortWithLongAndShortRecordsMixed failed 
 spuriously
 -

 Key: FLINK-2504
 URL: https://issues.apache.org/jira/browse/FLINK-2504
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann

 The test 
 {{ExternalSortLargeRecordsITCase.testSortWithLongAndShortRecordsMixed}} 
 failed in one of my Travis builds: 
 https://travis-ci.org/tillrohrmann/flink/jobs/74881883



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2506) HBase table that is distributed over more than 1 region server (Flink Cluster-Mode)

2015-08-11 Thread Lydia Ickler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lydia Ickler updated FLINK-2506:

Fix Version/s: (was: 0.10)
  Description: 
If I fill a default table (create 'test-table', 'someCf') with the 
HBaseWriteExample.java program from the HBase addon library then a table 
without start/end key is created. 
The data reading works great with the HBaseReadExample.java.

Nevertheless, if I manually create a test-table that is distributed over more 
than one region server (create 'test-table2', 'someCf', ) the run is canceled 
with the following error message: 

grips2
Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
attempts=35, exceptions:
Fri Aug 07 11:18:29 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:18:38 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:18:48 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:18:58 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:08 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:18 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:28 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:38 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:48 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:58 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:20:18 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:20:38 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:20:58 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:21:19 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:21:39 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:21:59 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:22:19 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:22:39 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:22:59 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:23:19 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:23:39 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:24:00 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:24:20 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:24:40 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:25:00 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:25:20 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:25:40 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:26:00 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:26:20 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:26:40 CEST 2015, 

[GitHub] flink pull request: [FLINK-1984] Integrate Flink with Apache Mesos

2015-08-11 Thread ankurcha
Github user ankurcha commented on the pull request:

https://github.com/apache/flink/pull/948#issuecomment-129765378
  
@StephanEwen Thanks for the pointer, I replied on the mailing list thread.

Any code-review comments for this pull request?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-2077) Rework Path class and add extend support for Windows paths

2015-08-11 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-2077:
-
Assignee: GaoLun

 Rework Path class and add extend support for Windows paths
 --

 Key: FLINK-2077
 URL: https://issues.apache.org/jira/browse/FLINK-2077
 Project: Flink
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.9
Reporter: Fabian Hueske
Assignee: GaoLun
Priority: Minor
  Labels: starter

 The class {{org.apache.flink.core.fs.Path}} handles paths for Flink's 
 {{FileInputFormat}} and {{FileOutputFormat}}. Over time, this class has 
 become quite hard to read and modify. 
 It would benefit from some cleaning and refactoring. Along with the 
 refactoring, support for Windows paths like {{//host/dir1/dir2}} could be 
 added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1984) Integrate Flink with Apache Mesos

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681423#comment-14681423
 ] 

ASF GitHub Bot commented on FLINK-1984:
---

Github user ankurcha commented on the pull request:

https://github.com/apache/flink/pull/948#issuecomment-129765378
  
@StephanEwen Thanks for the pointer, I replied on the mailing list thread.

Any code-review comments for this pull request?


 Integrate Flink with Apache Mesos
 -

 Key: FLINK-1984
 URL: https://issues.apache.org/jira/browse/FLINK-1984
 Project: Flink
  Issue Type: New Feature
  Components: New Components
Reporter: Robert Metzger
Priority: Minor
 Attachments: 251.patch


 There are some users asking for an integration of Flink into Mesos.
 There also is a pending pull request for adding Mesos support for Flink: 
 https://github.com/apache/flink/pull/251
 But the PR is insufficiently tested. I'll add the code of the pull request to 
 this JIRA in case somebody wants to pick it up in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2504) ExternalSortLargeRecordsITCase.testSortWithLongAndShortRecordsMixed failed spuriously

2015-08-11 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681475#comment-14681475
 ] 

Till Rohrmann commented on FLINK-2504:
--

[~StephanEwen] the output of travis is all I got. There is apparently a problem 
with the watchdog script for my repository which prevented the uploading.

[~sachingoel0101] your stack trace looks similar to FLINK-1455.

 ExternalSortLargeRecordsITCase.testSortWithLongAndShortRecordsMixed failed 
 spuriously
 -

 Key: FLINK-2504
 URL: https://issues.apache.org/jira/browse/FLINK-2504
 Project: Flink
  Issue Type: Bug
Reporter: Till Rohrmann

 The test 
 {{ExternalSortLargeRecordsITCase.testSortWithLongAndShortRecordsMixed}} 
 failed in one of my Travis builds: 
 https://travis-ci.org/tillrohrmann/flink/jobs/74881883



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2502) FiniteStormSpout documenation does not render correclty

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681563#comment-14681563
 ] 

ASF GitHub Bot commented on FLINK-2502:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1002#issuecomment-129805634
  
Looks good, will merge this...


 FiniteStormSpout documenation does not render correclty
 ---

 Key: FLINK-2502
 URL: https://issues.apache.org/jira/browse/FLINK-2502
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Trivial
 Attachments: screenshot.png


 The code examples do not render correctly, due to missing empty lines.
 
 !screenshot.png!
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2502] FiniteStormSpout documenation doe...

2015-08-11 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1002#issuecomment-129805634
  
Looks good, will merge this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-2506) HBase connection closing down (table distributed over more than 1 region server - Flink Cluster-Mode)

2015-08-11 Thread Lydia Ickler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lydia Ickler updated FLINK-2506:

Description: 
If I fill a default table (create 'test-table', 'someCf') with the 
HBaseWriteExample.java program from the HBase addon library then a table 
without start/end key is created. 
The data reading works great with the HBaseReadExample.java.

Nevertheless, if I manually create a test-table that is distributed over more 
than one region server (create 'test-table2', 'someCf',{NUMREGIONS = 
3,SPLITALGO = 'HexStringSplit'}) the run is canceled with the following error 
message: 

grips2
Error: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
attempts=35, exceptions:
Fri Aug 07 11:18:29 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:18:38 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:18:48 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:18:58 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:08 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:18 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:28 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:38 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:48 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:19:58 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:20:18 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:20:38 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:20:58 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:21:19 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:21:39 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:21:59 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:22:19 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:22:39 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:22:59 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:23:19 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:23:39 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:24:00 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:24:20 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:24:40 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:25:00 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:25:20 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:25:40 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:26:00 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:26:20 CEST 2015, 
org.apache.hadoop.hbase.client.RpcRetryingCaller@28961f68, java.io.IOException: 
hconnection-0x47bf79d7 closed
Fri Aug 07 11:26:40 CEST 2015, 

[jira] [Commented] (FLINK-2499) start-cluster.sh can start multiple TaskManager on the same node

2015-08-11 Thread Ufuk Celebi (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681507#comment-14681507
 ] 

Ufuk Celebi commented on FLINK-2499:


I agree with Max. We can print a warning or fail if the user starts a cluster 
and a JM or TM is already running. And we can also check whether the processes 
started up properly before continuing.

Before this was detected, because you couldn't start another TM instance etc.

 start-cluster.sh can start multiple TaskManager on the same node
 

 Key: FLINK-2499
 URL: https://issues.apache.org/jira/browse/FLINK-2499
 Project: Flink
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Chen He

 11562 JobHistoryServer
 3251 Main
 10596 Jps
 17934 RunJar
 6879 Main
 8837 Main
 19215 RunJar
 28902 DataNode
 6627 TaskManager
 642 NodeManager
 10408 RunJar
 10210 TaskManager
 5067 TaskManager
 357 ApplicationHistoryServer
 3540 RunJar
 28501 ResourceManager
 28572 SecondaryNameNode
 17630 QuorumPeerMain
 9069 TaskManager
 If we keep execute the start-cluster.sh, it may generate infinite 
 TaskManagers in a single system.
 And the nohup command in the start-cluster.sh can generate nohup.out file 
 that disturb any other nohup processes in the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2477) Add test for SocketClientSink

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681305#comment-14681305
 ] 

ASF GitHub Bot commented on FLINK-2477:
---

Github user HuangWHWHW commented on a diff in the pull request:

https://github.com/apache/flink/pull/977#discussion_r36714696
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/functions/SocketClientSinkTest.java
 ---
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.functions;
+
+import java.io.IOException;
+import java.net.Socket;
+
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.streaming.api.functions.sink.SocketClientSink;
+import org.apache.flink.streaming.util.serialization.SerializationSchema;
+
+import org.junit.Test;
+
+import static java.lang.Thread.sleep;
+import static org.junit.Assert.*;
+
+import java.io.BufferedReader;
+import java.io.InputStreamReader;
+import java.io.PrintWriter;
+import java.net.ServerSocket;
+
+/**
+ * Tests for the {@link 
org.apache.flink.streaming.api.functions.sink.SocketClientSink}.
+ */
+public class SocketClientSinkTest{
+
+   private final String host = 127.0.0.1;
+   private int port = ;
+   private String access;
+   public SocketServer.ServerThread th = null;
+
+   class SocketServer extends Thread {
+
+   private ServerSocket server = null;
+   private Socket sk = null;
+   private BufferedReader rdr = null;
+   private PrintWriter wtr = null;
+
+   private SocketServer(int port) {
+   while (port  0) {
+   try {
+   this.server = new ServerSocket(port);
+   break;
+   } catch (Exception e) {
+   --port;
+   if (port  0) {
+   continue;
+   }
+   else{
+   e.printStackTrace();
+   }
+   }
+   }
+   }
+
+   public void run() {
+   System.out.println(Listenning...);
+   try {
+   sk = server.accept();
--- End diff --

Sorry, you may ignore this reply.
I`ve got the picture.


 Add test for SocketClientSink
 -

 Key: FLINK-2477
 URL: https://issues.apache.org/jira/browse/FLINK-2477
 Project: Flink
  Issue Type: Test
  Components: Streaming
Affects Versions: 0.10
 Environment: win7 32bit;linux
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h

 Add some tests for SocketClientSink.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2477][Add]Add tests for SocketClientSin...

2015-08-11 Thread HuangWHWHW
Github user HuangWHWHW commented on a diff in the pull request:

https://github.com/apache/flink/pull/977#discussion_r36714696
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/functions/SocketClientSinkTest.java
 ---
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.functions;
+
+import java.io.IOException;
+import java.net.Socket;
+
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.streaming.api.functions.sink.SocketClientSink;
+import org.apache.flink.streaming.util.serialization.SerializationSchema;
+
+import org.junit.Test;
+
+import static java.lang.Thread.sleep;
+import static org.junit.Assert.*;
+
+import java.io.BufferedReader;
+import java.io.InputStreamReader;
+import java.io.PrintWriter;
+import java.net.ServerSocket;
+
+/**
+ * Tests for the {@link 
org.apache.flink.streaming.api.functions.sink.SocketClientSink}.
+ */
+public class SocketClientSinkTest{
+
+   private final String host = 127.0.0.1;
+   private int port = ;
+   private String access;
+   public SocketServer.ServerThread th = null;
+
+   class SocketServer extends Thread {
+
+   private ServerSocket server = null;
+   private Socket sk = null;
+   private BufferedReader rdr = null;
+   private PrintWriter wtr = null;
+
+   private SocketServer(int port) {
+   while (port  0) {
+   try {
+   this.server = new ServerSocket(port);
+   break;
+   } catch (Exception e) {
+   --port;
+   if (port  0) {
+   continue;
+   }
+   else{
+   e.printStackTrace();
+   }
+   }
+   }
+   }
+
+   public void run() {
+   System.out.println(Listenning...);
+   try {
+   sk = server.accept();
--- End diff --

Sorry, you may ignore this reply.
I`ve got the picture.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2506) HBase table that is distributed over more than 1 region server

2015-08-11 Thread Lydia Ickler (JIRA)
Lydia Ickler created FLINK-2506:
---

 Summary: HBase table that is distributed over more than 1 region 
server
 Key: FLINK-2506
 URL: https://issues.apache.org/jira/browse/FLINK-2506
 Project: Flink
  Issue Type: Bug
  Components: Examples
Affects Versions: 0.10
Reporter: Lydia Ickler
 Fix For: 0.10






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2502) FiniteStormSpout documenation does not render correclty

2015-08-11 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-2502.
---

 FiniteStormSpout documenation does not render correclty
 ---

 Key: FLINK-2502
 URL: https://issues.apache.org/jira/browse/FLINK-2502
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Trivial
 Fix For: 0.10

 Attachments: screenshot.png


 The code examples do not render correctly, due to missing empty lines.
 
 !screenshot.png!
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2500][Streaming]remove a unwanted if ...

2015-08-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1001


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2277) In Scala API delta Iterations can not be set to unmanaged

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681626#comment-14681626
 ] 

ASF GitHub Bot commented on FLINK-2277:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1005


 In Scala API delta Iterations can not be set to unmanaged
 -

 Key: FLINK-2277
 URL: https://issues.apache.org/jira/browse/FLINK-2277
 Project: Flink
  Issue Type: Improvement
  Components: Scala API
Reporter: Aljoscha Krettek
Assignee: PJ Van Aeken
  Labels: Starter

 DeltaIteration.java has method solutionSetUnManaged(). In the Scala API this 
 could be added as an optional parameter on iterateDelta() on the DataSet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-2502) FiniteStormSpout documenation does not render correclty

2015-08-11 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-2502.
-
   Resolution: Fixed
Fix Version/s: 0.10

Fixed via 5bb855bac7441701495ce47db7ba03ab0e0c6963

 FiniteStormSpout documenation does not render correclty
 ---

 Key: FLINK-2502
 URL: https://issues.apache.org/jira/browse/FLINK-2502
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Trivial
 Fix For: 0.10

 Attachments: screenshot.png


 The code examples do not render correctly, due to missing empty lines.
 
 !screenshot.png!
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2386) Implement Kafka connector using the new Kafka Consumer API

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681686#comment-14681686
 ] 

ASF GitHub Bot commented on FLINK-2386:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/996#issuecomment-129849082
  
I'll take a stab at checking out this monster ;-)


 Implement Kafka connector using the new Kafka Consumer API
 --

 Key: FLINK-2386
 URL: https://issues.apache.org/jira/browse/FLINK-2386
 Project: Flink
  Issue Type: Improvement
  Components: Kafka Connector
Reporter: Robert Metzger
Assignee: Robert Metzger

 Once Kafka has released its new consumer API, we should provide a connector 
 for that version.
 The release will probably be called 0.9 or 0.8.3.
 The connector will be mostly compatible with Kafka 0.8.2.x, except for 
 committing offsets to the broker (the new connector expects a coordinator to 
 be available on Kafka). To work around that, we can provide a configuration 
 option to commit offsets to zookeeper (managed by flink code).
 For 0.9/0.8.3 it will be fully compatible.
 It will not be compatible with 0.8.1 because of mismatching Kafka messages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2508) Confusing sharing of StreamExecutionEnvironment

2015-08-11 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2508:
---

 Summary: Confusing sharing of StreamExecutionEnvironment
 Key: FLINK-2508
 URL: https://issues.apache.org/jira/browse/FLINK-2508
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.10
Reporter: Stephan Ewen
 Fix For: 0.10


In the {{StreamExecutionEnvironment}}, the environment is once created and then 
shared with a static variable to all successive calls to 
{{getExecutionEnvironment()}}. But it can be overridden by calls to 
{{createLocalEnvironment()}} and {{createRemoteEnvironment()}}.

This seems a bit un-intuitive, and probably creates confusion when dispatching 
multiple streaming jobs from within the same JVM.

Why is it even necessary to cache the current execution environment?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2277) In Scala API delta Iterations can not be set to unmanaged

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681580#comment-14681580
 ] 

ASF GitHub Bot commented on FLINK-2277:
---

Github user PieterJanVanAeken commented on the pull request:

https://github.com/apache/flink/pull/1005#issuecomment-129817057
  
I changed the implementation to overloaded methods and I added one for 
integer based keyFields, a method I overlooked in my previous commit.


 In Scala API delta Iterations can not be set to unmanaged
 -

 Key: FLINK-2277
 URL: https://issues.apache.org/jira/browse/FLINK-2277
 Project: Flink
  Issue Type: Improvement
  Components: Scala API
Reporter: Aljoscha Krettek
Assignee: PJ Van Aeken
  Labels: Starter

 DeltaIteration.java has method solutionSetUnManaged(). In the Scala API this 
 could be added as an optional parameter on iterateDelta() on the DataSet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2277] In Scala API delta Iterations can...

2015-08-11 Thread PieterJanVanAeken
Github user PieterJanVanAeken commented on the pull request:

https://github.com/apache/flink/pull/1005#issuecomment-129817057
  
I changed the implementation to overloaded methods and I added one for 
integer based keyFields, a method I overlooked in my previous commit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2500) Some reviews to improve code quality

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681629#comment-14681629
 ] 

ASF GitHub Bot commented on FLINK-2500:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1001


 Some reviews to improve code quality
 

 Key: FLINK-2500
 URL: https://issues.apache.org/jira/browse/FLINK-2500
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Affects Versions: 0.10
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 672h
  Remaining Estimate: 672h

 I reviewed the Streaming module and there are some suggestions to improve the 
 code quality(.i.e reduce the complexity of the loop, fix memory leak...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2357) New JobManager Runtime Web Frontend

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681628#comment-14681628
 ] 

ASF GitHub Bot commented on FLINK-2357:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1006


 New JobManager Runtime Web Frontend
 ---

 Key: FLINK-2357
 URL: https://issues.apache.org/jira/browse/FLINK-2357
 Project: Flink
  Issue Type: New Feature
  Components: Webfrontend
Affects Versions: 0.10
Reporter: Stephan Ewen
 Attachments: Webfrontend Mockup.pdf


 We need to improve rework the Job Manager Web Frontend.
 The current web frontend is limited and has a lot of design issues
   - It does not display and progress while operators are running. This is 
 especially problematic for streaming jobs
   - It has no graph representation of the data flows
   - it does not allow to look into execution attempts
   - it has no hook to deal with the upcoming live accumulators
   - The architecture is not very modular/extensible
 I propose to add a new JobManager web frontend:
   - Based on Netty HTTP (very lightweight)
   - Using rest-style URLs for jobs and vertices
   - integrating the D3 graph renderer of the previews with the runtime monitor
   - with details on execution attempts
   - first class visualization of records processed and bytes processed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2277] In Scala API delta Iterations can...

2015-08-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1005


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2502] FiniteStormSpout documenation doe...

2015-08-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1002


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2357] Update Node installation instruct...

2015-08-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1006


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2502) FiniteStormSpout documenation does not render correclty

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681627#comment-14681627
 ] 

ASF GitHub Bot commented on FLINK-2502:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1002


 FiniteStormSpout documenation does not render correclty
 ---

 Key: FLINK-2502
 URL: https://issues.apache.org/jira/browse/FLINK-2502
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Trivial
 Attachments: screenshot.png


 The code examples do not render correctly, due to missing empty lines.
 
 !screenshot.png!
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-11 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-129849477
  
Yes, existing formats should converted to rich formats. The name 
`Abstract..` makes more sense if it becomes the new default way of interfacing 
with the rich input/output formats. I still think, calling it `Rich...` makes 
more sense in terms of consistency with the user-defined functions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681689#comment-14681689
 ] 

ASF GitHub Bot commented on FLINK-1819:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-129849477
  
Yes, existing formats should converted to rich formats. The name 
`Abstract..` makes more sense if it becomes the new default way of interfacing 
with the rich input/output formats. I still think, calling it `Rich...` makes 
more sense in terms of consistency with the user-defined functions.


 Allow access to RuntimeContext from Input and OutputFormats
 ---

 Key: FLINK-1819
 URL: https://issues.apache.org/jira/browse/FLINK-1819
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.9, 0.8.1
Reporter: Fabian Hueske
Assignee: Sachin Goel
Priority: Minor
 Fix For: 0.9


 User function that extend a RichFunction can access a {{RuntimeContext}} 
 which gives the parallel id of the task and access to Accumulators and 
 BroadcastVariables. 
 Right now, Input and OutputFormats cannot access their {{RuntimeContext}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2500) Some reviews to improve code quality

2015-08-11 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-2500.
---

 Some reviews to improve code quality
 

 Key: FLINK-2500
 URL: https://issues.apache.org/jira/browse/FLINK-2500
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Affects Versions: 0.10
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 672h
  Remaining Estimate: 672h

 I reviewed the Streaming module and there are some suggestions to improve the 
 code quality(.i.e reduce the complexity of the loop, fix memory leak...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-2500) Some reviews to improve code quality

2015-08-11 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-2500.
-
Resolution: Fixed

Fixed via b42fbf7a81c5b57dcf9760825edb175ffd944fb2

Thank you for the contribution!

 Some reviews to improve code quality
 

 Key: FLINK-2500
 URL: https://issues.apache.org/jira/browse/FLINK-2500
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Affects Versions: 0.10
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 672h
  Remaining Estimate: 672h

 I reviewed the Streaming module and there are some suggestions to improve the 
 code quality(.i.e reduce the complexity of the loop, fix memory leak...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2277] In Scala API delta Iterations can...

2015-08-11 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1005#issuecomment-129831692
  
Thanks!

I know it is extra code, but allowing existing code to continue running 
without recompile is something that people really appreciate.

Merging this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2277) In Scala API delta Iterations can not be set to unmanaged

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681600#comment-14681600
 ] 

ASF GitHub Bot commented on FLINK-2277:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1005#issuecomment-129831692
  
Thanks!

I know it is extra code, but allowing existing code to continue running 
without recompile is something that people really appreciate.

Merging this...


 In Scala API delta Iterations can not be set to unmanaged
 -

 Key: FLINK-2277
 URL: https://issues.apache.org/jira/browse/FLINK-2277
 Project: Flink
  Issue Type: Improvement
  Components: Scala API
Reporter: Aljoscha Krettek
Assignee: PJ Van Aeken
  Labels: Starter

 DeltaIteration.java has method solutionSetUnManaged(). In the Scala API this 
 could be added as an optional parameter on iterateDelta() on the DataSet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681603#comment-14681603
 ] 

ASF GitHub Bot commented on FLINK-1819:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-129834217
  
Okay. I think maybe calling them `Rich` maybe an overkill. I will change 
the names to `Abstract`. 
Do we keep the existing formats *non-rich* or would it be okay to make all 
of them *rich* [which is the case right now]? The problem with not making them 
*rich* is that then we're limiting any extending classes to never be *rich*.


 Allow access to RuntimeContext from Input and OutputFormats
 ---

 Key: FLINK-1819
 URL: https://issues.apache.org/jira/browse/FLINK-1819
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.9, 0.8.1
Reporter: Fabian Hueske
Assignee: Sachin Goel
Priority: Minor
 Fix For: 0.9


 User function that extend a RichFunction can access a {{RuntimeContext}} 
 which gives the parallel id of the task and access to Accumulators and 
 BroadcastVariables. 
 Right now, Input and OutputFormats cannot access their {{RuntimeContext}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-2277) In Scala API delta Iterations can not be set to unmanaged

2015-08-11 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-2277.
-
   Resolution: Fixed
Fix Version/s: 0.10

Fixed via f50ae26a2fb4a0c7f5b390e2f0f5528be9f61730

Thank you for the contribution!

 In Scala API delta Iterations can not be set to unmanaged
 -

 Key: FLINK-2277
 URL: https://issues.apache.org/jira/browse/FLINK-2277
 Project: Flink
  Issue Type: Improvement
  Components: Scala API
Reporter: Aljoscha Krettek
Assignee: PJ Van Aeken
  Labels: Starter
 Fix For: 0.10


 DeltaIteration.java has method solutionSetUnManaged(). In the Scala API this 
 could be added as an optional parameter on iterateDelta() on the DataSet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681674#comment-14681674
 ] 

ASF GitHub Bot commented on FLINK-1819:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-129847518
  
I'm in favor of making the existing formats rich.


 Allow access to RuntimeContext from Input and OutputFormats
 ---

 Key: FLINK-1819
 URL: https://issues.apache.org/jira/browse/FLINK-1819
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.9, 0.8.1
Reporter: Fabian Hueske
Assignee: Sachin Goel
Priority: Minor
 Fix For: 0.9


 User function that extend a RichFunction can access a {{RuntimeContext}} 
 which gives the parallel id of the task and access to Accumulators and 
 BroadcastVariables. 
 Right now, Input and OutputFormats cannot access their {{RuntimeContext}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-11 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-129847518
  
I'm in favor of making the existing formats rich.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [WIP][FLINK-2386] Add new Kafka Consumers

2015-08-11 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/996#issuecomment-129849082
  
I'll take a stab at checking out this monster ;-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2477][Add]Add tests for SocketClientSin...

2015-08-11 Thread HuangWHWHW
Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/977#issuecomment-129854454
  
@StephanEwen 
Hi, I push a new fix about the exception.
Would you please to check whether it`s correct?
And there is another PR(https://github.com/apache/flink/pull/991) that the 
CI is failed.
Since I couldn`t see the CI detail yet, could you do me a favor having a 
look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2477) Add test for SocketClientSink

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681711#comment-14681711
 ] 

ASF GitHub Bot commented on FLINK-2477:
---

Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/977#issuecomment-129854454
  
@StephanEwen 
Hi, I push a new fix about the exception.
Would you please to check whether it`s correct?
And there is another PR(https://github.com/apache/flink/pull/991) that the 
CI is failed.
Since I couldn`t see the CI detail yet, could you do me a favor having a 
look?


 Add test for SocketClientSink
 -

 Key: FLINK-2477
 URL: https://issues.apache.org/jira/browse/FLINK-2477
 Project: Flink
  Issue Type: Test
  Components: Streaming
Affects Versions: 0.10
 Environment: win7 32bit;linux
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h

 Add some tests for SocketClientSink.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2477][Add]Add tests for SocketClientSin...

2015-08-11 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/977#issuecomment-129829151
  
Looks much better. Two more comments:

1. It would be good to print the exception in addition to failing the test, 
because otherwise the tests only fails and gives no indication why. A typical 
pattern is:
```java
try {
  // some stuff
} catch (Exception e) {
  e.printStackTrace();
  Assert.fail(e.getMessage());
}
```
Here, the test prints nothing when working properly, but prints the error 
when it fails.

2. `Assert.fail()` does not work when used in spawned threads. The reason 
is that JUnit communicates the failure with a special 
`AssertionFailedException`, which needs to reach the invoking framework. That 
does not happen if the `fail()` method is called in a spawned thread.

Here is how you can do it. It is a bit clumsy, because you need to forward 
the exception to the main thread, but it works well:
```java
final AtomicReferenceThrowable error = new AtomicReference();
Thread thread = new Thread(server thread) {
@Override
public void run() {
try {
doStuff();
}
catch (Throwable t) {
ref.set(t);
}
}
};
thread.start();

// do some test logic

thread.join();
if (error.get() != null) {
Throwable t = error.get();
t.printStackTrace();
fail(Error in spawned thread:  + t.getMessage());
}
```




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-11 Thread sachingoel0101
Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-129834217
  
Okay. I think maybe calling them `Rich` maybe an overkill. I will change 
the names to `Abstract`. 
Do we keep the existing formats *non-rich* or would it be okay to make all 
of them *rich* [which is the case right now]? The problem with not making them 
*rich* is that then we're limiting any extending classes to never be *rich*.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2507) Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681622#comment-14681622
 ] 

ASF GitHub Bot commented on FLINK-2507:
---

GitHub user ffbin opened a pull request:

https://github.com/apache/flink/pull/1007

[FLINK-2507]Rename the function tansformAndEmit

I think the function name tansformAndEmit in ' 
org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector' is a wrong 
spelling, it should be transformAndEmit is more better.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ffbin/flink FLINK-2507

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1007.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1007


commit 39058ceb9dd3e5baf96daf1a3d2a9bd050ebf702
Author: ffbin 869218...@qq.com
Date:   2015-08-11T10:42:39Z

[FLINK-2507]Rename the function tansformAndEmit




 Rename the function tansformAndEmit in 
 org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector
 --

 Key: FLINK-2507
 URL: https://issues.apache.org/jira/browse/FLINK-2507
 Project: Flink
  Issue Type: Bug
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: fangfengbin
Assignee: fangfengbin
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2507]Rename the function tansformAndEmi...

2015-08-11 Thread ffbin
GitHub user ffbin opened a pull request:

https://github.com/apache/flink/pull/1007

[FLINK-2507]Rename the function tansformAndEmit

I think the function name tansformAndEmit in ' 
org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector' is a wrong 
spelling, it should be transformAndEmit is more better.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ffbin/flink FLINK-2507

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1007.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1007


commit 39058ceb9dd3e5baf96daf1a3d2a9bd050ebf702
Author: ffbin 869218...@qq.com
Date:   2015-08-11T10:42:39Z

[FLINK-2507]Rename the function tansformAndEmit




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2507]Rename the function tansformAndEmi...

2015-08-11 Thread chiwanpark
Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/1007#issuecomment-129850783
  
Why the permissions of file are changed from 644 to 755? Other changes 
seems good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2507) Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681694#comment-14681694
 ] 

ASF GitHub Bot commented on FLINK-2507:
---

Github user chiwanpark commented on the pull request:

https://github.com/apache/flink/pull/1007#issuecomment-129850783
  
Why the permissions of file are changed from 644 to 755? Other changes 
seems good.


 Rename the function tansformAndEmit in 
 org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector
 --

 Key: FLINK-2507
 URL: https://issues.apache.org/jira/browse/FLINK-2507
 Project: Flink
  Issue Type: Bug
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: fangfengbin
Assignee: fangfengbin
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2507]Rename the function tansformAndEmi...

2015-08-11 Thread ffbin
Github user ffbin commented on the pull request:

https://github.com/apache/flink/pull/1007#issuecomment-129854098
  
@chiwanpark Thank you very much. I submit code in linux and modify the 
permissions Incorrectly.I will submit again,.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...

2015-08-11 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-129864531
  
@HuangWHWHW `retryForever` is just a convenience variable for `maxRetry  
0`. Your fix is correct because the loop will only execute if `maxRetry  0` 
and thus not execute at all if it should retry forever. It would be great if 
you added a test that checks for the correct number of retries. In case of 
infinite retries, just check up to a certain number (e.g. 100 retries).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681771#comment-14681771
 ] 

ASF GitHub Bot commented on FLINK-2490:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-129864531
  
@HuangWHWHW `retryForever` is just a convenience variable for `maxRetry  
0`. Your fix is correct because the loop will only execute if `maxRetry  0` 
and thus not execute at all if it should retry forever. It would be great if 
you added a test that checks for the correct number of retries. In case of 
infinite retries, just check up to a certain number (e.g. 100 retries).


 Remove unwanted boolean check in function 
 SocketTextStreamFunction.streamFromSocket
 ---

 Key: FLINK-2490
 URL: https://issues.apache.org/jira/browse/FLINK-2490
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.10
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681594#comment-14681594
 ] 

ASF GitHub Bot commented on FLINK-1819:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-129830222
  
Okay, so how about calling the abstract base class`AbstractInputFormat` and 
have it implement the runtime context and leaves the other methods abstract?


 Allow access to RuntimeContext from Input and OutputFormats
 ---

 Key: FLINK-1819
 URL: https://issues.apache.org/jira/browse/FLINK-1819
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.9, 0.8.1
Reporter: Fabian Hueske
Assignee: Sachin Goel
Priority: Minor
 Fix For: 0.9


 User function that extend a RichFunction can access a {{RuntimeContext}} 
 which gives the parallel id of the task and access to Accumulators and 
 BroadcastVariables. 
 Right now, Input and OutputFormats cannot access their {{RuntimeContext}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-11 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-129830222
  
Okay, so how about calling the abstract base class`AbstractInputFormat` and 
have it implement the runtime context and leaves the other methods abstract?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2423) Properly test checkpoint notifications

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681704#comment-14681704
 ] 

ASF GitHub Bot commented on FLINK-2423:
---

Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/980#issuecomment-129851909
  
Merging...


 Properly test checkpoint notifications
 --

 Key: FLINK-2423
 URL: https://issues.apache.org/jira/browse/FLINK-2423
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Gyula Fora
Assignee: Márton Balassi

 Checkpoint notifications (via the CheckpointNotifier interface) are currently 
 not properly tested. 
 A test should be included to verify that checkpoint notifications are 
 eventually called on successful checkpoints, and that they are only called 
 once per checkpointID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2423] [streaming] ITCase for checkpoint...

2015-08-11 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/980#issuecomment-129851909
  
Merging...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1819][core]Allow access to RuntimeConte...

2015-08-11 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-129826022
  
@StephanEwen @fhueske 
The `Rich` prefix might seem a bit odd but it is a naming convention that 
is consistent with the user-defined functions which have the RuntimeContext 
available.
As for `abstract classes vs interface` I agree with @sachingoel0101 that it 
makes sense to implement access to the RuntimeContext once for the user. We may 
also add other rich functions; if we implemented that using an interface we 
would have to change it later on or add another interface.

I think Input/Ouput formats should be rich by default. However, we might 
not want to break the API for existing external implementations.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1819) Allow access to RuntimeContext from Input and OutputFormats

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681584#comment-14681584
 ] 

ASF GitHub Bot commented on FLINK-1819:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/966#issuecomment-129826022
  
@StephanEwen @fhueske 
The `Rich` prefix might seem a bit odd but it is a naming convention that 
is consistent with the user-defined functions which have the RuntimeContext 
available.
As for `abstract classes vs interface` I agree with @sachingoel0101 that it 
makes sense to implement access to the RuntimeContext once for the user. We may 
also add other rich functions; if we implemented that using an interface we 
would have to change it later on or add another interface.

I think Input/Ouput formats should be rich by default. However, we might 
not want to break the API for existing external implementations.


 Allow access to RuntimeContext from Input and OutputFormats
 ---

 Key: FLINK-1819
 URL: https://issues.apache.org/jira/browse/FLINK-1819
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 0.9, 0.8.1
Reporter: Fabian Hueske
Assignee: Sachin Goel
Priority: Minor
 Fix For: 0.9


 User function that extend a RichFunction can access a {{RuntimeContext}} 
 which gives the parallel id of the task and access to Accumulators and 
 BroadcastVariables. 
 Right now, Input and OutputFormats cannot access their {{RuntimeContext}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2277) In Scala API delta Iterations can not be set to unmanaged

2015-08-11 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-2277.
---

 In Scala API delta Iterations can not be set to unmanaged
 -

 Key: FLINK-2277
 URL: https://issues.apache.org/jira/browse/FLINK-2277
 Project: Flink
  Issue Type: Improvement
  Components: Scala API
Reporter: Aljoscha Krettek
Assignee: PJ Van Aeken
  Labels: Starter
 Fix For: 0.10


 DeltaIteration.java has method solutionSetUnManaged(). In the Scala API this 
 could be added as an optional parameter on iterateDelta() on the DataSet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2507) Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681708#comment-14681708
 ] 

ASF GitHub Bot commented on FLINK-2507:
---

Github user ffbin commented on the pull request:

https://github.com/apache/flink/pull/1007#issuecomment-129854098
  
@chiwanpark Thank you very much. I submit code in linux and modify the 
permissions Incorrectly.I will submit again,.


 Rename the function tansformAndEmit in 
 org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector
 --

 Key: FLINK-2507
 URL: https://issues.apache.org/jira/browse/FLINK-2507
 Project: Flink
  Issue Type: Bug
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: fangfengbin
Assignee: fangfengbin
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2490][FIX]Remove the retryForever check...

2015-08-11 Thread HuangWHWHW
Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-129875331
  
@mxm 
Ok, I`ll add a test.
There is a little difficult that I can`t get the retry times in test since 
the retry is a local variable.
So can I add a function to get the retry times?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2423) Properly test checkpoint notifications

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681827#comment-14681827
 ] 

ASF GitHub Bot commented on FLINK-2423:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/980


 Properly test checkpoint notifications
 --

 Key: FLINK-2423
 URL: https://issues.apache.org/jira/browse/FLINK-2423
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Gyula Fora
Assignee: Márton Balassi

 Checkpoint notifications (via the CheckpointNotifier interface) are currently 
 not properly tested. 
 A test should be included to verify that checkpoint notifications are 
 eventually called on successful checkpoints, and that they are only called 
 once per checkpointID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2509) Improve error messages when user code classes are not found

2015-08-11 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2509:
---

 Summary: Improve error messages when user code classes are not 
found
 Key: FLINK-2509
 URL: https://issues.apache.org/jira/browse/FLINK-2509
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.10
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.10


When a job fails because the user code classes are not found, we should add 
some information about the class loader and class path into the exception 
message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-2501) [py] Remove the need to specify types for transformations

2015-08-11 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680732#comment-14680732
 ] 

Chesnay Schepler edited comment on FLINK-2501 at 8/11/15 2:49 PM:
--

hm.

one question that still remains is how would we tell the Java API what size the 
emitted tuples have? This is the primary reason i didn't list such a solution.

(3) has really nice things going for it: you save bandwidth as you don't send 
separate keys around; you save computation power by not having to extract keys; 
you reduce complexity since you don't have to alter the program plan or 
discard/hide the keys.

But unless a solution for the above issue is brought up (2) seems like the way 
to go. Unless I'm misunderstanding something.

Regarding your other points:
projections is the only operation i see right now that wouldn't need a special 
implementation with (3), as it allows access to individual fields.

to skip the sort implementation one could modify (2) to work on a tuple of 
keys. so for grouped operations, instead of Tuple2byte[], byte[] you work on 
a Tuple2TupleXbyte[],.., byte[]. this would make sorts equivalent for (2) 
and (3).

and yes,the binary data would contain type information,



was (Author: zentol):
hm.

one question that still remains is how would we tell the Java API whether a UDF 
emits a basic type (which would just be a byte[]) or an arbitrarily nested 
tuple? This is the primary reason i didn't list such a solution.

(3) has really nice things going for it: you save bandwidth as you don't send 
separate keys around; you save computation power by not having to extract keys; 
you reduce complexity since you don't have to alter the program plan or 
discard/hide the keys.

But unless a solution for the above issue is brought up (2) seems like the way 
to go. Unless I'm misunderstanding something.

Regarding your other points:
projections is the only operation i see right now that wouldn't need a special 
implementation with (3), as it allows access to individual fields.

to skip the sort implementation one could modify (2) to work on a tuple of 
keys. so for grouped operations, instead of Tuple2byte[], byte[] you work on 
a Tuple2TupleXbyte[],.., byte[]. this would make sorts equivalent for (2) 
and (3).

and yes,the binary data would contain type information,


 [py] Remove the need to specify types for transformations
 -

 Key: FLINK-2501
 URL: https://issues.apache.org/jira/browse/FLINK-2501
 Project: Flink
  Issue Type: Improvement
  Components: Python API
Reporter: Chesnay Schepler

 Currently, users of the Python API have to provide type arguments when using 
 a UDF, like so:
 {code}
 d1.map(Mapper(), (INT, STRING))
 {code}
 Instead, it would be really convenient to be able to do this:
 {code}
 d1.map(Mapper())
 {code}
 The intention behind this issue is convenience, and it's also not really 
 pythonic to specify types.
 Before I'll go into possible solutions, let me summarize the way these type 
 arguments are currently used, and in general how types are handled:
 The type argument passed is actually an object of the type it represents, as 
 INT is a constant int value, whereas STRING is a constant string value. You 
 could as well write the following and it would still work.
 {code}
 d1.map(Mapper(), (1, ImNotATypInfo))
 {code}
 This object is transmitted to the java side during the plan binding (and is 
 now an actual Tuple2Integer, String), then passed to the type extractor, 
 and the resulting TypeInformation saved in the java counterpart of the udf, 
 which all implement the ResultTypeQueryable interface. 
 The TypeInformation object is only used by the Java API, python never touches 
 it. Instead, at runtime, the serializers used between python and java check 
 the classes of the values passed and are thus generated dynamically.
 This means that, if a UDF does not pass the type it claims to pass, the 
 Python API wont complain, but the underlying java API will when it's 
 serializers fail.
 Now let's talk solutions.
 In discussions on the mailing list, pretty much 2 proposals were made:
 # Add a way to disable/circumvent type checks during the plan phase in the 
 Java API and generate serializers dynamically.
 # Have objects always in serialized form on the java side, stored in a single 
 bytearray or Tuple2 containing a key/value pair.
 These proposals vary wildly in the changes necessary to the system:
 # How can we change the Java API to support this?
 This proposal would hardly change the way the Python API works, or even touch 
 the related source code. It mostly deals with the Java API. Since I'm not to 
 familiar with the Plan processing life-cycle on the java side I 

[jira] [Commented] (FLINK-2501) [py] Remove the need to specify types for transformations

2015-08-11 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681906#comment-14681906
 ] 

Stephan Ewen commented on FLINK-2501:
-

Can the Java API always think as Tuple2 (key/value) and the python API 
interprets them?

I am not too familiar with Python and the Python API, so I am just thinking out 
loud here ;-)

 [py] Remove the need to specify types for transformations
 -

 Key: FLINK-2501
 URL: https://issues.apache.org/jira/browse/FLINK-2501
 Project: Flink
  Issue Type: Improvement
  Components: Python API
Reporter: Chesnay Schepler

 Currently, users of the Python API have to provide type arguments when using 
 a UDF, like so:
 {code}
 d1.map(Mapper(), (INT, STRING))
 {code}
 Instead, it would be really convenient to be able to do this:
 {code}
 d1.map(Mapper())
 {code}
 The intention behind this issue is convenience, and it's also not really 
 pythonic to specify types.
 Before I'll go into possible solutions, let me summarize the way these type 
 arguments are currently used, and in general how types are handled:
 The type argument passed is actually an object of the type it represents, as 
 INT is a constant int value, whereas STRING is a constant string value. You 
 could as well write the following and it would still work.
 {code}
 d1.map(Mapper(), (1, ImNotATypInfo))
 {code}
 This object is transmitted to the java side during the plan binding (and is 
 now an actual Tuple2Integer, String), then passed to the type extractor, 
 and the resulting TypeInformation saved in the java counterpart of the udf, 
 which all implement the ResultTypeQueryable interface. 
 The TypeInformation object is only used by the Java API, python never touches 
 it. Instead, at runtime, the serializers used between python and java check 
 the classes of the values passed and are thus generated dynamically.
 This means that, if a UDF does not pass the type it claims to pass, the 
 Python API wont complain, but the underlying java API will when it's 
 serializers fail.
 Now let's talk solutions.
 In discussions on the mailing list, pretty much 2 proposals were made:
 # Add a way to disable/circumvent type checks during the plan phase in the 
 Java API and generate serializers dynamically.
 # Have objects always in serialized form on the java side, stored in a single 
 bytearray or Tuple2 containing a key/value pair.
 These proposals vary wildly in the changes necessary to the system:
 # How can we change the Java API to support this?
 This proposal would hardly change the way the Python API works, or even touch 
 the related source code. It mostly deals with the Java API. Since I'm not to 
 familiar with the Plan processing life-cycle on the java side I can't assess 
 which classes would have to be changed.
 # How can we make this work within the limits of the Java API?
 is the exact opposite, it changes nothing in the Java API. Instead, the 
 following issues would have to be solved:
 * Alter the plan to extract keys before keyed operations, while hiding these 
 keys from the UDF. This is exactly how KeySelectors (will) work, and as such 
 is generally solved. In fact, this solution would make a few things easier in 
 regards to KeySelectors.
 * Rework all operations that currently rely on Java API functions, that need 
 deserialized data, for example Projections or the upcoming Aggregations; 
 This generally means implementing them in python, or with special java UDF's 
 (they could de-/serialize data within the udf call, or work on serialized 
 data).
 * Change (De)Serializers accordingly
 * implement a reliable, not all-memory-consuming sorting mechanism on the 
 python side
 Personally i prefer the second option, as it
 # does not modify the Java API, it works within it's well-tested limits
 # Plan changes are similar to issues that are already worked on (KeySelectors)
 # Sorting implementation was necessary anyway (for chained reducers)
 # having data in serialized form was a performance-related consideration 
 already
 While the first option could work, and most likely require less work, i feel 
 like many of the things required for option 2 will be implemented eventually 
 anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681842#comment-14681842
 ] 

ASF GitHub Bot commented on FLINK-2509:
---

GitHub user StephanEwen opened a pull request:

https://github.com/apache/flink/pull/1008

[FLINK-2509] Add class loader info to exception message when user code 
classes are not found

This pull request adds the following type of info to the exception message:

```
Cannot load user class: com.foo.bar.SomeClass
ClassLoader info: URL ClassLoader:  -- 
http://localhost:26712/some/file/path -- file: 
'/tmp/flink-url-test2825910185710886145.tmp' (valid JAR) -- file: 
'/tmp/flink-url-test4914469684250387785.tmp' (invalid JAR: error in opening zip 
file) -- file: '/tmp/flink-url-test5462325335282720117.tmp' (missing)
```

This should help diagnosing why sometimes classes cannot be resolved, even 
though the JAR files seem valid.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StephanEwen/incubator-flink class_loader_info

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1008.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1008


commit 00e123709f304bd7f602989f28b111b8d282ebb1
Author: Stephan Ewen se...@apache.org
Date:   2015-08-11T14:07:22Z

[FLINK-2509] [runtime] Add class loader info into the exception message 
when user code classes are not found.




 Improve error messages when user code classes are not found
 ---

 Key: FLINK-2509
 URL: https://issues.apache.org/jira/browse/FLINK-2509
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.10
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.10


 When a job fails because the user code classes are not found, we should add 
 some information about the class loader and class path into the exception 
 message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2509] Add class loader info to exceptio...

2015-08-11 Thread StephanEwen
GitHub user StephanEwen opened a pull request:

https://github.com/apache/flink/pull/1008

[FLINK-2509] Add class loader info to exception message when user code 
classes are not found

This pull request adds the following type of info to the exception message:

```
Cannot load user class: com.foo.bar.SomeClass
ClassLoader info: URL ClassLoader:  -- 
http://localhost:26712/some/file/path -- file: 
'/tmp/flink-url-test2825910185710886145.tmp' (valid JAR) -- file: 
'/tmp/flink-url-test4914469684250387785.tmp' (invalid JAR: error in opening zip 
file) -- file: '/tmp/flink-url-test5462325335282720117.tmp' (missing)
```

This should help diagnosing why sometimes classes cannot be resolved, even 
though the JAR files seem valid.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StephanEwen/incubator-flink class_loader_info

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1008.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1008


commit 00e123709f304bd7f602989f28b111b8d282ebb1
Author: Stephan Ewen se...@apache.org
Date:   2015-08-11T14:07:22Z

[FLINK-2509] [runtime] Add class loader info into the exception message 
when user code classes are not found.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (FLINK-2501) [py] Remove the need to specify types for transformations

2015-08-11 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681911#comment-14681911
 ] 

Chesnay Schepler edited comment on FLINK-2501 at 8/11/15 3:02 PM:
--

yes. modified (2) would mean that you always have either a 
Tuple2TupleXbyte[],..., byte[] or a byte[]. which case exactly can be 
deducted from the program plan without user input.


was (Author: zentol):
yes. modified (2) would mean that you have a Tuple2TupleXbyte[],..., byte[].

 [py] Remove the need to specify types for transformations
 -

 Key: FLINK-2501
 URL: https://issues.apache.org/jira/browse/FLINK-2501
 Project: Flink
  Issue Type: Improvement
  Components: Python API
Reporter: Chesnay Schepler

 Currently, users of the Python API have to provide type arguments when using 
 a UDF, like so:
 {code}
 d1.map(Mapper(), (INT, STRING))
 {code}
 Instead, it would be really convenient to be able to do this:
 {code}
 d1.map(Mapper())
 {code}
 The intention behind this issue is convenience, and it's also not really 
 pythonic to specify types.
 Before I'll go into possible solutions, let me summarize the way these type 
 arguments are currently used, and in general how types are handled:
 The type argument passed is actually an object of the type it represents, as 
 INT is a constant int value, whereas STRING is a constant string value. You 
 could as well write the following and it would still work.
 {code}
 d1.map(Mapper(), (1, ImNotATypInfo))
 {code}
 This object is transmitted to the java side during the plan binding (and is 
 now an actual Tuple2Integer, String), then passed to the type extractor, 
 and the resulting TypeInformation saved in the java counterpart of the udf, 
 which all implement the ResultTypeQueryable interface. 
 The TypeInformation object is only used by the Java API, python never touches 
 it. Instead, at runtime, the serializers used between python and java check 
 the classes of the values passed and are thus generated dynamically.
 This means that, if a UDF does not pass the type it claims to pass, the 
 Python API wont complain, but the underlying java API will when it's 
 serializers fail.
 Now let's talk solutions.
 In discussions on the mailing list, pretty much 2 proposals were made:
 # Add a way to disable/circumvent type checks during the plan phase in the 
 Java API and generate serializers dynamically.
 # Have objects always in serialized form on the java side, stored in a single 
 bytearray or Tuple2 containing a key/value pair.
 These proposals vary wildly in the changes necessary to the system:
 # How can we change the Java API to support this?
 This proposal would hardly change the way the Python API works, or even touch 
 the related source code. It mostly deals with the Java API. Since I'm not to 
 familiar with the Plan processing life-cycle on the java side I can't assess 
 which classes would have to be changed.
 # How can we make this work within the limits of the Java API?
 is the exact opposite, it changes nothing in the Java API. Instead, the 
 following issues would have to be solved:
 * Alter the plan to extract keys before keyed operations, while hiding these 
 keys from the UDF. This is exactly how KeySelectors (will) work, and as such 
 is generally solved. In fact, this solution would make a few things easier in 
 regards to KeySelectors.
 * Rework all operations that currently rely on Java API functions, that need 
 deserialized data, for example Projections or the upcoming Aggregations; 
 This generally means implementing them in python, or with special java UDF's 
 (they could de-/serialize data within the udf call, or work on serialized 
 data).
 * Change (De)Serializers accordingly
 * implement a reliable, not all-memory-consuming sorting mechanism on the 
 python side
 Personally i prefer the second option, as it
 # does not modify the Java API, it works within it's well-tested limits
 # Plan changes are similar to issues that are already worked on (KeySelectors)
 # Sorting implementation was necessary anyway (for chained reducers)
 # having data in serialized form was a performance-related consideration 
 already
 While the first option could work, and most likely require less work, i feel 
 like many of the things required for option 2 will be implemented eventually 
 anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2501) [py] Remove the need to specify types for transformations

2015-08-11 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681911#comment-14681911
 ] 

Chesnay Schepler commented on FLINK-2501:
-

yes. modified (2) would mean that you have a Tuple2TupleXbyte[],..., byte[].

 [py] Remove the need to specify types for transformations
 -

 Key: FLINK-2501
 URL: https://issues.apache.org/jira/browse/FLINK-2501
 Project: Flink
  Issue Type: Improvement
  Components: Python API
Reporter: Chesnay Schepler

 Currently, users of the Python API have to provide type arguments when using 
 a UDF, like so:
 {code}
 d1.map(Mapper(), (INT, STRING))
 {code}
 Instead, it would be really convenient to be able to do this:
 {code}
 d1.map(Mapper())
 {code}
 The intention behind this issue is convenience, and it's also not really 
 pythonic to specify types.
 Before I'll go into possible solutions, let me summarize the way these type 
 arguments are currently used, and in general how types are handled:
 The type argument passed is actually an object of the type it represents, as 
 INT is a constant int value, whereas STRING is a constant string value. You 
 could as well write the following and it would still work.
 {code}
 d1.map(Mapper(), (1, ImNotATypInfo))
 {code}
 This object is transmitted to the java side during the plan binding (and is 
 now an actual Tuple2Integer, String), then passed to the type extractor, 
 and the resulting TypeInformation saved in the java counterpart of the udf, 
 which all implement the ResultTypeQueryable interface. 
 The TypeInformation object is only used by the Java API, python never touches 
 it. Instead, at runtime, the serializers used between python and java check 
 the classes of the values passed and are thus generated dynamically.
 This means that, if a UDF does not pass the type it claims to pass, the 
 Python API wont complain, but the underlying java API will when it's 
 serializers fail.
 Now let's talk solutions.
 In discussions on the mailing list, pretty much 2 proposals were made:
 # Add a way to disable/circumvent type checks during the plan phase in the 
 Java API and generate serializers dynamically.
 # Have objects always in serialized form on the java side, stored in a single 
 bytearray or Tuple2 containing a key/value pair.
 These proposals vary wildly in the changes necessary to the system:
 # How can we change the Java API to support this?
 This proposal would hardly change the way the Python API works, or even touch 
 the related source code. It mostly deals with the Java API. Since I'm not to 
 familiar with the Plan processing life-cycle on the java side I can't assess 
 which classes would have to be changed.
 # How can we make this work within the limits of the Java API?
 is the exact opposite, it changes nothing in the Java API. Instead, the 
 following issues would have to be solved:
 * Alter the plan to extract keys before keyed operations, while hiding these 
 keys from the UDF. This is exactly how KeySelectors (will) work, and as such 
 is generally solved. In fact, this solution would make a few things easier in 
 regards to KeySelectors.
 * Rework all operations that currently rely on Java API functions, that need 
 deserialized data, for example Projections or the upcoming Aggregations; 
 This generally means implementing them in python, or with special java UDF's 
 (they could de-/serialize data within the udf call, or work on serialized 
 data).
 * Change (De)Serializers accordingly
 * implement a reliable, not all-memory-consuming sorting mechanism on the 
 python side
 Personally i prefer the second option, as it
 # does not modify the Java API, it works within it's well-tested limits
 # Plan changes are similar to issues that are already worked on (KeySelectors)
 # Sorting implementation was necessary anyway (for chained reducers)
 # having data in serialized form was a performance-related consideration 
 already
 While the first option could work, and most likely require less work, i feel 
 like many of the things required for option 2 will be implemented eventually 
 anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2490) Remove unwanted boolean check in function SocketTextStreamFunction.streamFromSocket

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681802#comment-14681802
 ] 

ASF GitHub Bot commented on FLINK-2490:
---

Github user HuangWHWHW commented on the pull request:

https://github.com/apache/flink/pull/992#issuecomment-129875331
  
@mxm 
Ok, I`ll add a test.
There is a little difficult that I can`t get the retry times in test since 
the retry is a local variable.
So can I add a function to get the retry times?


 Remove unwanted boolean check in function 
 SocketTextStreamFunction.streamFromSocket
 ---

 Key: FLINK-2490
 URL: https://issues.apache.org/jira/browse/FLINK-2490
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.10
Reporter: Huang Wei
Priority: Minor
 Fix For: 0.10

   Original Estimate: 168h
  Remaining Estimate: 168h





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2423] [streaming] ITCase for checkpoint...

2015-08-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/980


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2499) start-cluster.sh can start multiple TaskManager on the same node

2015-08-11 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681926#comment-14681926
 ] 

Chen He commented on FLINK-2499:


+1, not only warning, we need to track each started TM so that we can stop 
corresponding TM when we run stop-cluster.sh.

Another problem is using nohup command. Multiple TM using nohup, their output 
interleaving each other, it is not a good way to debug. We should redirect to 
TM.out instead of nohup.out. It also affect other nohup processes because they 
all dump stdout to nohup.out

 start-cluster.sh can start multiple TaskManager on the same node
 

 Key: FLINK-2499
 URL: https://issues.apache.org/jira/browse/FLINK-2499
 Project: Flink
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Chen He

 11562 JobHistoryServer
 3251 Main
 10596 Jps
 17934 RunJar
 6879 Main
 8837 Main
 19215 RunJar
 28902 DataNode
 6627 TaskManager
 642 NodeManager
 10408 RunJar
 10210 TaskManager
 5067 TaskManager
 357 ApplicationHistoryServer
 3540 RunJar
 28501 ResourceManager
 28572 SecondaryNameNode
 17630 QuorumPeerMain
 9069 TaskManager
 If we keep execute the start-cluster.sh, it may generate infinite 
 TaskManagers in a single system.
 And the nohup command in the start-cluster.sh can generate nohup.out file 
 that disturb any other nohup processes in the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2423) Properly test checkpoint notifications

2015-08-11 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/FLINK-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Márton Balassi closed FLINK-2423.
-
   Resolution: Implemented
Fix Version/s: 0.10

Via 10ce2e2.

 Properly test checkpoint notifications
 --

 Key: FLINK-2423
 URL: https://issues.apache.org/jira/browse/FLINK-2423
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Gyula Fora
Assignee: Márton Balassi
 Fix For: 0.10


 Checkpoint notifications (via the CheckpointNotifier interface) are currently 
 not properly tested. 
 A test should be included to verify that checkpoint notifications are 
 eventually called on successful checkpoints, and that they are only called 
 once per checkpointID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2510) KafkaConnector should access partition metadata from master/cluster

2015-08-11 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681994#comment-14681994
 ] 

Stephan Ewen commented on FLINK-2510:
-

Currently blocked by [FLINK-2386]

 KafkaConnector should access partition metadata from master/cluster
 ---

 Key: FLINK-2510
 URL: https://issues.apache.org/jira/browse/FLINK-2510
 Project: Flink
  Issue Type: Improvement
  Components: Streaming Connectors
Affects Versions: 0.10
Reporter: Stephan Ewen
Priority: Minor

 Currently, the Kafka connector assumes that it can access the partition 
 metadata from the client. There may be setups where this is not possible, and 
 where the access needs to happen from within the cluster (due to firewalls).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14681990#comment-14681990
 ] 

ASF GitHub Bot commented on FLINK-2509:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/1008#issuecomment-129947817
  
This is great and will be very helpful while debugging! Only the output is 
very hard to read. How about this? Just some new lines and indention:
```
Cannot load user class: com.foo.bar.SomeClass
ClassLoader info:
URL ClassLoader:
url:  'http://localhost:26712/some/file/path'
file: '/tmp/flink-url-test2825910185710886145.tmp' (valid JAR)
file: '/tmp/flink-url-test4914469684250387785.tmp' (invalid JAR: 
error in opening zip file) 
file: '/tmp/flink-url-test5462325335282720117.tmp' (missing)

```
I think users are less prone to read Exception messages when they have to 
scroll horizontally. We usually don't include newlines in our Exceptions.


 Improve error messages when user code classes are not found
 ---

 Key: FLINK-2509
 URL: https://issues.apache.org/jira/browse/FLINK-2509
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.10
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.10


 When a job fails because the user code classes are not found, we should add 
 some information about the class loader and class path into the exception 
 message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2509] Add class loader info to exceptio...

2015-08-11 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/1008#issuecomment-129947817
  
This is great and will be very helpful while debugging! Only the output is 
very hard to read. How about this? Just some new lines and indention:
```
Cannot load user class: com.foo.bar.SomeClass
ClassLoader info:
URL ClassLoader:
url:  'http://localhost:26712/some/file/path'
file: '/tmp/flink-url-test2825910185710886145.tmp' (valid JAR)
file: '/tmp/flink-url-test4914469684250387785.tmp' (invalid JAR: 
error in opening zip file) 
file: '/tmp/flink-url-test5462325335282720117.tmp' (missing)

```
I think users are less prone to read Exception messages when they have to 
scroll horizontally. We usually don't include newlines in our Exceptions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2510) KafkaConnector should access partition metadata from master/cluster

2015-08-11 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2510:
---

 Summary: KafkaConnector should access partition metadata from 
master/cluster
 Key: FLINK-2510
 URL: https://issues.apache.org/jira/browse/FLINK-2510
 Project: Flink
  Issue Type: Improvement
  Components: Streaming Connectors
Affects Versions: 0.10
Reporter: Stephan Ewen
Priority: Minor


Currently, the Kafka connector assumes that it can access the partition 
metadata from the client. There may be setups where this is not possible, and 
where the access needs to happen from within the cluster (due to firewalls).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2509] Add class loader info to exceptio...

2015-08-11 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1008#issuecomment-129955067
  
Good comment, will adjust this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2509) Improve error messages when user code classes are not found

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14682030#comment-14682030
 ] 

ASF GitHub Bot commented on FLINK-2509:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1008#issuecomment-129955067
  
Good comment, will adjust this.


 Improve error messages when user code classes are not found
 ---

 Key: FLINK-2509
 URL: https://issues.apache.org/jira/browse/FLINK-2509
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 0.10
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.10


 When a job fails because the user code classes are not found, we should add 
 some information about the class loader and class path into the exception 
 message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2451] [gelly] examples and library clea...

2015-08-11 Thread andralungu
Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/1000#issuecomment-130002982
  
Apart from the minor comment, this looks good to merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2451] [gelly] examples and library clea...

2015-08-11 Thread andralungu
Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/1000#discussion_r36779494
  
--- Diff: 
flink-staging/flink-gelly/src/test/java/org/apache/flink/graph/test/GatherSumApplyITCase.java
 ---
@@ -86,26 +72,35 @@ public void testConnectedComponents() throws Exception {
 
@Test
public void testSingleSourceShortestPaths() throws Exception {
-   GSASingleSourceShortestPaths.main(new String[]{1, edgesPath, 
resultPath, 16});
-   expectedResult = 1 0.0\n +
-   2 12.0\n +
-   3 13.0\n +
-   4 47.0\n +
-   5 48.0\n +
-   6 Infinity\n +
-   7 Infinity\n;
+   final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+
+   GraphLong, Double, Double inputGraph = Graph.fromDataSet(
+   
SingleSourceShortestPathsData.getDefaultEdgeDataSet(env),
+   new InitMapperSSSP(), env);
+
+ListVertexLong, Double result = inputGraph.run(new 
GSASingleSourceShortestPathsLong(1l, 16))
+   .getVertices().collect();
--- End diff --

Is there a specific reason for which we decided to use `collect()` in the 
tests? It does not seem to be a consistency issue. The other tests (in flink) 
are still using  `compareResultsByLinesInMemory()`. Do we gain anything?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [Flink-Gelly] [example] added missing assumpti...

2015-08-11 Thread andralungu
Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/883#issuecomment-130003492
  
I guess the only request that was not fulfilled for this PR was to squash 
the commits. Then it should be fine. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2451) Cleanup Gelly examples

2015-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14682220#comment-14682220
 ] 

ASF GitHub Bot commented on FLINK-2451:
---

Github user andralungu commented on a diff in the pull request:

https://github.com/apache/flink/pull/1000#discussion_r36779494
  
--- Diff: 
flink-staging/flink-gelly/src/test/java/org/apache/flink/graph/test/GatherSumApplyITCase.java
 ---
@@ -86,26 +72,35 @@ public void testConnectedComponents() throws Exception {
 
@Test
public void testSingleSourceShortestPaths() throws Exception {
-   GSASingleSourceShortestPaths.main(new String[]{1, edgesPath, 
resultPath, 16});
-   expectedResult = 1 0.0\n +
-   2 12.0\n +
-   3 13.0\n +
-   4 47.0\n +
-   5 48.0\n +
-   6 Infinity\n +
-   7 Infinity\n;
+   final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+
+   GraphLong, Double, Double inputGraph = Graph.fromDataSet(
+   
SingleSourceShortestPathsData.getDefaultEdgeDataSet(env),
+   new InitMapperSSSP(), env);
+
+ListVertexLong, Double result = inputGraph.run(new 
GSASingleSourceShortestPathsLong(1l, 16))
+   .getVertices().collect();
--- End diff --

Is there a specific reason for which we decided to use `collect()` in the 
tests? It does not seem to be a consistency issue. The other tests (in flink) 
are still using  `compareResultsByLinesInMemory()`. Do we gain anything?


 Cleanup Gelly examples
 --

 Key: FLINK-2451
 URL: https://issues.apache.org/jira/browse/FLINK-2451
 Project: Flink
  Issue Type: Improvement
  Components: Gelly
Affects Versions: 0.10
Reporter: Vasia Kalavri
Assignee: Vasia Kalavri
Priority: Minor

 As per discussion in the dev@ mailing list, this issue proposes the following 
 changes to the Gelly examples and library:
 1. Keep the following examples as they are:
 EuclideanGraphWeighing, GraphMetrics, IncrementalSSSP, JaccardSimilarity,  
 MusicProfiles.
 2. Keep only 1 example to show how to use library methods.
 3. Add 1 example for vertex-centric iterations.
 4. Keep 1 example for GSA iterations and move the redundant GSA 
 implementations to the library.
 5. Improve the examples documentation and refer to the functionality that 
 each of them demonstrates.
 6. Port and modify existing example tests accordingly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1520]Read edges and vertices from CSV f...

2015-08-11 Thread andralungu
Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/847#issuecomment-130014300
  
Saw this I will also update the documentation  afterwards... Sorry!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


  1   2   >