[jira] [Commented] (FLINK-3923) Unify configuration conventions of the Kinesis producer to the same as the consumer

2016-06-06 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317860#comment-15317860
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-3923:


I'm wondering whether or not it might be a better idea to use dedicated 
configuration classes.
i.e.,
FlinkKinesisConsumerConfiguration and FlinkKinesisProducerConfiguration.
Both takes the required AWS connection info (region, credential) as constructor 
args, then use cascading set methods for additional settings.

For example for the consumer,
.setInitialPosition()
.setDescribeStreamBackfireMillis()
.setWatermarkAssigner()
... (any other config we may add in the future)

The configuration classes will be responsible for setting the default values of 
this optional settings (behaviour of reading default values when not set is 
kind of floppy right now).

What do you think?

> Unify configuration conventions of the Kinesis producer to the same as the 
> consumer
> ---
>
> Key: FLINK-3923
> URL: https://issues.apache.org/jira/browse/FLINK-3923
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kinesis Connector, Streaming Connectors
>Affects Versions: 1.1.0
>Reporter: Robert Metzger
>Assignee: Abdullah Ozturk
> Fix For: 1.1.0
>
>
> Currently, the Kinesis consumer and producer are configured differently.
> The producer expects a list of arguments for the access key, secret, region, 
> stream. The consumer is accepting properties (similar to the Kafka connector).
> The objective of this issue is to change the producer so that it is also 
> using a properties-based configuration (including an input validation step)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2985) Allow different field names for unionAll() in Table API

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317678#comment-15317678
 ] 

ASF GitHub Bot commented on FLINK-2985:
---

Github user gallenvara commented on the issue:

https://github.com/apache/flink/pull/2078
  
@twalthr  @fhueske  can you help with reviewing this PR? Thanks! :)


> Allow different field names for unionAll() in Table API
> ---
>
> Key: FLINK-2985
> URL: https://issues.apache.org/jira/browse/FLINK-2985
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API
>Reporter: Timo Walther
>Priority: Minor
>
> The recently merged `unionAll` operator checks if the field names of the left 
> and right side are equal. Actually, this is not necessary. The union operator 
> in SQL checks only the types and uses the names of left side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2078: [FLINK-2985] Allow different field names for unionAll() i...

2016-06-06 Thread gallenvara
Github user gallenvara commented on the issue:

https://github.com/apache/flink/pull/2078
  
@twalthr  @fhueske  can you help with reviewing this PR? Thanks! :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (FLINK-2985) Allow different field names for unionAll() in Table API

2016-06-06 Thread GaoLun (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317673#comment-15317673
 ] 

GaoLun edited comment on FLINK-2985 at 6/7/16 2:12 AM:
---

The refactoring work has been finished. If nobody working on this, i will go on 
with it. :)


was (Author: gallenvara_bg):
The refactoring work has been finished and i will go on with this issue. :)

> Allow different field names for unionAll() in Table API
> ---
>
> Key: FLINK-2985
> URL: https://issues.apache.org/jira/browse/FLINK-2985
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API
>Reporter: Timo Walther
>Priority: Minor
>
> The recently merged `unionAll` operator checks if the field names of the left 
> and right side are equal. Actually, this is not necessary. The union operator 
> in SQL checks only the types and uses the names of left side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2985) Allow different field names for unionAll() in Table API

2016-06-06 Thread GaoLun (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317673#comment-15317673
 ] 

GaoLun commented on FLINK-2985:
---

The refactoring work has been finished and i will go on with this issue. :)

> Allow different field names for unionAll() in Table API
> ---
>
> Key: FLINK-2985
> URL: https://issues.apache.org/jira/browse/FLINK-2985
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API
>Reporter: Timo Walther
>Priority: Minor
>
> The recently merged `unionAll` operator checks if the field names of the left 
> and right side are equal. Actually, this is not necessary. The union operator 
> in SQL checks only the types and uses the names of left side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2985) Allow different field names for unionAll() in Table API

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317665#comment-15317665
 ] 

ASF GitHub Bot commented on FLINK-2985:
---

GitHub user gallenvara opened a pull request:

https://github.com/apache/flink/pull/2078

[FLINK-2985] Allow different field names for unionAll() in Table API

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [X] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [X] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed

It is not necessary that the corresponding columns in each SELECT statement 
have the same name, but they do need to be the same data types. This PR 
supports allowing different field names for union/unionAll in Table API.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gallenvara/flink flink-2985

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2078.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2078


commit e84d836efbb290fb79d14e3d3c6c9e13db567b72
Author: gallenvara 
Date:   2016-06-06T08:30:41Z

Allow different field names for union in Table API.




> Allow different field names for unionAll() in Table API
> ---
>
> Key: FLINK-2985
> URL: https://issues.apache.org/jira/browse/FLINK-2985
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API
>Reporter: Timo Walther
>Priority: Minor
>
> The recently merged `unionAll` operator checks if the field names of the left 
> and right side are equal. Actually, this is not necessary. The union operator 
> in SQL checks only the types and uses the names of left side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2078: [FLINK-2985] Allow different field names for union...

2016-06-06 Thread gallenvara
GitHub user gallenvara opened a pull request:

https://github.com/apache/flink/pull/2078

[FLINK-2985] Allow different field names for unionAll() in Table API

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [X] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [X] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed

It is not necessary that the corresponding columns in each SELECT statement 
have the same name, but they do need to be the same data types. This PR 
supports allowing different field names for union/unionAll in Table API.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gallenvara/flink flink-2985

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2078.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2078


commit e84d836efbb290fb79d14e3d3c6c9e13db567b72
Author: gallenvara 
Date:   2016-06-06T08:30:41Z

Allow different field names for union in Table API.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2076: [FLINK-2832] [tests] Hardens RandomSamplerTest

2016-06-06 Thread chiwanpark
Github user chiwanpark commented on the issue:

https://github.com/apache/flink/pull/2076
  
Failing Travis is not related to this PR. Looks good to me!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2832) Failing test: RandomSamplerTest.testReservoirSamplerWithReplacement

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317663#comment-15317663
 ] 

ASF GitHub Bot commented on FLINK-2832:
---

Github user chiwanpark commented on the issue:

https://github.com/apache/flink/pull/2076
  
Failing Travis is not related to this PR. Looks good to me!


> Failing test: RandomSamplerTest.testReservoirSamplerWithReplacement
> ---
>
> Key: FLINK-2832
> URL: https://issues.apache.org/jira/browse/FLINK-2832
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.10.0
>Reporter: Vasia Kalavri
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.0.0
>
>
> Tests run: 17, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 19.133 sec 
> <<< FAILURE! - in org.apache.flink.api.java.sampling.RandomSamplerTest
> testReservoirSamplerWithReplacement(org.apache.flink.api.java.sampling.RandomSamplerTest)
>   Time elapsed: 2.534 sec  <<< FAILURE!
> java.lang.AssertionError: KS test result with p value(0.11), d 
> value(0.103090)
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyKSTest(RandomSamplerTest.java:342)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyRandomSamplerWithSampleSize(RandomSamplerTest.java:330)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyReservoirSamplerWithReplacement(RandomSamplerTest.java:289)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.testReservoirSamplerWithReplacement(RandomSamplerTest.java:192)
> Results :
> Failed tests: 
>   
> RandomSamplerTest.testReservoirSamplerWithReplacement:192->verifyReservoirSamplerWithReplacement:289->verifyRandomSamplerWithSampleSize:330->verifyKSTest:342
>  KS test result with p value(0.11), d value(0.103090)
> Full log [here|https://travis-ci.org/apache/flink/jobs/84120131].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2074: [FLINK-2985] [Table API] Allow different field nam...

2016-06-06 Thread gallenvara
Github user gallenvara closed the pull request at:

https://github.com/apache/flink/pull/2074


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2985) Allow different field names for unionAll() in Table API

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317664#comment-15317664
 ] 

ASF GitHub Bot commented on FLINK-2985:
---

Github user gallenvara closed the pull request at:

https://github.com/apache/flink/pull/2074


> Allow different field names for unionAll() in Table API
> ---
>
> Key: FLINK-2985
> URL: https://issues.apache.org/jira/browse/FLINK-2985
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API
>Reporter: Timo Walther
>Priority: Minor
>
> The recently merged `unionAll` operator checks if the field names of the left 
> and right side are equal. Actually, this is not necessary. The union operator 
> in SQL checks only the types and uses the names of left side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2832) Failing test: RandomSamplerTest.testReservoirSamplerWithReplacement

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317661#comment-15317661
 ] 

ASF GitHub Bot commented on FLINK-2832:
---

Github user chiwanpark commented on a diff in the pull request:

https://github.com/apache/flink/pull/2076#discussion_r6573
  
--- Diff: flink-java/src/test/resources/log4j-test.properties ---
@@ -0,0 +1,27 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Set root logger level to OFF to not flood build logs
+# set manually to INFO for debugging purposes
+log4j.rootLogger=OFF, testlogger
+
+# A1 is set to be a ConsoleAppender.
+log4j.appender.testlogger=org.apache.log4j.ConsoleAppender
+log4j.appender.testlogger.target = System.err
--- End diff --

Spaces around `=`


> Failing test: RandomSamplerTest.testReservoirSamplerWithReplacement
> ---
>
> Key: FLINK-2832
> URL: https://issues.apache.org/jira/browse/FLINK-2832
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.10.0
>Reporter: Vasia Kalavri
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.0.0
>
>
> Tests run: 17, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 19.133 sec 
> <<< FAILURE! - in org.apache.flink.api.java.sampling.RandomSamplerTest
> testReservoirSamplerWithReplacement(org.apache.flink.api.java.sampling.RandomSamplerTest)
>   Time elapsed: 2.534 sec  <<< FAILURE!
> java.lang.AssertionError: KS test result with p value(0.11), d 
> value(0.103090)
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyKSTest(RandomSamplerTest.java:342)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyRandomSamplerWithSampleSize(RandomSamplerTest.java:330)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyReservoirSamplerWithReplacement(RandomSamplerTest.java:289)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.testReservoirSamplerWithReplacement(RandomSamplerTest.java:192)
> Results :
> Failed tests: 
>   
> RandomSamplerTest.testReservoirSamplerWithReplacement:192->verifyReservoirSamplerWithReplacement:289->verifyRandomSamplerWithSampleSize:330->verifyKSTest:342
>  KS test result with p value(0.11), d value(0.103090)
> Full log [here|https://travis-ci.org/apache/flink/jobs/84120131].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2076: [FLINK-2832] [tests] Hardens RandomSamplerTest

2016-06-06 Thread chiwanpark
Github user chiwanpark commented on a diff in the pull request:

https://github.com/apache/flink/pull/2076#discussion_r6588
  
--- Diff: flink-java/src/test/resources/logback-test.xml ---
@@ -0,0 +1,34 @@
+
+
+
+
+
+%d{HH:mm:ss.SSS} [%thread] %-5level %logger{60} 
%X{sourceThread} - %msg%n
+
+
+
+
+
+
+
+
+
+
+
+
--- End diff --

we need to add new line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2832) Failing test: RandomSamplerTest.testReservoirSamplerWithReplacement

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317662#comment-15317662
 ] 

ASF GitHub Bot commented on FLINK-2832:
---

Github user chiwanpark commented on a diff in the pull request:

https://github.com/apache/flink/pull/2076#discussion_r6588
  
--- Diff: flink-java/src/test/resources/logback-test.xml ---
@@ -0,0 +1,34 @@
+
+
+
+
+
+%d{HH:mm:ss.SSS} [%thread] %-5level %logger{60} 
%X{sourceThread} - %msg%n
+
+
+
+
+
+
+
+
+
+
+
+
--- End diff --

we need to add new line


> Failing test: RandomSamplerTest.testReservoirSamplerWithReplacement
> ---
>
> Key: FLINK-2832
> URL: https://issues.apache.org/jira/browse/FLINK-2832
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.10.0
>Reporter: Vasia Kalavri
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.0.0
>
>
> Tests run: 17, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 19.133 sec 
> <<< FAILURE! - in org.apache.flink.api.java.sampling.RandomSamplerTest
> testReservoirSamplerWithReplacement(org.apache.flink.api.java.sampling.RandomSamplerTest)
>   Time elapsed: 2.534 sec  <<< FAILURE!
> java.lang.AssertionError: KS test result with p value(0.11), d 
> value(0.103090)
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyKSTest(RandomSamplerTest.java:342)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyRandomSamplerWithSampleSize(RandomSamplerTest.java:330)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyReservoirSamplerWithReplacement(RandomSamplerTest.java:289)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.testReservoirSamplerWithReplacement(RandomSamplerTest.java:192)
> Results :
> Failed tests: 
>   
> RandomSamplerTest.testReservoirSamplerWithReplacement:192->verifyReservoirSamplerWithReplacement:289->verifyRandomSamplerWithSampleSize:330->verifyKSTest:342
>  KS test result with p value(0.11), d value(0.103090)
> Full log [here|https://travis-ci.org/apache/flink/jobs/84120131].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2076: [FLINK-2832] [tests] Hardens RandomSamplerTest

2016-06-06 Thread chiwanpark
Github user chiwanpark commented on a diff in the pull request:

https://github.com/apache/flink/pull/2076#discussion_r6573
  
--- Diff: flink-java/src/test/resources/log4j-test.properties ---
@@ -0,0 +1,27 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Set root logger level to OFF to not flood build logs
+# set manually to INFO for debugging purposes
+log4j.rootLogger=OFF, testlogger
+
+# A1 is set to be a ConsoleAppender.
+log4j.appender.testlogger=org.apache.log4j.ConsoleAppender
+log4j.appender.testlogger.target = System.err
--- End diff --

Spaces around `=`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2031: FLINK-3967 - Flink Sink for Rethink Db

2016-06-06 Thread mans2singh
Github user mans2singh commented on the issue:

https://github.com/apache/flink/pull/2031
  
Hey @rmetzger, @StephanEwen, Flink Folks:  

Just wanted to let you know that the rethinkdb java driver that I used for 
Flink integration is available under Apache License 2.0  
(https://github.com/rethinkdb/rethinkdb/blob/next/drivers/COPYRIGHT).   

I've checked travis build and the rethink connector is passing but the last 
failure was in the flnk-yarn-tests module.

Please let me know if you have any advice/comments for me.

Thanks for your time.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3967) Provide RethinkDB Sink for Flink

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317469#comment-15317469
 ] 

ASF GitHub Bot commented on FLINK-3967:
---

Github user mans2singh commented on the issue:

https://github.com/apache/flink/pull/2031
  
Hey @rmetzger, @StephanEwen, Flink Folks:  

Just wanted to let you know that the rethinkdb java driver that I used for 
Flink integration is available under Apache License 2.0  
(https://github.com/rethinkdb/rethinkdb/blob/next/drivers/COPYRIGHT).   

I've checked travis build and the rethink connector is passing but the last 
failure was in the flnk-yarn-tests module.

Please let me know if you have any advice/comments for me.

Thanks for your time.


> Provide RethinkDB Sink for Flink
> 
>
> Key: FLINK-3967
> URL: https://issues.apache.org/jira/browse/FLINK-3967
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming, Streaming Connectors
>Affects Versions: 1.0.3
> Environment: All
>Reporter: Mans Singh
>Assignee: Mans Singh
>Priority: Minor
>  Labels: features
> Fix For: 1.1.0
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Provide Sink to stream data from flink to rethink db.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4027) FlinkKafkaProducer09 sink can lose messages

2016-06-06 Thread Elias Levy (JIRA)
Elias Levy created FLINK-4027:
-

 Summary: FlinkKafkaProducer09 sink can lose messages
 Key: FLINK-4027
 URL: https://issues.apache.org/jira/browse/FLINK-4027
 Project: Flink
  Issue Type: Bug
  Components: Kafka Connector
Affects Versions: 1.0.3
Reporter: Elias Levy
Priority: Critical


The FlinkKafkaProducer09 sink appears to not offer at-least-once guarantees.

The producer is publishing messages asynchronously.  A callback can record 
publishing errors, which will be raised when detected.  But as far as I can 
tell, there is no barrier to wait for async errors from the sink when 
checkpointing or to track the event time of acked messages to inform the 
checkpointing process.

If a checkpoint occurs while there are pending publish requests, and the 
requests return a failure after the checkpoint occurred, those message will be 
lost as the checkpoint will consider them processed by the sink.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4002) [py] Improve testing infraestructure

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317087#comment-15317087
 ] 

ASF GitHub Bot commented on FLINK-4002:
---

Github user omaralvarez commented on the issue:

https://github.com/apache/flink/pull/2063
  
Yes, I had the same thought looking at the code, I could not figure out why 
it worked with HDFS...


> [py] Improve testing infraestructure
> 
>
> Key: FLINK-4002
> URL: https://issues.apache.org/jira/browse/FLINK-4002
> Project: Flink
>  Issue Type: Bug
>  Components: Python API
>Affects Versions: 1.0.3
>Reporter: Omar Alvarez
>Priority: Minor
>  Labels: Python, Testing
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The Verify() test function errors out when array elements are missing:
> {code}
> env.generate_sequence(1, 5)\
>  .map(Id()).map_partition(Verify([1,2,3,4], "Sequence")).output()
> {code}
> {quote}
> IndexError: list index out of range
> {quote}
> There should also be more documentation in test functions.
> I am already working on a pull request to fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2063: [FLINK-4002] [py] Improve testing infraestructure

2016-06-06 Thread omaralvarez
Github user omaralvarez commented on the issue:

https://github.com/apache/flink/pull/2063
  
Yes, I had the same thought looking at the code, I could not figure out why 
it worked with HDFS...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4002) [py] Improve testing infraestructure

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317078#comment-15317078
 ] 

ASF GitHub Bot commented on FLINK-4002:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2063
  
looking at the code right now; i may have figured out why the files aren't 
copied, but i find it odd that it supposedly works with hdfs. it actually 
should never copy additional files if no parameters are given.


> [py] Improve testing infraestructure
> 
>
> Key: FLINK-4002
> URL: https://issues.apache.org/jira/browse/FLINK-4002
> Project: Flink
>  Issue Type: Bug
>  Components: Python API
>Affects Versions: 1.0.3
>Reporter: Omar Alvarez
>Priority: Minor
>  Labels: Python, Testing
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The Verify() test function errors out when array elements are missing:
> {code}
> env.generate_sequence(1, 5)\
>  .map(Id()).map_partition(Verify([1,2,3,4], "Sequence")).output()
> {code}
> {quote}
> IndexError: list index out of range
> {quote}
> There should also be more documentation in test functions.
> I am already working on a pull request to fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2063: [FLINK-4002] [py] Improve testing infraestructure

2016-06-06 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2063
  
looking at the code right now; i may have figured out why the files aren't 
copied, but i find it odd that it supposedly works with hdfs. it actually 
should never copy additional files if no parameters are given.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-3753) KillerWatchDog should not use kill on toKill thread

2016-06-06 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-3753:
--
Description: 
{code}
// this is harsh, but this watchdog is a last resort
if (toKill.isAlive()) {
  toKill.stop();
}
{code}
stop() is deprecated.

See:
https://www.securecoding.cert.org/confluence/display/java/THI05-J.+Do+not+use+Thread.stop()+to+terminate+threads

  was:
{code}
// this is harsh, but this watchdog is a last resort
if (toKill.isAlive()) {
  toKill.stop();
}
{code}

stop() is deprecated.

See:
https://www.securecoding.cert.org/confluence/display/java/THI05-J.+Do+not+use+Thread.stop()+to+terminate+threads


> KillerWatchDog should not use kill on toKill thread
> ---
>
> Key: FLINK-3753
> URL: https://issues.apache.org/jira/browse/FLINK-3753
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> {code}
> // this is harsh, but this watchdog is a last resort
> if (toKill.isAlive()) {
>   toKill.stop();
> }
> {code}
> stop() is deprecated.
> See:
> https://www.securecoding.cert.org/confluence/display/java/THI05-J.+Do+not+use+Thread.stop()+to+terminate+threads



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3734) Unclosed DataInputView in AbstractAlignedProcessingTimeWindowOperator#restoreState()

2016-06-06 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-3734:
--
Description: 
{code}
DataInputView in = inputState.getState(getUserCodeClassloader());

final long nextEvaluationTime = in.readLong();
final long nextSlideTime = in.readLong();

AbstractKeyedTimePanes panes = 
createPanes(keySelector, function);
panes.readFromInput(in, keySerializer, stateTypeSerializer);

restoredState = new RestoredState<>(panes, nextEvaluationTime, 
nextSlideTime);
  }
{code}

DataInputView in is not closed upon return.

  was:
{code}
DataInputView in = inputState.getState(getUserCodeClassloader());

final long nextEvaluationTime = in.readLong();
final long nextSlideTime = in.readLong();

AbstractKeyedTimePanes panes = 
createPanes(keySelector, function);
panes.readFromInput(in, keySerializer, stateTypeSerializer);

restoredState = new RestoredState<>(panes, nextEvaluationTime, 
nextSlideTime);
  }
{code}
DataInputView in is not closed upon return.


> Unclosed DataInputView in 
> AbstractAlignedProcessingTimeWindowOperator#restoreState()
> 
>
> Key: FLINK-3734
> URL: https://issues.apache.org/jira/browse/FLINK-3734
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> {code}
> DataInputView in = inputState.getState(getUserCodeClassloader());
> final long nextEvaluationTime = in.readLong();
> final long nextSlideTime = in.readLong();
> AbstractKeyedTimePanes panes = 
> createPanes(keySelector, function);
> panes.readFromInput(in, keySerializer, stateTypeSerializer);
> restoredState = new RestoredState<>(panes, nextEvaluationTime, 
> nextSlideTime);
>   }
> {code}
> DataInputView in is not closed upon return.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3801) Upgrade Joda-Time library to 2.9.3

2016-06-06 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-3801:
--
Description: 
Currently yoda-time 2.5 is used which was very old.

We should upgrade to 2.9.3

  was:
Currently yoda-time 2.5 is used which was very old.


We should upgrade to 2.9.3


> Upgrade Joda-Time library to 2.9.3
> --
>
> Key: FLINK-3801
> URL: https://issues.apache.org/jira/browse/FLINK-3801
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ted Yu
>Priority: Minor
>
> Currently yoda-time 2.5 is used which was very old.
> We should upgrade to 2.9.3



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4024) FileSourceFunction not adjusted to new IF lifecycle

2016-06-06 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317071#comment-15317071
 ] 

Chesnay Schepler commented on FLINK-4024:
-

it also starts the function regardless of whether splitIterator.hasNext() 
returns true or false. This may cause issues with NonParallelInputFormats when 
the parallelism of the FileSourceFunction is greater than 1.

> FileSourceFunction not adjusted to new IF lifecycle
> ---
>
> Key: FLINK-4024
> URL: https://issues.apache.org/jira/browse/FLINK-4024
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Priority: Critical
> Fix For: 1.1.0
>
>
> The InputFormat lifecycle was extended in 
> ac2137cfa5e63bd4f53a4b7669dc591ab210093f, adding additional 
> open-/closeInputFormat() methods.
> The streaming FileSourceFunction was not adjusted for this change, and thus 
> will fail for every InputFormat that leverages these new methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3998) Access to gauges and counters in StatsDReporter#report() is not properly synchronized

2016-06-06 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-3998:
--
Description: 
{code}
for (Map.Entry entry : gauges.entrySet()) {
  reportGauge(entry.getValue(), entry.getKey());
}

for (Map.Entry entry : counters.entrySet()) {
  reportCounter(entry.getValue(), entry.getKey());
{code}

Access to gauges and counters should be protected by lock on 
AbstractReporter.this

  was:
{code}
for (Map.Entry entry : gauges.entrySet()) {
  reportGauge(entry.getValue(), entry.getKey());
}

for (Map.Entry entry : counters.entrySet()) {
  reportCounter(entry.getValue(), entry.getKey());
{code}
Access to gauges and counters should be protected by lock on 
AbstractReporter.this


> Access to gauges and counters in StatsDReporter#report() is not properly 
> synchronized
> -
>
> Key: FLINK-3998
> URL: https://issues.apache.org/jira/browse/FLINK-3998
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>
> {code}
> for (Map.Entry entry : gauges.entrySet()) {
>   reportGauge(entry.getValue(), entry.getKey());
> }
> for (Map.Entry entry : counters.entrySet()) {
>   reportCounter(entry.getValue(), entry.getKey());
> {code}
> Access to gauges and counters should be protected by lock on 
> AbstractReporter.this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3958) Access to MetricRegistry doesn't have proper synchronization in some classes

2016-06-06 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-3958:
--
Description: 
In GraphiteReporter#getReporter():
{code}  com.codahale.metrics.graphite.GraphiteReporter.Builder builder =
  com.codahale.metrics.graphite.GraphiteReporter.forRegistry(registry);
{code}

Access to registry should be protected by lock on 
ScheduledDropwizardReporter.this

Similar issue exists in GangliaReporter#getReporter()

  was:
In GraphiteReporter#getReporter():
{code}  com.codahale.metrics.graphite.GraphiteReporter.Builder builder =
  com.codahale.metrics.graphite.GraphiteReporter.forRegistry(registry);
{code}
Access to registry should be protected by lock on 
ScheduledDropwizardReporter.this

Similar issue exists in GangliaReporter#getReporter()


> Access to MetricRegistry doesn't have proper synchronization in some classes
> 
>
> Key: FLINK-3958
> URL: https://issues.apache.org/jira/browse/FLINK-3958
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>
> In GraphiteReporter#getReporter():
> {code}  com.codahale.metrics.graphite.GraphiteReporter.Builder builder =
>   com.codahale.metrics.graphite.GraphiteReporter.forRegistry(registry);
> {code}
> Access to registry should be protected by lock on 
> ScheduledDropwizardReporter.this
> Similar issue exists in GangliaReporter#getReporter()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4024) FileSourceFunction not adjusted to new IF lifecycle

2016-06-06 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317064#comment-15317064
 ] 

Fabian Hueske commented on FLINK-4024:
--

It appears that {{FileSourceFunction}} doesn't call 
{{RichInputFormat.setRuntimeContext()}} as well. 

> FileSourceFunction not adjusted to new IF lifecycle
> ---
>
> Key: FLINK-4024
> URL: https://issues.apache.org/jira/browse/FLINK-4024
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Priority: Critical
> Fix For: 1.1.0
>
>
> The InputFormat lifecycle was extended in 
> ac2137cfa5e63bd4f53a4b7669dc591ab210093f, adding additional 
> open-/closeInputFormat() methods.
> The streaming FileSourceFunction was not adjusted for this change, and thus 
> will fail for every InputFormat that leverages these new methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4002) [py] Improve testing infraestructure

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317053#comment-15317053
 ] 

ASF GitHub Bot commented on FLINK-4002:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2063
  
you can add it to this PR as a separate commit.


> [py] Improve testing infraestructure
> 
>
> Key: FLINK-4002
> URL: https://issues.apache.org/jira/browse/FLINK-4002
> Project: Flink
>  Issue Type: Bug
>  Components: Python API
>Affects Versions: 1.0.3
>Reporter: Omar Alvarez
>Priority: Minor
>  Labels: Python, Testing
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The Verify() test function errors out when array elements are missing:
> {code}
> env.generate_sequence(1, 5)\
>  .map(Id()).map_partition(Verify([1,2,3,4], "Sequence")).output()
> {code}
> {quote}
> IndexError: list index out of range
> {quote}
> There should also be more documentation in test functions.
> I am already working on a pull request to fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2063: [FLINK-4002] [py] Improve testing infraestructure

2016-06-06 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2063
  
you can add it to this PR as a separate commit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4016) FoldApplyWindowFunction is not properly initialized

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317012#comment-15317012
 ] 

ASF GitHub Bot commented on FLINK-4016:
---

Github user rvdwenden commented on the issue:

https://github.com/apache/flink/pull/2070
  
I checked in this change with this commit comment:
[FLINK-3977] initialize FoldApplyWindowFunction properly 

Hopefully that’s enough, however, on Travis CI I see that on my commit, the 
[FLINK-4016] is still triggered.

Regards, RvdWenden

> On 06 Jun 2016, at 09:46, Aljoscha Krettek  
wrote:
> 
> Hi,
> I wrote in the issue that you opened. This is a duplicate of 
https://issues.apache.org/jira/browse/FLINK-3977 
 and the way to fix it is to 
make InternalWindowFunction properly forward calls to setOutputType. If you 
want, could you please change your fix to that and open a PR against the older 
issue?
> 
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub 
, or mute the 
thread 
.
> 




> FoldApplyWindowFunction is not properly initialized
> ---
>
> Key: FLINK-4016
> URL: https://issues.apache.org/jira/browse/FLINK-4016
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Affects Versions: 1.0.3
>Reporter: RWenden
>Priority: Blocker
>  Labels: easyfix
> Fix For: 1.1.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> FoldApplyWindowFunction's outputtype is not set.
> We're using constructions like (excerpt):
>   .keyBy(0)
>   .countWindow(10, 5)
>   .fold(...)
> Running this stream gives an runtime exception in FoldApplyWindowFunction:
> "No initial value was serialized for the fold window function. Probably the 
> setOutputType method was not called."
> This can be easily fixed in WindowedStream.java by (around line# 449):
> FoldApplyWindowFunction foldApplyWindowFunction = new 
> FoldApplyWindowFunction<>(initialValue, foldFunction, function);
> foldApplyWindowFunction.setOutputType(resultType, 
> input.getExecutionConfig());
> operator = new EvictingWindowOperator<>(windowAssigner,
> 
> windowAssigner.getWindowSerializer(getExecutionEnvironment().getConfig()),
> keySel,
> 
> input.getKeyType().createSerializer(getExecutionEnvironment().getConfig()),
> stateDesc,
> new 
> InternalIterableWindowFunction<>(foldApplyWindowFunction),
> trigger,
> evictor);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2070: [FLINK-4016] initialize FoldApplyWindowFunction properly

2016-06-06 Thread rvdwenden
Github user rvdwenden commented on the issue:

https://github.com/apache/flink/pull/2070
  
I checked in this change with this commit comment:
[FLINK-3977] initialize FoldApplyWindowFunction properly 

Hopefully that’s enough, however, on Travis CI I see that on my commit, 
the [FLINK-4016] is still triggered.

Regards, RvdWenden

> On 06 Jun 2016, at 09:46, Aljoscha Krettek  
wrote:
> 
> Hi,
> I wrote in the issue that you opened. This is a duplicate of 
https://issues.apache.org/jira/browse/FLINK-3977 
 and the way to fix it is to 
make InternalWindowFunction properly forward calls to setOutputType. If you 
want, could you please change your fix to that and open a PR against the older 
issue?
> 
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub 
, or mute the 
thread 
.
> 




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2063: [FLINK-4002] [py] Improve testing infraestructure

2016-06-06 Thread omaralvarez
Github user omaralvarez commented on the issue:

https://github.com/apache/flink/pull/2063
  
I can confirm that there is a bug in `PythonPlanBinder.java`. When you pass 
packages to the `runPlan()` function, they are not copied to the temp directory 
in which `plan.py` file is generated. I manually changed the `prepareFiles()` 
call to copy the package files, and everything worked fine. 

Should I open another issue, or just fix it here? This is not only a bug 
related to the testing infrastructure, but the Python API in general. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4002) [py] Improve testing infraestructure

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317000#comment-15317000
 ] 

ASF GitHub Bot commented on FLINK-4002:
---

Github user omaralvarez commented on the issue:

https://github.com/apache/flink/pull/2063
  
I can confirm that there is a bug in `PythonPlanBinder.java`. When you pass 
packages to the `runPlan()` function, they are not copied to the temp directory 
in which `plan.py` file is generated. I manually changed the `prepareFiles()` 
call to copy the package files, and everything worked fine. 

Should I open another issue, or just fix it here? This is not only a bug 
related to the testing infrastructure, but the Python API in general. 


> [py] Improve testing infraestructure
> 
>
> Key: FLINK-4002
> URL: https://issues.apache.org/jira/browse/FLINK-4002
> Project: Flink
>  Issue Type: Bug
>  Components: Python API
>Affects Versions: 1.0.3
>Reporter: Omar Alvarez
>Priority: Minor
>  Labels: Python, Testing
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The Verify() test function errors out when array elements are missing:
> {code}
> env.generate_sequence(1, 5)\
>  .map(Id()).map_partition(Verify([1,2,3,4], "Sequence")).output()
> {code}
> {quote}
> IndexError: list index out of range
> {quote}
> There should also be more documentation in test functions.
> I am already working on a pull request to fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4026) Fix code, grammar, and link issues in the Streaming documentation

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316643#comment-15316643
 ] 

ASF GitHub Bot commented on FLINK-4026:
---

GitHub user dyanarose opened a pull request:

https://github.com/apache/flink/pull/2077

[FLINK-4026] Fix code, grammar, and link issues in the Streaming 
documentation

* fixing grammar issues with he streaming API section of the documentation 
that make it hard to follow in places. 
* fixing an incorrect code example, and places of unnecessary parentheses 
on the Windows page
* adding a missing link for Kineses Streams on the Connectors index page.
* correcting the nav position of the Pre-defined Timestamp Extractors / 
Watermark Emitters page

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dyanarose/flink FLINK-4026

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2077.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2077


commit 9c561fb4825d202cefbe382df36abce7a6d825b3
Author: Dyana Rose 
Date:   2016-05-18T21:14:15Z

[FLINK-4026] Fix code, grammar, and link issues in the Streaming 
documentation




> Fix code, grammar, and link issues in the Streaming documentation
> -
>
> Key: FLINK-4026
> URL: https://issues.apache.org/jira/browse/FLINK-4026
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Dyana Rose
>Priority: Trivial
>
> The streaming API section of the documentation has issues with grammar that 
> make it hard to follow in places. As well as an incorrect code example, and 
> places of unnecessary parentheses on the Windows page, and a missing link for 
> Kineses Streams on the Connectors index page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3777) Add open and close methods to manage IF lifecycle

2016-06-06 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316599#comment-15316599
 ] 

Chesnay Schepler commented on FLINK-3777:
-

small addition: there's some work being done regarding InputFormats in the 
Python API that also leverage these new methods.

> Add open and close methods to manage IF lifecycle
> -
>
> Key: FLINK-3777
> URL: https://issues.apache.org/jira/browse/FLINK-3777
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.1
>Reporter: Flavio Pompermaier
>Assignee: Flavio Pompermaier
>  Labels: inputformat, lifecycle
>
> At the moment the opening and closing of an inputFormat are not managed, 
> although open() could be (improperly IMHO) simulated by configure().
> This limits the possibility to reuse expensive resources (like database 
> connections) and manage their release. 
> Probably the best option would be to add 2 methods (i.e. openInputformat() 
> and closeInputFormat() ) to RichInputFormat*
> * NOTE: the best option from a "semantic" point of view would be to rename 
> the current open() and close() to openSplit() and closeSplit() respectively 
> while using open() and close() methods for the IF lifecycle management, but 
> this would cause a backward compatibility issue...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange

2016-06-06 Thread Subhankar Biswas (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subhankar Biswas updated FLINK-3769:

Comment: was deleted

(was: IMO user should be able to build different configuration for queue in 
both sink and source side. So if we can able to build some configuration for 
queue like this connection configuration:
https://github.com/subhankarb/flink/blob/670e92d0731652563fd58631107e0f19c5d5d9a1/flink-streaming-connectors/flink-connector-rabbitmq/src/main/java/org/apache/flink/streaming/connectors/rabbitmq/common/RMQConnectionConfig.java
and then pass to sink and source and declare queue based on this config, user 
will have more options to pass configurations.
I will start work on this once https://issues.apache.org/jira/browse/FLINK-3763 
will get merged.)

> RabbitMQ Sink ability to publish to a different exchange
> 
>
> Key: FLINK-3769
> URL: https://issues.apache.org/jira/browse/FLINK-3769
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.1
>Reporter: Robert Batts
>Assignee: Subhankar Biswas
>  Labels: rabbitmq
>
> The RabbitMQ Sink can currently only publish to the "default" exchange. This 
> exchange is a direct exchange, so the routing key will route directly to the 
> queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
> to 1 exchange which routes to 1 queue). Additionally, I believe that if a 
> user decides to use a different exchange I think the following can be assumed:
> 1.) The provided exchange exists
> 2.) The user has declared the appropriate mapping and the appropriate queues 
> exist in RabbitMQ (therefore, nothing needs to be created)
> RabbitMQ currently provides four types of exchanges. Three of these will be 
> covered by just enabling exchanges (Direct, Fanout, Topic) because they use 
> the routingkey (or nothing). 
> The fourth exchange type relies on the message headers, which are currently 
> set to null by default on the publish. These headers may be on a per message 
> level, so the input of this stream will need to take this as input as well. 
> This forth exchange could very well be outside of the scope of this 
> Improvement and a "RabbitMQ Sink enable headers" Improvement might be the 
> better way to go with this.
> Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange

2016-06-06 Thread Subhankar Biswas (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subhankar Biswas updated FLINK-3769:

Comment: was deleted

(was: IMO user should be able to build different configuration for queue in 
both sink and source side. So if we can able to build some configuration for 
queue like this connection configuration:
https://github.com/subhankarb/flink/blob/670e92d0731652563fd58631107e0f19c5d5d9a1/flink-streaming-connectors/flink-connector-rabbitmq/src/main/java/org/apache/flink/streaming/connectors/rabbitmq/common/RMQConnectionConfig.java
and then pass to sink and source and declare queue based on this config, user 
will have more options to pass configurations.
I will start work on this once https://issues.apache.org/jira/browse/FLINK-3763 
will get merged.)

> RabbitMQ Sink ability to publish to a different exchange
> 
>
> Key: FLINK-3769
> URL: https://issues.apache.org/jira/browse/FLINK-3769
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.1
>Reporter: Robert Batts
>Assignee: Subhankar Biswas
>  Labels: rabbitmq
>
> The RabbitMQ Sink can currently only publish to the "default" exchange. This 
> exchange is a direct exchange, so the routing key will route directly to the 
> queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
> to 1 exchange which routes to 1 queue). Additionally, I believe that if a 
> user decides to use a different exchange I think the following can be assumed:
> 1.) The provided exchange exists
> 2.) The user has declared the appropriate mapping and the appropriate queues 
> exist in RabbitMQ (therefore, nothing needs to be created)
> RabbitMQ currently provides four types of exchanges. Three of these will be 
> covered by just enabling exchanges (Direct, Fanout, Topic) because they use 
> the routingkey (or nothing). 
> The fourth exchange type relies on the message headers, which are currently 
> set to null by default on the publish. These headers may be on a per message 
> level, so the input of this stream will need to take this as input as well. 
> This forth exchange could very well be outside of the scope of this 
> Improvement and a "RabbitMQ Sink enable headers" Improvement might be the 
> better way to go with this.
> Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4026) Fix code, grammar, and link issues in the Streaming documentation

2016-06-06 Thread Dyana Rose (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316665#comment-15316665
 ] 

Dyana Rose commented on FLINK-4026:
---

oh the indignity, I fixed my own misspellings in the PR message, but the bot 
was too fast for me.

> Fix code, grammar, and link issues in the Streaming documentation
> -
>
> Key: FLINK-4026
> URL: https://issues.apache.org/jira/browse/FLINK-4026
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Dyana Rose
>Priority: Trivial
>
> The streaming API section of the documentation has issues with grammar that 
> make it hard to follow in places. As well as an incorrect code example, and 
> places of unnecessary parentheses on the Windows page, and a missing link for 
> Kineses Streams on the Connectors index page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4026) Fix code, grammar, and link issues in the Streaming documentation

2016-06-06 Thread Dyana Rose (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dyana Rose updated FLINK-4026:
--
Description: The streaming API section of the documentation has issues with 
grammar that make it hard to follow in places. As well as an incorrect code 
example, and places of unnecessary parentheses on the Windows page, and a 
missing link for Kinesis Streams on the Connectors index page.  (was: The 
streaming API section of the documentation has issues with grammar that make it 
hard to follow in places. As well as an incorrect code example, and places of 
unnecessary parentheses on the Windows page, and a missing link for Kineses 
Streams on the Connectors index page.)

> Fix code, grammar, and link issues in the Streaming documentation
> -
>
> Key: FLINK-4026
> URL: https://issues.apache.org/jira/browse/FLINK-4026
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Dyana Rose
>Priority: Trivial
>
> The streaming API section of the documentation has issues with grammar that 
> make it hard to follow in places. As well as an incorrect code example, and 
> places of unnecessary parentheses on the Windows page, and a missing link for 
> Kinesis Streams on the Connectors index page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2077: [FLINK-4026] Fix code, grammar, and link issues in...

2016-06-06 Thread dyanarose
GitHub user dyanarose opened a pull request:

https://github.com/apache/flink/pull/2077

[FLINK-4026] Fix code, grammar, and link issues in the Streaming 
documentation

* fixing grammar issues with he streaming API section of the documentation 
that make it hard to follow in places. 
* fixing an incorrect code example, and places of unnecessary parentheses 
on the Windows page
* adding a missing link for Kineses Streams on the Connectors index page.
* correcting the nav position of the Pre-defined Timestamp Extractors / 
Watermark Emitters page

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dyanarose/flink FLINK-4026

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2077.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2077


commit 9c561fb4825d202cefbe382df36abce7a6d825b3
Author: Dyana Rose 
Date:   2016-05-18T21:14:15Z

[FLINK-4026] Fix code, grammar, and link issues in the Streaming 
documentation




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2733) ZooKeeperLeaderElectionTest.testZooKeeperReelection fails

2016-06-06 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316636#comment-15316636
 ] 

Till Rohrmann commented on FLINK-2733:
--

The problem seems to be a connection loss to the {{ZooKeeper}} testing server. 
Two curator clients lose their connection to the testing server. I suspect that 
the testing listener is one of the affected components. As a consequence the 
testing listener is no longer notified about the changing leader election and 
the test fails.

I assume that this has something to do with the Travis instances and the 
resouce consumption, since I couldn't reproduce the problem locally. I propose 
to decrease the number of concurrently connected instances and to increase the 
connection timeout in order to harden the test case.

> ZooKeeperLeaderElectionTest.testZooKeeperReelection fails
> -
>
> Key: FLINK-2733
> URL: https://issues.apache.org/jira/browse/FLINK-2733
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
>  Labels: test-stability
>
> I observed a test failure in this run: 
> https://travis-ci.org/rmetzger/flink/jobs/81571914
> {code}
> testZooKeeperReelection(org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest)
>   Time elapsed: 109.794 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but 
> was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest.testZooKeeperReelection(ZooKeeperLeaderElectionTest.java:171)
> Results :
> Failed tests: 
>   ZooKeeperLeaderElectionTest.testZooKeeperReelection:171 
> expected: but was:
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4026) Fix code, grammar, and link issues in the Streaming documentation

2016-06-06 Thread Dyana Rose (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316631#comment-15316631
 ] 

Dyana Rose commented on FLINK-4026:
---

I've a branch for this that I'll PR shortly

> Fix code, grammar, and link issues in the Streaming documentation
> -
>
> Key: FLINK-4026
> URL: https://issues.apache.org/jira/browse/FLINK-4026
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Dyana Rose
>Priority: Trivial
>
> The streaming API section of the documentation has issues with grammar that 
> make it hard to follow in places. As well as an incorrect code example, and 
> places of unnecessary parentheses on the Windows page, and a missing link for 
> Kineses Streams on the Connectors index page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4026) Fix code, grammar, and link issues in the Streaming documentation

2016-06-06 Thread Dyana Rose (JIRA)
Dyana Rose created FLINK-4026:
-

 Summary: Fix code, grammar, and link issues in the Streaming 
documentation
 Key: FLINK-4026
 URL: https://issues.apache.org/jira/browse/FLINK-4026
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Dyana Rose
Priority: Trivial


The streaming API section of the documentation has issues with grammar that 
make it hard to follow in places. As well as an incorrect code example, and 
places of unnecessary parentheses on the Windows page, and a missing link for 
Kineses Streams on the Connectors index page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4025) Add possiblity for the RMQ Streaming Source to customize the queue

2016-06-06 Thread Dominik Bruhn (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316571#comment-15316571
 ] 

Dominik Bruhn commented on FLINK-4025:
--

No, actually they address different problems although one needs to modify the 
other PR if the other one was merged:

FLINK-3763 makes the currently existing configuration parameters easier 
accessible using an own class
My PR adds the possiblity to customize what happens when the RMQSource creates 
the queue. That is still not possible on the mentioned FLINK-3763.

> Add possiblity for the RMQ Streaming Source to customize the queue
> --
>
> Key: FLINK-4025
> URL: https://issues.apache.org/jira/browse/FLINK-4025
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.2
>Reporter: Dominik Bruhn
>
> This patch adds the possibilty for the user of the RabbitMQ
> Streaming Connector to customize the queue which is used. There
> are use-cases in which you want to set custom parameters for the
> queue (i.e. TTL of the messages if Flink reboots) or the
> possibility to bind the queue to an exchange afterwards.
> The commit doesn't change the actual behaviour but makes it
> possible for users to override the newly create `setupQueue`
> method and cutomize their implementation. This was not possible
> before.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2733) ZooKeeperLeaderElectionTest.testZooKeeperReelection fails

2016-06-06 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316557#comment-15316557
 ] 

Till Rohrmann commented on FLINK-2733:
--

Another instance: 
https://s3.amazonaws.com/archive.travis-ci.org/jobs/134215896/log.txt

> ZooKeeperLeaderElectionTest.testZooKeeperReelection fails
> -
>
> Key: FLINK-2733
> URL: https://issues.apache.org/jira/browse/FLINK-2733
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
>  Labels: test-stability
>
> I observed a test failure in this run: 
> https://travis-ci.org/rmetzger/flink/jobs/81571914
> {code}
> testZooKeeperReelection(org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest)
>   Time elapsed: 109.794 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but 
> was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest.testZooKeeperReelection(ZooKeeperLeaderElectionTest.java:171)
> Results :
> Failed tests: 
>   ZooKeeperLeaderElectionTest.testZooKeeperReelection:171 
> expected: but was:
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-2733) ZooKeeperLeaderElectionTest.testZooKeeperReelection fails

2016-06-06 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-2733:


Assignee: Till Rohrmann

> ZooKeeperLeaderElectionTest.testZooKeeperReelection fails
> -
>
> Key: FLINK-2733
> URL: https://issues.apache.org/jira/browse/FLINK-2733
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
>  Labels: test-stability
>
> I observed a test failure in this run: 
> https://travis-ci.org/rmetzger/flink/jobs/81571914
> {code}
> testZooKeeperReelection(org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest)
>   Time elapsed: 109.794 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but 
> was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest.testZooKeeperReelection(ZooKeeperLeaderElectionTest.java:171)
> Results :
> Failed tests: 
>   ZooKeeperLeaderElectionTest.testZooKeeperReelection:171 
> expected: but was:
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2076: [FLINK-2832] [tests] Hardens RandomSamplerTest

2016-06-06 Thread tillrohrmann
GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/2076

[FLINK-2832] [tests] Hardens RandomSamplerTest

Increase the level of significance from p=0.01 to p=0.001 and add retry 
annotations
to random sampler tests. This should decrease the probability of failing 
random
sampler tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink fixRandomSamplerTest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2076.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2076


commit 3bf500eccd41e4094cd7951954bf44de845cb3ce
Author: Till Rohrmann 
Date:   2016-06-06T14:19:30Z

[FLINK-2832] [tests] Hardens RandomSamplerTest

Increase the level of significance from p=0.01 to p=0.001 and add retry 
annotations
to random sampler tests. This should decrease the probability of failing 
random
sampler tests.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2832) Failing test: RandomSamplerTest.testReservoirSamplerWithReplacement

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316541#comment-15316541
 ] 

ASF GitHub Bot commented on FLINK-2832:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/2076

[FLINK-2832] [tests] Hardens RandomSamplerTest

Increase the level of significance from p=0.01 to p=0.001 and add retry 
annotations
to random sampler tests. This should decrease the probability of failing 
random
sampler tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink fixRandomSamplerTest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2076.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2076


commit 3bf500eccd41e4094cd7951954bf44de845cb3ce
Author: Till Rohrmann 
Date:   2016-06-06T14:19:30Z

[FLINK-2832] [tests] Hardens RandomSamplerTest

Increase the level of significance from p=0.01 to p=0.001 and add retry 
annotations
to random sampler tests. This should decrease the probability of failing 
random
sampler tests.




> Failing test: RandomSamplerTest.testReservoirSamplerWithReplacement
> ---
>
> Key: FLINK-2832
> URL: https://issues.apache.org/jira/browse/FLINK-2832
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.10.0
>Reporter: Vasia Kalavri
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.0.0
>
>
> Tests run: 17, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 19.133 sec 
> <<< FAILURE! - in org.apache.flink.api.java.sampling.RandomSamplerTest
> testReservoirSamplerWithReplacement(org.apache.flink.api.java.sampling.RandomSamplerTest)
>   Time elapsed: 2.534 sec  <<< FAILURE!
> java.lang.AssertionError: KS test result with p value(0.11), d 
> value(0.103090)
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyKSTest(RandomSamplerTest.java:342)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyRandomSamplerWithSampleSize(RandomSamplerTest.java:330)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyReservoirSamplerWithReplacement(RandomSamplerTest.java:289)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.testReservoirSamplerWithReplacement(RandomSamplerTest.java:192)
> Results :
> Failed tests: 
>   
> RandomSamplerTest.testReservoirSamplerWithReplacement:192->verifyReservoirSamplerWithReplacement:289->verifyRandomSamplerWithSampleSize:330->verifyKSTest:342
>  KS test result with p value(0.11), d value(0.103090)
> Full log [here|https://travis-ci.org/apache/flink/jobs/84120131].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-2832) Failing test: RandomSamplerTest.testReservoirSamplerWithReplacement

2016-06-06 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-2832:


Assignee: Till Rohrmann

> Failing test: RandomSamplerTest.testReservoirSamplerWithReplacement
> ---
>
> Key: FLINK-2832
> URL: https://issues.apache.org/jira/browse/FLINK-2832
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.10.0
>Reporter: Vasia Kalavri
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.0.0
>
>
> Tests run: 17, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 19.133 sec 
> <<< FAILURE! - in org.apache.flink.api.java.sampling.RandomSamplerTest
> testReservoirSamplerWithReplacement(org.apache.flink.api.java.sampling.RandomSamplerTest)
>   Time elapsed: 2.534 sec  <<< FAILURE!
> java.lang.AssertionError: KS test result with p value(0.11), d 
> value(0.103090)
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyKSTest(RandomSamplerTest.java:342)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyRandomSamplerWithSampleSize(RandomSamplerTest.java:330)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyReservoirSamplerWithReplacement(RandomSamplerTest.java:289)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.testReservoirSamplerWithReplacement(RandomSamplerTest.java:192)
> Results :
> Failed tests: 
>   
> RandomSamplerTest.testReservoirSamplerWithReplacement:192->verifyReservoirSamplerWithReplacement:289->verifyRandomSamplerWithSampleSize:330->verifyKSTest:342
>  KS test result with p value(0.11), d value(0.103090)
> Full log [here|https://travis-ci.org/apache/flink/jobs/84120131].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2832) Failing test: RandomSamplerTest.testReservoirSamplerWithReplacement

2016-06-06 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316536#comment-15316536
 ] 

Till Rohrmann commented on FLINK-2832:
--

Here is another instance of the failure (this time the 
{{testReservoirSamplerWithMultiSourcePartitions1}} test failed): 
https://s3.amazonaws.com/archive.travis-ci.org/jobs/134215943/log.txt

The problem seems to be that the testing condition is not deterministic. Every 
so and so many test runs the Kolmogorow-Smirnow test simply fails because the 
level of significance is not zero.

I would propose to set the level of significance to {{0.001}} instead of 
{{0.01}} and to add a retry annotation to the sampler tests. This should harden 
the test stability.

> Failing test: RandomSamplerTest.testReservoirSamplerWithReplacement
> ---
>
> Key: FLINK-2832
> URL: https://issues.apache.org/jira/browse/FLINK-2832
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.10.0
>Reporter: Vasia Kalavri
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.0.0
>
>
> Tests run: 17, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 19.133 sec 
> <<< FAILURE! - in org.apache.flink.api.java.sampling.RandomSamplerTest
> testReservoirSamplerWithReplacement(org.apache.flink.api.java.sampling.RandomSamplerTest)
>   Time elapsed: 2.534 sec  <<< FAILURE!
> java.lang.AssertionError: KS test result with p value(0.11), d 
> value(0.103090)
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyKSTest(RandomSamplerTest.java:342)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyRandomSamplerWithSampleSize(RandomSamplerTest.java:330)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.verifyReservoirSamplerWithReplacement(RandomSamplerTest.java:289)
>   at 
> org.apache.flink.api.java.sampling.RandomSamplerTest.testReservoirSamplerWithReplacement(RandomSamplerTest.java:192)
> Results :
> Failed tests: 
>   
> RandomSamplerTest.testReservoirSamplerWithReplacement:192->verifyReservoirSamplerWithReplacement:289->verifyRandomSamplerWithSampleSize:330->verifyKSTest:342
>  KS test result with p value(0.11), d value(0.103090)
> Full log [here|https://travis-ci.org/apache/flink/jobs/84120131].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2020: [FLINK-2314] Make Streaming File Sources Persisten...

2016-06-06 Thread kl0u
GitHub user kl0u reopened a pull request:

https://github.com/apache/flink/pull/2020

[FLINK-2314] Make Streaming File Sources Persistent

This PR solves FLINK-2314 and combines a number of sub-tasks. In addition, 
it solves FLINK-3896 which was introduced as part of this task.

The way File Input sources are now processed is the following:
 * One task monitors (parallelism 1) a user-specified path for new 
files/data
 * The above task assigns FileInputSplits to downstream (parallel) 
readers to actually read the data

The monitoring entity scans the path, splits the files to be processed in 
splits, and assigns them downstream. For now, two modes are supported. These 
are the PROCESS_ONCE which just processes the current contents of the path and 
exits, and the REPROCESS_WITH_APPENDED which periodically monitors the path and 
reprocesses new files and (the entire contents of) files with new data.

In addition, these sources are checkpointed, i.e. in the case of a task 
failure the job will resume from where it left off.

Finally, some changes were introduced in the way we are handling 
FileInputFormats after discussions with @aljoscha .

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kl0u/flink api_ft_files

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2020.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2020


commit 457727b4b2c7bfad914ad9876dd4135355de732f
Author: kl0u 
Date:   2016-04-10T14:56:42Z

[FLINK-3717] Make FileInputFormat checkpointable

This adds a new interface called CheckpointableInputFormat
which describes input formats whose state is queryable,
i.e. getCurrentChannelState() returns where the reader is
in the underlying source, and they can resume reading from
a user-specified position.

This functionality is not yet leveraged by current readers.

commit edac9fea48200e62fc38b96926d9254c93830499
Author: kl0u 
Date:   2016-04-18T14:37:54Z

[FLINK-3889] Refactor File Monitoring Source

This is meant to replace the different file
reading sources in Flink streaming. Now there is
one monitoring source with DOP 1 monitoring a
directory and assigning input split to downstream
readers.

In addition, it makes the new features added by
FLINK-3717 work together with the aforementioned entities
(the monitor and the readers) in order to have
fault tolerant file sources and exactly once guarantees.

This does not replace the old API calls. This
will be done in a future commit.

commit d343d1143514e97b2ce9acabcbcc2fdaf2f89814
Author: kl0u 
Date:   2016-05-10T16:56:58Z

[FLINK-3896] Allow a StreamTask to be Externally Cancelled

It adds a method failExternally() to the StreamTask, so that custom 
Operators
can make their containing task fail when needed.

commit 966244c703012e8674e9786a033f7d779ceb6f73
Author: kl0u 
Date:   2016-05-18T14:44:45Z

[FLINK-2314] Make Streaming File Sources Persistent

This commit takes the changes from the previous
commits and wires them into the API, both Java and Scala.

While doing so, some changes were introduced to the
classes actually doing the work, either as bug fixes, or
as new design choices.

commit f17b5318fb84b3111ac8407ef11e719c1fb9b360
Author: kl0u 
Date:   2016-05-27T11:56:44Z

Integrating the PR Comments.

commit 66e1423ae512f60c955113d1fd564e50663d4ea2
Author: kl0u 
Date:   2016-05-31T23:07:20Z

Final comments

commit 61d2a1a9cc75f0c698e0bca9c1e6adb612336f01
Author: kl0u 
Date:   2016-06-01T12:45:46Z

Final Commnents.

commit e1dac4b506d470299bd9504c344797910394fe59
Author: kl0u 
Date:   2016-06-06T09:19:29Z

Fixing broken test




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2314) Make Streaming File Sources Persistent

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316529#comment-15316529
 ] 

ASF GitHub Bot commented on FLINK-2314:
---

GitHub user kl0u reopened a pull request:

https://github.com/apache/flink/pull/2020

[FLINK-2314] Make Streaming File Sources Persistent

This PR solves FLINK-2314 and combines a number of sub-tasks. In addition, 
it solves FLINK-3896 which was introduced as part of this task.

The way File Input sources are now processed is the following:
 * One task monitors (parallelism 1) a user-specified path for new 
files/data
 * The above task assigns FileInputSplits to downstream (parallel) 
readers to actually read the data

The monitoring entity scans the path, splits the files to be processed in 
splits, and assigns them downstream. For now, two modes are supported. These 
are the PROCESS_ONCE which just processes the current contents of the path and 
exits, and the REPROCESS_WITH_APPENDED which periodically monitors the path and 
reprocesses new files and (the entire contents of) files with new data.

In addition, these sources are checkpointed, i.e. in the case of a task 
failure the job will resume from where it left off.

Finally, some changes were introduced in the way we are handling 
FileInputFormats after discussions with @aljoscha .

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kl0u/flink api_ft_files

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2020.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2020


commit 457727b4b2c7bfad914ad9876dd4135355de732f
Author: kl0u 
Date:   2016-04-10T14:56:42Z

[FLINK-3717] Make FileInputFormat checkpointable

This adds a new interface called CheckpointableInputFormat
which describes input formats whose state is queryable,
i.e. getCurrentChannelState() returns where the reader is
in the underlying source, and they can resume reading from
a user-specified position.

This functionality is not yet leveraged by current readers.

commit edac9fea48200e62fc38b96926d9254c93830499
Author: kl0u 
Date:   2016-04-18T14:37:54Z

[FLINK-3889] Refactor File Monitoring Source

This is meant to replace the different file
reading sources in Flink streaming. Now there is
one monitoring source with DOP 1 monitoring a
directory and assigning input split to downstream
readers.

In addition, it makes the new features added by
FLINK-3717 work together with the aforementioned entities
(the monitor and the readers) in order to have
fault tolerant file sources and exactly once guarantees.

This does not replace the old API calls. This
will be done in a future commit.

commit d343d1143514e97b2ce9acabcbcc2fdaf2f89814
Author: kl0u 
Date:   2016-05-10T16:56:58Z

[FLINK-3896] Allow a StreamTask to be Externally Cancelled

It adds a method failExternally() to the StreamTask, so that custom 
Operators
can make their containing task fail when needed.

commit 966244c703012e8674e9786a033f7d779ceb6f73
Author: kl0u 
Date:   2016-05-18T14:44:45Z

[FLINK-2314] Make Streaming File Sources Persistent

This commit takes the changes from the previous
commits and wires them into the API, both Java and Scala.

While doing so, some changes were introduced to the
classes actually doing the work, either as bug fixes, or
as new design choices.

commit f17b5318fb84b3111ac8407ef11e719c1fb9b360
Author: kl0u 
Date:   2016-05-27T11:56:44Z

Integrating the PR Comments.

commit 66e1423ae512f60c955113d1fd564e50663d4ea2
Author: kl0u 
Date:   2016-05-31T23:07:20Z

Final comments

commit 61d2a1a9cc75f0c698e0bca9c1e6adb612336f01
Author: kl0u 
Date:   2016-06-01T12:45:46Z

Final Commnents.

commit e1dac4b506d470299bd9504c344797910394fe59
Author: kl0u 
Date:   2016-06-06T09:19:29Z

Fixing broken test




> Make Streaming File Sources Persistent
> --
>
> Key: FLINK-2314
> URL: https://issues.apache.org/jira/browse/FLINK-2314
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Kostas Kloudas
>
> Streaming File sources should participate in the checkpointing. They should 
> track the bytes they read from the file and checkpoint it.
> One can look at the sequence generating source function for an 

[GitHub] flink pull request #2020: [FLINK-2314] Make Streaming File Sources Persisten...

2016-06-06 Thread kl0u
Github user kl0u closed the pull request at:

https://github.com/apache/flink/pull/2020


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2314) Make Streaming File Sources Persistent

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316527#comment-15316527
 ] 

ASF GitHub Bot commented on FLINK-2314:
---

Github user kl0u closed the pull request at:

https://github.com/apache/flink/pull/2020


> Make Streaming File Sources Persistent
> --
>
> Key: FLINK-2314
> URL: https://issues.apache.org/jira/browse/FLINK-2314
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Kostas Kloudas
>
> Streaming File sources should participate in the checkpointing. They should 
> track the bytes they read from the file and checkpoint it.
> One can look at the sequence generating source function for an example of a 
> checkpointed source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-3960) Disable, fix and re-enable EventTimeWindowCheckpointingITCase

2016-06-06 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek resolved FLINK-3960.
-
Resolution: Fixed

Reverted in 
https://github.com/apache/flink/commit/cfffdc87e3a3ac8aa7e33db87223df7bb7c8aef9

> Disable, fix and re-enable EventTimeWindowCheckpointingITCase
> -
>
> Key: FLINK-3960
> URL: https://issues.apache.org/jira/browse/FLINK-3960
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Aljoscha Krettek
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> As a follow-up issue of FLINK-3909, our tests fail with the following. I 
> believe [~aljoscha] is working on a fix.
> {noformat}
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7fae0c62a264, pid=72720, tid=140385528268544
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_76-b13) (build 
> 1.7.0_76-b13)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.76-b04 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [librocksdbjni78704726610339516..so+0x13c264]  
> rocksdb_iterator_helper(rocksdb::DB*, rocksdb::ReadOptions, 
> rocksdb::ColumnFamilyHandle*)+0x4
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /home/travis/build/mxm/flink/flink-tests/target/hs_err_pid72720.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> Aborted (core dumped)
> {noformat}
> I propose to disable the test case in the meantime because it is blocking our 
> test execution which we need for pull requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-3948) EventTimeWindowCheckpointingITCase Fails with Core Dump

2016-06-06 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek resolved FLINK-3948.
-
   Resolution: Fixed
Fix Version/s: 1.1.0

Fixed in 
https://github.com/apache/flink/commit/61d69a229b40e52460f26804e4a36cf12e150004

> EventTimeWindowCheckpointingITCase Fails with Core Dump
> ---
>
> Key: FLINK-3948
> URL: https://issues.apache.org/jira/browse/FLINK-3948
> Project: Flink
>  Issue Type: Bug
>  Components: state backends
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Critical
> Fix For: 1.1.0
>
>
> It fails because of a core dump in RocksDB. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4025) Add possiblity for the RMQ Streaming Source to customize the queue

2016-06-06 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316384#comment-15316384
 ] 

Robert Metzger commented on FLINK-4025:
---

I guess this issue is subsumed by / duplicated by FLINK-3763?

> Add possiblity for the RMQ Streaming Source to customize the queue
> --
>
> Key: FLINK-4025
> URL: https://issues.apache.org/jira/browse/FLINK-4025
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.2
>Reporter: Dominik Bruhn
>
> This patch adds the possibilty for the user of the RabbitMQ
> Streaming Connector to customize the queue which is used. There
> are use-cases in which you want to set custom parameters for the
> queue (i.e. TTL of the messages if Flink reboots) or the
> possibility to bind the queue to an exchange afterwards.
> The commit doesn't change the actual behaviour but makes it
> possible for users to override the newly create `setupQueue`
> method and cutomize their implementation. This was not possible
> before.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-4025) Add possiblity for the RMQ Streaming Source to customize the queue

2016-06-06 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316384#comment-15316384
 ] 

Robert Metzger edited comment on FLINK-4025 at 6/6/16 11:51 AM:


I guess this issue is subsumed / duplicated by FLINK-3763?


was (Author: rmetzger):
I guess this issue is subsumed by / duplicated by FLINK-3763?

> Add possiblity for the RMQ Streaming Source to customize the queue
> --
>
> Key: FLINK-4025
> URL: https://issues.apache.org/jira/browse/FLINK-4025
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.2
>Reporter: Dominik Bruhn
>
> This patch adds the possibilty for the user of the RabbitMQ
> Streaming Connector to customize the queue which is used. There
> are use-cases in which you want to set custom parameters for the
> queue (i.e. TTL of the messages if Flink reboots) or the
> possibility to bind the queue to an exchange afterwards.
> The commit doesn't change the actual behaviour but makes it
> possible for users to override the newly create `setupQueue`
> method and cutomize their implementation. This was not possible
> before.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #1856: FLINK-3650 Add maxBy/minBy to Scala DataSet API

2016-06-06 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/1856
  
@zentol , @tillrohrmann , @fhueske 
Any chance of review here. Sorry for regular reminders :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3650) Add maxBy/minBy to Scala DataSet API

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316380#comment-15316380
 ] 

ASF GitHub Bot commented on FLINK-3650:
---

Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/1856
  
@zentol , @tillrohrmann , @fhueske 
Any chance of review here. Sorry for regular reminders :)


> Add maxBy/minBy to Scala DataSet API
> 
>
> Key: FLINK-3650
> URL: https://issues.apache.org/jira/browse/FLINK-3650
> Project: Flink
>  Issue Type: Improvement
>  Components: Java API, Scala API
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: ramkrishna.s.vasudevan
>
> The stable Java DataSet API contains the API calls {{maxBy}} and {{minBy}}. 
> These methods are not supported by the Scala DataSet API. These methods 
> should be added in order to have a consistent API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2066: Updated ssh configuration in base Dockerfile

2016-06-06 Thread techmaniack
Github user techmaniack commented on a diff in the pull request:

https://github.com/apache/flink/pull/2066#discussion_r65873361
  
--- Diff: flink-contrib/docker-flink/base/Dockerfile ---
@@ -38,12 +38,12 @@ ENV JAVA_HOME /usr/java/default/
 RUN echo 'root:secret' | chpasswd
 
 #SSH as root... probably needs to be revised for security!
-RUN sed -i 's/PermitRootLogin without-password/PermitRootLogin yes/' 
/etc/ssh/sshd_config
--- End diff --

Agreed. It might break previous ubuntu versions workflow.
Changing the sed expression to the following works for precise/trusty/xenial

`sed -i 's/PermitRootLogin .*/PermitRootLogin yes/' /etc/ssh/sshd_config`

Also, currently the flink version is hardcoded in a URL in flink/Dockerfile 
which can be moved to **install-flink.sh** where flink/scala/hadoop versions 
can be specified. Something like this:

```
BASE_URL=http://www-us.apache.org/dist/flink/
FLINK_VERSION=1.0.3
SCALA_VERSION=2.11
HADOOP_VERSION=27 #2.7.0

FULL_URL="$BASE_URL/flink-$FLINK_VERSION/flink-$FLINK_VERSION-bin-hadoop$HADOOP_VERSION-scala_$SCALA_VERSION.tgz"

wget -q -O - $FULL_URL | tar -zxvf - -C /usr/local/

cd /usr/local && ln -s "./flink-$FLINK_VERSION" flink

```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2074: [FLINK-2985] [Table API] Allow different field names for ...

2016-06-06 Thread gallenvara
Github user gallenvara commented on the issue:

https://github.com/apache/flink/pull/2074
  
@twalthr  @fhueske Can you help with review work? Thanks. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2074: [FLINK-2985] [Table API] Allow different field nam...

2016-06-06 Thread gallenvara
GitHub user gallenvara opened a pull request:

https://github.com/apache/flink/pull/2074

[FLINK-2985] [Table API] Allow different field names for unionAll() in 
Table API.

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [X] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [X] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed

It is not necessary that the corresponding columns in each SELECT statement 
have the same name, but they do need to be the same data types. This PR 
supports allowing different field names for union/unionAll in Table API.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gallenvara/flink flink-2985

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2074.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2074


commit e84d836efbb290fb79d14e3d3c6c9e13db567b72
Author: gallenvara 
Date:   2016-06-06T08:30:41Z

Allow different field names for union in Table API.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Issue Comment Deleted] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange

2016-06-06 Thread Subhankar Biswas (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subhankar Biswas updated FLINK-3769:

Comment: was deleted

(was: IMO user should be able to build different configuration for queue in 
both sink and source side. So if we can able to build some configuration for 
queue like this connection configuration:
https://github.com/subhankarb/flink/blob/670e92d0731652563fd58631107e0f19c5d5d9a1/flink-streaming-connectors/flink-connector-rabbitmq/src/main/java/org/apache/flink/streaming/connectors/rabbitmq/common/RMQConnectionConfig.java
and then pass to sink and source and declare queue based on this config, user 
will have more options to pass configurations.
I will start work on this once FLINK-3763 will get merged. 
)

> RabbitMQ Sink ability to publish to a different exchange
> 
>
> Key: FLINK-3769
> URL: https://issues.apache.org/jira/browse/FLINK-3769
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.1
>Reporter: Robert Batts
>Assignee: Subhankar Biswas
>  Labels: rabbitmq
>
> The RabbitMQ Sink can currently only publish to the "default" exchange. This 
> exchange is a direct exchange, so the routing key will route directly to the 
> queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
> to 1 exchange which routes to 1 queue). Additionally, I believe that if a 
> user decides to use a different exchange I think the following can be assumed:
> 1.) The provided exchange exists
> 2.) The user has declared the appropriate mapping and the appropriate queues 
> exist in RabbitMQ (therefore, nothing needs to be created)
> RabbitMQ currently provides four types of exchanges. Three of these will be 
> covered by just enabling exchanges (Direct, Fanout, Topic) because they use 
> the routingkey (or nothing). 
> The fourth exchange type relies on the message headers, which are currently 
> set to null by default on the publish. These headers may be on a per message 
> level, so the input of this stream will need to take this as input as well. 
> This forth exchange could very well be outside of the scope of this 
> Improvement and a "RabbitMQ Sink enable headers" Improvement might be the 
> better way to go with this.
> Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3777) Add open and close methods to manage IF lifecycle

2016-06-06 Thread Flavio Pompermaier (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316320#comment-15316320
 ] 

Flavio Pompermaier commented on FLINK-3777:
---

Hi Stephan,
we decided to introduce the open/close-IF functions to give the possibility, 
where necessary, to properly handle the initialization and destruction of an IF.
I asked for such a feature during the implementation of an efficient JDBC 
connector for a huge table (11 billions of rows) where the creation and 
destruction of a JDBC connection become a very expensive operation during the 
job because a new connection was created millions of times.
In the former implementation, without those methods, I had to write a custom IF 
with a connection-pool to overcome this problem.
Since Flink is supposed to be a tool for big data I thought this was a 
must-have feature instead of a corner case..

BTW, I think you're referring to 
https://issues.apache.org/jira/browse/FLINK-4024. I quickly looked at that code 
and I'm not fully convinced that the main problem is the introduction of those 
2 new methods..first of all the FileSourceFunction actually seems to be not 
related to File at all, it's something more generic. Second, as I stated at the 
very beginning of this thread, open() and close() are actually referred to 
splits and not to the IF (openSplit and closeSplit would help in readability of 
the code) and third, a proper call to open/close-IF not only improves 
readability of the code but also force a developer to detect possible bad usage 
of an IF.

Summarizing, IMHO it's better to fix FLINK-4024 wrt reverting all this PR that 
enhance the flexibility of Flink when dealing with real big data.

> Add open and close methods to manage IF lifecycle
> -
>
> Key: FLINK-3777
> URL: https://issues.apache.org/jira/browse/FLINK-3777
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.1
>Reporter: Flavio Pompermaier
>Assignee: Flavio Pompermaier
>  Labels: inputformat, lifecycle
>
> At the moment the opening and closing of an inputFormat are not managed, 
> although open() could be (improperly IMHO) simulated by configure().
> This limits the possibility to reuse expensive resources (like database 
> connections) and manage their release. 
> Probably the best option would be to add 2 methods (i.e. openInputformat() 
> and closeInputFormat() ) to RichInputFormat*
> * NOTE: the best option from a "semantic" point of view would be to rename 
> the current open() and close() to openSplit() and closeSplit() respectively 
> while using open() and close() methods for the IF lifecycle management, but 
> this would cause a backward compatibility issue...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3777) Add open and close methods to manage IF lifecycle

2016-06-06 Thread Flavio Pompermaier (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316318#comment-15316318
 ] 

Flavio Pompermaier commented on FLINK-3777:
---

Hi Stephan,
we decided to introduce the open/close-IF functions to give the possibility, 
where necessary, to properly handle the initialization and destruction of an IF.
I asked for such a feature during the implementation of an efficient JDBC 
connector for a huge table (11 billions of rows) where the creation and 
destruction of a JDBC connection become a very expensive operation during the 
job because a new connection was created millions of times.
In the former implementation, without those methods, I had to write a custom IF 
with a connection-pool to overcome this problem.
Since Flink is supposed to be a tool for big data I thought this was a 
must-have feature instead of a corner case..

BTW, I think you're referring to 
https://issues.apache.org/jira/browse/FLINK-4024. I quickly looked at that code 
and I'm not fully convinced that the main problem is the introduction of those 
2 new methods..first of all the FileSourceFunction actually seems to be not 
related to File at all, it's something more generic. Second, as I stated at the 
very beginning of this thread, open() and close() are actually referred to 
splits and not to the IF (openSplit and closeSplit would help in readability of 
the code) and third, a proper call to open/close-IF not only improves 
readability of the code but also force a developer to detect possible bad usage 
of an IF.

Summarizing, IMHO it's better to fix FLINK-4024 wrt reverting all this PR that 
enhance the flexibility of Flink when dealing with real big data.

> Add open and close methods to manage IF lifecycle
> -
>
> Key: FLINK-3777
> URL: https://issues.apache.org/jira/browse/FLINK-3777
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.1
>Reporter: Flavio Pompermaier
>Assignee: Flavio Pompermaier
>  Labels: inputformat, lifecycle
>
> At the moment the opening and closing of an inputFormat are not managed, 
> although open() could be (improperly IMHO) simulated by configure().
> This limits the possibility to reuse expensive resources (like database 
> connections) and manage their release. 
> Probably the best option would be to add 2 methods (i.e. openInputformat() 
> and closeInputFormat() ) to RichInputFormat*
> * NOTE: the best option from a "semantic" point of view would be to rename 
> the current open() and close() to openSplit() and closeSplit() respectively 
> while using open() and close() methods for the IF lifecycle management, but 
> this would cause a backward compatibility issue...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3777) Add open and close methods to manage IF lifecycle

2016-06-06 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316302#comment-15316302
 ] 

Chesnay Schepler commented on FLINK-3777:
-

these methods allow sharing resources across splits. This is a very useful 
feature to have when doing anything remotely expensive in open() or close(). 
Dealing with a large number of relatively small splits is one example, as is 
the case for the refactored JDBC IF.

Reverting this most likely also means reverting part of the JDBC changes.

Also, afaik we only found a single case where they were not handled correctly.

> Add open and close methods to manage IF lifecycle
> -
>
> Key: FLINK-3777
> URL: https://issues.apache.org/jira/browse/FLINK-3777
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.1
>Reporter: Flavio Pompermaier
>Assignee: Flavio Pompermaier
>  Labels: inputformat, lifecycle
>
> At the moment the opening and closing of an inputFormat are not managed, 
> although open() could be (improperly IMHO) simulated by configure().
> This limits the possibility to reuse expensive resources (like database 
> connections) and manage their release. 
> Probably the best option would be to add 2 methods (i.e. openInputformat() 
> and closeInputFormat() ) to RichInputFormat*
> * NOTE: the best option from a "semantic" point of view would be to rename 
> the current open() and close() to openSplit() and closeSplit() respectively 
> while using open() and close() methods for the IF lifecycle management, but 
> this would cause a backward compatibility issue...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4016) FoldApplyWindowFunction is not properly initialized

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316289#comment-15316289
 ] 

ASF GitHub Bot commented on FLINK-4016:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2070
  
Hi,
I wrote in the issue that you opened. This is a duplicate of 
https://issues.apache.org/jira/browse/FLINK-3977 and the way to fix it is to 
make `InternalWindowFunction` properly forward calls to `setOutputType`. If you 
want, could you please change your fix to that and open a PR against the older 
issue?


> FoldApplyWindowFunction is not properly initialized
> ---
>
> Key: FLINK-4016
> URL: https://issues.apache.org/jira/browse/FLINK-4016
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Affects Versions: 1.0.3
>Reporter: RWenden
>Priority: Blocker
>  Labels: easyfix
> Fix For: 1.1.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> FoldApplyWindowFunction's outputtype is not set.
> We're using constructions like (excerpt):
>   .keyBy(0)
>   .countWindow(10, 5)
>   .fold(...)
> Running this stream gives an runtime exception in FoldApplyWindowFunction:
> "No initial value was serialized for the fold window function. Probably the 
> setOutputType method was not called."
> This can be easily fixed in WindowedStream.java by (around line# 449):
> FoldApplyWindowFunction foldApplyWindowFunction = new 
> FoldApplyWindowFunction<>(initialValue, foldFunction, function);
> foldApplyWindowFunction.setOutputType(resultType, 
> input.getExecutionConfig());
> operator = new EvictingWindowOperator<>(windowAssigner,
> 
> windowAssigner.getWindowSerializer(getExecutionEnvironment().getConfig()),
> keySel,
> 
> input.getKeyType().createSerializer(getExecutionEnvironment().getConfig()),
> stateDesc,
> new 
> InternalIterableWindowFunction<>(foldApplyWindowFunction),
> trigger,
> evictor);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2070: [FLINK-4016] initialize FoldApplyWindowFunction properly

2016-06-06 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2070
  
Hi,
I wrote in the issue that you opened. This is a duplicate of 
https://issues.apache.org/jira/browse/FLINK-3977 and the way to fix it is to 
make `InternalWindowFunction` properly forward calls to `setOutputType`. If you 
want, could you please change your fix to that and open a PR against the older 
issue?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (FLINK-4016) FoldApplyWindowFunction is not properly initialized

2016-06-06 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek closed FLINK-4016.
---
Resolution: Duplicate

This is a duplicate of FLINK-3977.

> FoldApplyWindowFunction is not properly initialized
> ---
>
> Key: FLINK-4016
> URL: https://issues.apache.org/jira/browse/FLINK-4016
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Affects Versions: 1.0.3
>Reporter: RWenden
>Priority: Blocker
>  Labels: easyfix
> Fix For: 1.1.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> FoldApplyWindowFunction's outputtype is not set.
> We're using constructions like (excerpt):
>   .keyBy(0)
>   .countWindow(10, 5)
>   .fold(...)
> Running this stream gives an runtime exception in FoldApplyWindowFunction:
> "No initial value was serialized for the fold window function. Probably the 
> setOutputType method was not called."
> This can be easily fixed in WindowedStream.java by (around line# 449):
> FoldApplyWindowFunction foldApplyWindowFunction = new 
> FoldApplyWindowFunction<>(initialValue, foldFunction, function);
> foldApplyWindowFunction.setOutputType(resultType, 
> input.getExecutionConfig());
> operator = new EvictingWindowOperator<>(windowAssigner,
> 
> windowAssigner.getWindowSerializer(getExecutionEnvironment().getConfig()),
> keySel,
> 
> input.getKeyType().createSerializer(getExecutionEnvironment().getConfig()),
> stateDesc,
> new 
> InternalIterableWindowFunction<>(foldApplyWindowFunction),
> trigger,
> evictor);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2072: [FLINK-3948] Protect RocksDB cleanup by cleanup lo...

2016-06-06 Thread aljoscha
Github user aljoscha closed the pull request at:

https://github.com/apache/flink/pull/2072


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange

2016-06-06 Thread Subhankar Biswas (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316252#comment-15316252
 ] 

Subhankar Biswas commented on FLINK-3769:
-

IMO user should be able to build different configuration for queue in both sink 
and source side. So if we can able to build some configuration for queue like 
this connection configuration:
https://github.com/subhankarb/flink/blob/670e92d0731652563fd58631107e0f19c5d5d9a1/flink-streaming-connectors/flink-connector-rabbitmq/src/main/java/org/apache/flink/streaming/connectors/rabbitmq/common/RMQConnectionConfig.java
and then pass to sink and source and declare queue based on this config, user 
will have more options to pass configurations.
I will start work on this once https://issues.apache.org/jira/browse/FLINK-3763 
will get merged.

> RabbitMQ Sink ability to publish to a different exchange
> 
>
> Key: FLINK-3769
> URL: https://issues.apache.org/jira/browse/FLINK-3769
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.1
>Reporter: Robert Batts
>Assignee: Subhankar Biswas
>  Labels: rabbitmq
>
> The RabbitMQ Sink can currently only publish to the "default" exchange. This 
> exchange is a direct exchange, so the routing key will route directly to the 
> queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
> to 1 exchange which routes to 1 queue). Additionally, I believe that if a 
> user decides to use a different exchange I think the following can be assumed:
> 1.) The provided exchange exists
> 2.) The user has declared the appropriate mapping and the appropriate queues 
> exist in RabbitMQ (therefore, nothing needs to be created)
> RabbitMQ currently provides four types of exchanges. Three of these will be 
> covered by just enabling exchanges (Direct, Fanout, Topic) because they use 
> the routingkey (or nothing). 
> The fourth exchange type relies on the message headers, which are currently 
> set to null by default on the publish. These headers may be on a per message 
> level, so the input of this stream will need to take this as input as well. 
> This forth exchange could very well be outside of the scope of this 
> Improvement and a "RabbitMQ Sink enable headers" Improvement might be the 
> better way to go with this.
> Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange

2016-06-06 Thread Subhankar Biswas (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316250#comment-15316250
 ] 

Subhankar Biswas commented on FLINK-3769:
-

IMO user should be able to build different configuration for queue in both sink 
and source side. So if we can able to build some configuration for queue like 
this connection configuration:
https://github.com/subhankarb/flink/blob/670e92d0731652563fd58631107e0f19c5d5d9a1/flink-streaming-connectors/flink-connector-rabbitmq/src/main/java/org/apache/flink/streaming/connectors/rabbitmq/common/RMQConnectionConfig.java
and then pass to sink and source and declare queue based on this config, user 
will have more options to pass configurations.
I will start work on this once https://issues.apache.org/jira/browse/FLINK-3763 
will get merged.

> RabbitMQ Sink ability to publish to a different exchange
> 
>
> Key: FLINK-3769
> URL: https://issues.apache.org/jira/browse/FLINK-3769
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.1
>Reporter: Robert Batts
>Assignee: Subhankar Biswas
>  Labels: rabbitmq
>
> The RabbitMQ Sink can currently only publish to the "default" exchange. This 
> exchange is a direct exchange, so the routing key will route directly to the 
> queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
> to 1 exchange which routes to 1 queue). Additionally, I believe that if a 
> user decides to use a different exchange I think the following can be assumed:
> 1.) The provided exchange exists
> 2.) The user has declared the appropriate mapping and the appropriate queues 
> exist in RabbitMQ (therefore, nothing needs to be created)
> RabbitMQ currently provides four types of exchanges. Three of these will be 
> covered by just enabling exchanges (Direct, Fanout, Topic) because they use 
> the routingkey (or nothing). 
> The fourth exchange type relies on the message headers, which are currently 
> set to null by default on the publish. These headers may be on a per message 
> level, so the input of this stream will need to take this as input as well. 
> This forth exchange could very well be outside of the scope of this 
> Improvement and a "RabbitMQ Sink enable headers" Improvement might be the 
> better way to go with this.
> Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange

2016-06-06 Thread Subhankar Biswas (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316242#comment-15316242
 ] 

Subhankar Biswas commented on FLINK-3769:
-

IMO user should be able to build different configuration for queue in both sink 
and source side. So if we can able to build some configuration for queue like 
this connection configuration:
https://github.com/subhankarb/flink/blob/670e92d0731652563fd58631107e0f19c5d5d9a1/flink-streaming-connectors/flink-connector-rabbitmq/src/main/java/org/apache/flink/streaming/connectors/rabbitmq/common/RMQConnectionConfig.java
and then pass to sink and source and declare queue based on this config, user 
will have more options to pass configurations.
I will start work on this once FLINK-3763 will get merged. 


> RabbitMQ Sink ability to publish to a different exchange
> 
>
> Key: FLINK-3769
> URL: https://issues.apache.org/jira/browse/FLINK-3769
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.1
>Reporter: Robert Batts
>Assignee: Subhankar Biswas
>  Labels: rabbitmq
>
> The RabbitMQ Sink can currently only publish to the "default" exchange. This 
> exchange is a direct exchange, so the routing key will route directly to the 
> queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
> to 1 exchange which routes to 1 queue). Additionally, I believe that if a 
> user decides to use a different exchange I think the following can be assumed:
> 1.) The provided exchange exists
> 2.) The user has declared the appropriate mapping and the appropriate queues 
> exist in RabbitMQ (therefore, nothing needs to be created)
> RabbitMQ currently provides four types of exchanges. Three of these will be 
> covered by just enabling exchanges (Direct, Fanout, Topic) because they use 
> the routingkey (or nothing). 
> The fourth exchange type relies on the message headers, which are currently 
> set to null by default on the publish. These headers may be on a per message 
> level, so the input of this stream will need to take this as input as well. 
> This forth exchange could very well be outside of the scope of this 
> Improvement and a "RabbitMQ Sink enable headers" Improvement might be the 
> better way to go with this.
> Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3769) RabbitMQ Sink ability to publish to a different exchange

2016-06-06 Thread Subhankar Biswas (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316243#comment-15316243
 ] 

Subhankar Biswas commented on FLINK-3769:
-

IMO user should be able to build different configuration for queue in both sink 
and source side. So if we can able to build some configuration for queue like 
this connection configuration:
https://github.com/subhankarb/flink/blob/670e92d0731652563fd58631107e0f19c5d5d9a1/flink-streaming-connectors/flink-connector-rabbitmq/src/main/java/org/apache/flink/streaming/connectors/rabbitmq/common/RMQConnectionConfig.java
and then pass to sink and source and declare queue based on this config, user 
will have more options to pass configurations.
I will start work on this once FLINK-3763 will get merged. 


> RabbitMQ Sink ability to publish to a different exchange
> 
>
> Key: FLINK-3769
> URL: https://issues.apache.org/jira/browse/FLINK-3769
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming Connectors
>Affects Versions: 1.0.1
>Reporter: Robert Batts
>Assignee: Subhankar Biswas
>  Labels: rabbitmq
>
> The RabbitMQ Sink can currently only publish to the "default" exchange. This 
> exchange is a direct exchange, so the routing key will route directly to the 
> queue name. Because of this, the current sink will only be 1-to-1-to-1 (1 job 
> to 1 exchange which routes to 1 queue). Additionally, I believe that if a 
> user decides to use a different exchange I think the following can be assumed:
> 1.) The provided exchange exists
> 2.) The user has declared the appropriate mapping and the appropriate queues 
> exist in RabbitMQ (therefore, nothing needs to be created)
> RabbitMQ currently provides four types of exchanges. Three of these will be 
> covered by just enabling exchanges (Direct, Fanout, Topic) because they use 
> the routingkey (or nothing). 
> The fourth exchange type relies on the message headers, which are currently 
> set to null by default on the publish. These headers may be on a per message 
> level, so the input of this stream will need to take this as input as well. 
> This forth exchange could very well be outside of the scope of this 
> Improvement and a "RabbitMQ Sink enable headers" Improvement might be the 
> better way to go with this.
> Exchange Types: https://www.rabbitmq.com/tutorials/amqp-concepts.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4002) [py] Improve testing infraestructure

2016-06-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316231#comment-15316231
 ] 

ASF GitHub Bot commented on FLINK-4002:
---

Github user omaralvarez commented on the issue:

https://github.com/apache/flink/pull/2063
  
Yes, I know that how python scripts are executed for the test is different. 
Let me elaborate:

Since running the tests are quite costly in my laptop, I normally test my 
changes executing them in a local instance of Flink 1.0.3, since this is less 
taxing. Once I complete the changes, I run `mvn verify`. The problem is that 
when I call `pyflink2.sh test_main.py utils.py, the module that I pass to the 
test script, is ignored unless I use HDFS, in which case, everything works fine.


> [py] Improve testing infraestructure
> 
>
> Key: FLINK-4002
> URL: https://issues.apache.org/jira/browse/FLINK-4002
> Project: Flink
>  Issue Type: Bug
>  Components: Python API
>Affects Versions: 1.0.3
>Reporter: Omar Alvarez
>Priority: Minor
>  Labels: Python, Testing
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The Verify() test function errors out when array elements are missing:
> {code}
> env.generate_sequence(1, 5)\
>  .map(Id()).map_partition(Verify([1,2,3,4], "Sequence")).output()
> {code}
> {quote}
> IndexError: list index out of range
> {quote}
> There should also be more documentation in test functions.
> I am already working on a pull request to fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2063: [FLINK-4002] [py] Improve testing infraestructure

2016-06-06 Thread omaralvarez
Github user omaralvarez commented on the issue:

https://github.com/apache/flink/pull/2063
  
Yes, I know that how python scripts are executed for the test is different. 
Let me elaborate:

Since running the tests are quite costly in my laptop, I normally test my 
changes executing them in a local instance of Flink 1.0.3, since this is less 
taxing. Once I complete the changes, I run `mvn verify`. The problem is that 
when I call `pyflink2.sh test_main.py utils.py, the module that I pass to the 
test script, is ignored unless I use HDFS, in which case, everything works fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---