[jira] [Commented] (FLINK-5792) Improve “UDF/UDTF" to support constructor with parameter.

2017-02-16 Thread sunjincheng (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869473#comment-15869473
 ] 

sunjincheng commented on FLINK-5792:


HI, [~fhueske] Thanks for your attention to this JIRA.   Thanks [~jark] review 
the PR.
I had serialize UDF object at code-gen stage and deserialize in the  `open()` 
method. JackWu is right. the current implement is similar to the approach-2. 
What do you think about the current PR?

> Improve “UDF/UDTF" to support constructor with parameter.
> -
>
> Key: FLINK-5792
> URL: https://issues.apache.org/jira/browse/FLINK-5792
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>
> Currently UDF/UDTF in the codegen phase using a nonparametric constructor to 
> create the instance, causing the user can not include the state value in the 
> UDF/UDTF. The UDF/UDTF's codegen phase can use a serialized mechanism so that 
> the UDTF can contain state values.
> 1. UserDefinedFunction inherits Serializable.
> 2. Modify CodeGenerator about UDF/UDTF part.
> 3. Modify TableAPI about UDF/UDTF
> 4. Add Test.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5786) Add support GetClusterStatus message for standalong flink cluster

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869450#comment-15869450
 ] 

ASF GitHub Bot commented on FLINK-5786:
---

Github user mylog00 commented on the issue:

https://github.com/apache/flink/pull/3325
  
TravisCI passed build.
https://travis-ci.org/mylog00/flink/builds/201929462


> Add support GetClusterStatus message for standalong flink cluster
> -
>
> Key: FLINK-5786
> URL: https://issues.apache.org/jira/browse/FLINK-5786
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Dmitrii Kniazev
>Assignee: Dmitrii Kniazev
>Priority: Minor
>
> Currently, the  invoke of {{StandaloneClusterClient#getClusterStatus()}} 
> causes the failure of all Flink cluster, because {{JobManager}} has no 
> handler for {{GetClusterStatus}} message.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5792) Improve “UDF/UDTF" to support constructor with parameter.

2017-02-16 Thread sunjincheng (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869488#comment-15869488
 ] 

sunjincheng commented on FLINK-5792:


BYW, when I implement the issue, I had try to use `kryo` serialize 
the UDTF/UDF object, It will make byte [] very small.
But unfortunately, It asked the serialized member must have a zero-parameter 
constructor which not friendly to user.

> Improve “UDF/UDTF" to support constructor with parameter.
> -
>
> Key: FLINK-5792
> URL: https://issues.apache.org/jira/browse/FLINK-5792
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>
> Currently UDF/UDTF in the codegen phase using a nonparametric constructor to 
> create the instance, causing the user can not include the state value in the 
> UDF/UDTF. The UDF/UDTF's codegen phase can use a serialized mechanism so that 
> the UDTF can contain state values.
> 1. UserDefinedFunction inherits Serializable.
> 2. Modify CodeGenerator about UDF/UDTF part.
> 3. Modify TableAPI about UDF/UDTF
> 4. Add Test.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5818) change checkpoint dir permission to 700 for security reason

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871046#comment-15871046
 ] 

ASF GitHub Bot commented on FLINK-5818:
---

Github user WangTaoTheTonic commented on the issue:

https://github.com/apache/flink/pull/3335
  
Hi @greghogan , I'm not sure I understand the relationship between HDFS 
ACLs and this change I proposed. Could you explain more specifically? Thanks.


> change checkpoint dir permission to 700 for security reason
> ---
>
> Key: FLINK-5818
> URL: https://issues.apache.org/jira/browse/FLINK-5818
> Project: Flink
>  Issue Type: Improvement
>  Components: Security, State Backends, Checkpointing
>Reporter: Tao Wang
>
> Now checkpoint directory is made w/o specified permission, so it is easy for 
> another user to delete or read files under it, which will cause restore 
> failure or information leak.
> It's better to lower it down to 700.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...

2017-02-16 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the issue:

https://github.com/apache/flink/pull/3335
  
Hi @greghogan , I'm not sure I understand the relationship between HDFS 
ACLs and this change I proposed. Could you explain more specifically? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4856) Add MapState for keyed streams

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871118#comment-15871118
 ] 

ASF GitHub Bot commented on FLINK-4856:
---

GitHub user shixiaogang opened a pull request:

https://github.com/apache/flink/pull/3336

[FLINK-4856][state] Add MapState in KeyedState

1. Add `MapState` and `MapStateDescriptor`
2. Implementation of `MapState` in `HeapKeyedStateBackend` and 
`RocksDBKeyedStateBackend`.
3. Add accessors to `MapState` in `RuntimeContext`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/alibaba/flink flink-4856

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3336.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3336


commit 430b4f596acbff0a9dfdc20fbb2430a8fad819f9
Author: xiaogang.sxg 
Date:   2017-02-17T03:19:18Z

Add MapState in KeyedState




> Add MapState for keyed streams
> --
>
> Key: FLINK-4856
> URL: https://issues.apache.org/jira/browse/FLINK-4856
> Project: Flink
>  Issue Type: New Feature
>  Components: State Backends, Checkpointing
>Reporter: Xiaogang Shi
>Assignee: Xiaogang Shi
>
> Many states in keyed streams are organized as key-value pairs. Currently, 
> these states are implemented by storing the entire map into a ValueState or a 
> ListState. The implementation however is very costly because all entries have 
> to be serialized/deserialized when updating a single entry. To improve the 
> efficiency of these states, MapStates are urgently needed. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3336: [FLINK-4856][state] Add MapState in KeyedState

2017-02-16 Thread shixiaogang
GitHub user shixiaogang opened a pull request:

https://github.com/apache/flink/pull/3336

[FLINK-4856][state] Add MapState in KeyedState

1. Add `MapState` and `MapStateDescriptor`
2. Implementation of `MapState` in `HeapKeyedStateBackend` and 
`RocksDBKeyedStateBackend`.
3. Add accessors to `MapState` in `RuntimeContext`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/alibaba/flink flink-4856

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3336.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3336


commit 430b4f596acbff0a9dfdc20fbb2430a8fad819f9
Author: xiaogang.sxg 
Date:   2017-02-17T03:19:18Z

Add MapState in KeyedState




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3332: [FLINK-5751] [docs] Add link check script

2017-02-16 Thread patricklucas
Github user patricklucas commented on a diff in the pull request:

https://github.com/apache/flink/pull/3332#discussion_r101678631
  
--- Diff: docs/check_links.sh ---
@@ -0,0 +1,39 @@
+#!/usr/bin/env bash

+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.

+
+
+# Don't abort on any non-zero exit code
+#set +e
+
+target=${1:-"http://localhost:4000"}
+
+# Crawl the docs, ignoring robots.txt, storing nothing locally
+wget --spider -r -nd -nv -e robots=off -p -o spider.log $target
+
+# Abort for anything other than 0 and 4 ("Network failure")
+status=$?
+if [ $status -ne 0 ] && [ $status -ne 4 ]; then
+exit $status
+fi
+
+# Fail the build if any broken links are found
+broken_links_str=$(grep -e 'Found [[:digit:]]\+ broken link' spider.log)
+echo -e "$broken_links_str"
--- End diff --

If you're going to get rid of my pretty bold-red formatting you can drop 
the `-e` too. :)

Actually, might as well put the `echo` line inside the `if` too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3332: [FLINK-5751] [docs] Add link check script

2017-02-16 Thread patricklucas
Github user patricklucas commented on a diff in the pull request:

https://github.com/apache/flink/pull/3332#discussion_r101678241
  
--- Diff: docs/check_links.sh ---
@@ -0,0 +1,39 @@
+#!/usr/bin/env bash

+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.

+
+
+# Don't abort on any non-zero exit code
+#set +e
+
+target=${1:-"http://localhost:4000"}
+
+# Crawl the docs, ignoring robots.txt, storing nothing locally
+wget --spider -r -nd -nv -e robots=off -p -o spider.log $target
--- End diff --

$target in quotes


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5751) 404 in documentation

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871107#comment-15871107
 ] 

ASF GitHub Bot commented on FLINK-5751:
---

Github user patricklucas commented on a diff in the pull request:

https://github.com/apache/flink/pull/3332#discussion_r101678631
  
--- Diff: docs/check_links.sh ---
@@ -0,0 +1,39 @@
+#!/usr/bin/env bash

+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.

+
+
+# Don't abort on any non-zero exit code
+#set +e
+
+target=${1:-"http://localhost:4000"}
+
+# Crawl the docs, ignoring robots.txt, storing nothing locally
+wget --spider -r -nd -nv -e robots=off -p -o spider.log $target
+
+# Abort for anything other than 0 and 4 ("Network failure")
+status=$?
+if [ $status -ne 0 ] && [ $status -ne 4 ]; then
+exit $status
+fi
+
+# Fail the build if any broken links are found
+broken_links_str=$(grep -e 'Found [[:digit:]]\+ broken link' spider.log)
+echo -e "$broken_links_str"
--- End diff --

If you're going to get rid of my pretty bold-red formatting you can drop 
the `-e` too. :)

Actually, might as well put the `echo` line inside the `if` too.


> 404 in documentation
> 
>
> Key: FLINK-5751
> URL: https://issues.apache.org/jira/browse/FLINK-5751
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Colin Breame
>Priority: Trivial
> Fix For: 1.3.0, 1.2.1
>
>
> This page:
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/quickstart/setup_quickstart.html
> Contains a link with title "Flink on Windows" with URL:
> - 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/flink_on_windows
> This gives a 404.  It should be:
> - 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/flink_on_windows.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3332: [FLINK-5751] [docs] Add link check script

2017-02-16 Thread patricklucas
Github user patricklucas commented on a diff in the pull request:

https://github.com/apache/flink/pull/3332#discussion_r101678217
  
--- Diff: docs/check_links.sh ---
@@ -0,0 +1,39 @@
+#!/usr/bin/env bash

+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.

+
+
+# Don't abort on any non-zero exit code
+#set +e
--- End diff --

This is a vestige from me writing this as a Jenkins jobs; you can remove 
this line and the above comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5751) 404 in documentation

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871106#comment-15871106
 ] 

ASF GitHub Bot commented on FLINK-5751:
---

Github user patricklucas commented on a diff in the pull request:

https://github.com/apache/flink/pull/3332#discussion_r101678217
  
--- Diff: docs/check_links.sh ---
@@ -0,0 +1,39 @@
+#!/usr/bin/env bash

+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.

+
+
+# Don't abort on any non-zero exit code
+#set +e
--- End diff --

This is a vestige from me writing this as a Jenkins jobs; you can remove 
this line and the above comment.


> 404 in documentation
> 
>
> Key: FLINK-5751
> URL: https://issues.apache.org/jira/browse/FLINK-5751
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Colin Breame
>Priority: Trivial
> Fix For: 1.3.0, 1.2.1
>
>
> This page:
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/quickstart/setup_quickstart.html
> Contains a link with title "Flink on Windows" with URL:
> - 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/flink_on_windows
> This gives a 404.  It should be:
> - 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/flink_on_windows.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5751) 404 in documentation

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871105#comment-15871105
 ] 

ASF GitHub Bot commented on FLINK-5751:
---

Github user patricklucas commented on a diff in the pull request:

https://github.com/apache/flink/pull/3332#discussion_r101678241
  
--- Diff: docs/check_links.sh ---
@@ -0,0 +1,39 @@
+#!/usr/bin/env bash

+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.

+
+
+# Don't abort on any non-zero exit code
+#set +e
+
+target=${1:-"http://localhost:4000"}
+
+# Crawl the docs, ignoring robots.txt, storing nothing locally
+wget --spider -r -nd -nv -e robots=off -p -o spider.log $target
--- End diff --

$target in quotes


> 404 in documentation
> 
>
> Key: FLINK-5751
> URL: https://issues.apache.org/jira/browse/FLINK-5751
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Colin Breame
>Priority: Trivial
> Fix For: 1.3.0, 1.2.1
>
>
> This page:
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/quickstart/setup_quickstart.html
> Contains a link with title "Flink on Windows" with URL:
> - 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/flink_on_windows
> This gives a 404.  It should be:
> - 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/flink_on_windows.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5817) Fix test concurrent execution failure by test dir conflicts.

2017-02-16 Thread shijinkui (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871082#comment-15871082
 ] 

shijinkui commented on FLINK-5817:
--

[~StephanEwen] [~wenlong.lwl], there no overlapping between FLINK-5546 and 
FLINK-5817. In FLINK-5546, I'm focus on changing the default java.io.tmp system 
property.
I think this issue can find out all the temporary directory creating by new 
File, and replace all with TemporaryFolder following Stephan's suggestion.
You can find out all the *Test.* files and search the keyword "new File(", then 
you'll find there so much bad smell need to be re-corrected.

> Fix test concurrent execution failure by test dir conflicts.
> 
>
> Key: FLINK-5817
> URL: https://issues.apache.org/jira/browse/FLINK-5817
> Project: Flink
>  Issue Type: Bug
>Reporter: Wenlong Lyu
>Assignee: Wenlong Lyu
>
> Currently when different users build flink on the same machine, failure may 
> happen because some test utilities create test file using the fixed name, 
> which will cause file access failing when different user processing the same 
> file at the same time.
> We have found errors from AbstractTestBase, IOManagerTest, FileCacheTest.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5414) Bump up Calcite version to 1.11

2017-02-16 Thread Jark Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871154#comment-15871154
 ] 

Jark Wu commented on FLINK-5414:


Hi [~wheat9], sorry for the delay. Actually, I had some problems when working 
on this issue. 

I will try to solve the problem today, and update my progress here. 

> Bump up Calcite version to 1.11
> ---
>
> Key: FLINK-5414
> URL: https://issues.apache.org/jira/browse/FLINK-5414
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Jark Wu
>
> The upcoming Calcite release 1.11 has a lot of stability fixes and new 
> features. We should update it for the Table API.
> E.g. we can hopefully merge FLINK-4864



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5792) Improve “UDF/UDTF" to support constructor with parameter.

2017-02-16 Thread Zhuoluo Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871230#comment-15871230
 ] 

Zhuoluo Yang commented on FLINK-5792:
-

Hi [~sunjincheng121], I think this feature is very important to FLINK-5802. 
Because we need to pass something (eg, Hive Udf ) to the Flink's UDF/UDTF. A 
serialization will be a good idea, IMHO. 

> Improve “UDF/UDTF" to support constructor with parameter.
> -
>
> Key: FLINK-5792
> URL: https://issues.apache.org/jira/browse/FLINK-5792
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>
> Currently UDF/UDTF in the codegen phase using a nonparametric constructor to 
> create the instance, causing the user can not include the state value in the 
> UDF/UDTF. The UDF/UDTF's codegen phase can use a serialized mechanism so that 
> the UDTF can contain state values.
> 1. UserDefinedFunction inherits Serializable.
> 2. Modify CodeGenerator about UDF/UDTF part.
> 3. Modify TableAPI about UDF/UDTF
> 4. Add Test.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (FLINK-5826) UDF/UDTF should support variable types and variable arguments

2017-02-16 Thread Zhuoluo Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871338#comment-15871338
 ] 

Zhuoluo Yang edited comment on FLINK-5826 at 2/17/17 7:33 AM:
--

I have already made a small modification in 
{code}UserDefinedFunctionUtils.getSignature(){code}. The basic idea is that we 
let the "Object..." and "Any*" pass and return the corresponding signature. 
This modification works for Java only. The Scala will fail. Since the code 
generation is to generate a Java codes, There will be some problem call {code} 
eval(Seq) {code} in generated Java. However, there will be no problem at 
all in calling {code}eval(Object... args}{code} in generated Java.


was (Author: clarkyzl):
I have already made a small modification in 
{code}UserDefinedFunctionUtils.getSignature(){code}. The basic idea is that we 
let the "Object..." and "Any*" pass and return the corresponding signature. 
This modification works for Java only. The Scala will fail. Since the code 
generation is to generate a Java codes, There will be some problem call {code} 
eval(Seq) {/code} in generated Java. However, there will be no problem at 
all in calling {code}eval(Object... args}{code} in generated Java.

> UDF/UDTF should support variable types and variable arguments
> -
>
> Key: FLINK-5826
> URL: https://issues.apache.org/jira/browse/FLINK-5826
> Project: Flink
>  Issue Type: Improvement
>Reporter: Zhuoluo Yang
>Assignee: Zhuoluo Yang
>
> In some cases, UDF/UDTF should support variable types and variable arguments. 
> Many UDF/UDTF developers wish to make the # of arguments and types flexible 
> to users. They try to make their functions flexible.
> Thus, we should support the following styles of UDF/UDTFs.
> for example 1, in Java
> {code:java}
> public class SimpleUDF extends ScalarFunction {
>   public int eval(Object... args) {
>   // do something
>   }
> }
> {code}
> for example 2, in Scala
> {code}
> class SimpleUDF extends ScalarFunction {
>   def eval(args: Any*): Int = {
> // do something
>   }
> }
> {code}
> If we modify the code in UserDefinedFunctionUtils.getSignature() and make 
> both signatures pass. The first example will work normally. However, the 
> second example will raise an exception.
> {noformat}
> Caused by: org.codehaus.commons.compiler.CompileException: Line 58, Column 0: 
> No applicable constructor/method found for actual parameters 
> "java.lang.String"; candidates are: "public java.lang.Object 
> test.SimpleUDF.eval(scala.collection.Seq)"
>   at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:11523) 
> ~[janino-3.0.6.jar:?]
>   at 
> org.codehaus.janino.UnitCompiler.findMostSpecificIInvocable(UnitCompiler.java:8679)
>  ~[janino-3.0.6.jar:?]
>   at org.codehaus.janino.UnitCompiler.findIMethod(UnitCompiler.java:8539) 
> ~[janino-3.0.6.jar:?]
> {noformat} 
> The reason is that Scala will do a sugary modification to the signature of 
> the method. The mothod {code} def eval(args: Any*){code} will become 
> {code}def eval(args: scala.collection.Seq){code} in the class file. 
> The code generation has been done in Java. If we use java style 
> {code}eval(Object... args){code} to call the Scala method, it will raise the 
> above exception.
> However, I can't always restrict users to use Java to write a UDF/UDTF. Any 
> ideas in variable types and variable arguments of Scala UDF/UDTFs to prevent 
> the compilation failure?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3337: [FLINK-5825][UI]make the cite path relative to sho...

2017-02-16 Thread WangTaoTheTonic
GitHub user WangTaoTheTonic opened a pull request:

https://github.com/apache/flink/pull/3337

[FLINK-5825][UI]make the cite path relative to show it correctly

In yarn mode, the web frontend url is accessed from yarn in format like 
"http://spark-91-206:8088/proxy/application_1487122678902_0015/;, and the 
running job page's url is 
"http://spark-91-206:8088/proxy/application_1487122678902_0015/#/jobs/9440a129ea5899c16e7c1a7e8f2897b3;.
One .png file called "horizontal.png", which is very small can not be 
loaded in that mode, because in "index.styl" it is cited as absolute path.
We should make the path relative.

- [ ] Tests & Build
LATER I will paste difference between before and after.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WangTaoTheTonic/flink FLINK-5825

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3337.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3337


commit c2e2711b2f815a4aa1b9c8be2478c963c41da245
Author: WangTaoTheTonic 
Date:   2017-02-17T06:48:14Z

make the cite path relative to show it correctly




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5825) In yarn mode, a small pic can not be loaded

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871299#comment-15871299
 ] 

ASF GitHub Bot commented on FLINK-5825:
---

GitHub user WangTaoTheTonic opened a pull request:

https://github.com/apache/flink/pull/3337

[FLINK-5825][UI]make the cite path relative to show it correctly

In yarn mode, the web frontend url is accessed from yarn in format like 
"http://spark-91-206:8088/proxy/application_1487122678902_0015/;, and the 
running job page's url is 
"http://spark-91-206:8088/proxy/application_1487122678902_0015/#/jobs/9440a129ea5899c16e7c1a7e8f2897b3;.
One .png file called "horizontal.png", which is very small can not be 
loaded in that mode, because in "index.styl" it is cited as absolute path.
We should make the path relative.

- [ ] Tests & Build
LATER I will paste difference between before and after.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WangTaoTheTonic/flink FLINK-5825

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3337.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3337


commit c2e2711b2f815a4aa1b9c8be2478c963c41da245
Author: WangTaoTheTonic 
Date:   2017-02-17T06:48:14Z

make the cite path relative to show it correctly




> In yarn mode, a small pic can not be loaded
> ---
>
> Key: FLINK-5825
> URL: https://issues.apache.org/jira/browse/FLINK-5825
> Project: Flink
>  Issue Type: Bug
>  Components: Webfrontend, YARN
>Reporter: Tao Wang
>Priority: Minor
>
> In yarn mode, the web frontend url is accessed from yarn in format like 
> "http://spark-91-206:8088/proxy/application_1487122678902_0015/;, and the 
> running job page's url is 
> "http://spark-91-206:8088/proxy/application_1487122678902_0015/#/jobs/9440a129ea5899c16e7c1a7e8f2897b3;.
> One .png file called "horizontal.png", which is very small can not be loaded 
> in that mode, because in "index.styl" it is cited as absolute path.
> We should make the path relative.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5826) UDF/UDTF should support variable types and variable arguments

2017-02-16 Thread Zhuoluo Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871338#comment-15871338
 ] 

Zhuoluo Yang commented on FLINK-5826:
-

I have already made a small modification in 
{code}UserDefinedFunctionUtils.getSignature(){code}. The basic idea is that we 
let the "Object..." and "Any*" pass and return the corresponding signature. 
This modification works for Java only. The Scala will fail. Since the code 
generation is to generate a Java codes, There will be some problem call {code} 
eval(Seq) {/code} in generated Java. However, there will be no problem at 
all in calling {code}eval(Object... args}{code} in generated Java.

> UDF/UDTF should support variable types and variable arguments
> -
>
> Key: FLINK-5826
> URL: https://issues.apache.org/jira/browse/FLINK-5826
> Project: Flink
>  Issue Type: Improvement
>Reporter: Zhuoluo Yang
>Assignee: Zhuoluo Yang
>
> In some cases, UDF/UDTF should support variable types and variable arguments. 
> Many UDF/UDTF developers wish to make the # of arguments and types flexible 
> to users. They try to make their functions flexible.
> Thus, we should support the following styles of UDF/UDTFs.
> for example 1, in Java
> {code:java}
> public class SimpleUDF extends ScalarFunction {
>   public int eval(Object... args) {
>   // do something
>   }
> }
> {code}
> for example 2, in Scala
> {code}
> class SimpleUDF extends ScalarFunction {
>   def eval(args: Any*): Int = {
> // do something
>   }
> }
> {code}
> If we modify the code in UserDefinedFunctionUtils.getSignature() and make 
> both signatures pass. The first example will work normally. However, the 
> second example will raise an exception.
> {noformat}
> Caused by: org.codehaus.commons.compiler.CompileException: Line 58, Column 0: 
> No applicable constructor/method found for actual parameters 
> "java.lang.String"; candidates are: "public java.lang.Object 
> test.SimpleUDF.eval(scala.collection.Seq)"
>   at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:11523) 
> ~[janino-3.0.6.jar:?]
>   at 
> org.codehaus.janino.UnitCompiler.findMostSpecificIInvocable(UnitCompiler.java:8679)
>  ~[janino-3.0.6.jar:?]
>   at org.codehaus.janino.UnitCompiler.findIMethod(UnitCompiler.java:8539) 
> ~[janino-3.0.6.jar:?]
> {noformat} 
> The reason is that Scala will do a sugary modification to the signature of 
> the method. The mothod {code} def eval(args: Any*){code} will become 
> {code}def eval(args: scala.collection.Seq){code} in the class file. 
> The code generation has been done in Java. If we use java style 
> {code}eval(Object... args){code} to call the Scala method, it will raise the 
> above exception.
> However, I can't always restrict users to use Java to write a UDF/UDTF. Any 
> ideas in variable types and variable arguments of Scala UDF/UDTFs to prevent 
> the compilation failure?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-5819) Improve metrics reporting

2017-02-16 Thread Paul Nelligan (JIRA)
Paul Nelligan created FLINK-5819:


 Summary: Improve metrics reporting
 Key: FLINK-5819
 URL: https://issues.apache.org/jira/browse/FLINK-5819
 Project: Flink
  Issue Type: Improvement
  Components: Webfrontend
Affects Versions: 1.3.0
Reporter: Paul Nelligan


When displaying individual metrics for a vertex / node of a job in the webui, 
it is desirable to add an option to display metrics as a numeric or as a chart.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-5823) Store Checkpoint Root Metadata in StateBackend (not in HA custom store)

2017-02-16 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-5823:
---

 Summary: Store Checkpoint Root Metadata in StateBackend (not in HA 
custom store)
 Key: FLINK-5823
 URL: https://issues.apache.org/jira/browse/FLINK-5823
 Project: Flink
  Issue Type: Sub-task
  Components: State Backends, Checkpointing
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.3.0






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...

2017-02-16 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3335
  
Thank you for the contribution. I see the idea behind the fix.

I am unsure whether we should let Flink manage the permissions of these 
directories. My gut feeling is that this is something that the operational 
setup should be taking care of. For example some setup may want to keep 
whatever is the pre-defined permission, to make checkpoints accessible by other 
groups for the purpose of cloning/migrating jobs to other clusters.

That is something probably worth of a mailing list discussion.
I would put this on hold until we have a decision there.

If we want this change, we still need to do it a bit different, as we are 
trying to make Flink work without any dependencies to Hadoop (Hadoop is still 
supported perfectly, but is an optional dependencies).
Adding new hard Hadoop dependencies (like here) is not possible due to that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-5824) Fix String/byte conversions without explicit encoding

2017-02-16 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-5824:

Priority: Blocker  (was: Major)

> Fix String/byte conversions without explicit encoding
> -
>
> Key: FLINK-5824
> URL: https://issues.apache.org/jira/browse/FLINK-5824
> Project: Flink
>  Issue Type: Bug
>  Components: Python API, Queryable State, State Backends, 
> Checkpointing, Webfrontend
>Reporter: Ufuk Celebi
>Priority: Blocker
>
> In a couple of places we convert Strings to bytes and bytes back to Strings 
> without explicitly specifying an encoding. This can lead to problems when 
> client and server default encodings differ.
> The task of this JIRA is to go over the whole project and look for 
> conversions where we don't specify an encoding and fix it to specify UTF-8 
> explicitly.
> For starters, we can {{grep -R 'getBytes()' .}}, which already reveals many 
> problematic places.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-5818) change checkpoint dir permission to 700 for security reason

2017-02-16 Thread Tao Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Wang updated FLINK-5818:


Hi Stephan,


You may have a little misunderstanding about this change. It only controls 
directories with job id (generated using UUID), but not the configured root 
checkpoint directory.  I agree with you that the root directory should be 
created or changed permission when setup, but setup would not be aware of these 
directories with job ids, which are created in runtime.


About Hadoop dependency, I admit I am using a convenient (let's say a hack way) 
to do the transition, as it need a bit more codes to do it standalone. I will 
change it if it's a problem :)


-
Regards.
On 02/16/2017 21:17, ASF GitHub Bot (JIRA) wrote:

   [ 
https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869904#comment-15869904
 ]

ASF GitHub Bot commented on FLINK-5818:
---

Github user StephanEwen commented on the issue:

   https://github.com/apache/flink/pull/3335
 
   Thank you for the contribution. I see the idea behind the fix.
   
   I am unsure whether we should let Flink manage the permissions of these 
directories. My gut feeling is that this is something that the operational 
setup should be taking care of. For example some setup may want to keep 
whatever is the pre-defined permission, to make checkpoints accessible by other 
groups for the purpose of cloning/migrating jobs to other clusters.
   
   That is something probably worth of a mailing list discussion.
   I would put this on hold until we have a decision there.
   
   If we want this change, we still need to do it a bit different, as we are 
trying to make Flink work without any dependencies to Hadoop (Hadoop is still 
supported perfectly, but is an optional dependencies).
   Adding new hard Hadoop dependencies (like here) is not possible due to that.





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


> change checkpoint dir permission to 700 for security reason
> ---
>
> Key: FLINK-5818
> URL: https://issues.apache.org/jira/browse/FLINK-5818
> Project: Flink
>  Issue Type: Improvement
>  Components: Security, State Backends, Checkpointing
>Reporter: Tao Wang
>
> Now checkpoint directory is made w/o specified permission, so it is easy for 
> another user to delete or read files under it, which will cause restore 
> failure or information leak.
> It's better to lower it down to 700.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5588) Add a unit scaler based on different norms

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869974#comment-15869974
 ] 

ASF GitHub Bot commented on FLINK-5588:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3313
  
@skonto another option, if the master branch has a newer commit, is to 
rebase and force push.


> Add a unit scaler based on different norms
> --
>
> Key: FLINK-5588
> URL: https://issues.apache.org/jira/browse/FLINK-5588
> Project: Flink
>  Issue Type: New Feature
>  Components: Machine Learning Library
>Reporter: Stavros Kontopoulos
>Assignee: Stavros Kontopoulos
>Priority: Minor
>
> So far ML has two scalers: min-max and the standard scaler.
> A third one frequently used, is the scaler to unit.
> We could implement a transformer for this type of scaling for different norms 
> available to the user.
> I will make a separate class for the Normalization per sample procedure by 
> using the Transformer API because it is easy to add
> it, fit method does nothing in this case.
> Scikit-learn has also some calls available outside the Transform API, we 
> might want add that in the future.
> These calls work on any axis but they are not re-usable in a pipeline [4]
> Right now the existing scalers in Flink ML support per feature normalization 
> by using the Transformer API. 
> Resources
> [1] https://en.wikipedia.org/wiki/Feature_scaling
> [2] 
> http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html
> [3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html
> [4] http://scikit-learn.org/stable/modules/preprocessing.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3252: [FLINK-5624] Support tumbling window on streaming tables ...

2017-02-16 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3252
  
Thanks for the update @haohui.
PR is good to merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5824) Fix String/byte conversions without explicit encoding

2017-02-16 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869989#comment-15869989
 ] 

Stephan Ewen commented on FLINK-5824:
-

We should add a checkstyle rule that forbids {{String.toBytes()}} and {{new 
String(byte[])}}.

> Fix String/byte conversions without explicit encoding
> -
>
> Key: FLINK-5824
> URL: https://issues.apache.org/jira/browse/FLINK-5824
> Project: Flink
>  Issue Type: Bug
>  Components: Python API, Queryable State, State Backends, 
> Checkpointing, Webfrontend
>Reporter: Ufuk Celebi
>Priority: Blocker
>
> In a couple of places we convert Strings to bytes and bytes back to Strings 
> without explicitly specifying an encoding. This can lead to problems when 
> client and server default encodings differ.
> The task of this JIRA is to go over the whole project and look for 
> conversions where we don't specify an encoding and fix it to specify UTF-8 
> explicitly.
> For starters, we can {{grep -R 'getBytes()' .}}, which already reveals many 
> problematic places.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5818) change checkpoint dir permission to 700 for security reason

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869993#comment-15869993
 ] 

ASF GitHub Bot commented on FLINK-5818:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3335
  
@WangTaoTheTonic HDFS supports posix ACLs 
([link](http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html#ACLs_Access_Control_Lists)).
 These are per-file/directory and inherited. Does this meet your requirements?


> change checkpoint dir permission to 700 for security reason
> ---
>
> Key: FLINK-5818
> URL: https://issues.apache.org/jira/browse/FLINK-5818
> Project: Flink
>  Issue Type: Improvement
>  Components: Security, State Backends, Checkpointing
>Reporter: Tao Wang
>
> Now checkpoint directory is made w/o specified permission, so it is easy for 
> another user to delete or read files under it, which will cause restore 
> failure or information leak.
> It's better to lower it down to 700.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-4813) Having flink-test-utils as a dependency outside Flink fails the build

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869907#comment-15869907
 ] 

ASF GitHub Bot commented on FLINK-4813:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3322
  
Very good, I had stumbled over that problem as well.

Could you also remove the bundle plugin from the root pom (this should 
speed up dependency resolutions in IDEs by a lot) and instead add the above 
described dependency to the test projects that use the `SecureTestEnvironment`?


> Having flink-test-utils as a dependency outside Flink fails the build
> -
>
> Key: FLINK-4813
> URL: https://issues.apache.org/jira/browse/FLINK-4813
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
>Assignee: Nico Kruber
>
> The {{flink-test-utils}} depend on {{hadoop-minikdc}}, which has a 
> dependency, which is only resolvable, if the {{maven-bundle-plugin}} is 
> loaded.
> This is the error message
> {code}
> [ERROR] Failed to execute goal on project quickstart-1.2-tests: Could not 
> resolve dependencies for project 
> com.dataartisans:quickstart-1.2-tests:jar:1.0-SNAPSHOT: Failure to find 
> org.apache.directory.jdbm:apacheds-jdbm1:bundle:2.0.0-M2 in 
> https://repo.maven.apache.org/maven2 was cached in the local repository, 
> resolution will not be reattempted until the update interval of central has 
> elapsed or updates are forced -> [Help 1]
> {code}
> {{flink-parent}} loads that plugin, so all "internal" dependencies to the 
> test utils can resolve the plugin.
> Right now, users have to use the maven bundle plugin to use our test utils 
> externally.
> By making the hadoop minikdc dependency optional, we can probably resolve the 
> issues. Then, only users who want to use the security-related tools in the 
> test utils need to manually add the hadoop minikdc dependency + the plugin.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-5824) Fix String/byte conversion without explicit encodings

2017-02-16 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-5824:
--

 Summary: Fix String/byte conversion without explicit encodings
 Key: FLINK-5824
 URL: https://issues.apache.org/jira/browse/FLINK-5824
 Project: Flink
  Issue Type: Bug
  Components: Python API, Queryable State, State Backends, 
Checkpointing, Webfrontend
Reporter: Ufuk Celebi


In a couple of places we convert Strings to bytes and bytes back to Strings 
without explicitly specifying an encoding. This can lead to problems when 
client and server default encodings differ.

The task of this JIRA is to go over the whole project and look for conversions 
where we don't specify an encoding and fix it to specify UTF-8 explicitly.

For starters, we can {{grep -R 'getBytes()' .}}, which already reveals many 
problematic places.





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3257: [FLINK-5705] [WebMonitor] request/response use UTF-8 expl...

2017-02-16 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3257
  
The fix looks good, thanks!
Merging this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3190: [FLINK-5546][build] java.io.tmpdir setted as project buil...

2017-02-16 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3190
  
Here is a related issue: https://issues.apache.org/jira/browse/FLINK-5817


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5817) Fix test concurrent execution failure by test dir conflicts.

2017-02-16 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869939#comment-15869939
 ] 

Stephan Ewen commented on FLINK-5817:
-

There is also a pull request that tries to change the tmp directory for tests: 
https://github.com/apache/flink/pull/3190


> Fix test concurrent execution failure by test dir conflicts.
> 
>
> Key: FLINK-5817
> URL: https://issues.apache.org/jira/browse/FLINK-5817
> Project: Flink
>  Issue Type: Bug
>Reporter: Wenlong Lyu
>Assignee: Wenlong Lyu
>
> Currently when different users build flink on the same machine, failure may 
> happen because some test utilities create test file using the fixed name, 
> which will cause file access failing when different user processing the same 
> file at the same time.
> We have found errors from AbstractTestBase, IOManagerTest, FileCacheTest.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5546) java.io.tmpdir setted as project build directory in surefire plugin

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869936#comment-15869936
 ] 

ASF GitHub Bot commented on FLINK-5546:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3190
  
Here is a related issue: https://issues.apache.org/jira/browse/FLINK-5817


> java.io.tmpdir setted as project build directory in surefire plugin
> ---
>
> Key: FLINK-5546
> URL: https://issues.apache.org/jira/browse/FLINK-5546
> Project: Flink
>  Issue Type: Test
>  Components: Build System
> Environment: CentOS 7.2
>Reporter: Syinchwun Leo
>Assignee: shijinkui
> Fix For: 1.2.1
>
>
> When multiple Linux users run test at the same time, flink-runtime module may 
> fail. User A creates /tmp/cacheFile, and User B will have no permission to 
> visit the fold.  
> Failed tests: 
> FileCacheDeleteValidationTest.setup:79 Error initializing the test: 
> /tmp/cacheFile (Permission denied)
> Tests in error: 
> IOManagerTest.channelEnumerator:54 » Runtime Could not create storage 
> director...
> Tests run: 1385, Failures: 1, Errors: 1, Skipped: 8



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3313: [FLINK-5588][ml] add a data normalizer to ml library

2017-02-16 Thread tillrohrmann
Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/3313
  
But please don't do a force push on a branch which has been opened as a PR. 
If there was a review ongoing, then the force push might invalidate it (e.g. if 
you changed something in the relevant files).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3302: [FLINK-5710] Add ProcTime() function to indicate StreamSQ...

2017-02-16 Thread huawei-flink
Github user huawei-flink commented on the issue:

https://github.com/apache/flink/pull/3302
  
@fhueske I've addressed most of the points, however there is a thing that 
is not clear to me yet. So far, the procTime() function generates a timestamp. 
My understanding is that this is not correct, and it should be something else. 
could it be a default timestamp (e.g. epoch)? the actual timestamp normalized 
to the second? what is the best option in your opinion?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-5820) Extend State Backend Abstraction to support Global Cleanup Hooks

2017-02-16 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-5820:
---

 Summary: Extend State Backend Abstraction to support Global 
Cleanup Hooks
 Key: FLINK-5820
 URL: https://issues.apache.org/jira/browse/FLINK-5820
 Project: Flink
  Issue Type: Improvement
  Components: State Backends, Checkpointing
Affects Versions: 1.2.0
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.3.0


The current state backend abstraction has the limitation that each piece of 
state is only meaningful in the context of its state handle. There is no 
possibility of a view onto "all state associated with checkpoint X".

That causes several issues
  - State might not be cleaned up in the process of failures. When a 
TaskManager hands over a state handle to the JobManager and either of them has 
a failure, the state handle may be lost and state lingers.
  - State might also linger if a cleanup operation failed temporarily, and the 
checkpoint metadata was already disposed
  - State cleanup is more expensive than necessary in many cases. Each state 
handle is individually released. For large jobs, this means 1000s of release 
operations (typically file deletes) per checkpoint, which can be expensive on 
some file systems.
  - It is hard to guarantee cleanup of parent directories with the current 
architecture.

The core changes proposed here are:

  1. Each job has one core {{StateBackend}}. In the future, operators may have 
different {{KeyedStateBackends}} and {{OperatorStateBackends}} to mix and match 
for example RocksDB storabe and in-memory storage.

  2. The JobManager needs to be aware of the {{StateBackend}}.

  3. Storing checkpoint metadata becomes responsibility of the state backend, 
not the "completed checkpoint store". The later only stores the pointers to the 
available latest checkpoints (either in process or in ZooKeeper).

  4. The StateBackend may optionally have a hook to drop all checkpointed state 
that belongs to only one specific checkpoint (shared state comes as part of 
incremental checkpointing).

  5.  The StateBackend needs to have a hook to drop all checkpointed state up 
to a specific checkpoint (for all previously discarded checkpoints).

  6. In the future, this must support periodic cleanup hooks that track 
orphaned shared state from incremental checkpoints.

For the {{FsStateBackend}}, which stores most of the checkpointes state 
currently (transitively for RocksDB as well), this means a re-structuring of 
the storage directories as follows:

{code}
..//job1-id/
  /shared/<-- shared checkpoint data
  /chk-1/...  <-- data exclusive to checkpoint 1
  /chk-2/...  <-- data exclusive to checkpoint 2
  /chk-3/...  <-- data exclusive to checkpoint 3

..//job2-id/
  /shared/...
  /chk-1/...
  /chk-2/...
  /chk-3/...

..//savepoint-1/savepoint-root
 /file-1-uid
 /file-2-uid
 /file-3-uid
 /savepoint-2/savepoint-root
 /file-1-uid
 /file-2-uid
 /file-3-uid
{code}


This is the umbrella issue for the individual steps needed to address this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5818) change checkpoint dir permission to 700 for security reason

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869904#comment-15869904
 ] 

ASF GitHub Bot commented on FLINK-5818:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3335
  
Thank you for the contribution. I see the idea behind the fix.

I am unsure whether we should let Flink manage the permissions of these 
directories. My gut feeling is that this is something that the operational 
setup should be taking care of. For example some setup may want to keep 
whatever is the pre-defined permission, to make checkpoints accessible by other 
groups for the purpose of cloning/migrating jobs to other clusters.

That is something probably worth of a mailing list discussion.
I would put this on hold until we have a decision there.

If we want this change, we still need to do it a bit different, as we are 
trying to make Flink work without any dependencies to Hadoop (Hadoop is still 
supported perfectly, but is an optional dependencies).
Adding new hard Hadoop dependencies (like here) is not possible due to that.


> change checkpoint dir permission to 700 for security reason
> ---
>
> Key: FLINK-5818
> URL: https://issues.apache.org/jira/browse/FLINK-5818
> Project: Flink
>  Issue Type: Improvement
>  Components: Security, State Backends, Checkpointing
>Reporter: Tao Wang
>
> Now checkpoint directory is made w/o specified permission, so it is easy for 
> another user to delete or read files under it, which will cause restore 
> failure or information leak.
> It's better to lower it down to 700.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5710) Add ProcTime() function to indicate StreamSQL

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869902#comment-15869902
 ] 

ASF GitHub Bot commented on FLINK-5710:
---

Github user huawei-flink commented on the issue:

https://github.com/apache/flink/pull/3302
  
@fhueske I've addressed most of the points, however there is a thing that 
is not clear to me yet. So far, the procTime() function generates a timestamp. 
My understanding is that this is not correct, and it should be something else. 
could it be a default timestamp (e.g. epoch)? the actual timestamp normalized 
to the second? what is the best option in your opinion?


> Add ProcTime() function to indicate StreamSQL
> -
>
> Key: FLINK-5710
> URL: https://issues.apache.org/jira/browse/FLINK-5710
> Project: Flink
>  Issue Type: New Feature
>  Components: DataStream API
>Reporter: Stefano Bortoli
>Assignee: Stefano Bortoli
>Priority: Minor
>
> procTime() is a parameterless scalar function that just indicates processing 
> time mode



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5818) change checkpoint dir permission to 700 for security reason

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869944#comment-15869944
 ] 

ASF GitHub Bot commented on FLINK-5818:
---

Github user WangTaoTheTonic commented on the issue:

https://github.com/apache/flink/pull/3335
  
Hi Stephan,

You may have a little misunderstanding about this change. It only controls 
directories with job id (generated using UUID), but not the configured root 
checkpoint directory.  I agree with you that the root directory should be 
created or changed permission when setup, but setup would not be aware of these 
directories with job ids, which are created in runtime.

About Hadoop dependency, I admit I am using a convenient (let's say a hack 
way) to do the transition, as it need a bit more codes to do it standalone. I 
will change it if it's a problem :)


> change checkpoint dir permission to 700 for security reason
> ---
>
> Key: FLINK-5818
> URL: https://issues.apache.org/jira/browse/FLINK-5818
> Project: Flink
>  Issue Type: Improvement
>  Components: Security, State Backends, Checkpointing
>Reporter: Tao Wang
>
> Now checkpoint directory is made w/o specified permission, so it is easy for 
> another user to delete or read files under it, which will cause restore 
> failure or information leak.
> It's better to lower it down to 700.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5819) Improve metrics reporting

2017-02-16 Thread Greg Hogan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870011#comment-15870011
 ] 

Greg Hogan commented on FLINK-5819:
---

Some charts were added in FLINK-2730 but had to be removed due to the 
dependency on an incompatibly licensed library.

We should consider whether charts are best left to external software using 
Flink's metrics reporters.

> Improve metrics reporting
> -
>
> Key: FLINK-5819
> URL: https://issues.apache.org/jira/browse/FLINK-5819
> Project: Flink
>  Issue Type: Improvement
>  Components: Webfrontend
>Affects Versions: 1.3.0
>Reporter: Paul Nelligan
>  Labels: web-ui
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> When displaying individual metrics for a vertex / node of a job in the webui, 
> it is desirable to add an option to display metrics as a numeric or as a 
> chart.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3322: [FLINK-4813][flink-test-utils] make the hadoop-minikdc de...

2017-02-16 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3322
  
Very good, I had stumbled over that problem as well.

Could you also remove the bundle plugin from the root pom (this should 
speed up dependency resolutions in IDEs by a lot) and instead add the above 
described dependency to the test projects that use the `SecureTestEnvironment`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3211: [FLINK-5640][test]onfigure the explicit Unit Test file su...

2017-02-16 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3211
  
Can you briefly explain what this fixes? Currently, it behaves exactly like 
you describe: The "test" phase executes the `*Test.java` and `*Test.scala` 
classes, and the "verify" phase executes the `*ITCase.java` and `*ITCase.scala` 
classes.

Also, the changes to the compiler plugin - what is the purpose and effect 
of setting it to "verbose"?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5640) configure the explicit Unit Test file suffix

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869928#comment-15869928
 ] 

ASF GitHub Bot commented on FLINK-5640:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3211
  
Can you briefly explain what this fixes? Currently, it behaves exactly like 
you describe: The "test" phase executes the `*Test.java` and `*Test.scala` 
classes, and the "verify" phase executes the `*ITCase.java` and `*ITCase.scala` 
classes.

Also, the changes to the compiler plugin - what is the purpose and effect 
of setting it to "verbose"?


> configure the explicit Unit Test file suffix
> 
>
> Key: FLINK-5640
> URL: https://issues.apache.org/jira/browse/FLINK-5640
> Project: Flink
>  Issue Type: Test
>  Components: Tests
>Reporter: shijinkui
>Assignee: shijinkui
> Fix For: 1.2.1
>
>
> There are four types of Unit Test file: *ITCase.java, *Test.java, 
> *ITSuite.scala, *Suite.scala
> File name ending with "IT.java" is integration test. File name ending with 
> "Test.java"  is unit test.
> It's clear for Surefire plugin of default-test execution to declare that 
> "*Test.*" is Java Unit Test.
> The test file statistics below:
> * Suite  total: 10
> * ITCase  total: 378
> * Test  total: 1008
> * ITSuite  total: 14



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3327: [FLINK-5616] [tests] Harden YarnIntraNonHaMasterServicesT...

2017-02-16 Thread tillrohrmann
Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/3327
  
True, this does not make much sense, what I've done here. Somehow I was 
under the assumption that `verify` will block until the call happened when I 
wrote it. Will correct it. Thanks for the review :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5616) YarnPreConfiguredMasterHaServicesTest fails sometimes

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869973#comment-15869973
 ] 

ASF GitHub Bot commented on FLINK-5616:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/3327
  
True, this does not make much sense, what I've done here. Somehow I was 
under the assumption that `verify` will block until the call happened when I 
wrote it. Will correct it. Thanks for the review :-)


> YarnPreConfiguredMasterHaServicesTest fails sometimes
> -
>
> Key: FLINK-5616
> URL: https://issues.apache.org/jira/browse/FLINK-5616
> Project: Flink
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Aljoscha Krettek
>Assignee: Till Rohrmann
>  Labels: test-stability
>
> This is the relevant part from the log:
> {code}
> ---
>  T E S T S
> ---
> Running 
> org.apache.flink.yarn.highavailability.YarnPreConfiguredMasterHaServicesTest
> Formatting using clusterid: testClusterID
> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.407 sec - 
> in 
> org.apache.flink.yarn.highavailability.YarnPreConfiguredMasterHaServicesTest
> Running 
> org.apache.flink.yarn.highavailability.YarnIntraNonHaMasterServicesTest
> Formatting using clusterid: testClusterID
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.479 sec <<< 
> FAILURE! - in 
> org.apache.flink.yarn.highavailability.YarnIntraNonHaMasterServicesTest
> testClosingReportsToLeader(org.apache.flink.yarn.highavailability.YarnIntraNonHaMasterServicesTest)
>   Time elapsed: 0.836 sec  <<< FAILURE!
> org.mockito.exceptions.verification.WantedButNotInvoked: 
> Wanted but not invoked:
> leaderContender.handleError();
> -> at 
> org.apache.flink.yarn.highavailability.YarnIntraNonHaMasterServicesTest.testClosingReportsToLeader(YarnIntraNonHaMasterServicesTest.java:120)
> Actually, there were zero interactions with this mock.
>   at 
> org.apache.flink.yarn.highavailability.YarnIntraNonHaMasterServicesTest.testClosingReportsToLeader(YarnIntraNonHaMasterServicesTest.java:120)
> Running org.apache.flink.yarn.YarnFlinkResourceManagerTest
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.82 sec - in 
> org.apache.flink.yarn.YarnFlinkResourceManagerTest
> Running org.apache.flink.yarn.YarnClusterDescriptorTest
> java.lang.RuntimeException: Couldn't deploy Yarn cluster
>   at 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:425)
>   at 
> org.apache.flink.yarn.YarnClusterDescriptorTest.testConfigOverwrite(YarnClusterDescriptorTest.java:90)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:483)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> 

[GitHub] flink issue #3313: [FLINK-5588][ml] add a data normalizer to ml library

2017-02-16 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3313
  
@skonto another option, if the master branch has a newer commit, is to 
rebase and force push.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...

2017-02-16 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3335
  
@WangTaoTheTonic HDFS supports posix ACLs 
([link](http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html#ACLs_Access_Control_Lists)).
 These are per-file/directory and inherited. Does this meet your requirements?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5588) Add a unit scaler based on different norms

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1587#comment-1587
 ] 

ASF GitHub Bot commented on FLINK-5588:
---

Github user skonto commented on the issue:

https://github.com/apache/flink/pull/3313
  
Ok sorry for that I did squash the commits, also I am used to it from other 
projects where the comments are invalidated.


> Add a unit scaler based on different norms
> --
>
> Key: FLINK-5588
> URL: https://issues.apache.org/jira/browse/FLINK-5588
> Project: Flink
>  Issue Type: New Feature
>  Components: Machine Learning Library
>Reporter: Stavros Kontopoulos
>Assignee: Stavros Kontopoulos
>Priority: Minor
>
> So far ML has two scalers: min-max and the standard scaler.
> A third one frequently used, is the scaler to unit.
> We could implement a transformer for this type of scaling for different norms 
> available to the user.
> I will make a separate class for the Normalization per sample procedure by 
> using the Transformer API because it is easy to add
> it, fit method does nothing in this case.
> Scikit-learn has also some calls available outside the Transform API, we 
> might want add that in the future.
> These calls work on any axis but they are not re-usable in a pipeline [4]
> Right now the existing scalers in Flink ML support per feature normalization 
> by using the Transformer API. 
> Resources
> [1] https://en.wikipedia.org/wiki/Feature_scaling
> [2] 
> http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html
> [3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html
> [4] http://scikit-learn.org/stable/modules/preprocessing.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5616) YarnPreConfiguredMasterHaServicesTest fails sometimes

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869569#comment-15869569
 ] 

ASF GitHub Bot commented on FLINK-5616:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3327
  
I am wondering if this is the same as the original timeout meaning.

I though that using `verify` with a timeout will wait for a while until the 
call must have happened. Without the timeout, the verification will fail if the 
call had not happened by the time the verification runs.

Adding the timeout to the test as a whole means that the test may not take 
longer than that timeout.


> YarnPreConfiguredMasterHaServicesTest fails sometimes
> -
>
> Key: FLINK-5616
> URL: https://issues.apache.org/jira/browse/FLINK-5616
> Project: Flink
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Aljoscha Krettek
>Assignee: Till Rohrmann
>  Labels: test-stability
>
> This is the relevant part from the log:
> {code}
> ---
>  T E S T S
> ---
> Running 
> org.apache.flink.yarn.highavailability.YarnPreConfiguredMasterHaServicesTest
> Formatting using clusterid: testClusterID
> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.407 sec - 
> in 
> org.apache.flink.yarn.highavailability.YarnPreConfiguredMasterHaServicesTest
> Running 
> org.apache.flink.yarn.highavailability.YarnIntraNonHaMasterServicesTest
> Formatting using clusterid: testClusterID
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.479 sec <<< 
> FAILURE! - in 
> org.apache.flink.yarn.highavailability.YarnIntraNonHaMasterServicesTest
> testClosingReportsToLeader(org.apache.flink.yarn.highavailability.YarnIntraNonHaMasterServicesTest)
>   Time elapsed: 0.836 sec  <<< FAILURE!
> org.mockito.exceptions.verification.WantedButNotInvoked: 
> Wanted but not invoked:
> leaderContender.handleError();
> -> at 
> org.apache.flink.yarn.highavailability.YarnIntraNonHaMasterServicesTest.testClosingReportsToLeader(YarnIntraNonHaMasterServicesTest.java:120)
> Actually, there were zero interactions with this mock.
>   at 
> org.apache.flink.yarn.highavailability.YarnIntraNonHaMasterServicesTest.testClosingReportsToLeader(YarnIntraNonHaMasterServicesTest.java:120)
> Running org.apache.flink.yarn.YarnFlinkResourceManagerTest
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.82 sec - in 
> org.apache.flink.yarn.YarnFlinkResourceManagerTest
> Running org.apache.flink.yarn.YarnClusterDescriptorTest
> java.lang.RuntimeException: Couldn't deploy Yarn cluster
>   at 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:425)
>   at 
> org.apache.flink.yarn.YarnClusterDescriptorTest.testConfigOverwrite(YarnClusterDescriptorTest.java:90)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:483)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>   at 
> 

[jira] [Commented] (FLINK-3687) org.apache.flink.runtime.net.ConnectionUtilsTest fails

2017-02-16 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869604#comment-15869604
 ] 

Till Rohrmann commented on FLINK-3687:
--

Another instance: https://api.travis-ci.org/jobs/201937905/log.txt?deansi=true

> org.apache.flink.runtime.net.ConnectionUtilsTest fails
> --
>
> Key: FLINK-3687
> URL: https://issues.apache.org/jira/browse/FLINK-3687
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime, Tests
>Reporter: Nikolaas Steenbergen
>  Labels: test-stability
>
> {code}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.51 sec <<< 
> FAILURE! - in org.apache.flink.runtime.net.ConnectionUtilsTest
> testReturnLocalHostAddressUsingHeuristics(org.apache.flink.runtime.net.ConnectionUtilsTest)
>   Time elapsed: 0.504 sec  <<< FAILURE!
> java.lang.AssertionError: 
> expected: 
> but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.net.ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics(ConnectionUtilsTest.java:59)
> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 68.554 sec - 
> in org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest
> Results :
> Failed tests: 
>   ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics:59 
> expected: 
> but was:
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-3687) org.apache.flink.runtime.net.ConnectionUtilsTest fails

2017-02-16 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-3687:
-
Priority: Critical  (was: Major)

> org.apache.flink.runtime.net.ConnectionUtilsTest fails
> --
>
> Key: FLINK-3687
> URL: https://issues.apache.org/jira/browse/FLINK-3687
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime, Tests
>Reporter: Nikolaas Steenbergen
>Priority: Critical
>  Labels: test-stability
>
> {code}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.51 sec <<< 
> FAILURE! - in org.apache.flink.runtime.net.ConnectionUtilsTest
> testReturnLocalHostAddressUsingHeuristics(org.apache.flink.runtime.net.ConnectionUtilsTest)
>   Time elapsed: 0.504 sec  <<< FAILURE!
> java.lang.AssertionError: 
> expected: 
> but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.net.ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics(ConnectionUtilsTest.java:59)
> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 68.554 sec - 
> in org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest
> Results :
> Failed tests: 
>   ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics:59 
> expected: 
> but was:
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-3687) org.apache.flink.runtime.net.ConnectionUtilsTest fails

2017-02-16 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-3687:
-
Labels: test-stability  (was: )

> org.apache.flink.runtime.net.ConnectionUtilsTest fails
> --
>
> Key: FLINK-3687
> URL: https://issues.apache.org/jira/browse/FLINK-3687
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime, Tests
>Reporter: Nikolaas Steenbergen
>  Labels: test-stability
>
> {code}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.51 sec <<< 
> FAILURE! - in org.apache.flink.runtime.net.ConnectionUtilsTest
> testReturnLocalHostAddressUsingHeuristics(org.apache.flink.runtime.net.ConnectionUtilsTest)
>   Time elapsed: 0.504 sec  <<< FAILURE!
> java.lang.AssertionError: 
> expected: 
> but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.net.ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics(ConnectionUtilsTest.java:59)
> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 68.554 sec - 
> in org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest
> Results :
> Failed tests: 
>   ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics:59 
> expected: 
> but was:
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3324: [backport] [FLINK-5773] Use akka.actor.Status.Failure cla...

2017-02-16 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3324
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5811) Harden YarnClusterDescriptorTest

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869646#comment-15869646
 ] 

ASF GitHub Bot commented on FLINK-5811:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3326
  
I think this looks good!


> Harden YarnClusterDescriptorTest
> 
>
> Key: FLINK-5811
> URL: https://issues.apache.org/jira/browse/FLINK-5811
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Trivial
>
> The {{YarnClusterDescriptorTest}} is expecting in some test cases an 
> {{Exception}} but does not fail if the exception is not thrown. Moreover it 
> prints the stack trace of the expected exception.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3326: [FLINK-5811] [tests] Harden YarnClusterDescriptorTest

2017-02-16 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3326
  
I think this looks good!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5817) Fix test concurrent execution failure by test dir conflicts.

2017-02-16 Thread Wenlong Lyu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869704#comment-15869704
 ] 

Wenlong Lyu commented on FLINK-5817:


ok, It seems good, I will try it instead of adding a random suffix.

> Fix test concurrent execution failure by test dir conflicts.
> 
>
> Key: FLINK-5817
> URL: https://issues.apache.org/jira/browse/FLINK-5817
> Project: Flink
>  Issue Type: Bug
>Reporter: Wenlong Lyu
>Assignee: Wenlong Lyu
>
> Currently when different users build flink on the same machine, failure may 
> happen because some test utilities create test file using the fixed name, 
> which will cause file access failing when different user processing the same 
> file at the same time.
> We have found errors from AbstractTestBase, IOManagerTest, FileCacheTest.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-1725) New Partitioner for better load balancing for skewed data

2017-02-16 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869741#comment-15869741
 ] 

Aljoscha Krettek commented on FLINK-1725:
-

I think this issue is a bit more complicated than it looks on the surface: 
Flink uses the partitioner to both partition state and incoming elements. It is 
a strict requirement that this partitioning is always the same, especially 
across version numbers and when restoring save points with different 
parallelism, which causes state to be reshuffled according to the partitioner.

Simply adding a new partitioner is easy but if you cannot use keyed state, what 
do you use it for? And if you don't have keyed state, then a simply round-robin 
or random reshuffle will do the trick.

What do you think, [~srichter]?

> New Partitioner for better load balancing for skewed data
> -
>
> Key: FLINK-1725
> URL: https://issues.apache.org/jira/browse/FLINK-1725
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Affects Versions: 0.8.1
>Reporter: Anis Nasir
>Assignee: Anis Nasir
>  Labels: LoadBalancing, Partitioner
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Hi,
> We have recently studied the problem of load balancing in Storm [1].
> In particular, we focused on key distribution of the stream for skewed data.
> We developed a new stream partitioning scheme (which we call Partial Key 
> Grouping). It achieves better load balancing than key grouping while being 
> more scalable than shuffle grouping in terms of memory.
> In the paper we show a number of mining algorithms that are easy to implement 
> with partial key grouping, and whose performance can benefit from it. We 
> think that it might also be useful for a larger class of algorithms.
> Partial key grouping is very easy to implement: it requires just a few lines 
> of code in Java when implemented as a custom grouping in Storm [2].
> For all these reasons, we believe it will be a nice addition to the standard 
> Partitioners available in Flink. If the community thinks it's a good idea, we 
> will be happy to offer support in the porting.
> References:
> [1]. 
> https://melmeric.files.wordpress.com/2014/11/the-power-of-both-choices-practical-load-balancing-for-distributed-stream-processing-engines.pdf
> [2]. https://github.com/gdfm/partial-key-grouping



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-1725) New Partitioner for better load balancing for skewed data

2017-02-16 Thread Anis Nasir (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anis Nasir updated FLINK-1725:
--

I guess the idea is to avoid shuffle grouping. For instance, if you have a
cluster of 1000 machines, you might not want to broadcast message to all
and reduce the key from all the workers. This scheme allows you to only
split on two workers and still get the similar performance. The same
intuition will go for memory requirement using shuffle grouping as well.

Cheers,
Anis



On Thu, Feb 16, 2017 at 8:11 PM, Aljoscha Krettek (JIRA) 



> New Partitioner for better load balancing for skewed data
> -
>
> Key: FLINK-1725
> URL: https://issues.apache.org/jira/browse/FLINK-1725
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Affects Versions: 0.8.1
>Reporter: Anis Nasir
>Assignee: Anis Nasir
>  Labels: LoadBalancing, Partitioner
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Hi,
> We have recently studied the problem of load balancing in Storm [1].
> In particular, we focused on key distribution of the stream for skewed data.
> We developed a new stream partitioning scheme (which we call Partial Key 
> Grouping). It achieves better load balancing than key grouping while being 
> more scalable than shuffle grouping in terms of memory.
> In the paper we show a number of mining algorithms that are easy to implement 
> with partial key grouping, and whose performance can benefit from it. We 
> think that it might also be useful for a larger class of algorithms.
> Partial key grouping is very easy to implement: it requires just a few lines 
> of code in Java when implemented as a custom grouping in Storm [2].
> For all these reasons, we believe it will be a nice addition to the standard 
> Partitioners available in Flink. If the community thinks it's a good idea, we 
> will be happy to offer support in the porting.
> References:
> [1]. 
> https://melmeric.files.wordpress.com/2014/11/the-power-of-both-choices-practical-load-balancing-for-distributed-stream-processing-engines.pdf
> [2]. https://github.com/gdfm/partial-key-grouping



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-5818) change checkpoint dir permission to 700 for security reason

2017-02-16 Thread Tao Wang (JIRA)
Tao Wang created FLINK-5818:
---

 Summary: change checkpoint dir permission to 700 for security 
reason
 Key: FLINK-5818
 URL: https://issues.apache.org/jira/browse/FLINK-5818
 Project: Flink
  Issue Type: Improvement
  Components: Security, State Backends, Checkpointing
Reporter: Tao Wang


Now checkpoint directory is made w/o specified permission, so it is easy for 
another user to delete or read files under it, which will cause restore failure 
or information leak.

It's better to lower it down to 700.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-5821) Create StateBackend root interface

2017-02-16 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-5821:
---

 Summary: Create StateBackend root interface
 Key: FLINK-5821
 URL: https://issues.apache.org/jira/browse/FLINK-5821
 Project: Flink
  Issue Type: Sub-task
  Components: State Backends, Checkpointing
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.3.0


This requires to rename the {{StateBackend}} interface to {{StateBinder}} (as 
previously suggested) and create a new {{StateBackend}} interface as the 
superinterface to {{AbstractStateBackend}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-5822) Make Checkpoint Coordinator aware of State Backend

2017-02-16 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-5822:
---

 Summary: Make Checkpoint Coordinator aware of State Backend
 Key: FLINK-5822
 URL: https://issues.apache.org/jira/browse/FLINK-5822
 Project: Flink
  Issue Type: Sub-task
  Components: State Backends, Checkpointing
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.3.0






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-5824) Fix String/byte conversions without explicit encoding

2017-02-16 Thread Ufuk Celebi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi updated FLINK-5824:
---
Summary: Fix String/byte conversions without explicit encoding  (was: Fix 
String/byte conversion without explicit encodings)

> Fix String/byte conversions without explicit encoding
> -
>
> Key: FLINK-5824
> URL: https://issues.apache.org/jira/browse/FLINK-5824
> Project: Flink
>  Issue Type: Bug
>  Components: Python API, Queryable State, State Backends, 
> Checkpointing, Webfrontend
>Reporter: Ufuk Celebi
>
> In a couple of places we convert Strings to bytes and bytes back to Strings 
> without explicitly specifying an encoding. This can lead to problems when 
> client and server default encodings differ.
> The task of this JIRA is to go over the whole project and look for 
> conversions where we don't specify an encoding and fix it to specify UTF-8 
> explicitly.
> For starters, we can {{grep -R 'getBytes()' .}}, which already reveals many 
> problematic places.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5705) webmonitor's request/response use UTF-8 explicitly

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869923#comment-15869923
 ] 

ASF GitHub Bot commented on FLINK-5705:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3257
  
The fix looks good, thanks!
Merging this...


> webmonitor's request/response use UTF-8 explicitly
> --
>
> Key: FLINK-5705
> URL: https://issues.apache.org/jira/browse/FLINK-5705
> Project: Flink
>  Issue Type: Improvement
>  Components: Webfrontend
>Reporter: shijinkui
>Assignee: shijinkui
> Fix For: 1.2.1
>
>
> QueryStringDecoder and HttpPostRequestDecoder use UTF-8 defined in flink.
> Response set content-encoding header with utf-8



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...

2017-02-16 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the issue:

https://github.com/apache/flink/pull/3335
  
Hi Stephan,

You may have a little misunderstanding about this change. It only controls 
directories with job id (generated using UUID), but not the configured root 
checkpoint directory.  I agree with you that the root directory should be 
created or changed permission when setup, but setup would not be aware of these 
directories with job ids, which are created in runtime.

About Hadoop dependency, I admit I am using a convenient (let's say a hack 
way) to do the transition, as it need a bit more codes to do it standalone. I 
will change it if it's a problem :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5624) Support tumbling window on streaming tables in the SQL API

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869977#comment-15869977
 ] 

ASF GitHub Bot commented on FLINK-5624:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3252
  
Thanks for the update @haohui.
PR is good to merge.


> Support tumbling window on streaming tables in the SQL API
> --
>
> Key: FLINK-5624
> URL: https://issues.apache.org/jira/browse/FLINK-5624
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> This is a follow up of FLINK-4691.
> FLINK-4691 adds supports for group-windows for streaming tables. This jira 
> proposes to expose the functionality in the SQL layer via the {{GROUP BY}} 
> clauses, as described in 
> http://calcite.apache.org/docs/stream.html#tumbling-windows.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5588) Add a unit scaler based on different norms

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869984#comment-15869984
 ] 

ASF GitHub Bot commented on FLINK-5588:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/3313
  
But please don't do a force push on a branch which has been opened as a PR. 
If there was a review ongoing, then the force push might invalidate it (e.g. if 
you changed something in the relevant files).


> Add a unit scaler based on different norms
> --
>
> Key: FLINK-5588
> URL: https://issues.apache.org/jira/browse/FLINK-5588
> Project: Flink
>  Issue Type: New Feature
>  Components: Machine Learning Library
>Reporter: Stavros Kontopoulos
>Assignee: Stavros Kontopoulos
>Priority: Minor
>
> So far ML has two scalers: min-max and the standard scaler.
> A third one frequently used, is the scaler to unit.
> We could implement a transformer for this type of scaling for different norms 
> available to the user.
> I will make a separate class for the Normalization per sample procedure by 
> using the Transformer API because it is easy to add
> it, fit method does nothing in this case.
> Scikit-learn has also some calls available outside the Transform API, we 
> might want add that in the future.
> These calls work on any axis but they are not re-usable in a pipeline [4]
> Right now the existing scalers in Flink ML support per feature normalization 
> by using the Transformer API. 
> Resources
> [1] https://en.wikipedia.org/wiki/Feature_scaling
> [2] 
> http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html
> [3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html
> [4] http://scikit-learn.org/stable/modules/preprocessing.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3313: [FLINK-5588][ml] add a data normalizer to ml library

2017-02-16 Thread skonto
Github user skonto commented on the issue:

https://github.com/apache/flink/pull/3313
  
Ok sorry for that I did squash the commits, also I am used to it from other 
projects where the comments are invalidated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-5814) flink-dist creates wrong symlink when not used with cleaned before

2017-02-16 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-5814:
--

 Summary: flink-dist creates wrong symlink when not used with 
cleaned before
 Key: FLINK-5814
 URL: https://issues.apache.org/jira/browse/FLINK-5814
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 1.2.0
Reporter: Nico Kruber
Assignee: Nico Kruber
Priority: Minor


If {{/build-target}} already exists, 'mvn package' for flink-dist  
will create a symbolic link *inside* {{/build-target}} instead of 
replacing that symlink. This is due to the behaviour of {{ln \-sf}} for target 
links that point to directories and may be solved by adding the 
{{--no-dereference}} parameter.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3327: [FLINK-5616] [tests] Harden YarnIntraNonHaMasterServicesT...

2017-02-16 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3327
  
I am wondering if this is the same as the original timeout meaning.

I though that using `verify` with a timeout will wait for a while until the 
call must have happened. Without the timeout, the verification will fail if the 
call had not happened by the time the verification runs.

Adding the timeout to the test as a whole means that the test may not take 
longer than that timeout.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (FLINK-5751) 404 in documentation

2017-02-16 Thread Ufuk Celebi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi closed FLINK-5751.
--
   Resolution: Fixed
Fix Version/s: 1.2.1
   1.3.0

67e3d9d (release-1.2), 67e3d9d (master).

> 404 in documentation
> 
>
> Key: FLINK-5751
> URL: https://issues.apache.org/jira/browse/FLINK-5751
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Colin Breame
>Priority: Trivial
> Fix For: 1.3.0, 1.2.1
>
>
> This page:
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/quickstart/setup_quickstart.html
> Contains a link with title "Flink on Windows" with URL:
> - 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/flink_on_windows
> This gives a 404.  It should be:
> - 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/flink_on_windows.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-1725) New Partitioner for better load balancing for skewed data

2017-02-16 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek updated FLINK-1725:

Component/s: (was: Local Runtime)
 DataStream API

> New Partitioner for better load balancing for skewed data
> -
>
> Key: FLINK-1725
> URL: https://issues.apache.org/jira/browse/FLINK-1725
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Affects Versions: 0.8.1
>Reporter: Anis Nasir
>Assignee: Anis Nasir
>  Labels: LoadBalancing, Partitioner
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Hi,
> We have recently studied the problem of load balancing in Storm [1].
> In particular, we focused on key distribution of the stream for skewed data.
> We developed a new stream partitioning scheme (which we call Partial Key 
> Grouping). It achieves better load balancing than key grouping while being 
> more scalable than shuffle grouping in terms of memory.
> In the paper we show a number of mining algorithms that are easy to implement 
> with partial key grouping, and whose performance can benefit from it. We 
> think that it might also be useful for a larger class of algorithms.
> Partial key grouping is very easy to implement: it requires just a few lines 
> of code in Java when implemented as a custom grouping in Storm [2].
> For all these reasons, we believe it will be a nice addition to the standard 
> Partitioners available in Flink. If the community thinks it's a good idea, we 
> will be happy to offer support in the porting.
> References:
> [1]. 
> https://melmeric.files.wordpress.com/2014/11/the-power-of-both-choices-practical-load-balancing-for-distributed-stream-processing-engines.pdf
> [2]. https://github.com/gdfm/partial-key-grouping



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3333: Webui/watermarks

2017-02-16 Thread nellboy
GitHub user nellboy opened a pull request:

https://github.com/apache/flink/pull/

Webui/watermarks

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [ ] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [ ] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nellboy/flink webui/watermarks

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #


commit 6e7f5fb6778f3bcf81953b6ddd4c78b07da849d0
Author: paul 
Date:   2017-02-16T10:25:45Z

[FLINK-3427] [webui] implements watermarks tab

commit 8a7afaf6e32389d2617ce832d973a83b948d6732
Author: paul 
Date:   2017-02-16T10:57:26Z

[FLINK-3427] [webui] display watermarks index only instead of full id




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5751) 404 in documentation

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869734#comment-15869734
 ] 

ASF GitHub Bot commented on FLINK-5751:
---

GitHub user uce opened a pull request:

https://github.com/apache/flink/pull/3332

[FLINK-5751] [docs] Add link check script

This is a slightly adjusted script from @patricklucas as posted in 
FLINK-5751.

I was wondering whether it makes sense to have it inside the repo, the 
local work flow would be:
```sh
./build_docs.sh -p
./check_links.sh
```

You can overwrite the URL to check like `./check_links.sh 
https://ci.apache.org/projects/flink/flink-docs-master/`.

@rmetzger @patricklucas Should we include this?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uce/flink check_links

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3332.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3332


commit 50a879d8d18d009787fb6c67d78e0e5b79e010a9
Author: Ufuk Celebi 
Date:   2017-02-16T10:58:35Z

[FLINK-5751] [docs] Add link check script




> 404 in documentation
> 
>
> Key: FLINK-5751
> URL: https://issues.apache.org/jira/browse/FLINK-5751
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Colin Breame
>Priority: Trivial
> Fix For: 1.3.0, 1.2.1
>
>
> This page:
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/quickstart/setup_quickstart.html
> Contains a link with title "Flink on Windows" with URL:
> - 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/flink_on_windows
> This gives a 404.  It should be:
> - 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/flink_on_windows.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3332: [FLINK-5751] [docs] Add link check script

2017-02-16 Thread uce
GitHub user uce opened a pull request:

https://github.com/apache/flink/pull/3332

[FLINK-5751] [docs] Add link check script

This is a slightly adjusted script from @patricklucas as posted in 
FLINK-5751.

I was wondering whether it makes sense to have it inside the repo, the 
local work flow would be:
```sh
./build_docs.sh -p
./check_links.sh
```

You can overwrite the URL to check like `./check_links.sh 
https://ci.apache.org/projects/flink/flink-docs-master/`.

@rmetzger @patricklucas Should we include this?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uce/flink check_links

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3332.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3332


commit 50a879d8d18d009787fb6c67d78e0e5b79e010a9
Author: Ufuk Celebi 
Date:   2017-02-16T10:58:35Z

[FLINK-5751] [docs] Add link check script




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3331: [FLINK-5814] fix packaging flink-dist in unclean s...

2017-02-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3331


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5814) flink-dist creates wrong symlink when not used with cleaned before

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869824#comment-15869824
 ] 

ASF GitHub Bot commented on FLINK-5814:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3331


> flink-dist creates wrong symlink when not used with cleaned before
> --
>
> Key: FLINK-5814
> URL: https://issues.apache.org/jira/browse/FLINK-5814
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.2.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Minor
>
> If {{/build-target}} already exists, 'mvn package' for flink-dist  
> will create a symbolic link *inside* {{/build-target}} instead of 
> replacing that symlink. This is due to the behaviour of {{ln \-sf}} for 
> target links that point to directories and may be solved by adding the 
> {{--no-dereference}} parameter.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5588) Add a unit scaler based on different norms

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869518#comment-15869518
 ] 

ASF GitHub Bot commented on FLINK-5588:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/3313
  
Hi @skonto, if you have activated the Travis integration for your own repo, 
then you can restart the testing job there. For the apache account it is not 
possible. However, I've seen that only the last build ran out of time. We're 
currently at the very limit in terms of runtime what Travis allows. That is a 
known issue and we try to address it soon.


> Add a unit scaler based on different norms
> --
>
> Key: FLINK-5588
> URL: https://issues.apache.org/jira/browse/FLINK-5588
> Project: Flink
>  Issue Type: New Feature
>  Components: Machine Learning Library
>Reporter: Stavros Kontopoulos
>Assignee: Stavros Kontopoulos
>Priority: Minor
>
> So far ML has two scalers: min-max and the standard scaler.
> A third one frequently used, is the scaler to unit.
> We could implement a transformer for this type of scaling for different norms 
> available to the user.
> I will make a separate class for the Normalization per sample procedure by 
> using the Transformer API because it is easy to add
> it, fit method does nothing in this case.
> Scikit-learn has also some calls available outside the Transform API, we 
> might want add that in the future.
> These calls work on any axis but they are not re-usable in a pipeline [4]
> Right now the existing scalers in Flink ML support per feature normalization 
> by using the Transformer API. 
> Resources
> [1] https://en.wikipedia.org/wiki/Feature_scaling
> [2] 
> http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html
> [3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html
> [4] http://scikit-learn.org/stable/modules/preprocessing.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3313: [FLINK-5588][ml] add a data normalizer to ml library

2017-02-16 Thread tillrohrmann
Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/3313
  
Hi @skonto, if you have activated the Travis integration for your own repo, 
then you can restart the testing job there. For the apache account it is not 
possible. However, I've seen that only the last build ran out of time. We're 
currently at the very limit in terms of runtime what Travis allows. That is a 
known issue and we try to address it soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-5815) Add resource files configuration for Yarn Mode

2017-02-16 Thread Wenlong Lyu (JIRA)
Wenlong Lyu created FLINK-5815:
--

 Summary: Add resource files configuration for Yarn Mode
 Key: FLINK-5815
 URL: https://issues.apache.org/jira/browse/FLINK-5815
 Project: Flink
  Issue Type: Improvement
  Components: Client, Distributed Coordination
Reporter: Wenlong Lyu
Assignee: Wenlong Lyu


Currently in flink, when we want to setup a resource file to distributed cache, 
we need to make the file accessible remotely by a url, which is often difficult 
to maintain a service like that. What's more, when we want do add some extra 
jar files to job classpath, we need to copy the jar files to blob server when 
submitting the jobgraph. In yarn, especially in flip-6, the blob server is not 
running yet when we try to start a flink job. 
Yarn has a efficient distributed cache implementation for application running 
on it, what's more we can be easily share the files stored in hdfs in different 
application by distributed cache without extra IO operations. 
I suggest to introduce -yfiles, -ylibjars -yarchives options to FlinkYarnCLI to 
enable yarn user setup their job resource files by yarn distributed cache. The 
options is compatible with what is used in mapreduce, which make it easy to use 
for yarn user who generally has experience on using mapreduce.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5814) flink-dist creates wrong symlink when not used with cleaned before

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869565#comment-15869565
 ] 

ASF GitHub Bot commented on FLINK-5814:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3331
  
+1 merging this...


> flink-dist creates wrong symlink when not used with cleaned before
> --
>
> Key: FLINK-5814
> URL: https://issues.apache.org/jira/browse/FLINK-5814
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.2.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Minor
>
> If {{/build-target}} already exists, 'mvn package' for flink-dist  
> will create a symbolic link *inside* {{/build-target}} instead of 
> replacing that symlink. This is due to the behaviour of {{ln \-sf}} for 
> target links that point to directories and may be solved by adding the 
> {{--no-dereference}} parameter.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-5817) Fix test concurrent execution failure by test dir conflicts.

2017-02-16 Thread Wenlong Lyu (JIRA)
Wenlong Lyu created FLINK-5817:
--

 Summary: Fix test concurrent execution failure by test dir 
conflicts.
 Key: FLINK-5817
 URL: https://issues.apache.org/jira/browse/FLINK-5817
 Project: Flink
  Issue Type: Bug
Reporter: Wenlong Lyu
Assignee: Wenlong Lyu


Currently when different users build flink on the same machine, failure may 
happen because some test utilities create test file using the fixed name, which 
will cause file access failing when different user processing the same file at 
the same time.
We have found errors from AbstractTestBase, IOManagerTest, FileCacheTest.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (FLINK-5816) I would like to stream data from MongoDB using Flink's source function. Is this possible? Thank you.

2017-02-16 Thread Pedro Lima Monteiro (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869597#comment-15869597
 ] 

Pedro Lima Monteiro edited comment on FLINK-5816 at 2/16/17 9:24 AM:
-

Hello Fabian,

Thank you for the fast response. I will send an e-mail and close this issue as 
requested.

Pedro


was (Author: pedromlm):
Hello Fabian,

Thank you for the fast response. I will send an e-mail and close this issue as 
requested.

> I would like to stream data from MongoDB using Flink's source function. Is 
> this possible? Thank you.
> 
>
> Key: FLINK-5816
> URL: https://issues.apache.org/jira/browse/FLINK-5816
> Project: Flink
>  Issue Type: Wish
>  Components: DataStream API
> Environment: Flink, MongoDB, Kafka
>Reporter: Pedro Lima Monteiro
>  Labels: mongodb, stream
>
> I am trying to get data from MongoDB to be analysed in Flink.
> I would like to know if it is possible to stream data from MongoDB into 
> Flink. I have looked into Flink's source function to add in the addSource 
> method of the StreamExecutionEnvironment but I had no luck.
> Can anyone help me out? 
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (FLINK-5816) I would like to stream data from MongoDB using Flink's source function. Is this possible? Thank you.

2017-02-16 Thread Pedro Lima Monteiro (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pedro Lima Monteiro closed FLINK-5816.
--
Resolution: Invalid

> I would like to stream data from MongoDB using Flink's source function. Is 
> this possible? Thank you.
> 
>
> Key: FLINK-5816
> URL: https://issues.apache.org/jira/browse/FLINK-5816
> Project: Flink
>  Issue Type: Wish
>  Components: DataStream API
> Environment: Flink, MongoDB, Kafka
>Reporter: Pedro Lima Monteiro
>  Labels: mongodb, stream
>
> I am trying to get data from MongoDB to be analysed in Flink.
> I would like to know if it is possible to stream data from MongoDB into 
> Flink. I have looked into Flink's source function to add in the addSource 
> method of the StreamExecutionEnvironment but I had no luck.
> Can anyone help me out? 
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5816) I would like to stream data from MongoDB using Flink's source function. Is this possible? Thank you.

2017-02-16 Thread Pedro Lima Monteiro (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869597#comment-15869597
 ] 

Pedro Lima Monteiro commented on FLINK-5816:


Hello Fabian,

Thank you for the fast response. I will send an e-mail and close this issue as 
requested.

> I would like to stream data from MongoDB using Flink's source function. Is 
> this possible? Thank you.
> 
>
> Key: FLINK-5816
> URL: https://issues.apache.org/jira/browse/FLINK-5816
> Project: Flink
>  Issue Type: Wish
>  Components: DataStream API
> Environment: Flink, MongoDB, Kafka
>Reporter: Pedro Lima Monteiro
>  Labels: mongodb, stream
>
> I am trying to get data from MongoDB to be analysed in Flink.
> I would like to know if it is possible to stream data from MongoDB into 
> Flink. I have looked into Flink's source function to add in the addSource 
> method of the StreamExecutionEnvironment but I had no luck.
> Can anyone help me out? 
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (FLINK-3687) org.apache.flink.runtime.net.ConnectionUtilsTest fails

2017-02-16 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann resolved FLINK-3687.
--
Resolution: Duplicate

Duplicate of FLINK-4052. Closing this issue because for the other issue there 
was a PR associated with it.

> org.apache.flink.runtime.net.ConnectionUtilsTest fails
> --
>
> Key: FLINK-3687
> URL: https://issues.apache.org/jira/browse/FLINK-3687
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime, Tests
>Reporter: Nikolaas Steenbergen
>Priority: Critical
>  Labels: test-stability
>
> {code}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.51 sec <<< 
> FAILURE! - in org.apache.flink.runtime.net.ConnectionUtilsTest
> testReturnLocalHostAddressUsingHeuristics(org.apache.flink.runtime.net.ConnectionUtilsTest)
>   Time elapsed: 0.504 sec  <<< FAILURE!
> java.lang.AssertionError: 
> expected: 
> but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.net.ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics(ConnectionUtilsTest.java:59)
> Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 68.554 sec - 
> in org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest
> Results :
> Failed tests: 
>   ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics:59 
> expected: 
> but was:
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-3273) Remove Scala dependency from flink-streaming-java

2017-02-16 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek updated FLINK-3273:

Component/s: (was: Scala API)
 DataStream API

> Remove Scala dependency from flink-streaming-java
> -
>
> Key: FLINK-3273
> URL: https://issues.apache.org/jira/browse/FLINK-3273
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API, Streaming
>Reporter: Maximilian Michels
> Fix For: 1.0.0
>
>
> {{flink-streaming-java}} depends on Scala through {{flink-clients}}, 
> {{flink-runtime}}, and {{flink-testing-utils}}. We should get rid of the 
> Scala dependency just like we did for {{flink-java}}. Integration tests and 
> utilities which depend on Scala should be moved to {{flink-tests}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-4810) Checkpoint Coordinator should fail ExecutionGraph after "n" unsuccessful checkpoints

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869773#comment-15869773
 ] 

ASF GitHub Bot commented on FLINK-4810:
---

GitHub user ramkrish86 opened a pull request:

https://github.com/apache/flink/pull/3334

FLINK-4810 Checkpoint Coordinator should fail ExecutionGraph after "n" 
unsuccessful checkpoints

unsuccessful checkpoints

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [ ] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [ ] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


Ran mvn clean verify. Did not add test cases to know the first level 
feedback. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ramkrish86/flink FLINK-4810

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3334.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3334


commit 6e0fb38272e6bb59528065461c6ec6fdd43689ad
Author: Ramkrishna 
Date:   2017-02-16T11:29:37Z

FLINK-4810 Checkpoint Coordinator should fail ExecutionGraph after "n"
unsuccessful checkpoints




> Checkpoint Coordinator should fail ExecutionGraph after "n" unsuccessful 
> checkpoints
> 
>
> Key: FLINK-4810
> URL: https://issues.apache.org/jira/browse/FLINK-4810
> Project: Flink
>  Issue Type: Sub-task
>  Components: State Backends, Checkpointing
>Reporter: Stephan Ewen
>
> The Checkpoint coordinator should track the number of consecutive 
> unsuccessful checkpoints.
> If more than {{n}} (configured value) checkpoints fail in a row, it should 
> call {{fail()}} on the execution graph to trigger a recovery.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5616) YarnPreConfiguredMasterHaServicesTest fails sometimes

2017-02-16 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869560#comment-15869560
 ] 

Till Rohrmann commented on FLINK-5616:
--

Forget what I've written. It simply is a race condition between the grant 
leadership call (executed asynchronously) and the check. I'll fix this.

> YarnPreConfiguredMasterHaServicesTest fails sometimes
> -
>
> Key: FLINK-5616
> URL: https://issues.apache.org/jira/browse/FLINK-5616
> Project: Flink
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 1.2.0, 1.3.0
>Reporter: Aljoscha Krettek
>Assignee: Till Rohrmann
>  Labels: test-stability
>
> This is the relevant part from the log:
> {code}
> ---
>  T E S T S
> ---
> Running 
> org.apache.flink.yarn.highavailability.YarnPreConfiguredMasterHaServicesTest
> Formatting using clusterid: testClusterID
> Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.407 sec - 
> in 
> org.apache.flink.yarn.highavailability.YarnPreConfiguredMasterHaServicesTest
> Running 
> org.apache.flink.yarn.highavailability.YarnIntraNonHaMasterServicesTest
> Formatting using clusterid: testClusterID
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.479 sec <<< 
> FAILURE! - in 
> org.apache.flink.yarn.highavailability.YarnIntraNonHaMasterServicesTest
> testClosingReportsToLeader(org.apache.flink.yarn.highavailability.YarnIntraNonHaMasterServicesTest)
>   Time elapsed: 0.836 sec  <<< FAILURE!
> org.mockito.exceptions.verification.WantedButNotInvoked: 
> Wanted but not invoked:
> leaderContender.handleError();
> -> at 
> org.apache.flink.yarn.highavailability.YarnIntraNonHaMasterServicesTest.testClosingReportsToLeader(YarnIntraNonHaMasterServicesTest.java:120)
> Actually, there were zero interactions with this mock.
>   at 
> org.apache.flink.yarn.highavailability.YarnIntraNonHaMasterServicesTest.testClosingReportsToLeader(YarnIntraNonHaMasterServicesTest.java:120)
> Running org.apache.flink.yarn.YarnFlinkResourceManagerTest
> Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.82 sec - in 
> org.apache.flink.yarn.YarnFlinkResourceManagerTest
> Running org.apache.flink.yarn.YarnClusterDescriptorTest
> java.lang.RuntimeException: Couldn't deploy Yarn cluster
>   at 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:425)
>   at 
> org.apache.flink.yarn.YarnClusterDescriptorTest.testConfigOverwrite(YarnClusterDescriptorTest.java:90)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:483)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
>   at 
> 

[jira] [Created] (FLINK-5816) I would like to stream data from MongoDB using Flink's source function. Is this possible? Thank you.

2017-02-16 Thread Pedro Lima Monteiro (JIRA)
Pedro Lima Monteiro created FLINK-5816:
--

 Summary: I would like to stream data from MongoDB using Flink's 
source function. Is this possible? Thank you.
 Key: FLINK-5816
 URL: https://issues.apache.org/jira/browse/FLINK-5816
 Project: Flink
  Issue Type: Wish
  Components: DataStream API
 Environment: Flink, MongoDB, Kafka
Reporter: Pedro Lima Monteiro


I am trying to get data from MongoDB to be analysed in Flink.
I would like to know if it is possible to stream data from MongoDB into Flink. 
I have looked into Flink's source function to add in the addSource method of 
the StreamExecutionEnvironment but I had no luck.
Can anyone help me out? 
Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5816) I would like to stream data from MongoDB using Flink's source function. Is this possible? Thank you.

2017-02-16 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869589#comment-15869589
 ] 

Fabian Hueske commented on FLINK-5816:
--

Hi [~pedromlm], we use JIRA to track bugs and discuss new features. Not as a 
forum for questions.
The right place for your question would be the u...@apache.flink.org mailing 
list (see http://flink.apache.org/community.html#mailing-lists)

Could you repost your question there and close this issue?
Thank you,
Fabian

> I would like to stream data from MongoDB using Flink's source function. Is 
> this possible? Thank you.
> 
>
> Key: FLINK-5816
> URL: https://issues.apache.org/jira/browse/FLINK-5816
> Project: Flink
>  Issue Type: Wish
>  Components: DataStream API
> Environment: Flink, MongoDB, Kafka
>Reporter: Pedro Lima Monteiro
>  Labels: mongodb, stream
>
> I am trying to get data from MongoDB to be analysed in Flink.
> I would like to know if it is possible to stream data from MongoDB into 
> Flink. I have looked into Flink's source function to add in the addSource 
> method of the StreamExecutionEnvironment but I had no luck.
> Can anyone help me out? 
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-4052) Unstable test ConnectionUtilsTest

2017-02-16 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-4052:
-
Affects Version/s: 1.3.0

> Unstable test ConnectionUtilsTest
> -
>
> Key: FLINK-4052
> URL: https://issues.apache.org/jira/browse/FLINK-4052
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.0.2, 1.3.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> The error is the following:
> {code}
> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics:59 
> expected: but 
> was:
> {code}
> The probable cause for the failure is that the test tries to select an unused 
> closed port (from the ephemeral range), and then assumes that all connections 
> to that port fail.
> If a concurrent test actually uses that port, connections to the port will 
> succeed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Reopened] (FLINK-4052) Unstable test ConnectionUtilsTest

2017-02-16 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reopened FLINK-4052:
--

The problem still seems to persist: 

https://api.travis-ci.org/jobs/201937905/log.txt?deansi=true

> Unstable test ConnectionUtilsTest
> -
>
> Key: FLINK-4052
> URL: https://issues.apache.org/jira/browse/FLINK-4052
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.0.2, 1.3.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> The error is the following:
> {code}
> ConnectionUtilsTest.testReturnLocalHostAddressUsingHeuristics:59 
> expected: but 
> was:
> {code}
> The probable cause for the failure is that the test tries to select an unused 
> closed port (from the ephemeral range), and then assumes that all connections 
> to that port fail.
> If a concurrent test actually uses that port, connections to the port will 
> succeed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3335: [FLINK-5818][Security]change checkpoint dir permis...

2017-02-16 Thread WangTaoTheTonic
GitHub user WangTaoTheTonic opened a pull request:

https://github.com/apache/flink/pull/3335

[FLINK-5818][Security]change checkpoint dir permission to 700

Now checkpoint directory is made w/o specified permission, so it is easy 
for another user to delete or read files under it, which will cause restore 
failure or information leak.

It's better to lower it down to 700.

- [x] Tests & Build
  - Functionality added by the pull request is covered by tests

![chp-filesystem-session](https://cloud.githubusercontent.com/assets/5276001/23019741/d753e8e0-f47e-11e6-9f2e-2cd35de35ef1.JPG)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WangTaoTheTonic/flink FLINK-5818

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3335.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3335


commit 02eef87dc2bfaa6737efad023916898719d34fe2
Author: WangTaoTheTonic 
Date:   2017-02-16T11:24:43Z

change checkpoint dir permission to 700




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5818) change checkpoint dir permission to 700 for security reason

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869776#comment-15869776
 ] 

ASF GitHub Bot commented on FLINK-5818:
---

GitHub user WangTaoTheTonic opened a pull request:

https://github.com/apache/flink/pull/3335

[FLINK-5818][Security]change checkpoint dir permission to 700

Now checkpoint directory is made w/o specified permission, so it is easy 
for another user to delete or read files under it, which will cause restore 
failure or information leak.

It's better to lower it down to 700.

- [x] Tests & Build
  - Functionality added by the pull request is covered by tests

![chp-filesystem-session](https://cloud.githubusercontent.com/assets/5276001/23019741/d753e8e0-f47e-11e6-9f2e-2cd35de35ef1.JPG)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WangTaoTheTonic/flink FLINK-5818

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3335.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3335


commit 02eef87dc2bfaa6737efad023916898719d34fe2
Author: WangTaoTheTonic 
Date:   2017-02-16T11:24:43Z

change checkpoint dir permission to 700




> change checkpoint dir permission to 700 for security reason
> ---
>
> Key: FLINK-5818
> URL: https://issues.apache.org/jira/browse/FLINK-5818
> Project: Flink
>  Issue Type: Improvement
>  Components: Security, State Backends, Checkpointing
>Reporter: Tao Wang
>
> Now checkpoint directory is made w/o specified permission, so it is easy for 
> another user to delete or read files under it, which will cause restore 
> failure or information leak.
> It's better to lower it down to 700.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3176: [FLINK-5571] [table] add open and close methods for UserD...

2017-02-16 Thread godfreyhe
Github user godfreyhe commented on the issue:

https://github.com/apache/flink/pull/3176
  
Thanks for the suggestions @twalthr, I have updated the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3307: [FLINK-5420] Make CEP operators rescalable

2017-02-16 Thread kl0u
Github user kl0u closed the pull request at:

https://github.com/apache/flink/pull/3307


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5420) Make CEP operators rescalable

2017-02-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869534#comment-15869534
 ] 

ASF GitHub Bot commented on FLINK-5420:
---

Github user kl0u closed the pull request at:

https://github.com/apache/flink/pull/3307


> Make CEP operators rescalable
> -
>
> Key: FLINK-5420
> URL: https://issues.apache.org/jira/browse/FLINK-5420
> Project: Flink
>  Issue Type: Bug
>  Components: CEP
>Affects Versions: 1.2.0
>Reporter: Kostas Kloudas
>Assignee: Kostas Kloudas
>
> This issue targets making the operators in the CEP library re-scalable. After 
> this is implemented, the user will be able to take a savepoint and restart 
> his job with a different parallelism.
> The way to do it is to transform the CEP operators into the newly introduced 
> {{ProcessFunction}} and use only managed keyed state to store their state. 
> With this transformation, rescalability will come out-of-the-box. In 
> addition, for the keyed operator and for event time, we will not have to keep 
> the already seen keys in a list, but we can replace them with timers set for 
> each incoming element (in the {{ProcessFunction#processElement()}}) and made 
> to fire at the next watermark (their timestamp will be the that of the 
> element itself). These timers will be set to fire at the next watermark and 
> when they fire, they will register another timer for the next watermark (in 
> the {{ProcessFunction#onTimer()}} they will re-register themselves with a 
> timestamp equal to {{currentWatermark() + 1}}). This will preserve the 
> previous behavior of the operators.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (FLINK-5657) Add processing time OVER RANGE BETWEEN UNBOUNDED PRECEDING aggregation to SQL

2017-02-16 Thread sunjincheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sunjincheng closed FLINK-5657.
--
Resolution: Won't Fix

Move to FLINK-5803 & FLINK-5804.

> Add processing time OVER RANGE BETWEEN UNBOUNDED PRECEDING aggregation to SQL
> -
>
> Key: FLINK-5657
> URL: https://issues.apache.org/jira/browse/FLINK-5657
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Fabian Hueske
>Assignee: sunjincheng
>
> The goal of this issue is to add support for OVER RANGE aggregations on 
> processing time streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT 
>   a, 
>   SUM(b) OVER (PARTITION BY c ORDER BY procTime() RANGE BETWEEN UNBOUNDED 
> PRECEDING AND CURRENT ROW) AS sumB,
>   MIN(b) OVER (PARTITION BY c ORDER BY procTime() RANGE BETWEEN UNBOUNDED 
> PRECEDING AND CURRENT ROW) AS minB
> FROM myStream
> {code}
> The following restrictions should initially apply:
> - All OVER clauses in the same SELECT clause must be exactly the same.
> - The PARTITION BY clause is optional (no partitioning results in single 
> threaded execution).
> - The ORDER BY clause may only have procTime() as parameter. procTime() is a 
> parameterless scalar function that just indicates processing time mode.
> - bounded PRECEDING is not supported (see FLINK-5654)
> - FOLLOWING is not supported.
> The restrictions will be resolved in follow up issues. If we find that some 
> of the restrictions are trivial to address, we can add the functionality in 
> this issue as well.
> This issue includes:
> - Design of the DataStream operator to compute OVER ROW aggregates
> - Translation from Calcite's RelNode representation (LogicalProject with 
> RexOver expression).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   >