date:20170219

[jira] [Commented] (FLINK-5815) Add resource files configuration for Yarn Mode

2017-02-19 Thread Wenlong Lyu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15874104#comment-15874104
 ] 

Wenlong Lyu commented on FLINK-5815:


 Hi, till, we have finished an implementation. I was just blocked by some other 
work last week, I think I can submit the pull request soon this week.

> Add resource files configuration for Yarn Mode
> --
>
> Key: FLINK-5815
> URL: https://issues.apache.org/jira/browse/FLINK-5815
> Project: Flink
>  Issue Type: Improvement
>  Components: Client, YARN
>Affects Versions: 1.3.0
>Reporter: Wenlong Lyu
>Assignee: Wenlong Lyu
>
> Currently in flink, when we want to setup a resource file to distributed 
> cache, we need to make the file accessible remotely by a url, which is often 
> difficult to maintain a service like that. What's more, when we want do add 
> some extra jar files to job classpath, we need to copy the jar files to blob 
> server when submitting the jobgraph. In yarn, especially in flip-6, the blob 
> server is not running yet when we try to start a flink job. 
> Yarn has a efficient distributed cache implementation for application running 
> on it, what's more we can be easily share the files stored in hdfs in 
> different application by distributed cache without extra IO operations. 
> I suggest to introduce -yfiles, -ylibjars -yarchives options to FlinkYarnCLI 
> to enable yarn user setup their job resource files by yarn distributed cache. 
> The options is compatible with what is used in mapreduce, which make it easy 
> to use for yarn user who generally has experience on using mapreduce.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5566) Introduce structure to hold table and column level statistics

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15874084#comment-15874084
 ] 

ASF GitHub Bot commented on FLINK-5566:
---

GitHub user kaibozhou opened a pull request:

https://github.com/apache/flink/pull/3357

[FLINK-5566] [table] Exception when do filter after join a UDTF which 
returns a POJO type

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [x] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [x] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


This PR will fix the case:  do filter after join a UDTF which returns a 
POJO type

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kaibozhou/flink flink-5827

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3357.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3357


commit 0abc4f471f93a5f68ec67c27e8e94f531c36de94
Author: 宝牛 
Date:   2017-02-20T05:43:07Z

[FLINK-5566] [table] Exception when do filter after join a udtf which 
returns a POJO type




> Introduce structure to hold table and column level statistics
> -
>
> Key: FLINK-5566
> URL: https://issues.apache.org/jira/browse/FLINK-5566
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Kurt Young
>Assignee: zhangjing
> Fix For: 1.3.0
>
>
> We define two structure mode to hold statistics
> 1. TableStats: contain stats for table level, now only one element: rowCount
> 2. ColumnStats: contain stats of column level. 
> for numeric column type: including ndv, nullCount, max, min, histogram
> for string type: including ndv, nullCount, avgLen,maxLen
> for boolean:including ndv, nullCount, trueCount, falseCount
> for date/time/timestamp:  including ndv, nullCount, max, min, histogram 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3357: [FLINK-5566] [table] Exception when do filter afte...

2017-02-19 Thread kaibozhou

GitHub user kaibozhou opened a pull request:

https://github.com/apache/flink/pull/3357

[FLINK-5566] [table] Exception when do filter after join a UDTF which 
returns a POJO type

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [x] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [x] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


This PR will fix the case:  do filter after join a UDTF which returns a 
POJO type

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kaibozhou/flink flink-5827

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3357.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3357


commit 0abc4f471f93a5f68ec67c27e8e94f531c36de94
Author: å®ç 
Date:   2017-02-20T05:43:07Z

[FLINK-5566] [table] Exception when do filter after join a udtf which 
returns a POJO type




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5767) New aggregate function interface and built-in aggregate functions

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15874067#comment-15874067
 ] 

ASF GitHub Bot commented on FLINK-5767:
---

Github user shaoxuan-wang commented on the issue:

https://github.com/apache/flink/pull/3354
  
The travis-ci test build failed as it hits the Travis 50 min limit: "The 
job exceeded the maximum time limit for jobs, and has been terminated".  Robert 
has a fix for this FLINK-5731. Hope this won't block the review process. 


> New aggregate function interface and built-in aggregate functions
> -
>
> Key: FLINK-5767
> URL: https://issues.apache.org/jira/browse/FLINK-5767
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Shaoxuan Wang
>Assignee: Shaoxuan Wang
>
> Add a new aggregate function interface. This includes implementing the 
> aggregate interface, migrating the existing aggregation functions to this 
> interface, and adding the unit tests for these functions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink issue #3354: [FLINK-5767] [Table] New aggregate function interface and...

2017-02-19 Thread shaoxuan-wang

Github user shaoxuan-wang commented on the issue:

https://github.com/apache/flink/pull/3354
  
The travis-ci test build failed as it hits the Travis 50 min limit: "The 
job exceeded the maximum time limit for jobs, and has been terminated".  Robert 
has a fix for this FLINK-5731. Hope this won't block the review process. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3330: [FLINK-5795][TableAPI] Improve UDTF to support...

2017-02-19 Thread wuchong

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/3330#discussion_r101946735
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/datastream/DataStreamUserDefinedFunctionITCase.scala
 ---
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.datastream
+
+import org.apache.flink.api.scala._
+import org.apache.flink.types.Row
+import org.apache.flink.table.api.scala.stream.utils.StreamITCase
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.utils.TableFunc3
+import org.apache.flink.streaming.api.scala.{DataStream, 
StreamExecutionEnvironment}
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.apache.flink.table.api.TableEnvironment
+import org.junit.Assert._
+import org.junit.Test
+
+import scala.collection.mutable
+
+class DataStreamUserDefinedFunctionITCase extends 
StreamingMultipleProgramsTestBase {
--- End diff --

IMO, the scalar UDF tests can be put into `UserDefinedScalarFunctionTest`. 
Scalar function tests do not need to setup a cluster environment, so an unit 
test is enough.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5795) Improve “UDTF" to support constructor with parameter.

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15874017#comment-15874017
 ] 

ASF GitHub Bot commented on FLINK-5795:
---

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/3330#discussion_r101946735
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/datastream/DataStreamUserDefinedFunctionITCase.scala
 ---
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.datastream
+
+import org.apache.flink.api.scala._
+import org.apache.flink.types.Row
+import org.apache.flink.table.api.scala.stream.utils.StreamITCase
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.utils.TableFunc3
+import org.apache.flink.streaming.api.scala.{DataStream, 
StreamExecutionEnvironment}
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.apache.flink.table.api.TableEnvironment
+import org.junit.Assert._
+import org.junit.Test
+
+import scala.collection.mutable
+
+class DataStreamUserDefinedFunctionITCase extends 
StreamingMultipleProgramsTestBase {
--- End diff --

IMO, the scalar UDF tests can be put into `UserDefinedScalarFunctionTest`. 
Scalar function tests do not need to setup a cluster environment, so an unit 
test is enough.


> Improve “UDTF" to support constructor with parameter.
> -
>
> Key: FLINK-5795
> URL: https://issues.apache.org/jira/browse/FLINK-5795
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (FLINK-5844) jobmanager was killed when disk less 10% and restart fail

2017-02-19 Thread jing lining (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jing lining updated FLINK-5844:
---
Environment: 


Description: 
JobManager was killed

提交命令： /bin/flink  run -m yarn-cluster -yn 6 -yjm 1024 -ytm 2048
log is
{quote}
2017-02-19 03:20:37,087 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner 
- RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.
2017-02-19 03:20:37,088 INFO  
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Stopping 
checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
2017-02-19 03:20:37,088 INFO  
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Stopping 
checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache   
- Shutting down BlobCache
2017-02-19 03:20:37,089 INFO  
org.apache.flink.runtime.webmonitor.WebRuntimeMonitor - Removing web 
dashboard root cache directory 
/tmp/flink-web-dfa2b369-44ea-4e35-8011-672a1e627a10
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache   
- Shutting down BlobCache
2017-02-19 03:20:37,137 INFO  
org.apache.flink.runtime.webmonitor.WebRuntimeMonitor - Removing web 
dashboard jar upload directory 
/tmp/flink-web-upload-d6edb5ea-5894-489b-89f7-f2972fc9433d
2017-02-19 03:20:37,138 INFO  org.apache.flink.runtime.blob.BlobServer  
- Stopped BLOB server at 0.0.0.0:54513
End of LogType:jobmanager.log
{quote}

then yarn restart new node but always fail

log

{quote}
2017-02-19 03:20:39,166 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner 
- 

2017-02-19 03:20:39,167 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner 
-  Starting YARN ApplicationMaster / JobManager (Version: 1.1.3, 
Rev:8e8d454, Date:10.10.2016 @ 13:26:32 UTC)
2017-02-19 03:20:39,167 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner 
-  Current user: appweb
2017-02-19 03:20:39,167 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner 
-  JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 
1.8/25.65-b01
2017-02-19 03:20:39,167 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner 
-  Maximum heap size: 1840 MiBytes
2017-02-19 03:20:39,168 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner 
-  JAVA_HOME: /data/program/java
2017-02-19 03:20:39,168 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner 
-  Hadoop version: 2.7.2
2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner 
-  JVM Options:
2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner 
- -Xmx1920M
2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner 
- -verbose:gc -Xloggc:/tmp/gc.log -XX:+PrintGCDetails 
-XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=20 
-XX:GCLogFileSize=500m -XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/tmp/java.hprof
2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner 
- 
-Dlog.file=/data/logs/hadoop/containers/application_1482390799413_0053/container_1482390799413_0053_02_01/jobmanager.log
2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner 
- -Dlogback.configurationFile=file:logback.xml
2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner 
- -Dlog4j.configuration=file:log4j.properties
2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner 
-  Program Arguments: (none)
2017-02-19 03:20:39,169 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner 
-  Classpath:

[jira] [Created] (FLINK-5844) jobmanager was killed when disk less 10% and restart fail

2017-02-19 Thread jing lining (JIRA)

jing lining created FLINK-5844:
--

 Summary: jobmanager was killed when disk less 10% and restart fail
 Key: FLINK-5844
 URL: https://issues.apache.org/jira/browse/FLINK-5844
 Project: Flink
  Issue Type: Bug
  Components: YARN
Affects Versions: 1.1.3
Reporter: jing lining


JobManager was killed

log is
{quote}
2017-02-19 03:20:37,087 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner 
- RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.
2017-02-19 03:20:37,088 INFO  
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Stopping 
checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
2017-02-19 03:20:37,088 INFO  
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Stopping 
checkpoint coordinator for job 1b45608e30808183913eeffbb4d855da
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache   
- Shutting down BlobCache
2017-02-19 03:20:37,089 INFO  
org.apache.flink.runtime.webmonitor.WebRuntimeMonitor - Removing web 
dashboard root cache directory 
/tmp/flink-web-dfa2b369-44ea-4e35-8011-672a1e627a10
2017-02-19 03:20:37,089 INFO  org.apache.flink.runtime.blob.BlobCache   
- Shutting down BlobCache
2017-02-19 03:20:37,137 INFO  
org.apache.flink.runtime.webmonitor.WebRuntimeMonitor - Removing web 
dashboard jar upload directory 
/tmp/flink-web-upload-d6edb5ea-5894-489b-89f7-f2972fc9433d
2017-02-19 03:20:37,138 INFO  org.apache.flink.runtime.blob.BlobServer  
- Stopped BLOB server at 0.0.0.0:54513
End of LogType:jobmanager.log
{quote}

then yarn restart new node but always fail

log

{quote}
2017-02-19 03:20:44,244 WARN  
org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler - Error while 
handling request
org.apache.flink.runtime.webmonitor.NotFoundException: Could not find job with 
id 1b45608e30808183913eeffbb4d855da
at 
org.apache.flink.runtime.webmonitor.handlers.AbstractExecutionGraphRequestHandler.handleRequest(AbstractExecutionGraphRequestHandler.java:58)
at 
org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler.respondAsLeader(RuntimeMonitorHandler.java:88)
at 
org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:84)
at 
org.apache.flink.runtime.webmonitor.RuntimeMonitorHandlerBase.channelRead0(RuntimeMonitorHandlerBase.java:44)
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at io.netty.handler.codec.http.router.Handler.routed(Handler.java:62)
at 
io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:57)
at 
io.netty.handler.codec.http.router.DualAbstractHandler.channelRead0(DualAbstractHandler.java:20)
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:105)
at 
org.apache.flink.runtime.webmonitor.HttpRequestHandler.channelRead0(HttpRequestHandler.java:65)
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
at 
io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:147)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)

[GitHub] flink pull request #3330: [FLINK-5795][TableAPI] Improve UDTF to support...

2017-02-19 Thread sunjincheng121

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/3330#discussion_r101944209
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/datastream/DataStreamUserDefinedFunctionITCase.scala
 ---
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.datastream
+
+import org.apache.flink.api.scala._
+import org.apache.flink.types.Row
+import org.apache.flink.table.api.scala.stream.utils.StreamITCase
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.utils.TableFunc3
+import org.apache.flink.streaming.api.scala.{DataStream, 
StreamExecutionEnvironment}
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.apache.flink.table.api.TableEnvironment
+import org.junit.Assert._
+import org.junit.Test
+
+import scala.collection.mutable
+
+class DataStreamUserDefinedFunctionITCase extends 
StreamingMultipleProgramsTestBase {
--- End diff --

Where do we need to put the UDF test? If we put the UDTF test into 
DataStreamCorrelateITCase. Do we need to create a separate ITCase? @fhueske 
@wuchong 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5795) Improve “UDTF" to support constructor with parameter.

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873998#comment-15873998
 ] 

ASF GitHub Bot commented on FLINK-5795:
---

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/3330#discussion_r101944209
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/datastream/DataStreamUserDefinedFunctionITCase.scala
 ---
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.datastream
+
+import org.apache.flink.api.scala._
+import org.apache.flink.types.Row
+import org.apache.flink.table.api.scala.stream.utils.StreamITCase
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.utils.TableFunc3
+import org.apache.flink.streaming.api.scala.{DataStream, 
StreamExecutionEnvironment}
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.apache.flink.table.api.TableEnvironment
+import org.junit.Assert._
+import org.junit.Test
+
+import scala.collection.mutable
+
+class DataStreamUserDefinedFunctionITCase extends 
StreamingMultipleProgramsTestBase {
--- End diff --

Where do we need to put the UDF test? If we put the UDTF test into 
DataStreamCorrelateITCase. Do we need to create a separate ITCase? @fhueske 
@wuchong 


> Improve “UDTF" to support constructor with parameter.
> -
>
> Key: FLINK-5795
> URL: https://issues.apache.org/jira/browse/FLINK-5795
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5795) Improve “UDTF" to support constructor with parameter.

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873992#comment-15873992
 ] 

ASF GitHub Bot commented on FLINK-5795:
---

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/3330#discussion_r101943644
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/dataset/DataSetUserDefinedFunctionITCase.scala
 ---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.dataset
+
+import org.apache.flink.api.scala._
+import org.apache.flink.types.Row
+import 
org.apache.flink.table.api.scala.batch.utils.TableProgramsClusterTestBase
+import 
org.apache.flink.table.api.scala.batch.utils.TableProgramsTestBase.TableConfigMode
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.api.TableEnvironment
+import org.apache.flink.table.utils._
+import org.apache.flink.test.util.TestBaseUtils
+import org.junit.Test
+import org.junit.runner.RunWith
+import org.junit.runners.Parameterized
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+
+@RunWith(classOf[Parameterized])
+class DataSetUserDefinedFunctionITCase (
--- End diff --

We also need to add the UDF’s test , UDF‘s test add to the 
`DataSetCorrelateITCase` is not appropriate. So I think extend  
`TableProgramsCollectionTestBase` is good way .


> Improve “UDTF" to support constructor with parameter.
> -
>
> Key: FLINK-5795
> URL: https://issues.apache.org/jira/browse/FLINK-5795
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3330: [FLINK-5795][TableAPI] Improve UDTF to support...

2017-02-19 Thread sunjincheng121

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/3330#discussion_r101943644
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/dataset/DataSetUserDefinedFunctionITCase.scala
 ---
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.dataset
+
+import org.apache.flink.api.scala._
+import org.apache.flink.types.Row
+import 
org.apache.flink.table.api.scala.batch.utils.TableProgramsClusterTestBase
+import 
org.apache.flink.table.api.scala.batch.utils.TableProgramsTestBase.TableConfigMode
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.api.TableEnvironment
+import org.apache.flink.table.utils._
+import org.apache.flink.test.util.TestBaseUtils
+import org.junit.Test
+import org.junit.runner.RunWith
+import org.junit.runners.Parameterized
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+
+@RunWith(classOf[Parameterized])
+class DataSetUserDefinedFunctionITCase (
--- End diff --

We also need to add the UDFâs test , UDFâs test add to the 
`DataSetCorrelateITCase` is not appropriate. So I think extend  
`TableProgramsCollectionTestBase` is good way .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3330: [FLINK-5795][TableAPI] Improve UDTF to support...

2017-02-19 Thread sunjincheng121

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/3330#discussion_r101942319
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/codegen/CodeGenerator.scala
 ---
@@ -1463,21 +1465,23 @@ class CodeGenerator(
 */
   def addReusableFunction(function: UserDefinedFunction): String = {
 val classQualifier = function.getClass.getCanonicalName
-val fieldTerm = s"function_${classQualifier.replace('.', '$')}"
+val functionSerializedData = serialize(function)
+val fieldTerm =
+  s"""
+ |function_${classQualifier.replace('.', 
'$')}_${DigestUtils.md5Hex(functionSerializedData)}
--- End diff --

Hi @wuchong 
No, scalar UDF in Scala Table API not works well.  
The reason is that when we create `ScalarSqlFunction`, we apply` 
scalarFunction.getClass.getCanonicalName` as sql identifier, which produces the 
wrong result.

Hi, @fhueske  
So far, we have discussed a lot of UDF implementations in this PR, so I 
agree with merge FLINK-5794 into this PR.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5795) Improve “UDTF" to support constructor with parameter.

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873983#comment-15873983
 ] 

ASF GitHub Bot commented on FLINK-5795:
---

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/3330#discussion_r101942319
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/codegen/CodeGenerator.scala
 ---
@@ -1463,21 +1465,23 @@ class CodeGenerator(
 */
   def addReusableFunction(function: UserDefinedFunction): String = {
 val classQualifier = function.getClass.getCanonicalName
-val fieldTerm = s"function_${classQualifier.replace('.', '$')}"
+val functionSerializedData = serialize(function)
+val fieldTerm =
+  s"""
+ |function_${classQualifier.replace('.', 
'$')}_${DigestUtils.md5Hex(functionSerializedData)}
--- End diff --

Hi @wuchong 
No, scalar UDF in Scala Table API not works well.  
The reason is that when we create `ScalarSqlFunction`, we apply` 
scalarFunction.getClass.getCanonicalName` as sql identifier, which produces the 
wrong result.

Hi, @fhueske  
So far, we have discussed a lot of UDF implementations in this PR, so I 
agree with merge FLINK-5794 into this PR.



> Improve “UDTF" to support constructor with parameter.
> -
>
> Key: FLINK-5795
> URL: https://issues.apache.org/jira/browse/FLINK-5795
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink issue #3190: [FLINK-5546][build] java.io.tmpdir setted as project buil...

2017-02-19 Thread shijinkui

Github user shijinkui commented on the issue:

https://github.com/apache/flink/pull/3190
  
@StephanEwen The FileCacheDeleteValidationTest had been fixed in 
FLINK-5817. This PR have rollback it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5546) java.io.tmpdir setted as project build directory in surefire plugin

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873979#comment-15873979
 ] 

ASF GitHub Bot commented on FLINK-5546:
---

Github user shijinkui commented on the issue:

https://github.com/apache/flink/pull/3190
  
@StephanEwen The FileCacheDeleteValidationTest had been fixed in 
FLINK-5817. This PR have rollback it.


> java.io.tmpdir setted as project build directory in surefire plugin
> ---
>
> Key: FLINK-5546
> URL: https://issues.apache.org/jira/browse/FLINK-5546
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System
> Environment: CentOS 7.2
>Reporter: Syinchwun Leo
>Assignee: shijinkui
> Fix For: 1.2.1
>
>
> When multiple Linux users run test at the same time, flink-runtime module may 
> fail. User A creates /tmp/cacheFile, and User B will have no permission to 
> visit the fold.  
> Failed tests: 
> FileCacheDeleteValidationTest.setup:79 Error initializing the test: 
> /tmp/cacheFile (Permission denied)
> Tests in error: 
> IOManagerTest.channelEnumerator:54 » Runtime Could not create storage 
> director...
> Tests run: 1385, Failures: 1, Errors: 1, Skipped: 8



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5818) change checkpoint dir permission to 700 for security reason

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873976#comment-15873976
 ] 

ASF GitHub Bot commented on FLINK-5818:
---

Github user WangTaoTheTonic commented on the issue:

https://github.com/apache/flink/pull/3335
  
@greghogan I'm aware of that, but my concern is when lots of users store 
their checkpoint files under same root directory, it would be a burden for 
admin to set different ACLs for different needs, like user1 can read user2 and 
user3's files, while user2 can only read files of user1, while user3 would like 
read files of user4, while ...

Only set one ACL(like flink_admin) to allow one group to access all is not 
fine grained, as there is need that for some user (like user1), we only allow 
it to access some, not all, of sub directories(like sub directories user2 and 
user3 created).


> change checkpoint dir permission to 700 for security reason
> ---
>
> Key: FLINK-5818
> URL: https://issues.apache.org/jira/browse/FLINK-5818
> Project: Flink
>  Issue Type: Sub-task
>  Components: Security, State Backends, Checkpointing
>Reporter: Tao Wang
>
> Now checkpoint directory is made w/o specified permission, so it is easy for 
> another user to delete or read files under it, which will cause restore 
> failure or information leak.
> It's better to lower it down to 700.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...

2017-02-19 Thread WangTaoTheTonic

Github user WangTaoTheTonic commented on the issue:

https://github.com/apache/flink/pull/3335
  
@greghogan I'm aware of that, but my concern is when lots of users store 
their checkpoint files under same root directory, it would be a burden for 
admin to set different ACLs for different needs, like user1 can read user2 and 
user3's files, while user2 can only read files of user1, while user3 would like 
read files of user4, while ...

Only set one ACL(like flink_admin) to allow one group to access all is not 
fine grained, as there is need that for some user (like user1), we only allow 
it to access some, not all, of sub directories(like sub directories user2 and 
user3 created).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5791) Resource should be strictly matched when allocating for yarn

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873964#comment-15873964
 ] 

ASF GitHub Bot commented on FLINK-5791:
---

Github user shuai-xu commented on the issue:

https://github.com/apache/flink/pull/3304
  
@tillrohrmann ,OK, thank you


> Resource should be strictly matched when allocating for yarn
> 
>
> Key: FLINK-5791
> URL: https://issues.apache.org/jira/browse/FLINK-5791
> Project: Flink
>  Issue Type: Improvement
>  Components: YARN
>Reporter: shuai.xu
>Assignee: shuai.xu
>  Labels: flip-6
>
> For yarn mode, resource should be assigned as requested to avoid resource 
> wasting and OOM.
> 1. YarnResourceManager will request container according to ResourceProfile   
> in slot request form JM.
> 2. RM will pass the ResourceProfile to TM for initializing its slots.
> 3. RM should match the slots offered by TM with SlotRequest from JM strictly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink issue #3304: [FLINK-5791] [runtime] Resource should be strictly matche...

2017-02-19 Thread shuai-xu

Github user shuai-xu commented on the issue:

https://github.com/apache/flink/pull/3304
  
@tillrohrmann ,OK, thank you


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Closed] (FLINK-5739) NullPointerException in CliFrontend

2017-02-19 Thread Zhuoluo Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoluo Yang closed FLINK-5739.
---

> NullPointerException in CliFrontend
> ---
>
> Key: FLINK-5739
> URL: https://issues.apache.org/jira/browse/FLINK-5739
> Project: Flink
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 1.3.0
> Environment: Mac OS X 10.12.2, Java 1.8.0_92-b14
>Reporter: Zhuoluo Yang
>Assignee: Zhuoluo Yang
>  Labels: newbie, starter
> Fix For: 1.3.0
>
>
> I've run a simple program on a local cluster. It always fails with code 
> Version: 1.3-SNAPSHOTCommit: e24a866. 
> {quote}
> Zhuoluos-MacBook-Pro:build-target zhuoluo.yzl$ bin/flink run -c 
> com.alibaba.blink.TableApp ~/gitlab/tableapp/target/tableapp-1.0-SNAPSHOT.jar 
> Cluster configuration: Standalone cluster with JobManager at 
> localhost/127.0.0.1:6123
> Using address localhost:6123 to connect to JobManager.
> JobManager web interface address http://localhost:8081
> Starting execution of program
> 
>  The program finished with the following exception:
> java.lang.NullPointerException
>   at 
> org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:845)
>   at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
>   at 
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1076)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1123)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1120)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
>   at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1120)
> {quote}
> I don't think there should be a NullPointerException here, even if you forgot 
> the "execute()" call.
> The reproducing code looks like following:
> {code:java}
> public static void main(String[] args) throws Exception {
> ExecutionEnvironment env = 
> ExecutionEnvironment.getExecutionEnvironment();
> DataSource customer = 
> env.readTextFile("/Users/zhuoluo.yzl/customer.tbl");
> customer.filter(new FilterFunction() {
> public boolean filter(String value) throws Exception {
> return true;
> }
> })
> .writeAsText("/Users/zhuoluo.yzl/customer.txt");
> //env.execute();
> }
> {code}
> We can use *start-cluster.sh* on a *local* computer to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...

2017-02-19 Thread greghogan

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3335
  
@WangTaoTheTonic, ACLs combine with the standard file permissions 
(`user-group-other`). Only one ACL is necessary to implement this PR. A second 
ACL would allow, for example, a `flink_admin` group to access all checkpoint 
directories.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5818) change checkpoint dir permission to 700 for security reason

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873962#comment-15873962
 ] 

ASF GitHub Bot commented on FLINK-5818:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3335
  
@WangTaoTheTonic, ACLs combine with the standard file permissions 
(`user-group-other`). Only one ACL is necessary to implement this PR. A second 
ACL would allow, for example, a `flink_admin` group to access all checkpoint 
directories.


> change checkpoint dir permission to 700 for security reason
> ---
>
> Key: FLINK-5818
> URL: https://issues.apache.org/jira/browse/FLINK-5818
> Project: Flink
>  Issue Type: Sub-task
>  Components: Security, State Backends, Checkpointing
>Reporter: Tao Wang
>
> Now checkpoint directory is made w/o specified permission, so it is easy for 
> another user to delete or read files under it, which will cause restore 
> failure or information leak.
> It's better to lower it down to 700.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5739) NullPointerException in CliFrontend

2017-02-19 Thread Zhuoluo Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873961#comment-15873961
 ] 

Zhuoluo Yang commented on FLINK-5739:
-

Fixed via 5e32eb549d3bc2195548620005fcf54437e75f48

> NullPointerException in CliFrontend
> ---
>
> Key: FLINK-5739
> URL: https://issues.apache.org/jira/browse/FLINK-5739
> Project: Flink
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 1.3.0
> Environment: Mac OS X 10.12.2, Java 1.8.0_92-b14
>Reporter: Zhuoluo Yang
>Assignee: Zhuoluo Yang
>  Labels: newbie, starter
> Fix For: 1.3.0
>
>
> I've run a simple program on a local cluster. It always fails with code 
> Version: 1.3-SNAPSHOTCommit: e24a866. 
> {quote}
> Zhuoluos-MacBook-Pro:build-target zhuoluo.yzl$ bin/flink run -c 
> com.alibaba.blink.TableApp ~/gitlab/tableapp/target/tableapp-1.0-SNAPSHOT.jar 
> Cluster configuration: Standalone cluster with JobManager at 
> localhost/127.0.0.1:6123
> Using address localhost:6123 to connect to JobManager.
> JobManager web interface address http://localhost:8081
> Starting execution of program
> 
>  The program finished with the following exception:
> java.lang.NullPointerException
>   at 
> org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:845)
>   at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
>   at 
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1076)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1123)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1120)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
>   at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1120)
> {quote}
> I don't think there should be a NullPointerException here, even if you forgot 
> the "execute()" call.
> The reproducing code looks like following:
> {code:java}
> public static void main(String[] args) throws Exception {
> ExecutionEnvironment env = 
> ExecutionEnvironment.getExecutionEnvironment();
> DataSource customer = 
> env.readTextFile("/Users/zhuoluo.yzl/customer.tbl");
> customer.filter(new FilterFunction() {
> public boolean filter(String value) throws Exception {
> return true;
> }
> })
> .writeAsText("/Users/zhuoluo.yzl/customer.txt");
> //env.execute();
> }
> {code}
> We can use *start-cluster.sh* on a *local* computer to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (FLINK-5739) NullPointerException in CliFrontend

2017-02-19 Thread Zhuoluo Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoluo Yang resolved FLINK-5739.
-
Resolution: Fixed

Many thanks to [~StephanEwen] for merging.

> NullPointerException in CliFrontend
> ---
>
> Key: FLINK-5739
> URL: https://issues.apache.org/jira/browse/FLINK-5739
> Project: Flink
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 1.3.0
> Environment: Mac OS X 10.12.2, Java 1.8.0_92-b14
>Reporter: Zhuoluo Yang
>Assignee: Zhuoluo Yang
>  Labels: newbie, starter
> Fix For: 1.3.0
>
>
> I've run a simple program on a local cluster. It always fails with code 
> Version: 1.3-SNAPSHOTCommit: e24a866. 
> {quote}
> Zhuoluos-MacBook-Pro:build-target zhuoluo.yzl$ bin/flink run -c 
> com.alibaba.blink.TableApp ~/gitlab/tableapp/target/tableapp-1.0-SNAPSHOT.jar 
> Cluster configuration: Standalone cluster with JobManager at 
> localhost/127.0.0.1:6123
> Using address localhost:6123 to connect to JobManager.
> JobManager web interface address http://localhost:8081
> Starting execution of program
> 
>  The program finished with the following exception:
> java.lang.NullPointerException
>   at 
> org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:845)
>   at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
>   at 
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1076)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1123)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1120)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
>   at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1120)
> {quote}
> I don't think there should be a NullPointerException here, even if you forgot 
> the "execute()" call.
> The reproducing code looks like following:
> {code:java}
> public static void main(String[] args) throws Exception {
> ExecutionEnvironment env = 
> ExecutionEnvironment.getExecutionEnvironment();
> DataSource customer = 
> env.readTextFile("/Users/zhuoluo.yzl/customer.tbl");
> customer.filter(new FilterFunction() {
> public boolean filter(String value) throws Exception {
> return true;
> }
> })
> .writeAsText("/Users/zhuoluo.yzl/customer.txt");
> //env.execute();
> }
> {code}
> We can use *start-cluster.sh* on a *local* computer to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5826) UDF/UDTF should support variable types and variable arguments

2017-02-19 Thread Zhuoluo Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873947#comment-15873947
 ] 

Zhuoluo Yang commented on FLINK-5826:
-

Thanks [~jark] , I'd like to do some test on this. It seems a concise design. 
IMHO, a meaningful exception would also be useful to users if necessary. I will 
try this annotation.

> UDF/UDTF should support variable types and variable arguments
> -
>
> Key: FLINK-5826
> URL: https://issues.apache.org/jira/browse/FLINK-5826
> Project: Flink
>  Issue Type: Improvement
>Reporter: Zhuoluo Yang
>Assignee: Zhuoluo Yang
>
> In some cases, UDF/UDTF should support variable types and variable arguments. 
> Many UDF/UDTF developers wish to make the # of arguments and types flexible 
> to users. They try to make their functions flexible.
> Thus, we should support the following styles of UDF/UDTFs.
> for example 1, in Java
> {code:java}
> public class SimpleUDF extends ScalarFunction {
>   public int eval(Object... args) {
>   // do something
>   }
> }
> {code}
> for example 2, in Scala
> {code}
> class SimpleUDF extends ScalarFunction {
>   def eval(args: Any*): Int = {
> // do something
>   }
> }
> {code}
> If we modify the code in UserDefinedFunctionUtils.getSignature() and make 
> both signatures pass. The first example will work normally. However, the 
> second example will raise an exception.
> {noformat}
> Caused by: org.codehaus.commons.compiler.CompileException: Line 58, Column 0: 
> No applicable constructor/method found for actual parameters 
> "java.lang.String"; candidates are: "public java.lang.Object 
> test.SimpleUDF.eval(scala.collection.Seq)"
>   at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:11523) 
> ~[janino-3.0.6.jar:?]
>   at 
> org.codehaus.janino.UnitCompiler.findMostSpecificIInvocable(UnitCompiler.java:8679)
>  ~[janino-3.0.6.jar:?]
>   at org.codehaus.janino.UnitCompiler.findIMethod(UnitCompiler.java:8539) 
> ~[janino-3.0.6.jar:?]
> {noformat} 
> The reason is that Scala will do a *sugary* modification to the signature of 
> the method. The mothod {code} def eval(args: Any*){code} will become 
> {code}def eval(args: scala.collection.Seq){code} in the class file. 
> The code generation has been done in Java. If we use java style 
> {code}eval(Object... args){code} to call the Scala method, it will raise the 
> above exception.
> However, I can't always restrict users to use Java to write a UDF/UDTF. Any 
> ideas in variable types and variable arguments of Scala UDF/UDTFs to prevent 
> the compilation failure?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-4856) Add MapState for keyed streams

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873931#comment-15873931
 ] 

ASF GitHub Bot commented on FLINK-4856:
---

Github user shixiaogang commented on a diff in the pull request:

https://github.com/apache/flink/pull/3336#discussion_r101936792
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/state/MapState.java ---
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.common.state;
+
+import org.apache.flink.annotation.PublicEvolving;
+
+import java.io.IOException;
+import java.util.Iterator;
+import java.util.Map;
+
+/**
+ * {@link State} interface for partitioned key-value state. The key-value 
pair can be
+ * added, updated and retrieved.
+ *
+ * The state is accessed and modified by user functions, and 
checkpointed consistently
+ * by the system as part of the distributed snapshots.
+ *
+ * The state is only accessible by functions applied on a 
KeyedDataStream. The key is
+ * automatically supplied by the system, so the function always sees the 
value mapped to the
+ * key of the current element. That way, the system can handle stream and 
state partitioning
+ * consistently together.
+ *
+ * @param  Type of the keys in the state.
+ * @param  Type of the values in the state.
+ */
+@PublicEvolving
+public interface MapState extends AppendingState, 
Iterable>> {
--- End diff --

`MapState` provides the `add` method which puts a collection of key-value 
pairs into the state. Though the semantics may be a little different in 
existing `AppendingState`s, I think it's okay for `MapState` to be an 
`AppendingState` because the interface does not enforce any restriction on the 
modification of previous data.


> Add MapState for keyed streams
> --
>
> Key: FLINK-4856
> URL: https://issues.apache.org/jira/browse/FLINK-4856
> Project: Flink
>  Issue Type: New Feature
>  Components: DataStream API, State Backends, Checkpointing
>Reporter: Xiaogang Shi
>Assignee: Xiaogang Shi
>
> Many states in keyed streams are organized as key-value pairs. Currently, 
> these states are implemented by storing the entire map into a ValueState or a 
> ListState. The implementation however is very costly because all entries have 
> to be serialized/deserialized when updating a single entry. To improve the 
> efficiency of these states, MapStates are urgently needed. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3336: [FLINK-4856][state] Add MapState in KeyedState

2017-02-19 Thread shixiaogang

Github user shixiaogang commented on a diff in the pull request:

https://github.com/apache/flink/pull/3336#discussion_r101936792
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/state/MapState.java ---
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.common.state;
+
+import org.apache.flink.annotation.PublicEvolving;
+
+import java.io.IOException;
+import java.util.Iterator;
+import java.util.Map;
+
+/**
+ * {@link State} interface for partitioned key-value state. The key-value 
pair can be
+ * added, updated and retrieved.
+ *
+ * The state is accessed and modified by user functions, and 
checkpointed consistently
+ * by the system as part of the distributed snapshots.
+ *
+ * The state is only accessible by functions applied on a 
KeyedDataStream. The key is
+ * automatically supplied by the system, so the function always sees the 
value mapped to the
+ * key of the current element. That way, the system can handle stream and 
state partitioning
+ * consistently together.
+ *
+ * @param  Type of the keys in the state.
+ * @param  Type of the values in the state.
+ */
+@PublicEvolving
+public interface MapState extends AppendingState, 
Iterable>> {
--- End diff --

`MapState` provides the `add` method which puts a collection of key-value 
pairs into the state. Though the semantics may be a little different in 
existing `AppendingState`s, I think it's okay for `MapState` to be an 
`AppendingState` because the interface does not enforce any restriction on the 
modification of previous data.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...

2017-02-19 Thread WangTaoTheTonic

Github user WangTaoTheTonic commented on the issue:

https://github.com/apache/flink/pull/3335
  
We are very close in scenario :)

My point is that multiple users would use same root directory to store 
their checkpoint files(creating single directory for each job is complex), 
which makes it very hard for admin to set a proper permissions for it.

Adding a configuration item is a very good idea. Would it be better if this 
configuration would be applied to the sub directories each job created? It will 
resolve isolation of access between different users' checkpoint file and also 
can be customized for migrating. @StephanEwen 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5818) change checkpoint dir permission to 700 for security reason

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873926#comment-15873926
 ] 

ASF GitHub Bot commented on FLINK-5818:
---

Github user WangTaoTheTonic commented on the issue:

https://github.com/apache/flink/pull/3335
  
We are very close in scenario :)

My point is that multiple users would use same root directory to store 
their checkpoint files(creating single directory for each job is complex), 
which makes it very hard for admin to set a proper permissions for it.

Adding a configuration item is a very good idea. Would it be better if this 
configuration would be applied to the sub directories each job created? It will 
resolve isolation of access between different users' checkpoint file and also 
can be customized for migrating. @StephanEwen 


> change checkpoint dir permission to 700 for security reason
> ---
>
> Key: FLINK-5818
> URL: https://issues.apache.org/jira/browse/FLINK-5818
> Project: Flink
>  Issue Type: Sub-task
>  Components: Security, State Backends, Checkpointing
>Reporter: Tao Wang
>
> Now checkpoint directory is made w/o specified permission, so it is easy for 
> another user to delete or read files under it, which will cause restore 
> failure or information leak.
> It's better to lower it down to 700.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (FLINK-5828) BlobServer create cache dir has concurrency safety problem

2017-02-19 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-5828.
-
   Resolution: Fixed
Fix Version/s: (was: 1.2.0)
   1.2.1
   1.3.0

Fixed in
  - 1.2.1 via 8a5d56d448db752c9779a32d5a6f907b0232b489
  - 1.3.0 via 20420fc6ee153c7171265dda7bf7d593c17fb375

> BlobServer create cache dir has concurrency safety problem
> --
>
> Key: FLINK-5828
> URL: https://issues.apache.org/jira/browse/FLINK-5828
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Reporter: ZhengBowen
> Fix For: 1.3.0, 1.2.1
>
>
> Caused by: java.lang.RuntimeException: 
> org.apache.flink.client.program.ProgramInvocationException: The program 
> execution failed: Could not upload the jar files to the job manager.
> at 
> FlinkJob_20170217_161058_04.bind(FlinkJob_20170217_161058_04.java:45) 
> at 
> com.aliyun.kepler.rc.query.schedule.FlinkQueryJob.call(FlinkQueryJob.java:53) 
> at 
> com.aliyun.kepler.rc.query.schedule.FlinkQueryJob.call(FlinkQueryJob.java:13) 
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> at 
> java.util.concurrent.AbstractExecutorService$2.run(AbstractExecutorService.java:120)
>  
> ... 3 common frames omitted
> Caused by: org.apache.flink.client.program.ProgramInvocationException: The 
> program execution failed: Could not upload the jar files to the job manager.
> at com.aliyun.kepler.rc.flink.client.Client.runBlocking(Client.java:178) 
> at 
> org.apache.flink.api.java.ClientEnvironment.execute(ClientEnvironment.java:169)
>  
> at 
> org.apache.flink.api.java.ClientEnvironment.execute(ClientEnvironment.java:225)
>  
> at 
> FlinkJob_20170217_161058_04.bind(FlinkJob_20170217_161058_04.java:42) 
> ... 7 common frames omitted
> Caused by: org.apache.flink.runtime.client.JobSubmissionException: Could not 
> upload the jar files to the job manager.
> at 
> org.apache.flink.runtime.client.JobClientActor$2.call(JobClientActor.java:359)
>  
> at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:94) 
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>  
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) 
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) 
> ... 3 common frames omitted
> Caused by: java.io.IOException: Could not retrieve the JobManager's blob port.
> at 
> org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:706) 
> at 
> org.apache.flink.runtime.jobgraph.JobGraph.uploadUserJars(JobGraph.java:556) 
> at 
> org.apache.flink.runtime.client.JobClientActor$2.call(JobClientActor.java:357)
>  
> ... 7 common frames omitted
> Caused by: java.io.IOException: PUT operation failed: Server side error: 
> Could not create cache directory 
> '/home/kepler/kepler3012/data/work/blobs/blobStore-c3566cb2-b3d6-40ae-bdcf-594a81c8881b/cache'.
> at 
> org.apache.flink.runtime.blob.BlobClient.putInputStream(BlobClient.java:476) 
> at org.apache.flink.runtime.blob.BlobClient.put(BlobClient.java:338) 
> at 
> org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:730) 
> at 
> org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:701) 
> ... 9 common frames omitted



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Closed] (FLINK-5828) BlobServer create cache dir has concurrency safety problem

2017-02-19 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-5828.
---

> BlobServer create cache dir has concurrency safety problem
> --
>
> Key: FLINK-5828
> URL: https://issues.apache.org/jira/browse/FLINK-5828
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Reporter: ZhengBowen
> Fix For: 1.3.0, 1.2.1
>
>
> Caused by: java.lang.RuntimeException: 
> org.apache.flink.client.program.ProgramInvocationException: The program 
> execution failed: Could not upload the jar files to the job manager.
> at 
> FlinkJob_20170217_161058_04.bind(FlinkJob_20170217_161058_04.java:45) 
> at 
> com.aliyun.kepler.rc.query.schedule.FlinkQueryJob.call(FlinkQueryJob.java:53) 
> at 
> com.aliyun.kepler.rc.query.schedule.FlinkQueryJob.call(FlinkQueryJob.java:13) 
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> at 
> java.util.concurrent.AbstractExecutorService$2.run(AbstractExecutorService.java:120)
>  
> ... 3 common frames omitted
> Caused by: org.apache.flink.client.program.ProgramInvocationException: The 
> program execution failed: Could not upload the jar files to the job manager.
> at com.aliyun.kepler.rc.flink.client.Client.runBlocking(Client.java:178) 
> at 
> org.apache.flink.api.java.ClientEnvironment.execute(ClientEnvironment.java:169)
>  
> at 
> org.apache.flink.api.java.ClientEnvironment.execute(ClientEnvironment.java:225)
>  
> at 
> FlinkJob_20170217_161058_04.bind(FlinkJob_20170217_161058_04.java:42) 
> ... 7 common frames omitted
> Caused by: org.apache.flink.runtime.client.JobSubmissionException: Could not 
> upload the jar files to the job manager.
> at 
> org.apache.flink.runtime.client.JobClientActor$2.call(JobClientActor.java:359)
>  
> at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:94) 
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>  
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) 
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) 
> ... 3 common frames omitted
> Caused by: java.io.IOException: Could not retrieve the JobManager's blob port.
> at 
> org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:706) 
> at 
> org.apache.flink.runtime.jobgraph.JobGraph.uploadUserJars(JobGraph.java:556) 
> at 
> org.apache.flink.runtime.client.JobClientActor$2.call(JobClientActor.java:357)
>  
> ... 7 common frames omitted
> Caused by: java.io.IOException: PUT operation failed: Server side error: 
> Could not create cache directory 
> '/home/kepler/kepler3012/data/work/blobs/blobStore-c3566cb2-b3d6-40ae-bdcf-594a81c8881b/cache'.
> at 
> org.apache.flink.runtime.blob.BlobClient.putInputStream(BlobClient.java:476) 
> at org.apache.flink.runtime.blob.BlobClient.put(BlobClient.java:338) 
> at 
> org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:730) 
> at 
> org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:701) 
> ... 9 common frames omitted



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (FLINK-5129) make the BlobServer use a distributed file system

2017-02-19 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-5129.
-
   Resolution: Fixed
Fix Version/s: 1.3.0

Fixed via 9f544d83b3443cf33f5890efdb956678847d445f

> make the BlobServer use a distributed file system
> -
>
> Key: FLINK-5129
> URL: https://issues.apache.org/jira/browse/FLINK-5129
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Reporter: Nico Kruber
>Assignee: Nico Kruber
> Fix For: 1.3.0
>
>
> Currently, the BlobServer uses a local storage and, in addition when the HA 
> mode is set, a distributed file system, e.g. hdfs. This, however, is only 
> used by the JobManager and all TaskManager instances request blobs from the 
> JobManager. By using the distributed file system there as well, we would 
> lower the load on the JobManager and increase scalability.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Closed] (FLINK-5497) remove duplicated tests

2017-02-19 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-5497.
---

> remove duplicated tests
> ---
>
> Key: FLINK-5497
> URL: https://issues.apache.org/jira/browse/FLINK-5497
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests
>Reporter: Alexey Diomin
>Priority: Minor
> Fix For: 1.3.0
>
>
> Now we have test which run the same code 4 times, every run 17+ seconds.
> Need do small refactoring and remove duplicated code.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Closed] (FLINK-5129) make the BlobServer use a distributed file system

2017-02-19 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-5129.
---

> make the BlobServer use a distributed file system
> -
>
> Key: FLINK-5129
> URL: https://issues.apache.org/jira/browse/FLINK-5129
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Reporter: Nico Kruber
>Assignee: Nico Kruber
> Fix For: 1.3.0
>
>
> Currently, the BlobServer uses a local storage and, in addition when the HA 
> mode is set, a distributed file system, e.g. hdfs. This, however, is only 
> used by the JobManager and all TaskManager instances request blobs from the 
> JobManager. By using the distributed file system there as well, we would 
> lower the load on the JobManager and increase scalability.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Closed] (FLINK-5522) Storm LocalCluster can't run with powermock

2017-02-19 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-5522.
---

> Storm LocalCluster can't run with powermock
> ---
>
> Key: FLINK-5522
> URL: https://issues.apache.org/jira/browse/FLINK-5522
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Affects Versions: 1.3.0
>Reporter: liuyuzhong7
> Fix For: 1.3.0
>
>
> Strom LocalCluster can't run with powermock. For example:
> The codes which commented in WrapperSetupHelperTest.testCreateTopologyContext



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (FLINK-5497) remove duplicated tests

2017-02-19 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-5497.
-
   Resolution: Fixed
Fix Version/s: 1.3.0

Fixed via 53134594644407d0a3cd691b0e93ae09ff6c8102

> remove duplicated tests
> ---
>
> Key: FLINK-5497
> URL: https://issues.apache.org/jira/browse/FLINK-5497
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests
>Reporter: Alexey Diomin
>Priority: Minor
> Fix For: 1.3.0
>
>
> Now we have test which run the same code 4 times, every run 17+ seconds.
> Need do small refactoring and remove duplicated code.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (FLINK-5522) Storm LocalCluster can't run with powermock

2017-02-19 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-5522.
-
Resolution: Fixed

Fixed via d05fc377ee688b231fb1b0daeb8a34fd054f3ca1

> Storm LocalCluster can't run with powermock
> ---
>
> Key: FLINK-5522
> URL: https://issues.apache.org/jira/browse/FLINK-5522
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Affects Versions: 1.3.0
>Reporter: liuyuzhong7
> Fix For: 1.3.0
>
>
> Strom LocalCluster can't run with powermock. For example:
> The codes which commented in WrapperSetupHelperTest.testCreateTopologyContext



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Closed] (FLINK-5812) Clean up FileSystem

2017-02-19 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-5812.
---

> Clean up FileSystem
> ---
>
> Key: FLINK-5812
> URL: https://issues.apache.org/jira/browse/FLINK-5812
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.2.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
>Priority: Minor
> Fix For: 1.3.0
>
>
> The {{FileSystem}} class is overloaded and has methods that are not well 
> supported. I suggest to do the following cleanups:
>   - Pull the safety net into a separate class
>   - Use the {{WriteMode}} to indicate overwriting behavior. Right now, the 
> {{FileSystem}} class defines that enum and never uses it. It feels weird.
>   - Remove the {{create(path, overwrite, blocksize, reolication, ...)}} 
> method, which is not really supported across file system implementations. For 
> HDFS, behavior should be set via the configuration anyways.
> All changes have to be made in a non-API-breaking fashion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (FLINK-5812) Clean up FileSystem

2017-02-19 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-5812.
-
Resolution: Fixed

Fixed via 5902ea0e88c70f330c23b9ace94033ae34c84445

> Clean up FileSystem
> ---
>
> Key: FLINK-5812
> URL: https://issues.apache.org/jira/browse/FLINK-5812
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.2.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
>Priority: Minor
> Fix For: 1.3.0
>
>
> The {{FileSystem}} class is overloaded and has methods that are not well 
> supported. I suggest to do the following cleanups:
>   - Pull the safety net into a separate class
>   - Use the {{WriteMode}} to indicate overwriting behavior. Right now, the 
> {{FileSystem}} class defines that enum and never uses it. It feels weird.
>   - Remove the {{create(path, overwrite, blocksize, reolication, ...)}} 
> method, which is not really supported across file system implementations. For 
> HDFS, behavior should be set via the configuration anyways.
> All changes have to be made in a non-API-breaking fashion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5277) missing unit test for ensuring ResultPartition#add always recycles buffers

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873913#comment-15873913
 ] 

ASF GitHub Bot commented on FLINK-5277:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3309


> missing unit test for ensuring ResultPartition#add always recycles buffers
> --
>
> Key: FLINK-5277
> URL: https://issues.apache.org/jira/browse/FLINK-5277
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> We rely on ResultPartition to recycle the buffer if the add calls fails.
> It makes sense to add a special test (to ResultPartitionTest or 
> RecordWriterTest) where we ensure that this actually happens to guard against 
> future behaviour changes in ResultPartition.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5497) remove duplicated tests

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873909#comment-15873909
 ] 

ASF GitHub Bot commented on FLINK-5497:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3089


> remove duplicated tests
> ---
>
> Key: FLINK-5497
> URL: https://issues.apache.org/jira/browse/FLINK-5497
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests
>Reporter: Alexey Diomin
>Priority: Minor
>
> Now we have test which run the same code 4 times, every run 17+ seconds.
> Need do small refactoring and remove duplicated code.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (FLINK-5747) Eager Scheduling should deploy all Tasks together

2017-02-19 Thread Stephan Ewen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-5747.
-
Resolution: Fixed

Fixed via f113d79451ba88c487358861cc3e20aac3d19257

> Eager Scheduling should deploy all Tasks together
> -
>
> Key: FLINK-5747
> URL: https://issues.apache.org/jira/browse/FLINK-5747
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager
>Affects Versions: 1.2.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.3.0
>
>
> Currently, eager scheduling immediately triggers the scheduling for all 
> vertices and their subtasks in topological order. 
> This has two problems:
>   - This works only, as long as resource acquisition is "synchronous". With 
> dynamic resource acquisition in FLIP-6, the resources are returned as Futures 
> which may complete out of order. This results in out-of-order (not in 
> topological order) scheduling of tasks which does not work for streaming.
>   - Deploying some tasks that depend on other tasks before it is clear that 
> the other tasks have resources as well leads to situations where many 
> deploy/recovery cycles happen before enough resources are available to get 
> the job running fully.
> For eager scheduling, we should allocate all resources in one chunk and then 
> deploy once we know that all are available.
> As a follow-up, the same should be done per pipelined component in lazy batch 
> scheduling as well. That way we get lazy scheduling across blocking 
> boundaries, and bulk (gang) scheduling in pipelined subgroups.
> This also does not apply for efforts of fine grained recovery, where 
> individual tasks request replacement resources.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3211: [FLINK-5640][build]configure the explicit Unit Tes...

2017-02-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3211


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3089: [FLINK-5497] remove duplicated tests

2017-02-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3089


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5640) configure the explicit Unit Test file suffix

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873906#comment-15873906
 ] 

ASF GitHub Bot commented on FLINK-5640:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3211


> configure the explicit Unit Test file suffix
> 
>
> Key: FLINK-5640
> URL: https://issues.apache.org/jira/browse/FLINK-5640
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: shijinkui
>Assignee: shijinkui
> Fix For: 1.2.1
>
>
> There are four types of Unit Test file: *ITCase.java, *Test.java, 
> *ITSuite.scala, *Suite.scala
> File name ending with "IT.java" is integration test. File name ending with 
> "Test.java"  is unit test.
> It's clear for Surefire plugin of default-test execution to declare that 
> "*Test.*" is Java Unit Test.
> The test file statistics below:
> * Suite  total: 10
> * ITCase  total: 378
> * Test  total: 1008
> * ITSuite  total: 14



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5828) BlobServer create cache dir has concurrency safety problem

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873907#comment-15873907
 ] 

ASF GitHub Bot commented on FLINK-5828:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3342


> BlobServer create cache dir has concurrency safety problem
> --
>
> Key: FLINK-5828
> URL: https://issues.apache.org/jira/browse/FLINK-5828
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Reporter: ZhengBowen
> Fix For: 1.2.0
>
>
> Caused by: java.lang.RuntimeException: 
> org.apache.flink.client.program.ProgramInvocationException: The program 
> execution failed: Could not upload the jar files to the job manager.
> at 
> FlinkJob_20170217_161058_04.bind(FlinkJob_20170217_161058_04.java:45) 
> at 
> com.aliyun.kepler.rc.query.schedule.FlinkQueryJob.call(FlinkQueryJob.java:53) 
> at 
> com.aliyun.kepler.rc.query.schedule.FlinkQueryJob.call(FlinkQueryJob.java:13) 
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> at 
> java.util.concurrent.AbstractExecutorService$2.run(AbstractExecutorService.java:120)
>  
> ... 3 common frames omitted
> Caused by: org.apache.flink.client.program.ProgramInvocationException: The 
> program execution failed: Could not upload the jar files to the job manager.
> at com.aliyun.kepler.rc.flink.client.Client.runBlocking(Client.java:178) 
> at 
> org.apache.flink.api.java.ClientEnvironment.execute(ClientEnvironment.java:169)
>  
> at 
> org.apache.flink.api.java.ClientEnvironment.execute(ClientEnvironment.java:225)
>  
> at 
> FlinkJob_20170217_161058_04.bind(FlinkJob_20170217_161058_04.java:42) 
> ... 7 common frames omitted
> Caused by: org.apache.flink.runtime.client.JobSubmissionException: Could not 
> upload the jar files to the job manager.
> at 
> org.apache.flink.runtime.client.JobClientActor$2.call(JobClientActor.java:359)
>  
> at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:94) 
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>  
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) 
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) 
> ... 3 common frames omitted
> Caused by: java.io.IOException: Could not retrieve the JobManager's blob port.
> at 
> org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:706) 
> at 
> org.apache.flink.runtime.jobgraph.JobGraph.uploadUserJars(JobGraph.java:556) 
> at 
> org.apache.flink.runtime.client.JobClientActor$2.call(JobClientActor.java:357)
>  
> ... 7 common frames omitted
> Caused by: java.io.IOException: PUT operation failed: Server side error: 
> Could not create cache directory 
> '/home/kepler/kepler3012/data/work/blobs/blobStore-c3566cb2-b3d6-40ae-bdcf-594a81c8881b/cache'.
> at 
> org.apache.flink.runtime.blob.BlobClient.putInputStream(BlobClient.java:476) 
> at org.apache.flink.runtime.blob.BlobClient.put(BlobClient.java:338) 
> at 
> org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:730) 
> at 
> org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:701) 
> ... 9 common frames omitted



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3223: [FLINK-5669] Change DataStreamUtils to use the loo...

2017-02-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3223


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3295: [FLINK-5747] [distributed coordination] Eager sche...

2017-02-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3295


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3349: Updated DC/OS setup instructions.

2017-02-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3349


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-4813) Having flink-test-utils as a dependency outside Flink fails the build

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873915#comment-15873915
 ] 

ASF GitHub Bot commented on FLINK-4813:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3322


> Having flink-test-utils as a dependency outside Flink fails the build
> -
>
> Key: FLINK-4813
> URL: https://issues.apache.org/jira/browse/FLINK-4813
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
>Assignee: Nico Kruber
>
> The {{flink-test-utils}} depend on {{hadoop-minikdc}}, which has a 
> dependency, which is only resolvable, if the {{maven-bundle-plugin}} is 
> loaded.
> This is the error message
> {code}
> [ERROR] Failed to execute goal on project quickstart-1.2-tests: Could not 
> resolve dependencies for project 
> com.dataartisans:quickstart-1.2-tests:jar:1.0-SNAPSHOT: Failure to find 
> org.apache.directory.jdbm:apacheds-jdbm1:bundle:2.0.0-M2 in 
> https://repo.maven.apache.org/maven2 was cached in the local repository, 
> resolution will not be reattempted until the update interval of central has 
> elapsed or updates are forced -> [Help 1]
> {code}
> {{flink-parent}} loads that plugin, so all "internal" dependencies to the 
> test utils can resolve the plugin.
> Right now, users have to use the maven bundle plugin to use our test utils 
> externally.
> By making the hadoop minikdc dependency optional, we can probably resolve the 
> issues. Then, only users who want to use the security-related tools in the 
> test utils need to manually add the hadoop minikdc dependency + the plugin.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3292: [FLINK-5739] [client] fix NullPointerException in ...

2017-02-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3292


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5817) Fix test concurrent execution failure by test dir conflicts.

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873908#comment-15873908
 ] 

ASF GitHub Bot commented on FLINK-5817:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3341


> Fix test concurrent execution failure by test dir conflicts.
> 
>
> Key: FLINK-5817
> URL: https://issues.apache.org/jira/browse/FLINK-5817
> Project: Flink
>  Issue Type: Bug
>Reporter: Wenlong Lyu
>Assignee: Wenlong Lyu
>
> Currently when different users build flink on the same machine, failure may 
> happen because some test utilities create test file using the fixed name, 
> which will cause file access failing when different user processing the same 
> file at the same time.
> We have found errors from AbstractTestBase, IOManagerTest, FileCacheTest.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3084: [FLINK-5129] make the BlobServer use a distributed...

2017-02-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3084


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5739) NullPointerException in CliFrontend

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873912#comment-15873912
 ] 

ASF GitHub Bot commented on FLINK-5739:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3292


> NullPointerException in CliFrontend
> ---
>
> Key: FLINK-5739
> URL: https://issues.apache.org/jira/browse/FLINK-5739
> Project: Flink
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 1.3.0
> Environment: Mac OS X 10.12.2, Java 1.8.0_92-b14
>Reporter: Zhuoluo Yang
>Assignee: Zhuoluo Yang
>  Labels: newbie, starter
> Fix For: 1.3.0
>
>
> I've run a simple program on a local cluster. It always fails with code 
> Version: 1.3-SNAPSHOTCommit: e24a866. 
> {quote}
> Zhuoluos-MacBook-Pro:build-target zhuoluo.yzl$ bin/flink run -c 
> com.alibaba.blink.TableApp ~/gitlab/tableapp/target/tableapp-1.0-SNAPSHOT.jar 
> Cluster configuration: Standalone cluster with JobManager at 
> localhost/127.0.0.1:6123
> Using address localhost:6123 to connect to JobManager.
> JobManager web interface address http://localhost:8081
> Starting execution of program
> 
>  The program finished with the following exception:
> java.lang.NullPointerException
>   at 
> org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:845)
>   at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
>   at 
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1076)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1123)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1120)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
>   at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1120)
> {quote}
> I don't think there should be a NullPointerException here, even if you forgot 
> the "execute()" call.
> The reproducing code looks like following:
> {code:java}
> public static void main(String[] args) throws Exception {
> ExecutionEnvironment env = 
> ExecutionEnvironment.getExecutionEnvironment();
> DataSource customer = 
> env.readTextFile("/Users/zhuoluo.yzl/customer.tbl");
> customer.filter(new FilterFunction() {
> public boolean filter(String value) throws Exception {
> return true;
> }
> })
> .writeAsText("/Users/zhuoluo.yzl/customer.txt");
> //env.execute();
> }
> {code}
> We can use *start-cluster.sh* on a *local* computer to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3138: #Flink-5522 Storm Local Cluster can't work with po...

2017-02-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3138


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3342: TO FLINK-5828

2017-02-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3342


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5669) flink-streaming-contrib DataStreamUtils.collect in local environment mode fails when offline

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873911#comment-15873911
 ] 

ASF GitHub Bot commented on FLINK-5669:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3223


> flink-streaming-contrib DataStreamUtils.collect in local environment mode 
> fails when offline
> 
>
> Key: FLINK-5669
> URL: https://issues.apache.org/jira/browse/FLINK-5669
> Project: Flink
>  Issue Type: Bug
>  Components: flink-contrib
>Reporter: Rick Cox
>Priority: Minor
>
> {{DataStreamUtils.collect()}} needs to obtain the local machine's IP so that 
> the job can send the results back. In the case of local 
> {{StreamEnvironments}}, it uses {{InetAddress.getLocalHost()}}, which 
> attempts to resolve the local hostname using DNS.
> If DNS is not available (for example, when offline) or if DNS is available 
> but cannot resolve the hostname (for example, if the hostname is an intranet 
> name but the machine is not currently on that network), an 
> {{UnknownHostException}} will be thrown (and wrapped in an {{IOException}}).
> If the resolved IP is not reachable for some reason, streaming results will 
> fail.
> Since this case is for local execution only, it seems that using 
> {{InetAddress.getLoopbackAddress()}} would work just as well, and avoid the 
> assumptions made by {{getLocalHost()}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3309: [FLINK-5277] add unit tests for ResultPartition#ad...

2017-02-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3309


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3322: [FLINK-4813][flink-test-utils] make the hadoop-min...

2017-02-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3322


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5747) Eager Scheduling should deploy all Tasks together

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873914#comment-15873914
 ] 

ASF GitHub Bot commented on FLINK-5747:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3295


> Eager Scheduling should deploy all Tasks together
> -
>
> Key: FLINK-5747
> URL: https://issues.apache.org/jira/browse/FLINK-5747
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager
>Affects Versions: 1.2.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.3.0
>
>
> Currently, eager scheduling immediately triggers the scheduling for all 
> vertices and their subtasks in topological order. 
> This has two problems:
>   - This works only, as long as resource acquisition is "synchronous". With 
> dynamic resource acquisition in FLIP-6, the resources are returned as Futures 
> which may complete out of order. This results in out-of-order (not in 
> topological order) scheduling of tasks which does not work for streaming.
>   - Deploying some tasks that depend on other tasks before it is clear that 
> the other tasks have resources as well leads to situations where many 
> deploy/recovery cycles happen before enough resources are available to get 
> the job running fully.
> For eager scheduling, we should allocate all resources in one chunk and then 
> deploy once we know that all are available.
> As a follow-up, the same should be done per pipelined component in lazy batch 
> scheduling as well. That way we get lazy scheduling across blocking 
> boundaries, and bulk (gang) scheduling in pipelined subgroups.
> This also does not apply for efforts of fine grained recovery, where 
> individual tasks request replacement resources.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3341: [FLINK-5817]Fix test concurrent execution failure ...

2017-02-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3341


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5129) make the BlobServer use a distributed file system

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873910#comment-15873910
 ] 

ASF GitHub Bot commented on FLINK-5129:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3084


> make the BlobServer use a distributed file system
> -
>
> Key: FLINK-5129
> URL: https://issues.apache.org/jira/browse/FLINK-5129
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> Currently, the BlobServer uses a local storage and, in addition when the HA 
> mode is set, a distributed file system, e.g. hdfs. This, however, is only 
> used by the JobManager and all TaskManager instances request blobs from the 
> JobManager. By using the distributed file system there as well, we would 
> lower the load on the JobManager and increase scalability.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (FLINK-5843) Website/docs missing Cache-Control HTTP header, can serve stale data

2017-02-19 Thread Patrick Lucas (JIRA)

Patrick Lucas created FLINK-5843:


 Summary: Website/docs missing Cache-Control HTTP header, can serve 
stale data
 Key: FLINK-5843
 URL: https://issues.apache.org/jira/browse/FLINK-5843
 Project: Flink
  Issue Type: Bug
  Components: Project Website
Reporter: Patrick Lucas


When Flink 1.2.0 was released, I found that the [Flink downloads 
page|https://flink.apache.org/downloads.html] was out-of-date until I forced my 
browser to refresh the page. Upon investigation, I found that the principle 
pages of the website are served with only the following headers that relate to 
caching: Date, Last-Modified, and ETag.

Since there is no Cache-Control header (or the older Expires or Pragma 
headers), browsers are left to their own heuristics as to how long to cache 
this content, which varies browser to browser. In some browsers, this heuristic 
is 10% of the difference between Date and Last-Modified headers. I take this to 
mean that, if the content were last modified 90 days ago, and I last accessed 
it 5 days ago, my browser will serve a cached response for a further 3.5 days 
(10% * (90 days - 5 days) = 8.5 days, 5 days have elapsed leaving 3.5 days).

I'm not sure who at the ASF we should talk to about this, but I recommend we 
add the following header to any responses served from the Flink project website 
or official documentation website\[1]:

{code}Cache-Control: max-age=0, must-revalidate{code}

(Note this will only make browser revalidate their caches; if the ETag of the 
cached content matches what the server still has, the server will return 304 
Not Modified and omit the actual content)

\[1] Both the website hosted at flink.apache.org and the documentation hosted 
at ci.apache.org are affected.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5253) Remove special treatment of "dynamic properties"

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873805#comment-15873805
 ] 

ASF GitHub Bot commented on FLINK-5253:
---

GitHub user mariusz89016 opened a pull request:

https://github.com/apache/flink/pull/3356

[FLINK-5253] Remove special treatment of "dynamic properties"

This PR removes special treatment of 'dynamic properties' (aka special way 
of encoding them as environment variable). Instead, these properties are 
appended to _flink-conf.yaml_ file.

Link to the issue: https://issues.apache.org/jira/browse/FLINK-5253

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mariusz89016/flink 
flink-5253-dynamic-properties

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3356.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3356


commit 1467b86525d4b2ce07ba9dbe42a5fae7fff1bf22
Author: Mariusz Wojakowski 
Date:   2017-02-18T14:15:07Z

[FLINK-5253] Remove special treatment of "dynamic properties"




> Remove special treatment of "dynamic properties"
> 
>
> Key: FLINK-5253
> URL: https://issues.apache.org/jira/browse/FLINK-5253
> Project: Flink
>  Issue Type: Sub-task
>  Components: YARN
> Environment: {{flip-6}} feature branch
>Reporter: Stephan Ewen
>  Labels: flip-6
>
> The YARN client accepts configuration keys as command line parameters.
> Currently these are send to the AppMaster and TaskManager as "dynamic 
> properties", encoded in a special way via environment variables.
> The mechanism is quite fragile. We should simplify it:
>   - The YARN client takes the local {{flink-conf.yaml}} as the base.
>   - It overwrite config entries with command line properties when preparing 
> the configuration to be shipped to YARN container processes (JM / TM)
>   - No additional handling neccessary



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3356: [FLINK-5253] Remove special treatment of "dynamic ...

2017-02-19 Thread mariusz89016

GitHub user mariusz89016 opened a pull request:

https://github.com/apache/flink/pull/3356

[FLINK-5253] Remove special treatment of "dynamic properties"

This PR removes special treatment of 'dynamic properties' (aka special way 
of encoding them as environment variable). Instead, these properties are 
appended to _flink-conf.yaml_ file.

Link to the issue: https://issues.apache.org/jira/browse/FLINK-5253

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mariusz89016/flink 
flink-5253-dynamic-properties

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3356.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3356


commit 1467b86525d4b2ce07ba9dbe42a5fae7fff1bf22
Author: Mariusz Wojakowski 
Date:   2017-02-18T14:15:07Z

[FLINK-5253] Remove special treatment of "dynamic properties"




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5842) Wrong 'since' version for ElasticSearch 5.x connector

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873796#comment-15873796
 ] 

ASF GitHub Bot commented on FLINK-5842:
---

GitHub user patricklucas opened a pull request:

https://github.com/apache/flink/pull/3355

[FLINK-5842] [docs] Fix ES5 "since" version



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/patricklucas/flink 
FLINK-5842_fix_es5_since_version

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3355.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3355


commit a671b6eb87109ba874bd81af6ee62ffaffe06f25
Author: Patrick Lucas 
Date:   2017-02-19T18:56:30Z

[FLINK-5842] [docs] Fix ES5 "since" version




> Wrong 'since' version for ElasticSearch 5.x connector
> -
>
> Key: FLINK-5842
> URL: https://issues.apache.org/jira/browse/FLINK-5842
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation, Streaming Connectors
>Affects Versions: 1.3.0
>Reporter: Dawid Wysakowicz
>Assignee: Patrick Lucas
>
> The documentation claims that ElasticSearch 5.x is supported since Flink 
> 1.2.0 which is not true, as the support was merged after 1.2.0.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3355: [FLINK-5842] [docs] Fix ES5 "since" version

2017-02-19 Thread patricklucas

GitHub user patricklucas opened a pull request:

https://github.com/apache/flink/pull/3355

[FLINK-5842] [docs] Fix ES5 "since" version



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/patricklucas/flink 
FLINK-5842_fix_es5_since_version

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3355.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3355


commit a671b6eb87109ba874bd81af6ee62ffaffe06f25
Author: Patrick Lucas 
Date:   2017-02-19T18:56:30Z

[FLINK-5842] [docs] Fix ES5 "since" version




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Assigned] (FLINK-5842) Wrong 'since' version for ElasticSearch 5.x connector

2017-02-19 Thread Patrick Lucas (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Lucas reassigned FLINK-5842:


Assignee: Patrick Lucas

> Wrong 'since' version for ElasticSearch 5.x connector
> -
>
> Key: FLINK-5842
> URL: https://issues.apache.org/jira/browse/FLINK-5842
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation, Streaming Connectors
>Affects Versions: 1.3.0
>Reporter: Dawid Wysakowicz
>Assignee: Patrick Lucas
>
> The documentation claims that ElasticSearch 5.x is supported since Flink 
> 1.2.0 which is not true, as the support was merged after 1.2.0.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (FLINK-5842) Wrong 'since' version for ElasticSearch 5.x connector

2017-02-19 Thread Dawid Wysakowicz (JIRA)

Dawid Wysakowicz created FLINK-5842:
---

 Summary: Wrong 'since' version for ElasticSearch 5.x connector
 Key: FLINK-5842
 URL: https://issues.apache.org/jira/browse/FLINK-5842
 Project: Flink
  Issue Type: Bug
  Components: Documentation, Streaming Connectors
Affects Versions: 1.3.0
Reporter: Dawid Wysakowicz


The documentation claims that ElasticSearch 5.x is supported since Flink 1.2.0 
which is not true, as the support was merged after 1.2.0.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Issue Comment Deleted] (FLINK-4499) Introduce findbugs maven plugin

2017-02-19 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-4499:
--
Comment: was deleted

(was: This should be assigned to [~smarthi], right ?)

> Introduce findbugs maven plugin
> ---
>
> Key: FLINK-4499
> URL: https://issues.apache.org/jira/browse/FLINK-4499
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Ted Yu
>Assignee: Suneel Marthi
>
> As suggested by Stephan in FLINK-4482, this issue is to add 
> findbugs-maven-plugin into the build process so that we can detect lack of 
> proper locking and other defects automatically.
> We can begin with small set of rules.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5546) java.io.tmpdir setted as project build directory in surefire plugin

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873721#comment-15873721
 ] 

ASF GitHub Bot commented on FLINK-5546:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3190
  
Sorry, I have to actually step back on this one.

I merged it into a feature branch and played around a bit with this, and it 
turns out it is not possible any more to execute tests from within the IDE. 
That is something we cannot have.

I don't know how to make this seamlessly across CLI builds and manual IDE 
tests. I think we are back at fixing the individual tests...


> java.io.tmpdir setted as project build directory in surefire plugin
> ---
>
> Key: FLINK-5546
> URL: https://issues.apache.org/jira/browse/FLINK-5546
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System
> Environment: CentOS 7.2
>Reporter: Syinchwun Leo
>Assignee: shijinkui
> Fix For: 1.2.1
>
>
> When multiple Linux users run test at the same time, flink-runtime module may 
> fail. User A creates /tmp/cacheFile, and User B will have no permission to 
> visit the fold.  
> Failed tests: 
> FileCacheDeleteValidationTest.setup:79 Error initializing the test: 
> /tmp/cacheFile (Permission denied)
> Tests in error: 
> IOManagerTest.channelEnumerator:54 » Runtime Could not create storage 
> director...
> Tests run: 1385, Failures: 1, Errors: 1, Skipped: 8



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink issue #3190: [FLINK-5546][build] java.io.tmpdir setted as project buil...

2017-02-19 Thread StephanEwen

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3190
  
Sorry, I have to actually step back on this one.

I merged it into a feature branch and played around a bit with this, and it 
turns out it is not possible any more to execute tests from within the IDE. 
That is something we cannot have.

I don't know how to make this seamlessly across CLI builds and manual IDE 
tests. I think we are back at fixing the individual tests...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...

2017-02-19 Thread StephanEwen

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3335
  
@WangTaoTheTonic Am I right in assuming that your scenario assumes that 
multiple different users submit Flink jobs and these jobs cannot be "prepared" 
by a script that sets up a dedicated checkpoint directory with the respective 
permissions?

If we see that as a use case we want to support, then I could see this as 
an optional feature of the `FsStateBackend`. The configuration for that backend 
could take an optional parameter `state.backend.fs.permissions`. If that 
parameter is non-null, the state backed applies it onto the root directory. 
That way we keep the change local to the `FsStateBackend` (which is implicitly 
also used by the RocksDBStateBackend) and optional.

What you all think about that proposal?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5818) change checkpoint dir permission to 700 for security reason

2017-02-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873718#comment-15873718
 ] 

ASF GitHub Bot commented on FLINK-5818:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3335
  
@WangTaoTheTonic Am I right in assuming that your scenario assumes that 
multiple different users submit Flink jobs and these jobs cannot be "prepared" 
by a script that sets up a dedicated checkpoint directory with the respective 
permissions?

If we see that as a use case we want to support, then I could see this as 
an optional feature of the `FsStateBackend`. The configuration for that backend 
could take an optional parameter `state.backend.fs.permissions`. If that 
parameter is non-null, the state backed applies it onto the root directory. 
That way we keep the change local to the `FsStateBackend` (which is implicitly 
also used by the RocksDBStateBackend) and optional.

What you all think about that proposal?


> change checkpoint dir permission to 700 for security reason
> ---
>
> Key: FLINK-5818
> URL: https://issues.apache.org/jira/browse/FLINK-5818
> Project: Flink
>  Issue Type: Sub-task
>  Components: Security, State Backends, Checkpointing
>Reporter: Tao Wang
>
> Now checkpoint directory is made w/o specified permission, so it is easy for 
> another user to delete or read files under it, which will cause restore 
> failure or information leak.
> It's better to lower it down to 700.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

76 matches

Mail list logo