[jira] [Commented] (FLINK-5818) change checkpoint dir permission to 700 for security reason

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873048#comment-15873048
 ] 

ASF GitHub Bot commented on FLINK-5818:
---

Github user WangTaoTheTonic commented on the issue:

https://github.com/apache/flink/pull/3335
  
I've read all guys and list preconditions and solutions for this directory 
permission setting. 

## Preconditions
1. Every flink job(session or single) can specify a directory storing 
checkpoint, called `state.backend.fs.checkpointdir`.
2. Different jobs can set same or different directories, which means their 
checkpoint files can be stored in one same or different directories, with 
**sub-dir** created with their own job-ids.
3. Jobs can be run by different users, and users has requirement that one 
could not read chp files written by another user, which will cause information 
leak.
4. In some condition(which is relatively rare, I think), as @StephanEwen 
said, users has need to access other users’ chp files for cloning/migrating 
jobs.
5. The chp files path is like: 
`hdfs://namenode:port/flink-checkpoints//chk-17/6ba7b810-9dad-11d1-80b4-00c04fd430c8`

## Solutions 
### Solution #1 (would not require changes)
1. Admins control permission of root directory via HDFS ACLs(set it like: 
user1 can read, user2 can only read, …).
2. This has two disadvantages: a) It is a huge burden for Admins to set 
different permissions for large number of users/groups); and b) sub-dirs 
inherited permissions from root directory, which means they are basically same, 
which make it hard to do fine grained control.
### Solution #2 (this proposal)
1. We don’t care what permission of the root dir is. It can be create while 
setup or job running, as long as it is available to use.
2. We control every sub-dir created by different jobs(which are submitted 
by different users, in most cases), and set it to a lower value(like “700”) to 
prevent it to be read by others.
3. If someone wanna migrate or clone jobs across users(again, this scenario 
is rare in my view), he should ask admins(normally HDFS admin) to add ACLs(or 
whatever) for this purpose.


> change checkpoint dir permission to 700 for security reason
> ---
>
> Key: FLINK-5818
> URL: https://issues.apache.org/jira/browse/FLINK-5818
> Project: Flink
>  Issue Type: Sub-task
>  Components: Security, State Backends, Checkpointing
>Reporter: Tao Wang
>
> Now checkpoint directory is made w/o specified permission, so it is easy for 
> another user to delete or read files under it, which will cause restore 
> failure or information leak.
> It's better to lower it down to 700.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...

2017-02-17 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the issue:

https://github.com/apache/flink/pull/3335
  
I've read all guys and list preconditions and solutions for this directory 
permission setting. 

## Preconditions
1. Every flink job(session or single) can specify a directory storing 
checkpoint, called `state.backend.fs.checkpointdir`.
2. Different jobs can set same or different directories, which means their 
checkpoint files can be stored in one same or different directories, with 
**sub-dir** created with their own job-ids.
3. Jobs can be run by different users, and users has requirement that one 
could not read chp files written by another user, which will cause information 
leak.
4. In some condition(which is relatively rare, I think), as @StephanEwen 
said, users has need to access other users’ chp files for cloning/migrating 
jobs.
5. The chp files path is like: 
`hdfs://namenode:port/flink-checkpoints//chk-17/6ba7b810-9dad-11d1-80b4-00c04fd430c8`

## Solutions 
### Solution #1 (would not require changes)
1. Admins control permission of root directory via HDFS ACLs(set it like: 
user1 can read, user2 can only read, …).
2. This has two disadvantages: a) It is a huge burden for Admins to set 
different permissions for large number of users/groups); and b) sub-dirs 
inherited permissions from root directory, which means they are basically same, 
which make it hard to do fine grained control.
### Solution #2 (this proposal)
1. We don’t care what permission of the root dir is. It can be create 
while setup or job running, as long as it is available to use.
2. We control every sub-dir created by different jobs(which are submitted 
by different users, in most cases), and set it to a lower value(like “700”) 
to prevent it to be read by others.
3. If someone wanna migrate or clone jobs across users(again, this scenario 
is rare in my view), he should ask admins(normally HDFS admin) to add ACLs(or 
whatever) for this purpose.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3269: [FLINK-5698] Add NestedFieldsProjectableTableSourc...

2017-02-17 Thread wuchong
Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/3269#discussion_r101887738
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/util/RexProgramProjectExtractor.scala
 ---
@@ -84,6 +105,39 @@ object RexProgramProjectExtractor {
 }
 
 /**
+  * A RexVisitor to extract used nested input fields
+  */
+class RefFieldAccessorVisitor(
+usedFields: Array[Int],
+names: List[String])
+  extends RexVisitorImpl[Unit](true) {
+
+  private val group = usedFields.toList
+  private var nestedFields = mutable.LinkedHashSet[String]()
+
+  def getNestedFields: Array[String] = nestedFields.toArray
+
+  override def visitFieldAccess(fieldAccess: RexFieldAccess): Unit = {
+fieldAccess.getReferenceExpr match {
+  case ref: RexInputRef =>
+nestedFields += 
s"${names(ref.getIndex)}.${fieldAccess.getField.getName}"
--- End diff --

Yes, the parent of `RexFieldAccess` can be also a `RexFieldAccess`. We 
should take care of that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3269: [FLINK-5698] Add NestedFieldsProjectableTableSourc...

2017-02-17 Thread wuchong
Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/3269#discussion_r101887737
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/NestedFieldsProjectableTableSource.scala
 ---
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.sources
+
+/**
+  * Adds support for projection push-down to a [[TableSource]] with nested 
fields.
+  * A [[TableSource]] extending this interface is able
+  * to project the nested fields of the return table.
+  *
+  * @tparam T The return type of the 
[[NestedFieldsProjectableTableSource]].
+  */
+trait NestedFieldsProjectableTableSource[T] extends 
ProjectableTableSource[T] {
+
+  /**
+* Creates a copy of the [[NestedFieldsProjectableTableSource]]
+* that projects its output on the specified nested fields.
+*
+* @param fields The indexes of the fields to return.
+* @return A copy of the [[NestedFieldsProjectableTableSource]] that 
projects its output.
+*/
+  def projectNestedFields(fields: Array[String]): 
NestedFieldsProjectableTableSource[T]
--- End diff --

@fhueske , I'm fine with this, but have some questions to make sure I 
understand right.

Say we have a complex table schema as shown below:

```
id,
student, age, name>,
teacher
```

The `id, student, teacher` is the first level column, and `student` have a 
nested `school, age, name` columns, and `school` has a nested `city, tuition` 
columns also.

If a user select `id, student.school.city, student.age, teacher`, what the 
actual arguments should be?
`field = [0, 1, 2]` and `nestedFields = `[ [], ["school.city", "age"], 
["age", "name"] ]` ? 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5698) Add NestedFieldsProjectableTableSource interface

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873045#comment-15873045
 ] 

ASF GitHub Bot commented on FLINK-5698:
---

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/3269#discussion_r101887738
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/util/RexProgramProjectExtractor.scala
 ---
@@ -84,6 +105,39 @@ object RexProgramProjectExtractor {
 }
 
 /**
+  * A RexVisitor to extract used nested input fields
+  */
+class RefFieldAccessorVisitor(
+usedFields: Array[Int],
+names: List[String])
+  extends RexVisitorImpl[Unit](true) {
+
+  private val group = usedFields.toList
+  private var nestedFields = mutable.LinkedHashSet[String]()
+
+  def getNestedFields: Array[String] = nestedFields.toArray
+
+  override def visitFieldAccess(fieldAccess: RexFieldAccess): Unit = {
+fieldAccess.getReferenceExpr match {
+  case ref: RexInputRef =>
+nestedFields += 
s"${names(ref.getIndex)}.${fieldAccess.getField.getName}"
--- End diff --

Yes, the parent of `RexFieldAccess` can be also a `RexFieldAccess`. We 
should take care of that.


> Add NestedFieldsProjectableTableSource interface
> 
>
> Key: FLINK-5698
> URL: https://issues.apache.org/jira/browse/FLINK-5698
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Anton Solovev
>Assignee: Anton Solovev
>
> Add a NestedFieldsProjectableTableSource interface for some TableSource 
> implementation that support nesting projection push-down.
> The interface could look as follows
> {code}
> def trait NestedFieldsProjectableTableSource {
>   def projectNestedFields(fields: Array[String]): 
> NestedFieldsProjectableTableSource[T]
> }
> {code}
> This interface works together with ProjectableTableSource 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5698) Add NestedFieldsProjectableTableSource interface

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873044#comment-15873044
 ] 

ASF GitHub Bot commented on FLINK-5698:
---

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/3269#discussion_r101887737
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/NestedFieldsProjectableTableSource.scala
 ---
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.sources
+
+/**
+  * Adds support for projection push-down to a [[TableSource]] with nested 
fields.
+  * A [[TableSource]] extending this interface is able
+  * to project the nested fields of the return table.
+  *
+  * @tparam T The return type of the 
[[NestedFieldsProjectableTableSource]].
+  */
+trait NestedFieldsProjectableTableSource[T] extends 
ProjectableTableSource[T] {
+
+  /**
+* Creates a copy of the [[NestedFieldsProjectableTableSource]]
+* that projects its output on the specified nested fields.
+*
+* @param fields The indexes of the fields to return.
+* @return A copy of the [[NestedFieldsProjectableTableSource]] that 
projects its output.
+*/
+  def projectNestedFields(fields: Array[String]): 
NestedFieldsProjectableTableSource[T]
--- End diff --

@fhueske , I'm fine with this, but have some questions to make sure I 
understand right.

Say we have a complex table schema as shown below:

```
id,
student, age, name>,
teacher
```

The `id, student, teacher` is the first level column, and `student` have a 
nested `school, age, name` columns, and `school` has a nested `city, tuition` 
columns also.

If a user select `id, student.school.city, student.age, teacher`, what the 
actual arguments should be?
`field = [0, 1, 2]` and `nestedFields = `[ [], ["school.city", "age"], 
["age", "name"] ]` ? 



> Add NestedFieldsProjectableTableSource interface
> 
>
> Key: FLINK-5698
> URL: https://issues.apache.org/jira/browse/FLINK-5698
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Anton Solovev
>Assignee: Anton Solovev
>
> Add a NestedFieldsProjectableTableSource interface for some TableSource 
> implementation that support nesting projection push-down.
> The interface could look as follows
> {code}
> def trait NestedFieldsProjectableTableSource {
>   def projectNestedFields(fields: Array[String]): 
> NestedFieldsProjectableTableSource[T]
> }
> {code}
> This interface works together with ProjectableTableSource 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5441) Directly allow SQL queries on a Table

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873043#comment-15873043
 ] 

ASF GitHub Bot commented on FLINK-5441:
---

Github user KurtYoung commented on the issue:

https://github.com/apache/flink/pull/3107
  
Hi @fhueske, both solutions are fine with me.


> Directly allow SQL queries on a Table
> -
>
> Key: FLINK-5441
> URL: https://issues.apache.org/jira/browse/FLINK-5441
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Jark Wu
>
> Right now a user has to register a table before it can be used in SQL 
> queries. In order to allow more fluent programming we propose calling SQL 
> directly on a table. An underscore can be used to reference the current table:
> {code}
> myTable.sql("SELECT a, b, c FROM _ WHERE d = 12")
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3107: [FLINK-5441] [table] Directly allow SQL queries on a Tabl...

2017-02-17 Thread KurtYoung
Github user KurtYoung commented on the issue:

https://github.com/apache/flink/pull/3107
  
Hi @fhueske, both solutions are fine with me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5414) Bump up Calcite version to 1.11

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873024#comment-15873024
 ] 

ASF GitHub Bot commented on FLINK-5414:
---

Github user wuchong commented on the issue:

https://github.com/apache/flink/pull/3338
  
Hi @fhueske , yes, Calcite forces an Calc after each aggregate that only 
renames fields, because we rename every aggregates in Table API which is not 
necessary. I changed the logic of getting projections on aggregates to only 
rename the duplicate aggregates. And that works good, no more Calc appended.

Hi @haohui , the ArrayRelDataType is still NOT NULL. I reverted [that 
line](https://github.com/apache/flink/blob/master/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/calcite/FlinkTypeFactory.scala#L121)
 which is not need to be changed in this PR.

Cheers,
Jark Wu



> Bump up Calcite version to 1.11
> ---
>
> Key: FLINK-5414
> URL: https://issues.apache.org/jira/browse/FLINK-5414
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Jark Wu
>
> The upcoming Calcite release 1.11 has a lot of stability fixes and new 
> features. We should update it for the Table API.
> E.g. we can hopefully merge FLINK-4864



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3338: [FLINK-5414] [table] Bump up Calcite version to 1.11

2017-02-17 Thread wuchong
Github user wuchong commented on the issue:

https://github.com/apache/flink/pull/3338
  
Hi @fhueske , yes, Calcite forces an Calc after each aggregate that only 
renames fields, because we rename every aggregates in Table API which is not 
necessary. I changed the logic of getting projections on aggregates to only 
rename the duplicate aggregates. And that works good, no more Calc appended.

Hi @haohui , the ArrayRelDataType is still NOT NULL. I reverted [that 
line](https://github.com/apache/flink/blob/master/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/calcite/FlinkTypeFactory.scala#L121)
 which is not need to be changed in this PR.

Cheers,
Jark Wu



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5795) Improve “UDTF" to support constructor with parameter.

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873010#comment-15873010
 ] 

ASF GitHub Bot commented on FLINK-5795:
---

Github user wuchong commented on a diff in the pull request:

https://github.com/apache/flink/pull/3330#discussion_r101885634
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/codegen/CodeGenerator.scala
 ---
@@ -1463,21 +1465,23 @@ class CodeGenerator(
 */
   def addReusableFunction(function: UserDefinedFunction): String = {
 val classQualifier = function.getClass.getCanonicalName
-val fieldTerm = s"function_${classQualifier.replace('.', '$')}"
+val functionSerializedData = serialize(function)
+val fieldTerm =
+  s"""
+ |function_${classQualifier.replace('.', 
'$')}_${DigestUtils.md5Hex(functionSerializedData)}
--- End diff --

I find that the md5Hex string in the fieldTerm is never used. What about 
using `CodeGenUtils.newName` to generate a new function field name (as shown 
below). It is a common usage in `CodeGenerator` and there must be no naming 
collisions and the generated name will be more readable. What do you think 
@sunjincheng121 @fhueske ?

```
CodeGenUtils.newName(s"function_${classQualifier.replace('.', '$')}")
```

Regarding to another PR for scalar UDFs, I think you are right. We can that 
in this PR.


> Improve “UDTF" to support constructor with parameter.
> -
>
> Key: FLINK-5795
> URL: https://issues.apache.org/jira/browse/FLINK-5795
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-3679) DeserializationSchema should handle zero or more outputs for every input

2017-02-17 Thread Tzu-Li (Gordon) Tai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873009#comment-15873009
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-3679:


Hi [~wheat9], sure! I've noticed your PR, and will schedule some time next week 
to review it ;-)
Thank you for the reminder.

> DeserializationSchema should handle zero or more outputs for every input
> 
>
> Key: FLINK-3679
> URL: https://issues.apache.org/jira/browse/FLINK-3679
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, Kafka Connector
>Reporter: Jamie Grier
>Assignee: Haohui Mai
>
> There are a couple of issues with the DeserializationSchema API that I think 
> should be improved.  This request has come to me via an existing Flink user.
> The main issue is simply that the API assumes that there is a one-to-one 
> mapping between input and outputs.  In reality there are scenarios where one 
> input message (say from Kafka) might actually map to zero or more logical 
> elements in the pipeline.
> Particularly important here is the case where you receive a message from a 
> source (such as Kafka) and say the raw bytes don't deserialize properly.  
> Right now the only recourse is to throw IOException and therefore fail the 
> job.  
> This is definitely not good since bad data is a reality and failing the job 
> is not the right option.  If the job fails we'll just end up replaying the 
> bad data and the whole thing will start again.
> Instead in this case it would be best if the user could just return the empty 
> set.
> The other case is where one input message should logically be multiple output 
> messages.  This case is probably less important since there are other ways to 
> do this but in general it might be good to make the 
> DeserializationSchema.deserialize() method return a collection rather than a 
> single element.
> Maybe we need to support a DeserializationSchema variant that has semantics 
> more like that of FlatMap.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5353) Elasticsearch Sink loses well-formed documents when there are malformed documents

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873008#comment-15873008
 ] 

ASF GitHub Bot commented on FLINK-5353:
---

Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/3246
  
Hi @fpompermaier, sorry I was busy with other stuff over the last week.
I hope to work towards merging this by the end of next week!


> Elasticsearch Sink loses well-formed documents when there are malformed 
> documents
> -
>
> Key: FLINK-5353
> URL: https://issues.apache.org/jira/browse/FLINK-5353
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.1.3
>Reporter: Flavio Pompermaier
>Assignee: Tzu-Li (Gordon) Tai
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3246: [FLINK-5353] [elasticsearch] User-provided failure handle...

2017-02-17 Thread tzulitai
Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/3246
  
Hi @fpompermaier, sorry I was busy with other stuff over the last week.
I hope to work towards merging this by the end of next week!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3231: [FLINK-5682] Fix scala version in flink-streaming-...

2017-02-17 Thread billliuatuber
Github user billliuatuber closed the pull request at:

https://github.com/apache/flink/pull/3231


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5682) Fix scala version in flink-streaming-scala POM file

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872986#comment-15872986
 ] 

ASF GitHub Bot commented on FLINK-5682:
---

Github user billliuatuber closed the pull request at:

https://github.com/apache/flink/pull/3231


> Fix scala version in  flink-streaming-scala POM file
> 
>
> Key: FLINK-5682
> URL: https://issues.apache.org/jira/browse/FLINK-5682
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Reporter: Bill Liu
>  Labels: build, easyfix
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> In flink-streaming-scala, it doesn't define the scala library version,
> when build Flink for scala 2.10, it still possiblely  includes scala 2.11. 
> {quote}
> 
>   org.scala-lang
>   scala-reflect
>   
>   
>   org.scala-lang
>   scala-library
>   
>   
>   org.scala-lang
>   scala-compiler
>   
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-5818) change checkpoint dir permission to 700 for security reason

2017-02-17 Thread shijinkui (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shijinkui updated FLINK-5818:
-
Issue Type: Sub-task  (was: Improvement)
Parent: FLINK-5839

> change checkpoint dir permission to 700 for security reason
> ---
>
> Key: FLINK-5818
> URL: https://issues.apache.org/jira/browse/FLINK-5818
> Project: Flink
>  Issue Type: Sub-task
>  Components: Security, State Backends, Checkpointing
>Reporter: Tao Wang
>
> Now checkpoint directory is made w/o specified permission, so it is easy for 
> another user to delete or read files under it, which will cause restore 
> failure or information leak.
> It's better to lower it down to 700.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-5546) java.io.tmpdir setted as project build directory in surefire plugin

2017-02-17 Thread shijinkui (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shijinkui updated FLINK-5546:
-
Issue Type: Sub-task  (was: Test)
Parent: FLINK-5839

> java.io.tmpdir setted as project build directory in surefire plugin
> ---
>
> Key: FLINK-5546
> URL: https://issues.apache.org/jira/browse/FLINK-5546
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System
> Environment: CentOS 7.2
>Reporter: Syinchwun Leo
>Assignee: shijinkui
> Fix For: 1.2.1
>
>
> When multiple Linux users run test at the same time, flink-runtime module may 
> fail. User A creates /tmp/cacheFile, and User B will have no permission to 
> visit the fold.  
> Failed tests: 
> FileCacheDeleteValidationTest.setup:79 Error initializing the test: 
> /tmp/cacheFile (Permission denied)
> Tests in error: 
> IOManagerTest.channelEnumerator:54 » Runtime Could not create storage 
> director...
> Tests run: 1385, Failures: 1, Errors: 1, Skipped: 8



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-5640) configure the explicit Unit Test file suffix

2017-02-17 Thread shijinkui (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shijinkui updated FLINK-5640:
-
Issue Type: Sub-task  (was: Test)
Parent: FLINK-5839

> configure the explicit Unit Test file suffix
> 
>
> Key: FLINK-5640
> URL: https://issues.apache.org/jira/browse/FLINK-5640
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: shijinkui
>Assignee: shijinkui
> Fix For: 1.2.1
>
>
> There are four types of Unit Test file: *ITCase.java, *Test.java, 
> *ITSuite.scala, *Suite.scala
> File name ending with "IT.java" is integration test. File name ending with 
> "Test.java"  is unit test.
> It's clear for Surefire plugin of default-test execution to declare that 
> "*Test.*" is Java Unit Test.
> The test file statistics below:
> * Suite  total: 10
> * ITCase  total: 378
> * Test  total: 1008
> * ITSuite  total: 14



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-5839) Flink Security problem collection

2017-02-17 Thread shijinkui (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shijinkui updated FLINK-5839:
-
Summary: Flink Security problem collection  (was: Flink Security in 
Huawei's use case)

> Flink Security problem collection
> -
>
> Key: FLINK-5839
> URL: https://issues.apache.org/jira/browse/FLINK-5839
> Project: Flink
>  Issue Type: Improvement
>Reporter: shijinkui
>
> This issue collect some security problem found in huawei's use case.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-5839) Flink Security in Huawei's use case

2017-02-17 Thread shijinkui (JIRA)
shijinkui created FLINK-5839:


 Summary: Flink Security in Huawei's use case
 Key: FLINK-5839
 URL: https://issues.apache.org/jira/browse/FLINK-5839
 Project: Flink
  Issue Type: Improvement
Reporter: shijinkui


This issue collect some security problem found in huawei's use case.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3292: [FLINK-5739] [client] fix NullPointerException in CliFron...

2017-02-17 Thread clarkyzl
Github user clarkyzl commented on the issue:

https://github.com/apache/flink/pull/3292
  
Thanks a lot


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5739) NullPointerException in CliFrontend

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872914#comment-15872914
 ] 

ASF GitHub Bot commented on FLINK-5739:
---

Github user clarkyzl commented on the issue:

https://github.com/apache/flink/pull/3292
  
Thanks a lot


> NullPointerException in CliFrontend
> ---
>
> Key: FLINK-5739
> URL: https://issues.apache.org/jira/browse/FLINK-5739
> Project: Flink
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 1.3.0
> Environment: Mac OS X 10.12.2, Java 1.8.0_92-b14
>Reporter: Zhuoluo Yang
>Assignee: Zhuoluo Yang
>  Labels: newbie, starter
> Fix For: 1.3.0
>
>
> I've run a simple program on a local cluster. It always fails with code 
> Version: 1.3-SNAPSHOTCommit: e24a866. 
> {quote}
> Zhuoluos-MacBook-Pro:build-target zhuoluo.yzl$ bin/flink run -c 
> com.alibaba.blink.TableApp ~/gitlab/tableapp/target/tableapp-1.0-SNAPSHOT.jar 
> Cluster configuration: Standalone cluster with JobManager at 
> localhost/127.0.0.1:6123
> Using address localhost:6123 to connect to JobManager.
> JobManager web interface address http://localhost:8081
> Starting execution of program
> 
>  The program finished with the following exception:
> java.lang.NullPointerException
>   at 
> org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:845)
>   at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
>   at 
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1076)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1123)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1120)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
>   at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1120)
> {quote}
> I don't think there should be a NullPointerException here, even if you forgot 
> the "execute()" call.
> The reproducing code looks like following:
> {code:java}
> public static void main(String[] args) throws Exception {
> ExecutionEnvironment env = 
> ExecutionEnvironment.getExecutionEnvironment();
> DataSource customer = 
> env.readTextFile("/Users/zhuoluo.yzl/customer.tbl");
> customer.filter(new FilterFunction() {
> public boolean filter(String value) throws Exception {
> return true;
> }
> })
> .writeAsText("/Users/zhuoluo.yzl/customer.txt");
> //env.execute();
> }
> {code}
> We can use *start-cluster.sh* on a *local* computer to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-4819) Checkpoint metadata+data inspection tool (view / update)

2017-02-17 Thread Zhenzhong Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenzhong Xu updated FLINK-4819:

Description: 
Checkpoint inspection tool for operationalization, troubleshooting, 
diagnostics, etc, or performing brain surgery.
If the tool can be done in a way that's programatically accessible, that'll be 
great for us to automating the some of the validations.

  was:Checkpoint inspection tool for operationalization, troubleshooting, 
diagnostics, etc, or performing brain surgery.


> Checkpoint metadata+data inspection tool (view / update)
> 
>
> Key: FLINK-4819
> URL: https://issues.apache.org/jira/browse/FLINK-4819
> Project: Flink
>  Issue Type: Task
>  Components: State Backends, Checkpointing
>Reporter: Zhenzhong Xu
>
> Checkpoint inspection tool for operationalization, troubleshooting, 
> diagnostics, etc, or performing brain surgery.
> If the tool can be done in a way that's programatically accessible, that'll 
> be great for us to automating the some of the validations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5414) Bump up Calcite version to 1.11

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872765#comment-15872765
 ] 

ASF GitHub Bot commented on FLINK-5414:
---

Github user haohui commented on the issue:

https://github.com/apache/flink/pull/3338
  
Looks good to me overall. One question -- I wonder, does it mean that all 
array types become nullable after this change?


> Bump up Calcite version to 1.11
> ---
>
> Key: FLINK-5414
> URL: https://issues.apache.org/jira/browse/FLINK-5414
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Jark Wu
>
> The upcoming Calcite release 1.11 has a lot of stability fixes and new 
> features. We should update it for the Table API.
> E.g. we can hopefully merge FLINK-4864



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3338: [FLINK-5414] [table] Bump up Calcite version to 1.11

2017-02-17 Thread haohui
Github user haohui commented on the issue:

https://github.com/apache/flink/pull/3338
  
Looks good to me overall. One question -- I wonder, does it mean that all 
array types become nullable after this change?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-4660) HadoopFileSystem (with S3A) may leak connections, which cause job to stuck in a restarting loop

2017-02-17 Thread Zhenzhong Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenzhong Xu resolved FLINK-4660.
-
Resolution: Fixed

> HadoopFileSystem (with S3A) may leak connections, which cause job to stuck in 
> a restarting loop
> ---
>
> Key: FLINK-4660
> URL: https://issues.apache.org/jira/browse/FLINK-4660
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Reporter: Zhenzhong Xu
>Priority: Critical
> Attachments: Screen Shot 2016-09-20 at 2.49.14 PM.png, Screen Shot 
> 2016-09-20 at 2.49.32 PM.png
>
>
> Flink job with checkpoints enabled and configured to use S3A file system 
> backend, sometimes experiences checkpointing failure due to S3 consistency 
> issue. This behavior is also reported by other people and documented in 
> https://issues.apache.org/jira/browse/FLINK-4218.
> This problem gets magnified by current HadoopFileSystem implementation, which 
> can potentially leak S3 client connections, and eventually get into a 
> restarting loop with “Timeout waiting for a connection from pool” exception 
> thrown from aws client.
> I looked at the code, seems HadoopFileSystem.java never invoke close() method 
> on fs object upon failure, but the FileSystem may be re-initialized every 
> time the job gets restarted.
> A few evidence I observed:
> 1. When I set the connection pool limit to 128, and below commands shows 128 
> connections are stuck in CLOSE_WAIT state.
> !Screen Shot 2016-09-20 at 2.49.14 PM.png|align=left, vspace=5! 
> 2. task manager logs indicates that state backend file system consistently 
> getting initialized upon job restarting.
> !Screen Shot 2016-09-20 at 2.49.32 PM.png!
> 3. Log indicates there is NPE during cleanning up of stream task which was 
> caused by “Timeout waiting for connection from pool” exception when trying to 
> create a directory in S3 bucket.
> 2016-09-02 08:17:50,886 ERROR 
> org.apache.flink.streaming.runtime.tasks.StreamTask - Error during cleanup of 
> stream task
> java.lang.NullPointerException
> at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.cleanup(OneInputStreamTask.java:73)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:323)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:589)
> at java.lang.Thread.run(Thread.java:745)
> 4.It appears StreamTask from invoking checkpointing operation, to handling 
> failure, there is no logic associated with closing Hadoop File System object 
> (which internally includes S3 aws client object), which resides in 
> HadoopFileSystem.java.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (FLINK-4660) HadoopFileSystem (with S3A) may leak connections, which cause job to stuck in a restarting loop

2017-02-17 Thread Zhenzhong Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenzhong Xu closed FLINK-4660.
---

> HadoopFileSystem (with S3A) may leak connections, which cause job to stuck in 
> a restarting loop
> ---
>
> Key: FLINK-4660
> URL: https://issues.apache.org/jira/browse/FLINK-4660
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Reporter: Zhenzhong Xu
>Priority: Critical
> Attachments: Screen Shot 2016-09-20 at 2.49.14 PM.png, Screen Shot 
> 2016-09-20 at 2.49.32 PM.png
>
>
> Flink job with checkpoints enabled and configured to use S3A file system 
> backend, sometimes experiences checkpointing failure due to S3 consistency 
> issue. This behavior is also reported by other people and documented in 
> https://issues.apache.org/jira/browse/FLINK-4218.
> This problem gets magnified by current HadoopFileSystem implementation, which 
> can potentially leak S3 client connections, and eventually get into a 
> restarting loop with “Timeout waiting for a connection from pool” exception 
> thrown from aws client.
> I looked at the code, seems HadoopFileSystem.java never invoke close() method 
> on fs object upon failure, but the FileSystem may be re-initialized every 
> time the job gets restarted.
> A few evidence I observed:
> 1. When I set the connection pool limit to 128, and below commands shows 128 
> connections are stuck in CLOSE_WAIT state.
> !Screen Shot 2016-09-20 at 2.49.14 PM.png|align=left, vspace=5! 
> 2. task manager logs indicates that state backend file system consistently 
> getting initialized upon job restarting.
> !Screen Shot 2016-09-20 at 2.49.32 PM.png!
> 3. Log indicates there is NPE during cleanning up of stream task which was 
> caused by “Timeout waiting for connection from pool” exception when trying to 
> create a directory in S3 bucket.
> 2016-09-02 08:17:50,886 ERROR 
> org.apache.flink.streaming.runtime.tasks.StreamTask - Error during cleanup of 
> stream task
> java.lang.NullPointerException
> at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.cleanup(OneInputStreamTask.java:73)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:323)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:589)
> at java.lang.Thread.run(Thread.java:745)
> 4.It appears StreamTask from invoking checkpointing operation, to handling 
> failure, there is no logic associated with closing Hadoop File System object 
> (which internally includes S3 aws client object), which resides in 
> HadoopFileSystem.java.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-4660) HadoopFileSystem (with S3A) may leak connections, which cause job to stuck in a restarting loop

2017-02-17 Thread Zhenzhong Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872727#comment-15872727
 ] 

Zhenzhong Xu commented on FLINK-4660:
-

This is fixed,closing.

> HadoopFileSystem (with S3A) may leak connections, which cause job to stuck in 
> a restarting loop
> ---
>
> Key: FLINK-4660
> URL: https://issues.apache.org/jira/browse/FLINK-4660
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Reporter: Zhenzhong Xu
>Priority: Critical
> Attachments: Screen Shot 2016-09-20 at 2.49.14 PM.png, Screen Shot 
> 2016-09-20 at 2.49.32 PM.png
>
>
> Flink job with checkpoints enabled and configured to use S3A file system 
> backend, sometimes experiences checkpointing failure due to S3 consistency 
> issue. This behavior is also reported by other people and documented in 
> https://issues.apache.org/jira/browse/FLINK-4218.
> This problem gets magnified by current HadoopFileSystem implementation, which 
> can potentially leak S3 client connections, and eventually get into a 
> restarting loop with “Timeout waiting for a connection from pool” exception 
> thrown from aws client.
> I looked at the code, seems HadoopFileSystem.java never invoke close() method 
> on fs object upon failure, but the FileSystem may be re-initialized every 
> time the job gets restarted.
> A few evidence I observed:
> 1. When I set the connection pool limit to 128, and below commands shows 128 
> connections are stuck in CLOSE_WAIT state.
> !Screen Shot 2016-09-20 at 2.49.14 PM.png|align=left, vspace=5! 
> 2. task manager logs indicates that state backend file system consistently 
> getting initialized upon job restarting.
> !Screen Shot 2016-09-20 at 2.49.32 PM.png!
> 3. Log indicates there is NPE during cleanning up of stream task which was 
> caused by “Timeout waiting for connection from pool” exception when trying to 
> create a directory in S3 bucket.
> 2016-09-02 08:17:50,886 ERROR 
> org.apache.flink.streaming.runtime.tasks.StreamTask - Error during cleanup of 
> stream task
> java.lang.NullPointerException
> at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.cleanup(OneInputStreamTask.java:73)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:323)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:589)
> at java.lang.Thread.run(Thread.java:745)
> 4.It appears StreamTask from invoking checkpointing operation, to handling 
> failure, there is no logic associated with closing Hadoop File System object 
> (which internally includes S3 aws client object), which resides in 
> HadoopFileSystem.java.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3353: Multithreaded `DataSet#flatMap` function

2017-02-17 Thread mohamagdy
GitHub user mohamagdy opened a pull request:

https://github.com/apache/flink/pull/3353

Multithreaded `DataSet#flatMap` function

# Mutli-Threaded FlatMap

## Overview

The DataStream#flatMap function takes a FlatMapFunction interface that has 
a method named flatMap that gets called by the DataStream#flatMap.

The FlatMapFunction#flatMap method takes an element (DataStream record) and 
do some transformation on that element for example if the element is a string, 
the flatMap function can implement the logic for splitting the element by space 
or converting to upper case.

The current implementation of the DataStream#flatMap uses a single thread 
to transform the element from one form to another. The idea of this change is 
to introduce a new API method the DataStream#flatMap that takes the parallelism 
value for transforming the input elements.

## Implementation Details

The following diagram shows the multithreaded `flatMap` function. Assume in 
the following diagram the parallelism (maximum thread pool) is set to `3` (3 
threads can run transformations on the input element)

Briefly, when the `flatMap` function receives an element it pushes it to a 
_buffer_ then spawns a thread per element to do the transformation then write 
back to the _output_.

The _output_ is thread-safe and only 1 thread can write at a time. It uses 
an `Object` as a lock state to the output.

The _buffer_ is used to accumulate elements so that when the _snapshot_ job 
runs and the element transformation is not yet finished, the _buffer_ writes 
all its elements serialized into an _output stream_. When a _restore_ is called 
it deserialize the elements of the buffer and try run the transformation again 
because their output state was not taken into consideration when the snapshot 
job ran.

![multithreaded 
flatmap](https://cloud.githubusercontent.com/assets/1228432/23085859/699a2926-f56a-11e6-9146-1be213c7.png)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mohamagdy/flink parallel-dataset-flatmap

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3353.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3353


commit bf2c710fe8af2be3475d66221f9e6b8e2090bbfc
Author: Mohamed Magdy 
Date:   2017-02-12T21:35:06Z

[FLINK-] Fix JavaDoc class name

commit bbee6c580815190139f289d0309bfdc2f4ca83c6
Author: Mohamed Magdy 
Date:   2017-02-12T21:36:09Z

[FLINK-] Override `close()` method

In order to introduce threads for processing `flatMap` elements,
the `close` method in `StreamFlatMap` will be overriden to shutdown
the thread pool.

commit e0aff5e241aff97426d8c10e84ba5a932668d815
Author: Mohamed Magdy 
Date:   2017-02-13T20:09:15Z

[FLINK-] Organize imports in test files

Ran Intellij organize imports

commit 73803bc8fc89f48691b4c0270b0061377a3e7e38
Author: Mohamed Magdy 
Date:   2017-02-14T15:08:30Z

[FLINK-] Fix typo in tests

commit b9343317eb35ef41aba82c70648dc8fd8767273e
Author: Mohamed Magdy 
Date:   2017-02-14T22:21:46Z

[FLINK-] Call `close` after `processElement` in tests

In order to follow the flow described in `StreamOperator` interface for the 
`close` method
which says the following:

```
This method is called after all records have been added to the operators 
via the methods
{@link 
org.apache.flink.streaming.api.operators.OneInputStreamOperator#processElement(StreamRecord)},
 or
{@link 
org.apache.flink.streaming.api.operators.TwoInputStreamOperator#processElement1(StreamRecord)}
 and
{@link 
org.apache.flink.streaming.api.operators.TwoInputStreamOperator#processElement2(StreamRecord)}.
```

commit fd0c28adceca9fbe5fc9e3ffd93679495edca02b
Author: Mohamed Magdy 
Date:   2017-02-14T23:46:01Z

[FLINK-] Add tests for multi-threaded `DataStream` `flatMap`

Tweak tests to check results of `flatMap` when processing elements of 
`DataStream`
in multiple threads.

commit cbe2fa8059272d8cb49e50056466564d6d8c322d
Author: Mohamed Magdy 
Date:   2017-02-14T23:50:45Z

[FLINK-] Add multiple threads for processing `flatMap` elements

Add an option to the `DataStream` `flatMap` function that sets the 
parallelizm
of processing the `DataStream` elements. This helps in cases when 
processing elements
of the `DataStream` blocks the main thread in favour of other elements.

The main thread is blocked until all the elements are processed and all the 
threads
finish.

commit 

[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...

2017-02-17 Thread gyfora
Github user gyfora commented on the issue:

https://github.com/apache/flink/pull/3335
  
I agree with @StephanEwen that people probably manage the directory 
permissions directly when configuring the Flink jobs. It would be quite 
annoying if the Flink job changed the permissions you set from somewhere else.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5818) change checkpoint dir permission to 700 for security reason

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872581#comment-15872581
 ] 

ASF GitHub Bot commented on FLINK-5818:
---

Github user gyfora commented on the issue:

https://github.com/apache/flink/pull/3335
  
I agree with @StephanEwen that people probably manage the directory 
permissions directly when configuring the Flink jobs. It would be quite 
annoying if the Flink job changed the permissions you set from somewhere else.


> change checkpoint dir permission to 700 for security reason
> ---
>
> Key: FLINK-5818
> URL: https://issues.apache.org/jira/browse/FLINK-5818
> Project: Flink
>  Issue Type: Improvement
>  Components: Security, State Backends, Checkpointing
>Reporter: Tao Wang
>
> Now checkpoint directory is made w/o specified permission, so it is easy for 
> another user to delete or read files under it, which will cause restore 
> failure or information leak.
> It's better to lower it down to 700.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3346: [FLINK-5763] [checkpoints] Add CheckpointOptions for self...

2017-02-17 Thread gyfora
Github user gyfora commented on the issue:

https://github.com/apache/flink/pull/3346
  
So will the same savepoint logic apply to externalized checkpoints? I think 
it would be good to have a similar way of restoring from checkpoints and 
savepoints from a usability perspective. 

I usually set the externalized checkpoint dir the same as the savepoint 
dir, to make it easy to write scripts to get the latest and restart.

Otherwise I think the changes make a lot of sense and the directory 
structure looks very reasonable. 

One question, in the directory name could we maybe use the checkpoint id 
instead of the random suffix or something more predictable? Maybe the 
checkpoint date?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5763) Make savepoints self-contained and relocatable

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872532#comment-15872532
 ] 

ASF GitHub Bot commented on FLINK-5763:
---

Github user gyfora commented on the issue:

https://github.com/apache/flink/pull/3346
  
So will the same savepoint logic apply to externalized checkpoints? I think 
it would be good to have a similar way of restoring from checkpoints and 
savepoints from a usability perspective. 

I usually set the externalized checkpoint dir the same as the savepoint 
dir, to make it easy to write scripts to get the latest and restart.

Otherwise I think the changes make a lot of sense and the directory 
structure looks very reasonable. 

One question, in the directory name could we maybe use the checkpoint id 
instead of the random suffix or something more predictable? Maybe the 
checkpoint date?


> Make savepoints self-contained and relocatable
> --
>
> Key: FLINK-5763
> URL: https://issues.apache.org/jira/browse/FLINK-5763
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>
> After a user has triggered a savepoint, a single savepoint file will be 
> returned as a handle to the savepoint. A savepoint to {{}} creates a 
> savepoint file like {{/savepoint-}}.
> This file contains the metadata of the corresponding checkpoint, but not the 
> actual program state. While this works well for short term management 
> (pause-and-resume a job), it makes it hard to manage savepoints over longer 
> periods of time.
> h4. Problems
> h5. Scattered Checkpoint Files
> For file system based checkpoints (FsStateBackend, RocksDBStateBackend) this 
> results in the savepoint referencing files from the checkpoint directory 
> (usually different than ). For users, it is virtually impossible to 
> tell which checkpoint files belong to a savepoint and which are lingering 
> around. This can easily lead to accidentally invalidating a savepoint by 
> deleting checkpoint files.
> h5. Savepoints Not Relocatable
> Even if a user is able to figure out which checkpoint files belong to a 
> savepoint, moving these files will invalidate the savepoint as well, because 
> the metadata file references absolute file paths.
> h5. Forced to Use CLI for Disposal
> Because of the scattered files, the user is in practice forced to use Flink’s 
> CLI to dispose a savepoint. This should be possible to handle in the scope of 
> the user’s environment via a file system delete operation.
> h4. Proposal
> In order to solve the described problems, savepoints should contain all their 
> state, both metadata and program state, inside a single directory. 
> Furthermore the metadata must only hold relative references to the checkpoint 
> files. This makes it obvious which files make up the state of a savepoint and 
> it is possible to move savepoints around by moving the savepoint directory.
> h5. Desired File Layout
> Triggering a savepoint to {{}} creates a directory as follows:
> {code}
> /savepoint--
>   +-- _metadata
>   +-- data- [1 or more]
> {code}
> We include the JobID in the savepoint directory name in order to give some 
> hints about which job a savepoint belongs to.
> h5. CLI
> - Trigger: When triggering a savepoint to {{}} the savepoint 
> directory will be returned as the handle to the savepoint.
> - Restore: Users can restore by pointing to the directory or the _metadata 
> file. The data files should be required to be in the same directory as the 
> _metadata file.
> - Dispose: The disposal command should be deprecated and eventually removed. 
> While deprecated, disposal can happen by specifying the directory or the 
> _metadata file (same as restore).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5818) change checkpoint dir permission to 700 for security reason

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872516#comment-15872516
 ] 

ASF GitHub Bot commented on FLINK-5818:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3335
  
The HDFS administrator can configure the parent directory for checkpoints 
with user and/or group ACL permissions. A default ACL is then inherited by the 
newly created files and subdirectories therein. If you create an ACL which 
blocks access for `group` and `other` the effective permissions are the 
requested `700`.


> change checkpoint dir permission to 700 for security reason
> ---
>
> Key: FLINK-5818
> URL: https://issues.apache.org/jira/browse/FLINK-5818
> Project: Flink
>  Issue Type: Improvement
>  Components: Security, State Backends, Checkpointing
>Reporter: Tao Wang
>
> Now checkpoint directory is made w/o specified permission, so it is easy for 
> another user to delete or read files under it, which will cause restore 
> failure or information leak.
> It's better to lower it down to 700.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...

2017-02-17 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3335
  
The HDFS administrator can configure the parent directory for checkpoints 
with user and/or group ACL permissions. A default ACL is then inherited by the 
newly created files and subdirectories therein. If you create an ACL which 
blocks access for `group` and `other` the effective permissions are the 
requested `700`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3679) DeserializationSchema should handle zero or more outputs for every input

2017-02-17 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872506#comment-15872506
 ] 

Haohui Mai commented on FLINK-3679:
---

[~tzulitai] -- would you mind taking a look?

> DeserializationSchema should handle zero or more outputs for every input
> 
>
> Key: FLINK-3679
> URL: https://issues.apache.org/jira/browse/FLINK-3679
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, Kafka Connector
>Reporter: Jamie Grier
>Assignee: Haohui Mai
>
> There are a couple of issues with the DeserializationSchema API that I think 
> should be improved.  This request has come to me via an existing Flink user.
> The main issue is simply that the API assumes that there is a one-to-one 
> mapping between input and outputs.  In reality there are scenarios where one 
> input message (say from Kafka) might actually map to zero or more logical 
> elements in the pipeline.
> Particularly important here is the case where you receive a message from a 
> source (such as Kafka) and say the raw bytes don't deserialize properly.  
> Right now the only recourse is to throw IOException and therefore fail the 
> job.  
> This is definitely not good since bad data is a reality and failing the job 
> is not the right option.  If the job fails we'll just end up replaying the 
> bad data and the whole thing will start again.
> Instead in this case it would be best if the user could just return the empty 
> set.
> The other case is where one input message should logically be multiple output 
> messages.  This case is probably less important since there are other ways to 
> do this but in general it might be good to make the 
> DeserializationSchema.deserialize() method return a collection rather than a 
> single element.
> Maybe we need to support a DeserializationSchema variant that has semantics 
> more like that of FlatMap.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5838) Print shell script usage

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872400#comment-15872400
 ] 

ASF GitHub Bot commented on FLINK-5838:
---

GitHub user greghogan opened a pull request:

https://github.com/apache/flink/pull/3352

[FLINK-5838] [scripts] Print shell script usage

If jobmanager.sh, taskmanager.sh, or zookeeper.sh are called without 
arguments then argument list for the call to flink-daemon.sh is misaligned and 
the usage for flink-daemon displayed to the user.

Adds a check to each script to check for a valid action and otherwise 
displays the proper usage string.

Note: this PR conflicts with the PR for FLINK-4326 which adds a 
"start-foreground" action.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/greghogan/flink 5838_print_shell_script_usage

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3352.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3352


commit 0441164935a34032523040864ec18ee01b4adb5b
Author: Greg Hogan 
Date:   2017-02-17T19:38:31Z

[FLINK-5838] [scripts] Print shell script usage

If jobmanager.sh, taskmanager.sh, or zookeeper.sh are called without
arguments then argument list for the call to flink-daemon.sh is
misaligned and the usage for flink-daemon displayed to the user.

Adds a check to each script to check for a valid action and otherwise
displays the proper usage string.




> Print shell script usage
> 
>
> Key: FLINK-5838
> URL: https://issues.apache.org/jira/browse/FLINK-5838
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Affects Versions: 1.3.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Trivial
> Fix For: 1.3.0
>
>
> {code}
> $ ./bin/jobmanager.sh 
> Unknown daemon ''. Usage: flink-daemon.sh (start|stop|stop-all) 
> (jobmanager|taskmanager|zookeeper) [args].
> {code}
> The arguments in {{jobmanager.sh}}'s call to {{flink-daemon.sh}} are 
> misaligned when {{$STARTSTOP}} is the null string.
> {code}
> "${FLINK_BIN_DIR}"/flink-daemon.sh $STARTSTOP jobmanager "${args[@]}"
> {code}
> Same issue in {{taskmanager.sh}} and {{zookeeper.sh}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3352: [FLINK-5838] [scripts] Print shell script usage

2017-02-17 Thread greghogan
GitHub user greghogan opened a pull request:

https://github.com/apache/flink/pull/3352

[FLINK-5838] [scripts] Print shell script usage

If jobmanager.sh, taskmanager.sh, or zookeeper.sh are called without 
arguments then argument list for the call to flink-daemon.sh is misaligned and 
the usage for flink-daemon displayed to the user.

Adds a check to each script to check for a valid action and otherwise 
displays the proper usage string.

Note: this PR conflicts with the PR for FLINK-4326 which adds a 
"start-foreground" action.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/greghogan/flink 5838_print_shell_script_usage

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3352.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3352


commit 0441164935a34032523040864ec18ee01b4adb5b
Author: Greg Hogan 
Date:   2017-02-17T19:38:31Z

[FLINK-5838] [scripts] Print shell script usage

If jobmanager.sh, taskmanager.sh, or zookeeper.sh are called without
arguments then argument list for the call to flink-daemon.sh is
misaligned and the usage for flink-daemon displayed to the user.

Adds a check to each script to check for a valid action and otherwise
displays the proper usage string.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-5838) Print shell script usage

2017-02-17 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan updated FLINK-5838:
--
Description: 
{code}
$ ./bin/jobmanager.sh 
Unknown daemon ''. Usage: flink-daemon.sh (start|stop|stop-all) 
(jobmanager|taskmanager|zookeeper) [args].
{code}

The arguments in {{jobmanager.sh}}'s call to {{flink-daemon.sh}} are misaligned 
when {{$STARTSTOP}} is the null string.

{code}
"${FLINK_BIN_DIR}"/flink-daemon.sh $STARTSTOP jobmanager "${args[@]}"
{code}

Same issue in {{taskmanager.sh}} and {{zookeeper.sh}}.

  was:
{code}
$ ./bin/jobmanager.sh 
Unknown daemon ''. Usage: flink-daemon.sh (start|stop|stop-all) 
(jobmanager|taskmanager|zookeeper) [args].
{code}

The arguments in {{jobmanager.sh}}'s call to {{flink-daemon.sh}} are misaligned 
when {{$STARTSTOP}} is the null string.

{code}
"${FLINK_BIN_DIR}"/flink-daemon.sh $STARTSTOP jobmanager "${args[@]}"
{code}


> Print shell script usage
> 
>
> Key: FLINK-5838
> URL: https://issues.apache.org/jira/browse/FLINK-5838
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Affects Versions: 1.3.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Trivial
> Fix For: 1.3.0
>
>
> {code}
> $ ./bin/jobmanager.sh 
> Unknown daemon ''. Usage: flink-daemon.sh (start|stop|stop-all) 
> (jobmanager|taskmanager|zookeeper) [args].
> {code}
> The arguments in {{jobmanager.sh}}'s call to {{flink-daemon.sh}} are 
> misaligned when {{$STARTSTOP}} is the null string.
> {code}
> "${FLINK_BIN_DIR}"/flink-daemon.sh $STARTSTOP jobmanager "${args[@]}"
> {code}
> Same issue in {{taskmanager.sh}} and {{zookeeper.sh}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-5838) Print shell script usage

2017-02-17 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan updated FLINK-5838:
--
Summary: Print shell script usage  (was: Fix jobmanager.sh usage)

> Print shell script usage
> 
>
> Key: FLINK-5838
> URL: https://issues.apache.org/jira/browse/FLINK-5838
> Project: Flink
>  Issue Type: Bug
>  Components: Startup Shell Scripts
>Affects Versions: 1.3.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Trivial
> Fix For: 1.3.0
>
>
> {code}
> $ ./bin/jobmanager.sh 
> Unknown daemon ''. Usage: flink-daemon.sh (start|stop|stop-all) 
> (jobmanager|taskmanager|zookeeper) [args].
> {code}
> The arguments in {{jobmanager.sh}}'s call to {{flink-daemon.sh}} are 
> misaligned when {{$STARTSTOP}} is the null string.
> {code}
> "${FLINK_BIN_DIR}"/flink-daemon.sh $STARTSTOP jobmanager "${args[@]}"
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #2982: [FLINK-4460] Side Outputs in Flink

2017-02-17 Thread chenqin
Github user chenqin commented on the issue:

https://github.com/apache/flink/pull/2982
  
@aljoscha Looks good to me 👍

I briefly looked at your git branch, a minor comment would be adding 
comments to `sideOutputLateData` so user get better idea when they opt-in to 
late arriving event stream. 

Initial late arriving event is decided by comparing watermark & eventTime, 
do you think there is a need to allow user pass a kinda `Evaluator` and enable 
user sideOutput any kind of sideOutputs?

`window.sideOutput(OutputTag, Evaluator)`

`interface Evaluator{ MergedWindows, key, watermark}`

- Regarding `split` `select`, I think there is a chance of consolidate 
select and build upon `OutputTag`, but might be out of this PR's scope.
- Regarding to `WindowStream`, I am a bit confused to figure out  if I use 
`allowedlateness` and  `sideOutputLateData` at same time.

Thanks,
Chen





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5747) Eager Scheduling should deploy all Tasks together

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872366#comment-15872366
 ] 

ASF GitHub Bot commented on FLINK-5747:
---

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/3295#discussion_r101829659
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java
 ---
@@ -754,6 +759,139 @@ public void scheduleForExecution(SlotProvider 
slotProvider) throws JobException
}
}
 
+   private void scheduleLazy(SlotProvider slotProvider) throws 
NoResourceAvailableException {
+   // simply take the vertices without inputs.
+   for (ExecutionJobVertex ejv : this.tasks.values()) {
+   if (ejv.getJobVertex().isInputVertex()) {
+   ejv.scheduleAll(slotProvider, 
allowQueuedScheduling);
+   }
+   }
+   }
+
+   /**
+* 
+* 
+* @param slotProvider  The resource provider from which the slots are 
allocated
+* @param timeout   The maximum time that the deployment may take, 
before a
+*  TimeoutException is thrown.
+*/
+   private void scheduleEager(SlotProvider slotProvider, final Time 
timeout) {
+   checkState(state == JobStatus.RUNNING, "job is not running 
currently");
+
+   // Important: reserve all the space we need up front.
+   // that way we do not have any operation that can fail between 
allocating the slots
+   // and adding them to the list. If we had a failure in between 
there, that would
+   // cause the slots to get lost
+   final ArrayList resources = new 
ArrayList<>(getNumberOfExecutionJobVertices());
+   final boolean queued = allowQueuedScheduling;
+
+   // we use this flag to handle failures in a 'finally' clause
+   // that allows us to not go through clumsy cast-and-rethrow 
logic
+   boolean successful = false;
+
+   try {
+   // collecting all the slots may resize and fail in that 
operation without slots getting lost
+   final ArrayList slotFutures = new 
ArrayList<>(getNumberOfExecutionJobVertices());
+
+   // allocate the slots (obtain all their futures
+   for (ExecutionJobVertex ejv : 
getVerticesTopologically()) {
+   // these calls are not blocking, they only 
return futures
+   ExecutionAndSlot[] slots = 
ejv.allocateResourcesForAll(slotProvider, queued);
+
+   // we need to first add the slots to this list, 
to be safe on release
+   resources.add(slots);
+
+   for (ExecutionAndSlot ens : slots) {
+   slotFutures.add(ens.slotFuture);
+   }
+   }
+
+   // this future is complete once all slot futures are 
complete.
+   // the future fails once one slot future fails.
+   final ConjunctFuture allAllocationsComplete = 
FutureUtils.combineAll(slotFutures);
--- End diff --

True, it is not incorrect. But some tasks would be already deployed if we 
start as soon as some futures are ready. They would need to be canceled again, 
which gives these not so nice fast 
deploy/out-of-resource/cancel/wait-for-cancellation/retry/etc loops.


> Eager Scheduling should deploy all Tasks together
> -
>
> Key: FLINK-5747
> URL: https://issues.apache.org/jira/browse/FLINK-5747
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager
>Affects Versions: 1.2.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.3.0
>
>
> Currently, eager scheduling immediately triggers the scheduling for all 
> vertices and their subtasks in topological order. 
> This has two problems:
>   - This works only, as long as resource acquisition is "synchronous". With 
> dynamic resource acquisition in FLIP-6, the resources are returned as Futures 
> which may complete out of order. This results in out-of-order (not in 
> topological order) scheduling of tasks which does not work for streaming.
>   - Deploying some tasks that depend on other tasks before it is clear that 
> the other tasks have resources as well leads to situations where many 
> deploy/recovery cycles happen before enough resources are available to get 
> the job running fully.
> For eager scheduling, we should allocate 

[GitHub] flink pull request #3295: [FLINK-5747] [distributed coordination] Eager sche...

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/3295#discussion_r101829659
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java
 ---
@@ -754,6 +759,139 @@ public void scheduleForExecution(SlotProvider 
slotProvider) throws JobException
}
}
 
+   private void scheduleLazy(SlotProvider slotProvider) throws 
NoResourceAvailableException {
+   // simply take the vertices without inputs.
+   for (ExecutionJobVertex ejv : this.tasks.values()) {
+   if (ejv.getJobVertex().isInputVertex()) {
+   ejv.scheduleAll(slotProvider, 
allowQueuedScheduling);
+   }
+   }
+   }
+
+   /**
+* 
+* 
+* @param slotProvider  The resource provider from which the slots are 
allocated
+* @param timeout   The maximum time that the deployment may take, 
before a
+*  TimeoutException is thrown.
+*/
+   private void scheduleEager(SlotProvider slotProvider, final Time 
timeout) {
+   checkState(state == JobStatus.RUNNING, "job is not running 
currently");
+
+   // Important: reserve all the space we need up front.
+   // that way we do not have any operation that can fail between 
allocating the slots
+   // and adding them to the list. If we had a failure in between 
there, that would
+   // cause the slots to get lost
+   final ArrayList resources = new 
ArrayList<>(getNumberOfExecutionJobVertices());
+   final boolean queued = allowQueuedScheduling;
+
+   // we use this flag to handle failures in a 'finally' clause
+   // that allows us to not go through clumsy cast-and-rethrow 
logic
+   boolean successful = false;
+
+   try {
+   // collecting all the slots may resize and fail in that 
operation without slots getting lost
+   final ArrayList slotFutures = new 
ArrayList<>(getNumberOfExecutionJobVertices());
+
+   // allocate the slots (obtain all their futures
+   for (ExecutionJobVertex ejv : 
getVerticesTopologically()) {
+   // these calls are not blocking, they only 
return futures
+   ExecutionAndSlot[] slots = 
ejv.allocateResourcesForAll(slotProvider, queued);
+
+   // we need to first add the slots to this list, 
to be safe on release
+   resources.add(slots);
+
+   for (ExecutionAndSlot ens : slots) {
+   slotFutures.add(ens.slotFuture);
+   }
+   }
+
+   // this future is complete once all slot futures are 
complete.
+   // the future fails once one slot future fails.
+   final ConjunctFuture allAllocationsComplete = 
FutureUtils.combineAll(slotFutures);
--- End diff --

True, it is not incorrect. But some tasks would be already deployed if we 
start as soon as some futures are ready. They would need to be canceled again, 
which gives these not so nice fast 
deploy/out-of-resource/cancel/wait-for-cancellation/retry/etc loops.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3295: [FLINK-5747] [distributed coordination] Eager sche...

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/3295#discussion_r101829019
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/concurrent/FutureUtils.java
 ---
@@ -88,4 +100,104 @@ public RetryException(Throwable cause) {
super(cause);
}
}
+
+   // 

+   //  composing futures
+   // 

+
+   /**
+* Creates a future that is complete once multiple other futures 
completed. 
+* The ConjunctFuture fails (completes exceptionally) once one of the 
Futures in the
+* conjunction fails.
+*
+* The ConjunctFuture gives access to how many Futures in the 
conjunction have already
+* completed successfully, via {@link 
ConjunctFuture#getNumFuturesCompleted()}. 
+* 
+* @param futures The futures that make up the conjunction. No null 
entries are allowed.
+* @return The ConjunctFuture that completes once all given futures are 
complete (or one fails).
+*/
+   public static ConjunctFuture combineAll(Collection> 
futures) {
+   checkNotNull(futures, "futures");
+   checkArgument(!futures.isEmpty(), "futures is empty");
+
+   final ConjunctFutureImpl conjunct = new 
ConjunctFutureImpl(futures.size());
+
+   for (Future future : futures) {
+   future.handle(conjunct.completionHandler);
+   }
+
+   return conjunct;
+   }
+
+   /**
+* A future that is complete once multiple other futures completed. The 
futures are not
+* necessarily of the same type, which is why the type of this Future 
is {@code Void}.
+* The ConjunctFuture fails (completes exceptionally) once one of the 
Futures in the
+* conjunction fails.
+* 
+* The advantage of using the ConjunctFuture over chaining all the 
futures (such as via
+* {@link Future#thenCombine(Future, BiFunction)}) is that 
ConjunctFuture also tracks how
+* many of the Futures are already complete.
+*/
+   public interface ConjunctFuture extends CompletableFuture {
+
+   /**
+* Gets the total number of Futures in the conjunction.
+* @return The total number of Futures in the conjunction.
+*/
+   int getNumFuturesTotal();
+
+   /**
+* Gets the number of Futures in the conjunction that are 
already complete.
+* @return The number of Futures in the conjunction that are 
already complete
+*/
+   int getNumFuturesCompleted();
+   }
+
+   /**
+* The implementation of the {@link ConjunctFuture}.
+* 
+* Implementation notice: The member fields all have package-private 
access, because they are
+* either accessed by an inner subclass or by the enclosing class.
+*/
+   private static class ConjunctFutureImpl extends 
FlinkCompletableFuture implements ConjunctFuture {
--- End diff --

Yes, with set rather then add it should work. Since the list gets 
initialized with an array, I would actually just use an array in the first 
place.

Followup ;-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5747) Eager Scheduling should deploy all Tasks together

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872360#comment-15872360
 ] 

ASF GitHub Bot commented on FLINK-5747:
---

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/3295#discussion_r101829019
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/concurrent/FutureUtils.java
 ---
@@ -88,4 +100,104 @@ public RetryException(Throwable cause) {
super(cause);
}
}
+
+   // 

+   //  composing futures
+   // 

+
+   /**
+* Creates a future that is complete once multiple other futures 
completed. 
+* The ConjunctFuture fails (completes exceptionally) once one of the 
Futures in the
+* conjunction fails.
+*
+* The ConjunctFuture gives access to how many Futures in the 
conjunction have already
+* completed successfully, via {@link 
ConjunctFuture#getNumFuturesCompleted()}. 
+* 
+* @param futures The futures that make up the conjunction. No null 
entries are allowed.
+* @return The ConjunctFuture that completes once all given futures are 
complete (or one fails).
+*/
+   public static ConjunctFuture combineAll(Collection> 
futures) {
+   checkNotNull(futures, "futures");
+   checkArgument(!futures.isEmpty(), "futures is empty");
+
+   final ConjunctFutureImpl conjunct = new 
ConjunctFutureImpl(futures.size());
+
+   for (Future future : futures) {
+   future.handle(conjunct.completionHandler);
+   }
+
+   return conjunct;
+   }
+
+   /**
+* A future that is complete once multiple other futures completed. The 
futures are not
+* necessarily of the same type, which is why the type of this Future 
is {@code Void}.
+* The ConjunctFuture fails (completes exceptionally) once one of the 
Futures in the
+* conjunction fails.
+* 
+* The advantage of using the ConjunctFuture over chaining all the 
futures (such as via
+* {@link Future#thenCombine(Future, BiFunction)}) is that 
ConjunctFuture also tracks how
+* many of the Futures are already complete.
+*/
+   public interface ConjunctFuture extends CompletableFuture {
+
+   /**
+* Gets the total number of Futures in the conjunction.
+* @return The total number of Futures in the conjunction.
+*/
+   int getNumFuturesTotal();
+
+   /**
+* Gets the number of Futures in the conjunction that are 
already complete.
+* @return The number of Futures in the conjunction that are 
already complete
+*/
+   int getNumFuturesCompleted();
+   }
+
+   /**
+* The implementation of the {@link ConjunctFuture}.
+* 
+* Implementation notice: The member fields all have package-private 
access, because they are
+* either accessed by an inner subclass or by the enclosing class.
+*/
+   private static class ConjunctFutureImpl extends 
FlinkCompletableFuture implements ConjunctFuture {
--- End diff --

Yes, with set rather then add it should work. Since the list gets 
initialized with an array, I would actually just use an array in the first 
place.

Followup ;-)


> Eager Scheduling should deploy all Tasks together
> -
>
> Key: FLINK-5747
> URL: https://issues.apache.org/jira/browse/FLINK-5747
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager
>Affects Versions: 1.2.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.3.0
>
>
> Currently, eager scheduling immediately triggers the scheduling for all 
> vertices and their subtasks in topological order. 
> This has two problems:
>   - This works only, as long as resource acquisition is "synchronous". With 
> dynamic resource acquisition in FLIP-6, the resources are returned as Futures 
> which may complete out of order. This results in out-of-order (not in 
> topological order) scheduling of tasks which does not work for streaming.
>   - Deploying some tasks that depend on other tasks before it is clear that 
> the other tasks have resources as well leads to situations where many 
> deploy/recovery cycles happen before enough resources are available to get 
> the job running fully.
> For eager scheduling, we should allocate all resources in one chunk and then 
> deploy once we know 

[GitHub] flink pull request #3351: [FLINK-4326] [scripts] Flink foreground services

2017-02-17 Thread greghogan
GitHub user greghogan opened a pull request:

https://github.com/apache/flink/pull/3351

[FLINK-4326] [scripts] Flink foreground services

Add a "start-foreground" option to the Flink service scripts which does not 
daemonize the service nor redirect output.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/greghogan/flink 
4326_flink_startup_scripts_should_optionally_start_services_on_the_foreground

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3351.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3351


commit c9a84fb189ef8342f3673137c2137bcb3fb54df7
Author: Greg Hogan 
Date:   2016-10-07T20:06:48Z

[FLINK-4326] [scripts] Flink foreground services

Add a "start-foreground" option to the Flink service scripts which does
not daemonize the service nor redirect output.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4460) Side Outputs in Flink

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872353#comment-15872353
 ] 

ASF GitHub Bot commented on FLINK-4460:
---

Github user chenqin commented on the issue:

https://github.com/apache/flink/pull/2982
  
@aljoscha Looks good to me 

I briefly looked at your git branch, a minor comment would be adding 
comments to `sideOutputLateData` so user get better idea when they opt-in to 
late arriving event stream. 

Initial late arriving event is decided by comparing watermark & eventTime, 
do you think there is a need to allow user pass a kinda `Evaluator` and enable 
user sideOutput any kind of sideOutputs?

`window.sideOutput(OutputTag, Evaluator)`

`interface Evaluator{ MergedWindows, key, watermark}`

- Regarding `split` `select`, I think there is a chance of consolidate 
select and build upon `OutputTag`, but might be out of this PR's scope.
- Regarding to `WindowStream`, I am a bit confused to figure out  if I use 
`allowedlateness` and  `sideOutputLateData` at same time.

Thanks,
Chen





> Side Outputs in Flink
> -
>
> Key: FLINK-4460
> URL: https://issues.apache.org/jira/browse/FLINK-4460
> Project: Flink
>  Issue Type: New Feature
>  Components: Core, DataStream API
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Chen Qin
>Assignee: Chen Qin
>  Labels: latearrivingevents, sideoutput
>
> https://docs.google.com/document/d/1vg1gpR8JL4dM07Yu4NyerQhhVvBlde5qdqnuJv4LcV4/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-4326) Flink start-up scripts should optionally start services on the foreground

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872352#comment-15872352
 ] 

ASF GitHub Bot commented on FLINK-4326:
---

GitHub user greghogan opened a pull request:

https://github.com/apache/flink/pull/3351

[FLINK-4326] [scripts] Flink foreground services

Add a "start-foreground" option to the Flink service scripts which does not 
daemonize the service nor redirect output.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/greghogan/flink 
4326_flink_startup_scripts_should_optionally_start_services_on_the_foreground

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3351.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3351


commit c9a84fb189ef8342f3673137c2137bcb3fb54df7
Author: Greg Hogan 
Date:   2016-10-07T20:06:48Z

[FLINK-4326] [scripts] Flink foreground services

Add a "start-foreground" option to the Flink service scripts which does
not daemonize the service nor redirect output.




> Flink start-up scripts should optionally start services on the foreground
> -
>
> Key: FLINK-4326
> URL: https://issues.apache.org/jira/browse/FLINK-4326
> Project: Flink
>  Issue Type: Improvement
>  Components: Startup Shell Scripts
>Affects Versions: 1.0.3
>Reporter: Elias Levy
>
> This has previously been mentioned in the mailing list, but has not been 
> addressed.  Flink start-up scripts start the job and task managers in the 
> background.  This makes it difficult to integrate Flink with most processes 
> supervisory tools and init systems, including Docker.  One can get around 
> this via hacking the scripts or manually starting the right classes via Java, 
> but it is a brittle solution.
> In addition to starting the daemons in the foreground, the start up scripts 
> should use exec instead of running the commends, so as to avoid forks.  Many 
> supervisory tools assume the PID of the process to be monitored is that of 
> the process it first executes, and fork chains make it difficult for the 
> supervisor to figure out what process to monitor.  Specifically, 
> jobmanager.sh and taskmanager.sh should exec flink-daemon.sh, and 
> flink-daemon.sh should exec java.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3295: [FLINK-5747] [distributed coordination] Eager sche...

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/3295#discussion_r101827724
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/concurrent/FutureUtils.java
 ---
@@ -88,4 +100,104 @@ public RetryException(Throwable cause) {
super(cause);
}
}
+
+   // 

+   //  composing futures
+   // 

+
+   /**
+* Creates a future that is complete once multiple other futures 
completed. 
+* The ConjunctFuture fails (completes exceptionally) once one of the 
Futures in the
+* conjunction fails.
+*
+* The ConjunctFuture gives access to how many Futures in the 
conjunction have already
+* completed successfully, via {@link 
ConjunctFuture#getNumFuturesCompleted()}. 
+* 
+* @param futures The futures that make up the conjunction. No null 
entries are allowed.
+* @return The ConjunctFuture that completes once all given futures are 
complete (or one fails).
+*/
+   public static ConjunctFuture combineAll(Collection> 
futures) {
+   checkNotNull(futures, "futures");
+   checkArgument(!futures.isEmpty(), "futures is empty");
--- End diff --

Yes, will change that...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5747) Eager Scheduling should deploy all Tasks together

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872348#comment-15872348
 ] 

ASF GitHub Bot commented on FLINK-5747:
---

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/3295#discussion_r101827724
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/concurrent/FutureUtils.java
 ---
@@ -88,4 +100,104 @@ public RetryException(Throwable cause) {
super(cause);
}
}
+
+   // 

+   //  composing futures
+   // 

+
+   /**
+* Creates a future that is complete once multiple other futures 
completed. 
+* The ConjunctFuture fails (completes exceptionally) once one of the 
Futures in the
+* conjunction fails.
+*
+* The ConjunctFuture gives access to how many Futures in the 
conjunction have already
+* completed successfully, via {@link 
ConjunctFuture#getNumFuturesCompleted()}. 
+* 
+* @param futures The futures that make up the conjunction. No null 
entries are allowed.
+* @return The ConjunctFuture that completes once all given futures are 
complete (or one fails).
+*/
+   public static ConjunctFuture combineAll(Collection> 
futures) {
+   checkNotNull(futures, "futures");
+   checkArgument(!futures.isEmpty(), "futures is empty");
--- End diff --

Yes, will change that...


> Eager Scheduling should deploy all Tasks together
> -
>
> Key: FLINK-5747
> URL: https://issues.apache.org/jira/browse/FLINK-5747
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager
>Affects Versions: 1.2.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.3.0
>
>
> Currently, eager scheduling immediately triggers the scheduling for all 
> vertices and their subtasks in topological order. 
> This has two problems:
>   - This works only, as long as resource acquisition is "synchronous". With 
> dynamic resource acquisition in FLIP-6, the resources are returned as Futures 
> which may complete out of order. This results in out-of-order (not in 
> topological order) scheduling of tasks which does not work for streaming.
>   - Deploying some tasks that depend on other tasks before it is clear that 
> the other tasks have resources as well leads to situations where many 
> deploy/recovery cycles happen before enough resources are available to get 
> the job running fully.
> For eager scheduling, we should allocate all resources in one chunk and then 
> deploy once we know that all are available.
> As a follow-up, the same should be done per pipelined component in lazy batch 
> scheduling as well. That way we get lazy scheduling across blocking 
> boundaries, and bulk (gang) scheduling in pipelined subgroups.
> This also does not apply for efforts of fine grained recovery, where 
> individual tasks request replacement resources.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3246: [FLINK-5353] [elasticsearch] User-provided failure handle...

2017-02-17 Thread fpompermaier
Github user fpompermaier commented on the issue:

https://github.com/apache/flink/pull/3246
  
Any news on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5353) Elasticsearch Sink loses well-formed documents when there are malformed documents

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872297#comment-15872297
 ] 

ASF GitHub Bot commented on FLINK-5353:
---

Github user fpompermaier commented on the issue:

https://github.com/apache/flink/pull/3246
  
Any news on this?


> Elasticsearch Sink loses well-formed documents when there are malformed 
> documents
> -
>
> Key: FLINK-5353
> URL: https://issues.apache.org/jira/browse/FLINK-5353
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.1.3
>Reporter: Flavio Pompermaier
>Assignee: Tzu-Li (Gordon) Tai
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5634) Flink should not always redirect stdout to a file.

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872294#comment-15872294
 ] 

ASF GitHub Bot commented on FLINK-5634:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3204
  
This is now looking like a game of hot potato :) I'm happy to let @iemejia 
create the PR but he had offered the same to me. If I don't hear otherwise 
first I'll create a PR for FLINK-4326. While sitting down so I don't step on 
anyone's toes.


> Flink should not always redirect stdout to a file.
> --
>
> Key: FLINK-5634
> URL: https://issues.apache.org/jira/browse/FLINK-5634
> Project: Flink
>  Issue Type: Bug
>  Components: Docker
>Affects Versions: 1.2.0
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>
> Flink always redirects stdout to a file.  While often convenient this isn't 
> always what people want.  The most obvious case of this is a Docker 
> deployment.
> It should be possible to have Flink log to stdout.
> Here is a PR for this:  https://github.com/apache/flink/pull/3204



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3204: [FLINK-5634] Flink should not always redirect stdout to a...

2017-02-17 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3204
  
This is now looking like a game of hot potato :) I'm happy to let @iemejia 
create the PR but he had offered the same to me. If I don't hear otherwise 
first I'll create a PR for FLINK-4326. While sitting down so I don't step on 
anyone's toes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3163) Configure Flink for NUMA systems

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872261#comment-15872261
 ] 

ASF GitHub Bot commented on FLINK-3163:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3249
  
@StephanEwen thanks for the review. I'll verify, test, and merge.


> Configure Flink for NUMA systems
> 
>
> Key: FLINK-3163
> URL: https://issues.apache.org/jira/browse/FLINK-3163
> Project: Flink
>  Issue Type: Improvement
>  Components: Startup Shell Scripts
>Affects Versions: 1.0.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.3.0
>
>
> On NUMA systems Flink can be pinned to a single physical processor ("node") 
> using {{numactl --membind=$node --cpunodebind=$node }}. Commonly 
> available NUMA systems include the largest AWS and Google Compute instances.
> For example, on an AWS c4.8xlarge system with 36 hyperthreads the user could 
> configure a single TaskManager with 36 slots or have Flink create two 
> TaskManagers bound to each of the NUMA nodes, each with 18 slots.
> There may be some extra overhead in transferring network buffers between 
> TaskManagers on the same system, though the fraction of data shuffled in this 
> manner decreases with the size of the cluster. The performance improvement 
> from only accessing local memory looks to be significant though difficult to 
> benchmark.
> The JobManagers may fit into NUMA nodes rather than requiring full systems.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-4810) Checkpoint Coordinator should fail ExecutionGraph after "n" unsuccessful checkpoints

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872260#comment-15872260
 ] 

ASF GitHub Bot commented on FLINK-4810:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3334
  
Thank you for opening this pull request.
I'll try to review it in the coming days...


> Checkpoint Coordinator should fail ExecutionGraph after "n" unsuccessful 
> checkpoints
> 
>
> Key: FLINK-4810
> URL: https://issues.apache.org/jira/browse/FLINK-4810
> Project: Flink
>  Issue Type: Sub-task
>  Components: State Backends, Checkpointing
>Reporter: Stephan Ewen
>
> The Checkpoint coordinator should track the number of consecutive 
> unsuccessful checkpoints.
> If more than {{n}} (configured value) checkpoints fail in a row, it should 
> call {{fail()}} on the execution graph to trigger a recovery.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3249: [FLINK-3163] [scripts] Configure Flink for NUMA systems

2017-02-17 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3249
  
@StephanEwen thanks for the review. I'll verify, test, and merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3334: FLINK-4810 Checkpoint Coordinator should fail ExecutionGr...

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3334
  
Thank you for opening this pull request.
I'll try to review it in the coming days...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3323: [hotfix] [core] Add missing stability annotations for cla...

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3323
  
Good cleanup, thanks a lot!

Merging this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4813) Having flink-test-utils as a dependency outside Flink fails the build

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872250#comment-15872250
 ] 

ASF GitHub Bot commented on FLINK-4813:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3322
  
The change looks good, thank you!
Merging this...


> Having flink-test-utils as a dependency outside Flink fails the build
> -
>
> Key: FLINK-4813
> URL: https://issues.apache.org/jira/browse/FLINK-4813
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.2.0
>Reporter: Robert Metzger
>Assignee: Nico Kruber
>
> The {{flink-test-utils}} depend on {{hadoop-minikdc}}, which has a 
> dependency, which is only resolvable, if the {{maven-bundle-plugin}} is 
> loaded.
> This is the error message
> {code}
> [ERROR] Failed to execute goal on project quickstart-1.2-tests: Could not 
> resolve dependencies for project 
> com.dataartisans:quickstart-1.2-tests:jar:1.0-SNAPSHOT: Failure to find 
> org.apache.directory.jdbm:apacheds-jdbm1:bundle:2.0.0-M2 in 
> https://repo.maven.apache.org/maven2 was cached in the local repository, 
> resolution will not be reattempted until the update interval of central has 
> elapsed or updates are forced -> [Help 1]
> {code}
> {{flink-parent}} loads that plugin, so all "internal" dependencies to the 
> test utils can resolve the plugin.
> Right now, users have to use the maven bundle plugin to use our test utils 
> externally.
> By making the hadoop minikdc dependency optional, we can probably resolve the 
> issues. Then, only users who want to use the security-related tools in the 
> test utils need to manually add the hadoop minikdc dependency + the plugin.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3322: [FLINK-4813][flink-test-utils] make the hadoop-minikdc de...

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3322
  
The change looks good, thank you!
Merging this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5277) missing unit test for ensuring ResultPartition#add always recycles buffers

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872246#comment-15872246
 ] 

ASF GitHub Bot commented on FLINK-5277:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3309
  
Good addition, thanks!
Merging this...


> missing unit test for ensuring ResultPartition#add always recycles buffers
> --
>
> Key: FLINK-5277
> URL: https://issues.apache.org/jira/browse/FLINK-5277
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> We rely on ResultPartition to recycle the buffer if the add calls fails.
> It makes sense to add a special test (to ResultPartitionTest or 
> RecordWriterTest) where we ensure that this actually happens to guard against 
> future behaviour changes in ResultPartition.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3309: [FLINK-5277] add unit tests for ResultPartition#add() in ...

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3309
  
Good addition, thanks!
Merging this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5739) NullPointerException in CliFrontend

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872240#comment-15872240
 ] 

ASF GitHub Bot commented on FLINK-5739:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3292
  
Change looks good, thank you!
Merging this...


> NullPointerException in CliFrontend
> ---
>
> Key: FLINK-5739
> URL: https://issues.apache.org/jira/browse/FLINK-5739
> Project: Flink
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 1.3.0
> Environment: Mac OS X 10.12.2, Java 1.8.0_92-b14
>Reporter: Zhuoluo Yang
>Assignee: Zhuoluo Yang
>  Labels: newbie, starter
> Fix For: 1.3.0
>
>
> I've run a simple program on a local cluster. It always fails with code 
> Version: 1.3-SNAPSHOTCommit: e24a866. 
> {quote}
> Zhuoluos-MacBook-Pro:build-target zhuoluo.yzl$ bin/flink run -c 
> com.alibaba.blink.TableApp ~/gitlab/tableapp/target/tableapp-1.0-SNAPSHOT.jar 
> Cluster configuration: Standalone cluster with JobManager at 
> localhost/127.0.0.1:6123
> Using address localhost:6123 to connect to JobManager.
> JobManager web interface address http://localhost:8081
> Starting execution of program
> 
>  The program finished with the following exception:
> java.lang.NullPointerException
>   at 
> org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:845)
>   at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
>   at 
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1076)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1123)
>   at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1120)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
>   at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1120)
> {quote}
> I don't think there should be a NullPointerException here, even if you forgot 
> the "execute()" call.
> The reproducing code looks like following:
> {code:java}
> public static void main(String[] args) throws Exception {
> ExecutionEnvironment env = 
> ExecutionEnvironment.getExecutionEnvironment();
> DataSource customer = 
> env.readTextFile("/Users/zhuoluo.yzl/customer.tbl");
> customer.filter(new FilterFunction() {
> public boolean filter(String value) throws Exception {
> return true;
> }
> })
> .writeAsText("/Users/zhuoluo.yzl/customer.txt");
> //env.execute();
> }
> {code}
> We can use *start-cluster.sh* on a *local* computer to reproduce the problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3292: [FLINK-5739] [client] fix NullPointerException in CliFron...

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3292
  
Change looks good, thank you!
Merging this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5024) Add SimpleStateDescriptor to clarify the concepts

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872233#comment-15872233
 ] 

ASF GitHub Bot commented on FLINK-5024:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3243
  
Subsumed by another pull request...


> Add SimpleStateDescriptor to clarify the concepts
> -
>
> Key: FLINK-5024
> URL: https://issues.apache.org/jira/browse/FLINK-5024
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Reporter: Xiaogang Shi
>Assignee: Xiaogang Shi
>
> Currently, StateDescriptors accept two type arguments : the first one is the 
> type of the created state and the second one is the type of the values in the 
> states. 
> The concepts however is a little confusing here because in ListStates, the 
> arguments passed to the StateDescriptors are the types of the list elements 
> instead of the lists. It also makes the implementation of MapStates difficult.
> I suggest not to put the type serializer in StateDescriptors, making 
> StateDescriptors independent of the data structures of the values. 
> A new type of StateDescriptor named SimpleStateDescriptor can be provided to 
> abstract those states (namely ValueState, ReducingState and FoldingState) 
> whose states are not composited. 
> The states (e.g. ListStates and MapStates) can implement their own 
> descriptors according to their data structures. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-3163) Configure Flink for NUMA systems

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872236#comment-15872236
 ] 

ASF GitHub Bot commented on FLINK-3163:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3249
  
Looks good.

+1 from my side!


> Configure Flink for NUMA systems
> 
>
> Key: FLINK-3163
> URL: https://issues.apache.org/jira/browse/FLINK-3163
> Project: Flink
>  Issue Type: Improvement
>  Components: Startup Shell Scripts
>Affects Versions: 1.0.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.3.0
>
>
> On NUMA systems Flink can be pinned to a single physical processor ("node") 
> using {{numactl --membind=$node --cpunodebind=$node }}. Commonly 
> available NUMA systems include the largest AWS and Google Compute instances.
> For example, on an AWS c4.8xlarge system with 36 hyperthreads the user could 
> configure a single TaskManager with 36 slots or have Flink create two 
> TaskManagers bound to each of the NUMA nodes, each with 18 slots.
> There may be some extra overhead in transferring network buffers between 
> TaskManagers on the same system, though the fraction of data shuffled in this 
> manner decreases with the size of the cluster. The performance improvement 
> from only accessing local memory looks to be significant though difficult to 
> benchmark.
> The JobManagers may fit into NUMA nodes rather than requiring full systems.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5024) Add SimpleStateDescriptor to clarify the concepts

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872234#comment-15872234
 ] 

ASF GitHub Bot commented on FLINK-5024:
---

Github user StephanEwen closed the pull request at:

https://github.com/apache/flink/pull/3243


> Add SimpleStateDescriptor to clarify the concepts
> -
>
> Key: FLINK-5024
> URL: https://issues.apache.org/jira/browse/FLINK-5024
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Reporter: Xiaogang Shi
>Assignee: Xiaogang Shi
>
> Currently, StateDescriptors accept two type arguments : the first one is the 
> type of the created state and the second one is the type of the values in the 
> states. 
> The concepts however is a little confusing here because in ListStates, the 
> arguments passed to the StateDescriptors are the types of the list elements 
> instead of the lists. It also makes the implementation of MapStates difficult.
> I suggest not to put the type serializer in StateDescriptors, making 
> StateDescriptors independent of the data structures of the values. 
> A new type of StateDescriptor named SimpleStateDescriptor can be provided to 
> abstract those states (namely ValueState, ReducingState and FoldingState) 
> whose states are not composited. 
> The states (e.g. ListStates and MapStates) can implement their own 
> descriptors according to their data structures. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3249: [FLINK-3163] [scripts] Configure Flink for NUMA systems

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3249
  
Looks good.

+1 from my side!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3243: [FLINK-5024] [core] Refactor the interface of State and S...

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3243
  
Subsumed by another pull request...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5682) Fix scala version in flink-streaming-scala POM file

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872231#comment-15872231
 ] 

ASF GitHub Bot commented on FLINK-5682:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3231
  
@billliuatuber Since the dependency management section in the root 
`pom.xml` defines the Scala version for all sub-modules, I think this change is 
not needed.
If you agree, could you close the pull request?
If you think I am overlooking something, please let me know!


> Fix scala version in  flink-streaming-scala POM file
> 
>
> Key: FLINK-5682
> URL: https://issues.apache.org/jira/browse/FLINK-5682
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Reporter: Bill Liu
>  Labels: build, easyfix
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> In flink-streaming-scala, it doesn't define the scala library version,
> when build Flink for scala 2.10, it still possiblely  includes scala 2.11. 
> {quote}
> 
>   org.scala-lang
>   scala-reflect
>   
>   
>   org.scala-lang
>   scala-library
>   
>   
>   org.scala-lang
>   scala-compiler
>   
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3243: [FLINK-5024] [core] Refactor the interface of Stat...

2017-02-17 Thread StephanEwen
Github user StephanEwen closed the pull request at:

https://github.com/apache/flink/pull/3243


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3231: [FLINK-5682] Fix scala version in flink-streaming-scala P...

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3231
  
@billliuatuber Since the dependency management section in the root 
`pom.xml` defines the Scala version for all sub-modules, I think this change is 
not needed.
If you agree, could you close the pull request?
If you think I am overlooking something, please let me know!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5669) flink-streaming-contrib DataStreamUtils.collect in local environment mode fails when offline

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872227#comment-15872227
 ] 

ASF GitHub Bot commented on FLINK-5669:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3223
  
This change looks good, thank you!

Merging this...


> flink-streaming-contrib DataStreamUtils.collect in local environment mode 
> fails when offline
> 
>
> Key: FLINK-5669
> URL: https://issues.apache.org/jira/browse/FLINK-5669
> Project: Flink
>  Issue Type: Bug
>  Components: flink-contrib
>Reporter: Rick Cox
>Priority: Minor
>
> {{DataStreamUtils.collect()}} needs to obtain the local machine's IP so that 
> the job can send the results back. In the case of local 
> {{StreamEnvironments}}, it uses {{InetAddress.getLocalHost()}}, which 
> attempts to resolve the local hostname using DNS.
> If DNS is not available (for example, when offline) or if DNS is available 
> but cannot resolve the hostname (for example, if the hostname is an intranet 
> name but the machine is not currently on that network), an 
> {{UnknownHostException}} will be thrown (and wrapped in an {{IOException}}).
> If the resolved IP is not reachable for some reason, streaming results will 
> fail.
> Since this case is for local execution only, it seems that using 
> {{InetAddress.getLoopbackAddress()}} would work just as well, and avoid the 
> assumptions made by {{getLocalHost()}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3223: [FLINK-5669] Change DataStreamUtils to use the loopback a...

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3223
  
This change looks good, thank you!

Merging this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5640) configure the explicit Unit Test file suffix

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872225#comment-15872225
 ] 

ASF GitHub Bot commented on FLINK-5640:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3211
  
Change looks good, thanks!

Merging this...


> configure the explicit Unit Test file suffix
> 
>
> Key: FLINK-5640
> URL: https://issues.apache.org/jira/browse/FLINK-5640
> Project: Flink
>  Issue Type: Test
>  Components: Tests
>Reporter: shijinkui
>Assignee: shijinkui
> Fix For: 1.2.1
>
>
> There are four types of Unit Test file: *ITCase.java, *Test.java, 
> *ITSuite.scala, *Suite.scala
> File name ending with "IT.java" is integration test. File name ending with 
> "Test.java"  is unit test.
> It's clear for Surefire plugin of default-test execution to declare that 
> "*Test.*" is Java Unit Test.
> The test file statistics below:
> * Suite  total: 10
> * ITCase  total: 378
> * Test  total: 1008
> * ITSuite  total: 14



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3211: [FLINK-5640][build]configure the explicit Unit Test file ...

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3211
  
Change looks good, thanks!

Merging this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5634) Flink should not always redirect stdout to a file.

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872224#comment-15872224
 ] 

ASF GitHub Bot commented on FLINK-5634:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3204
  
Does anyone want to take a stab at addressing this the 
https://issues.apache.org/jira/browse/FLINK-4326 way? I think no one is active 
on that issue right now...


> Flink should not always redirect stdout to a file.
> --
>
> Key: FLINK-5634
> URL: https://issues.apache.org/jira/browse/FLINK-5634
> Project: Flink
>  Issue Type: Bug
>  Components: Docker
>Affects Versions: 1.2.0
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>
> Flink always redirects stdout to a file.  While often convenient this isn't 
> always what people want.  The most obvious case of this is a Docker 
> deployment.
> It should be possible to have Flink log to stdout.
> Here is a PR for this:  https://github.com/apache/flink/pull/3204



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3204: [FLINK-5634] Flink should not always redirect stdout to a...

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3204
  
Does anyone want to take a stab at addressing this the 
https://issues.apache.org/jira/browse/FLINK-4326 way? I think no one is active 
on that issue right now...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5546) java.io.tmpdir setted as project build directory in surefire plugin

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872196#comment-15872196
 ] 

ASF GitHub Bot commented on FLINK-5546:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3190
  
All right, I am convinced now that this is a helpful change.

Merging this...


> java.io.tmpdir setted as project build directory in surefire plugin
> ---
>
> Key: FLINK-5546
> URL: https://issues.apache.org/jira/browse/FLINK-5546
> Project: Flink
>  Issue Type: Test
>  Components: Build System
> Environment: CentOS 7.2
>Reporter: Syinchwun Leo
>Assignee: shijinkui
> Fix For: 1.2.1
>
>
> When multiple Linux users run test at the same time, flink-runtime module may 
> fail. User A creates /tmp/cacheFile, and User B will have no permission to 
> visit the fold.  
> Failed tests: 
> FileCacheDeleteValidationTest.setup:79 Error initializing the test: 
> /tmp/cacheFile (Permission denied)
> Tests in error: 
> IOManagerTest.channelEnumerator:54 » Runtime Could not create storage 
> director...
> Tests run: 1385, Failures: 1, Errors: 1, Skipped: 8



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3190: [FLINK-5546][build] java.io.tmpdir setted as project buil...

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3190
  
All right, I am convinced now that this is a helpful change.

Merging this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5817) Fix test concurrent execution failure by test dir conflicts.

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872180#comment-15872180
 ] 

ASF GitHub Bot commented on FLINK-5817:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3341
  
Good change, thank you. merging this...


> Fix test concurrent execution failure by test dir conflicts.
> 
>
> Key: FLINK-5817
> URL: https://issues.apache.org/jira/browse/FLINK-5817
> Project: Flink
>  Issue Type: Bug
>Reporter: Wenlong Lyu
>Assignee: Wenlong Lyu
>
> Currently when different users build flink on the same machine, failure may 
> happen because some test utilities create test file using the fixed name, 
> which will cause file access failing when different user processing the same 
> file at the same time.
> We have found errors from AbstractTestBase, IOManagerTest, FileCacheTest.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3341: [FLINK-5817]Fix test concurrent execution failure by test...

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3341
  
Good change, thank you. merging this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3138: #Flink-5522 Storm Local Cluster can't work with powermock

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3138
  
I think that is a good fix, thank you!

Merging this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5497) remove duplicated tests

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872163#comment-15872163
 ] 

ASF GitHub Bot commented on FLINK-5497:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3089
  
Thank you for fixing this!


> remove duplicated tests
> ---
>
> Key: FLINK-5497
> URL: https://issues.apache.org/jira/browse/FLINK-5497
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests
>Reporter: Alexey Diomin
>Priority: Minor
>
> Now we have test which run the same code 4 times, every run 17+ seconds.
> Need do small refactoring and remove duplicated code.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3089: [FLINK-5497] remove duplicated tests

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3089
  
Thank you for fixing this!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3089: [FLINK-5497] remove duplicated tests

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3089
  
Okay, I finally found the time to double check this.

The changes are good, merging this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5497) remove duplicated tests

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872162#comment-15872162
 ] 

ASF GitHub Bot commented on FLINK-5497:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3089
  
Okay, I finally found the time to double check this.

The changes are good, merging this...


> remove duplicated tests
> ---
>
> Key: FLINK-5497
> URL: https://issues.apache.org/jira/browse/FLINK-5497
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests
>Reporter: Alexey Diomin
>Priority: Minor
>
> Now we have test which run the same code 4 times, every run 17+ seconds.
> Need do small refactoring and remove duplicated code.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5751) 404 in documentation

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872126#comment-15872126
 ] 

ASF GitHub Bot commented on FLINK-5751:
---

Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/3332#discussion_r101801834
  
--- Diff: docs/check_links.sh ---
@@ -0,0 +1,36 @@
+#!/usr/bin/env bash

+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.

+
+
+target=${1:-"http://localhost:4000"}
+
+# Crawl the docs, ignoring robots.txt, storing nothing locally
+wget --spider -r -nd -nv -e robots=off -p -o spider.log "$target"
+
+# Abort for anything other than 0 and 4 ("Network failure")
+status=$?
+if [ $status -ne 0 ] && [ $status -ne 4 ]; then
+exit $status
+fi
+
+# Fail the build if any broken links are found
+broken_links_str=$(grep -e 'Found [[:digit:]]\+ broken link(s)' spider.log)
+if [ -n "$broken_links_str" ]; then
+   echo -e "\e[1;31m$broken_links_str\e[0m"
--- End diff --

Thanks for catching this! Fixed in a separate commit.


> 404 in documentation
> 
>
> Key: FLINK-5751
> URL: https://issues.apache.org/jira/browse/FLINK-5751
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Colin Breame
>Priority: Trivial
> Fix For: 1.3.0, 1.2.1
>
>
> This page:
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/quickstart/setup_quickstart.html
> Contains a link with title "Flink on Windows" with URL:
> - 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/flink_on_windows
> This gives a 404.  It should be:
> - 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/flink_on_windows.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink pull request #3332: [FLINK-5751] [docs] Add link check script

2017-02-17 Thread uce
Github user uce commented on a diff in the pull request:

https://github.com/apache/flink/pull/3332#discussion_r101801834
  
--- Diff: docs/check_links.sh ---
@@ -0,0 +1,36 @@
+#!/usr/bin/env bash

+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.

+
+
+target=${1:-"http://localhost:4000"}
+
+# Crawl the docs, ignoring robots.txt, storing nothing locally
+wget --spider -r -nd -nv -e robots=off -p -o spider.log "$target"
+
+# Abort for anything other than 0 and 4 ("Network failure")
+status=$?
+if [ $status -ne 0 ] && [ $status -ne 4 ]; then
+exit $status
+fi
+
+# Fail the build if any broken links are found
+broken_links_str=$(grep -e 'Found [[:digit:]]\+ broken link(s)' spider.log)
+if [ -n "$broken_links_str" ]; then
+   echo -e "\e[1;31m$broken_links_str\e[0m"
--- End diff --

Thanks for catching this! Fixed in a separate commit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5178) allow BlobCache to use a distributed file system irrespective of the HA mode

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872125#comment-15872125
 ] 

ASF GitHub Bot commented on FLINK-5178:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3085
  
I would personally prefer to not make that change at this point.

Interpreting HA parameters in non-HA mode might come across as confusing to 
users.
Also, the new way of instantiating TaskManager and JobManager (FLIP-6) via 
the `HighAvailabiliytServices` should make more cases use the File-system-based 
BlobStore anyways (irrespective of HA setups).


> allow BlobCache to use a distributed file system irrespective of the HA mode
> 
>
> Key: FLINK-5178
> URL: https://issues.apache.org/jira/browse/FLINK-5178
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> After FLINK-5129, high availability (HA) mode adds the ability for the 
> BlobCache instances at the task managers to download blobs directly from the 
> distributed file system. It would be nice if this also worked in non-HA mode.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3085: [FLINK-5178] allow BlobCache to use a distributed file sy...

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3085
  
I would personally prefer to not make that change at this point.

Interpreting HA parameters in non-HA mode might come across as confusing to 
users.
Also, the new way of instantiating TaskManager and JobManager (FLIP-6) via 
the `HighAvailabiliytServices` should make more cases use the File-system-based 
BlobStore anyways (irrespective of HA setups).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5129) make the BlobServer use a distributed file system

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872122#comment-15872122
 ] 

ASF GitHub Bot commented on FLINK-5129:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3084
  
Good change, thanks!

Merging this...


> make the BlobServer use a distributed file system
> -
>
> Key: FLINK-5129
> URL: https://issues.apache.org/jira/browse/FLINK-5129
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> Currently, the BlobServer uses a local storage and, in addition when the HA 
> mode is set, a distributed file system, e.g. hdfs. This, however, is only 
> used by the JobManager and all TaskManager instances request blobs from the 
> JobManager. By using the distributed file system there as well, we would 
> lower the load on the JobManager and increase scalability.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3084: [FLINK-5129] make the BlobServer use a distributed file s...

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3084
  
Good change, thanks!

Merging this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5731) Split up CI builds

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872119#comment-15872119
 ] 

ASF GitHub Bot commented on FLINK-5731:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3344
  
Ten minutes allows for a significant number of new tests. Alternatively, we 
did look at parallelizing the tests but there were some dependency issues and 
the PR expired in review.

This does need fixed, but as you note the longer we put this off the deeper 
we dig ourselves in a hole. Are we looking to split the libraries out of the 
main repo?


> Split up CI builds
> --
>
> Key: FLINK-5731
> URL: https://issues.apache.org/jira/browse/FLINK-5731
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System, Tests
>Reporter: Ufuk Celebi
>Assignee: Robert Metzger
>Priority: Critical
>
> Test builds regularly time out because we are hitting the Travis 50 min 
> limit. Previously, we worked around this by splitting up the tests into 
> groups. I think we have to split them further.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3344: FLINK-5731 Spilt up tests into three disjoint groups

2017-02-17 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3344
  
Ten minutes allows for a significant number of new tests. Alternatively, we 
did look at parallelizing the tests but there were some dependency issues and 
the PR expired in review.

This does need fixed, but as you note the longer we put this off the deeper 
we dig ourselves in a hole. Are we looking to split the libraries out of the 
main repo?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3349: Updated DC/OS setup instructions.

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3349
  
Awesome to see that Flink is that easy to install in DC/OS...

Merging this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5828) BlobServer create cache dir has concurrency safety problem

2017-02-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15872105#comment-15872105
 ] 

ASF GitHub Bot commented on FLINK-5828:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3342
  
Merging this...


> BlobServer create cache dir has concurrency safety problem
> --
>
> Key: FLINK-5828
> URL: https://issues.apache.org/jira/browse/FLINK-5828
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Reporter: ZhengBowen
> Fix For: 1.2.0
>
>
> Caused by: java.lang.RuntimeException: 
> org.apache.flink.client.program.ProgramInvocationException: The program 
> execution failed: Could not upload the jar files to the job manager.
> at 
> FlinkJob_20170217_161058_04.bind(FlinkJob_20170217_161058_04.java:45) 
> at 
> com.aliyun.kepler.rc.query.schedule.FlinkQueryJob.call(FlinkQueryJob.java:53) 
> at 
> com.aliyun.kepler.rc.query.schedule.FlinkQueryJob.call(FlinkQueryJob.java:13) 
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> at 
> java.util.concurrent.AbstractExecutorService$2.run(AbstractExecutorService.java:120)
>  
> ... 3 common frames omitted
> Caused by: org.apache.flink.client.program.ProgramInvocationException: The 
> program execution failed: Could not upload the jar files to the job manager.
> at com.aliyun.kepler.rc.flink.client.Client.runBlocking(Client.java:178) 
> at 
> org.apache.flink.api.java.ClientEnvironment.execute(ClientEnvironment.java:169)
>  
> at 
> org.apache.flink.api.java.ClientEnvironment.execute(ClientEnvironment.java:225)
>  
> at 
> FlinkJob_20170217_161058_04.bind(FlinkJob_20170217_161058_04.java:42) 
> ... 7 common frames omitted
> Caused by: org.apache.flink.runtime.client.JobSubmissionException: Could not 
> upload the jar files to the job manager.
> at 
> org.apache.flink.runtime.client.JobClientActor$2.call(JobClientActor.java:359)
>  
> at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:94) 
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>  
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) 
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) 
> ... 3 common frames omitted
> Caused by: java.io.IOException: Could not retrieve the JobManager's blob port.
> at 
> org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:706) 
> at 
> org.apache.flink.runtime.jobgraph.JobGraph.uploadUserJars(JobGraph.java:556) 
> at 
> org.apache.flink.runtime.client.JobClientActor$2.call(JobClientActor.java:357)
>  
> ... 7 common frames omitted
> Caused by: java.io.IOException: PUT operation failed: Server side error: 
> Could not create cache directory 
> '/home/kepler/kepler3012/data/work/blobs/blobStore-c3566cb2-b3d6-40ae-bdcf-594a81c8881b/cache'.
> at 
> org.apache.flink.runtime.blob.BlobClient.putInputStream(BlobClient.java:476) 
> at org.apache.flink.runtime.blob.BlobClient.put(BlobClient.java:338) 
> at 
> org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:730) 
> at 
> org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:701) 
> ... 9 common frames omitted



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] flink issue #3342: TO FLINK-5828

2017-02-17 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3342
  
Merging this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


  1   2   3   >