[jira] [Commented] (FLINK-5818) change checkpoint dir permission to 700 for security reason
[ https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15873048#comment-15873048 ] ASF GitHub Bot commented on FLINK-5818: --- Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3335 I've read all guys and list preconditions and solutions for this directory permission setting. ## Preconditions 1. Every flink job(session or single) can specify a directory storing checkpoint, called `state.backend.fs.checkpointdir`. 2. Different jobs can set same or different directories, which means their checkpoint files can be stored in one same or different directories, with **sub-dir** created with their own job-ids. 3. Jobs can be run by different users, and users has requirement that one could not read chp files written by another user, which will cause information leak. 4. In some condition(which is relatively rare, I think), as @StephanEwen said, users has need to access other users’ chp files for cloning/migrating jobs. 5. The chp files path is like: `hdfs://namenode:port/flink-checkpoints//chk-17/6ba7b810-9dad-11d1-80b4-00c04fd430c8` ## Solutions ### Solution #1 (would not require changes) 1. Admins control permission of root directory via HDFS ACLs(set it like: user1 can read&write, user2 can only read, …). 2. This has two disadvantages: a) It is a huge burden for Admins to set different permissions for large number of users/groups); and b) sub-dirs inherited permissions from root directory, which means they are basically same, which make it hard to do fine grained control. ### Solution #2 (this proposal) 1. We don’t care what permission of the root dir is. It can be create while setup or job running, as long as it is available to use. 2. We control every sub-dir created by different jobs(which are submitted by different users, in most cases), and set it to a lower value(like “700”) to prevent it to be read by others. 3. If someone wanna migrate or clone jobs across users(again, this scenario is rare in my view), he should ask admins(normally HDFS admin) to add ACLs(or whatever) for this purpose. > change checkpoint dir permission to 700 for security reason > --- > > Key: FLINK-5818 > URL: https://issues.apache.org/jira/browse/FLINK-5818 > Project: Flink > Issue Type: Sub-task > Components: Security, State Backends, Checkpointing >Reporter: Tao Wang > > Now checkpoint directory is made w/o specified permission, so it is easy for > another user to delete or read files under it, which will cause restore > failure or information leak. > It's better to lower it down to 700. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3335 I've read all guys and list preconditions and solutions for this directory permission setting. ## Preconditions 1. Every flink job(session or single) can specify a directory storing checkpoint, called `state.backend.fs.checkpointdir`. 2. Different jobs can set same or different directories, which means their checkpoint files can be stored in one same or different directories, with **sub-dir** created with their own job-ids. 3. Jobs can be run by different users, and users has requirement that one could not read chp files written by another user, which will cause information leak. 4. In some condition(which is relatively rare, I think), as @StephanEwen said, users has need to access other usersâ chp files for cloning/migrating jobs. 5. The chp files path is like: `hdfs://namenode:port/flink-checkpoints//chk-17/6ba7b810-9dad-11d1-80b4-00c04fd430c8` ## Solutions ### Solution #1 (would not require changes) 1. Admins control permission of root directory via HDFS ACLs(set it like: user1 can read&write, user2 can only read, â¦). 2. This has two disadvantages: a) It is a huge burden for Admins to set different permissions for large number of users/groups); and b) sub-dirs inherited permissions from root directory, which means they are basically same, which make it hard to do fine grained control. ### Solution #2 (this proposal) 1. We donât care what permission of the root dir is. It can be create while setup or job running, as long as it is available to use. 2. We control every sub-dir created by different jobs(which are submitted by different users, in most cases), and set it to a lower value(like â700â) to prevent it to be read by others. 3. If someone wanna migrate or clone jobs across users(again, this scenario is rare in my view), he should ask admins(normally HDFS admin) to add ACLs(or whatever) for this purpose. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3269: [FLINK-5698] Add NestedFieldsProjectableTableSourc...
Github user wuchong commented on a diff in the pull request: https://github.com/apache/flink/pull/3269#discussion_r101887738 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/util/RexProgramProjectExtractor.scala --- @@ -84,6 +105,39 @@ object RexProgramProjectExtractor { } /** + * A RexVisitor to extract used nested input fields + */ +class RefFieldAccessorVisitor( +usedFields: Array[Int], +names: List[String]) + extends RexVisitorImpl[Unit](true) { + + private val group = usedFields.toList + private var nestedFields = mutable.LinkedHashSet[String]() + + def getNestedFields: Array[String] = nestedFields.toArray + + override def visitFieldAccess(fieldAccess: RexFieldAccess): Unit = { +fieldAccess.getReferenceExpr match { + case ref: RexInputRef => +nestedFields += s"${names(ref.getIndex)}.${fieldAccess.getField.getName}" --- End diff -- Yes, the parent of `RexFieldAccess` can be also a `RexFieldAccess`. We should take care of that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3269: [FLINK-5698] Add NestedFieldsProjectableTableSourc...
Github user wuchong commented on a diff in the pull request: https://github.com/apache/flink/pull/3269#discussion_r101887737 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/NestedFieldsProjectableTableSource.scala --- @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.sources + +/** + * Adds support for projection push-down to a [[TableSource]] with nested fields. + * A [[TableSource]] extending this interface is able + * to project the nested fields of the return table. + * + * @tparam T The return type of the [[NestedFieldsProjectableTableSource]]. + */ +trait NestedFieldsProjectableTableSource[T] extends ProjectableTableSource[T] { + + /** +* Creates a copy of the [[NestedFieldsProjectableTableSource]] +* that projects its output on the specified nested fields. +* +* @param fields The indexes of the fields to return. +* @return A copy of the [[NestedFieldsProjectableTableSource]] that projects its output. +*/ + def projectNestedFields(fields: Array[String]): NestedFieldsProjectableTableSource[T] --- End diff -- @fhueske , I'm fine with this, but have some questions to make sure I understand right. Say we have a complex table schema as shown below: ``` id, student, age, name>, teacher ``` The `id, student, teacher` is the first level column, and `student` have a nested `school, age, name` columns, and `school` has a nested `city, tuition` columns also. If a user select `id, student.school.city, student.age, teacher`, what the actual arguments should be? `field = [0, 1, 2]` and `nestedFields = `[ [], ["school.city", "age"], ["age", "name"] ]` ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5698) Add NestedFieldsProjectableTableSource interface
[ https://issues.apache.org/jira/browse/FLINK-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15873045#comment-15873045 ] ASF GitHub Bot commented on FLINK-5698: --- Github user wuchong commented on a diff in the pull request: https://github.com/apache/flink/pull/3269#discussion_r101887738 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/util/RexProgramProjectExtractor.scala --- @@ -84,6 +105,39 @@ object RexProgramProjectExtractor { } /** + * A RexVisitor to extract used nested input fields + */ +class RefFieldAccessorVisitor( +usedFields: Array[Int], +names: List[String]) + extends RexVisitorImpl[Unit](true) { + + private val group = usedFields.toList + private var nestedFields = mutable.LinkedHashSet[String]() + + def getNestedFields: Array[String] = nestedFields.toArray + + override def visitFieldAccess(fieldAccess: RexFieldAccess): Unit = { +fieldAccess.getReferenceExpr match { + case ref: RexInputRef => +nestedFields += s"${names(ref.getIndex)}.${fieldAccess.getField.getName}" --- End diff -- Yes, the parent of `RexFieldAccess` can be also a `RexFieldAccess`. We should take care of that. > Add NestedFieldsProjectableTableSource interface > > > Key: FLINK-5698 > URL: https://issues.apache.org/jira/browse/FLINK-5698 > Project: Flink > Issue Type: New Feature > Components: Table API & SQL >Reporter: Anton Solovev >Assignee: Anton Solovev > > Add a NestedFieldsProjectableTableSource interface for some TableSource > implementation that support nesting projection push-down. > The interface could look as follows > {code} > def trait NestedFieldsProjectableTableSource { > def projectNestedFields(fields: Array[String]): > NestedFieldsProjectableTableSource[T] > } > {code} > This interface works together with ProjectableTableSource -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-5698) Add NestedFieldsProjectableTableSource interface
[ https://issues.apache.org/jira/browse/FLINK-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15873044#comment-15873044 ] ASF GitHub Bot commented on FLINK-5698: --- Github user wuchong commented on a diff in the pull request: https://github.com/apache/flink/pull/3269#discussion_r101887737 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/NestedFieldsProjectableTableSource.scala --- @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.sources + +/** + * Adds support for projection push-down to a [[TableSource]] with nested fields. + * A [[TableSource]] extending this interface is able + * to project the nested fields of the return table. + * + * @tparam T The return type of the [[NestedFieldsProjectableTableSource]]. + */ +trait NestedFieldsProjectableTableSource[T] extends ProjectableTableSource[T] { + + /** +* Creates a copy of the [[NestedFieldsProjectableTableSource]] +* that projects its output on the specified nested fields. +* +* @param fields The indexes of the fields to return. +* @return A copy of the [[NestedFieldsProjectableTableSource]] that projects its output. +*/ + def projectNestedFields(fields: Array[String]): NestedFieldsProjectableTableSource[T] --- End diff -- @fhueske , I'm fine with this, but have some questions to make sure I understand right. Say we have a complex table schema as shown below: ``` id, student, age, name>, teacher ``` The `id, student, teacher` is the first level column, and `student` have a nested `school, age, name` columns, and `school` has a nested `city, tuition` columns also. If a user select `id, student.school.city, student.age, teacher`, what the actual arguments should be? `field = [0, 1, 2]` and `nestedFields = `[ [], ["school.city", "age"], ["age", "name"] ]` ? > Add NestedFieldsProjectableTableSource interface > > > Key: FLINK-5698 > URL: https://issues.apache.org/jira/browse/FLINK-5698 > Project: Flink > Issue Type: New Feature > Components: Table API & SQL >Reporter: Anton Solovev >Assignee: Anton Solovev > > Add a NestedFieldsProjectableTableSource interface for some TableSource > implementation that support nesting projection push-down. > The interface could look as follows > {code} > def trait NestedFieldsProjectableTableSource { > def projectNestedFields(fields: Array[String]): > NestedFieldsProjectableTableSource[T] > } > {code} > This interface works together with ProjectableTableSource -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-5441) Directly allow SQL queries on a Table
[ https://issues.apache.org/jira/browse/FLINK-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15873043#comment-15873043 ] ASF GitHub Bot commented on FLINK-5441: --- Github user KurtYoung commented on the issue: https://github.com/apache/flink/pull/3107 Hi @fhueske, both solutions are fine with me. > Directly allow SQL queries on a Table > - > > Key: FLINK-5441 > URL: https://issues.apache.org/jira/browse/FLINK-5441 > Project: Flink > Issue Type: New Feature > Components: Table API & SQL >Reporter: Timo Walther >Assignee: Jark Wu > > Right now a user has to register a table before it can be used in SQL > queries. In order to allow more fluent programming we propose calling SQL > directly on a table. An underscore can be used to reference the current table: > {code} > myTable.sql("SELECT a, b, c FROM _ WHERE d = 12") > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3107: [FLINK-5441] [table] Directly allow SQL queries on a Tabl...
Github user KurtYoung commented on the issue: https://github.com/apache/flink/pull/3107 Hi @fhueske, both solutions are fine with me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5414) Bump up Calcite version to 1.11
[ https://issues.apache.org/jira/browse/FLINK-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15873024#comment-15873024 ] ASF GitHub Bot commented on FLINK-5414: --- Github user wuchong commented on the issue: https://github.com/apache/flink/pull/3338 Hi @fhueske , yes, Calcite forces an Calc after each aggregate that only renames fields, because we rename every aggregates in Table API which is not necessary. I changed the logic of getting projections on aggregates to only rename the duplicate aggregates. And that works good, no more Calc appended. Hi @haohui , the ArrayRelDataType is still NOT NULL. I reverted [that line](https://github.com/apache/flink/blob/master/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/calcite/FlinkTypeFactory.scala#L121) which is not need to be changed in this PR. Cheers, Jark Wu > Bump up Calcite version to 1.11 > --- > > Key: FLINK-5414 > URL: https://issues.apache.org/jira/browse/FLINK-5414 > Project: Flink > Issue Type: Improvement > Components: Table API & SQL >Reporter: Timo Walther >Assignee: Jark Wu > > The upcoming Calcite release 1.11 has a lot of stability fixes and new > features. We should update it for the Table API. > E.g. we can hopefully merge FLINK-4864 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3338: [FLINK-5414] [table] Bump up Calcite version to 1.11
Github user wuchong commented on the issue: https://github.com/apache/flink/pull/3338 Hi @fhueske , yes, Calcite forces an Calc after each aggregate that only renames fields, because we rename every aggregates in Table API which is not necessary. I changed the logic of getting projections on aggregates to only rename the duplicate aggregates. And that works good, no more Calc appended. Hi @haohui , the ArrayRelDataType is still NOT NULL. I reverted [that line](https://github.com/apache/flink/blob/master/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/calcite/FlinkTypeFactory.scala#L121) which is not need to be changed in this PR. Cheers, Jark Wu --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3330: [FLINK-5795][TableAPI&SQL] Improve UDTF to support...
Github user wuchong commented on a diff in the pull request: https://github.com/apache/flink/pull/3330#discussion_r101885634 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/codegen/CodeGenerator.scala --- @@ -1463,21 +1465,23 @@ class CodeGenerator( */ def addReusableFunction(function: UserDefinedFunction): String = { val classQualifier = function.getClass.getCanonicalName -val fieldTerm = s"function_${classQualifier.replace('.', '$')}" +val functionSerializedData = serialize(function) +val fieldTerm = + s""" + |function_${classQualifier.replace('.', '$')}_${DigestUtils.md5Hex(functionSerializedData)} --- End diff -- I find that the md5Hex string in the fieldTerm is never used. What about using `CodeGenUtils.newName` to generate a new function field name (as shown below). It is a common usage in `CodeGenerator` and there must be no naming collisions and the generated name will be more readable. What do you think @sunjincheng121 @fhueske ? ``` CodeGenUtils.newName(s"function_${classQualifier.replace('.', '$')}") ``` Regarding to another PR for scalar UDFs, I think you are right. We can that in this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5795) Improve “UDTF" to support constructor with parameter.
[ https://issues.apache.org/jira/browse/FLINK-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15873010#comment-15873010 ] ASF GitHub Bot commented on FLINK-5795: --- Github user wuchong commented on a diff in the pull request: https://github.com/apache/flink/pull/3330#discussion_r101885634 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/codegen/CodeGenerator.scala --- @@ -1463,21 +1465,23 @@ class CodeGenerator( */ def addReusableFunction(function: UserDefinedFunction): String = { val classQualifier = function.getClass.getCanonicalName -val fieldTerm = s"function_${classQualifier.replace('.', '$')}" +val functionSerializedData = serialize(function) +val fieldTerm = + s""" + |function_${classQualifier.replace('.', '$')}_${DigestUtils.md5Hex(functionSerializedData)} --- End diff -- I find that the md5Hex string in the fieldTerm is never used. What about using `CodeGenUtils.newName` to generate a new function field name (as shown below). It is a common usage in `CodeGenerator` and there must be no naming collisions and the generated name will be more readable. What do you think @sunjincheng121 @fhueske ? ``` CodeGenUtils.newName(s"function_${classQualifier.replace('.', '$')}") ``` Regarding to another PR for scalar UDFs, I think you are right. We can that in this PR. > Improve “UDTF" to support constructor with parameter. > - > > Key: FLINK-5795 > URL: https://issues.apache.org/jira/browse/FLINK-5795 > Project: Flink > Issue Type: Sub-task > Components: Table API & SQL >Reporter: sunjincheng >Assignee: sunjincheng > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-3679) DeserializationSchema should handle zero or more outputs for every input
[ https://issues.apache.org/jira/browse/FLINK-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15873009#comment-15873009 ] Tzu-Li (Gordon) Tai commented on FLINK-3679: Hi [~wheat9], sure! I've noticed your PR, and will schedule some time next week to review it ;-) Thank you for the reminder. > DeserializationSchema should handle zero or more outputs for every input > > > Key: FLINK-3679 > URL: https://issues.apache.org/jira/browse/FLINK-3679 > Project: Flink > Issue Type: Bug > Components: DataStream API, Kafka Connector >Reporter: Jamie Grier >Assignee: Haohui Mai > > There are a couple of issues with the DeserializationSchema API that I think > should be improved. This request has come to me via an existing Flink user. > The main issue is simply that the API assumes that there is a one-to-one > mapping between input and outputs. In reality there are scenarios where one > input message (say from Kafka) might actually map to zero or more logical > elements in the pipeline. > Particularly important here is the case where you receive a message from a > source (such as Kafka) and say the raw bytes don't deserialize properly. > Right now the only recourse is to throw IOException and therefore fail the > job. > This is definitely not good since bad data is a reality and failing the job > is not the right option. If the job fails we'll just end up replaying the > bad data and the whole thing will start again. > Instead in this case it would be best if the user could just return the empty > set. > The other case is where one input message should logically be multiple output > messages. This case is probably less important since there are other ways to > do this but in general it might be good to make the > DeserializationSchema.deserialize() method return a collection rather than a > single element. > Maybe we need to support a DeserializationSchema variant that has semantics > more like that of FlatMap. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-5353) Elasticsearch Sink loses well-formed documents when there are malformed documents
[ https://issues.apache.org/jira/browse/FLINK-5353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15873008#comment-15873008 ] ASF GitHub Bot commented on FLINK-5353: --- Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/3246 Hi @fpompermaier, sorry I was busy with other stuff over the last week. I hope to work towards merging this by the end of next week! > Elasticsearch Sink loses well-formed documents when there are malformed > documents > - > > Key: FLINK-5353 > URL: https://issues.apache.org/jira/browse/FLINK-5353 > Project: Flink > Issue Type: Bug > Components: Streaming Connectors >Affects Versions: 1.1.3 >Reporter: Flavio Pompermaier >Assignee: Tzu-Li (Gordon) Tai > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3246: [FLINK-5353] [elasticsearch] User-provided failure handle...
Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/3246 Hi @fpompermaier, sorry I was busy with other stuff over the last week. I hope to work towards merging this by the end of next week! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3231: [FLINK-5682] Fix scala version in flink-streaming-...
Github user billliuatuber closed the pull request at: https://github.com/apache/flink/pull/3231 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5682) Fix scala version in flink-streaming-scala POM file
[ https://issues.apache.org/jira/browse/FLINK-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872986#comment-15872986 ] ASF GitHub Bot commented on FLINK-5682: --- Github user billliuatuber closed the pull request at: https://github.com/apache/flink/pull/3231 > Fix scala version in flink-streaming-scala POM file > > > Key: FLINK-5682 > URL: https://issues.apache.org/jira/browse/FLINK-5682 > Project: Flink > Issue Type: Bug > Components: Build System >Reporter: Bill Liu > Labels: build, easyfix > Original Estimate: 48h > Remaining Estimate: 48h > > In flink-streaming-scala, it doesn't define the scala library version, > when build Flink for scala 2.10, it still possiblely includes scala 2.11. > {quote} > > org.scala-lang > scala-reflect > > > org.scala-lang > scala-library > > > org.scala-lang > scala-compiler > > {quote} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (FLINK-5818) change checkpoint dir permission to 700 for security reason
[ https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shijinkui updated FLINK-5818: - Issue Type: Sub-task (was: Improvement) Parent: FLINK-5839 > change checkpoint dir permission to 700 for security reason > --- > > Key: FLINK-5818 > URL: https://issues.apache.org/jira/browse/FLINK-5818 > Project: Flink > Issue Type: Sub-task > Components: Security, State Backends, Checkpointing >Reporter: Tao Wang > > Now checkpoint directory is made w/o specified permission, so it is easy for > another user to delete or read files under it, which will cause restore > failure or information leak. > It's better to lower it down to 700. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (FLINK-5546) java.io.tmpdir setted as project build directory in surefire plugin
[ https://issues.apache.org/jira/browse/FLINK-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shijinkui updated FLINK-5546: - Issue Type: Sub-task (was: Test) Parent: FLINK-5839 > java.io.tmpdir setted as project build directory in surefire plugin > --- > > Key: FLINK-5546 > URL: https://issues.apache.org/jira/browse/FLINK-5546 > Project: Flink > Issue Type: Sub-task > Components: Build System > Environment: CentOS 7.2 >Reporter: Syinchwun Leo >Assignee: shijinkui > Fix For: 1.2.1 > > > When multiple Linux users run test at the same time, flink-runtime module may > fail. User A creates /tmp/cacheFile, and User B will have no permission to > visit the fold. > Failed tests: > FileCacheDeleteValidationTest.setup:79 Error initializing the test: > /tmp/cacheFile (Permission denied) > Tests in error: > IOManagerTest.channelEnumerator:54 » Runtime Could not create storage > director... > Tests run: 1385, Failures: 1, Errors: 1, Skipped: 8 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (FLINK-5640) configure the explicit Unit Test file suffix
[ https://issues.apache.org/jira/browse/FLINK-5640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shijinkui updated FLINK-5640: - Issue Type: Sub-task (was: Test) Parent: FLINK-5839 > configure the explicit Unit Test file suffix > > > Key: FLINK-5640 > URL: https://issues.apache.org/jira/browse/FLINK-5640 > Project: Flink > Issue Type: Sub-task > Components: Tests >Reporter: shijinkui >Assignee: shijinkui > Fix For: 1.2.1 > > > There are four types of Unit Test file: *ITCase.java, *Test.java, > *ITSuite.scala, *Suite.scala > File name ending with "IT.java" is integration test. File name ending with > "Test.java" is unit test. > It's clear for Surefire plugin of default-test execution to declare that > "*Test.*" is Java Unit Test. > The test file statistics below: > * Suite total: 10 > * ITCase total: 378 > * Test total: 1008 > * ITSuite total: 14 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (FLINK-5839) Flink Security problem collection
[ https://issues.apache.org/jira/browse/FLINK-5839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shijinkui updated FLINK-5839: - Summary: Flink Security problem collection (was: Flink Security in Huawei's use case) > Flink Security problem collection > - > > Key: FLINK-5839 > URL: https://issues.apache.org/jira/browse/FLINK-5839 > Project: Flink > Issue Type: Improvement >Reporter: shijinkui > > This issue collect some security problem found in huawei's use case. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-5839) Flink Security in Huawei's use case
shijinkui created FLINK-5839: Summary: Flink Security in Huawei's use case Key: FLINK-5839 URL: https://issues.apache.org/jira/browse/FLINK-5839 Project: Flink Issue Type: Improvement Reporter: shijinkui This issue collect some security problem found in huawei's use case. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3292: [FLINK-5739] [client] fix NullPointerException in CliFron...
Github user clarkyzl commented on the issue: https://github.com/apache/flink/pull/3292 Thanks a lot --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5739) NullPointerException in CliFrontend
[ https://issues.apache.org/jira/browse/FLINK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872914#comment-15872914 ] ASF GitHub Bot commented on FLINK-5739: --- Github user clarkyzl commented on the issue: https://github.com/apache/flink/pull/3292 Thanks a lot > NullPointerException in CliFrontend > --- > > Key: FLINK-5739 > URL: https://issues.apache.org/jira/browse/FLINK-5739 > Project: Flink > Issue Type: Bug > Components: Client >Affects Versions: 1.3.0 > Environment: Mac OS X 10.12.2, Java 1.8.0_92-b14 >Reporter: Zhuoluo Yang >Assignee: Zhuoluo Yang > Labels: newbie, starter > Fix For: 1.3.0 > > > I've run a simple program on a local cluster. It always fails with code > Version: 1.3-SNAPSHOTCommit: e24a866. > {quote} > Zhuoluos-MacBook-Pro:build-target zhuoluo.yzl$ bin/flink run -c > com.alibaba.blink.TableApp ~/gitlab/tableapp/target/tableapp-1.0-SNAPSHOT.jar > Cluster configuration: Standalone cluster with JobManager at > localhost/127.0.0.1:6123 > Using address localhost:6123 to connect to JobManager. > JobManager web interface address http://localhost:8081 > Starting execution of program > > The program finished with the following exception: > java.lang.NullPointerException > at > org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:845) > at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259) > at > org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1076) > at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1123) > at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1120) > at > org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at > org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40) > at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1120) > {quote} > I don't think there should be a NullPointerException here, even if you forgot > the "execute()" call. > The reproducing code looks like following: > {code:java} > public static void main(String[] args) throws Exception { > ExecutionEnvironment env = > ExecutionEnvironment.getExecutionEnvironment(); > DataSource customer = > env.readTextFile("/Users/zhuoluo.yzl/customer.tbl"); > customer.filter(new FilterFunction() { > public boolean filter(String value) throws Exception { > return true; > } > }) > .writeAsText("/Users/zhuoluo.yzl/customer.txt"); > //env.execute(); > } > {code} > We can use *start-cluster.sh* on a *local* computer to reproduce the problem. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (FLINK-4819) Checkpoint metadata+data inspection tool (view / update)
[ https://issues.apache.org/jira/browse/FLINK-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhenzhong Xu updated FLINK-4819: Description: Checkpoint inspection tool for operationalization, troubleshooting, diagnostics, etc, or performing brain surgery. If the tool can be done in a way that's programatically accessible, that'll be great for us to automating the some of the validations. was:Checkpoint inspection tool for operationalization, troubleshooting, diagnostics, etc, or performing brain surgery. > Checkpoint metadata+data inspection tool (view / update) > > > Key: FLINK-4819 > URL: https://issues.apache.org/jira/browse/FLINK-4819 > Project: Flink > Issue Type: Task > Components: State Backends, Checkpointing >Reporter: Zhenzhong Xu > > Checkpoint inspection tool for operationalization, troubleshooting, > diagnostics, etc, or performing brain surgery. > If the tool can be done in a way that's programatically accessible, that'll > be great for us to automating the some of the validations. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-5414) Bump up Calcite version to 1.11
[ https://issues.apache.org/jira/browse/FLINK-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872765#comment-15872765 ] ASF GitHub Bot commented on FLINK-5414: --- Github user haohui commented on the issue: https://github.com/apache/flink/pull/3338 Looks good to me overall. One question -- I wonder, does it mean that all array types become nullable after this change? > Bump up Calcite version to 1.11 > --- > > Key: FLINK-5414 > URL: https://issues.apache.org/jira/browse/FLINK-5414 > Project: Flink > Issue Type: Improvement > Components: Table API & SQL >Reporter: Timo Walther >Assignee: Jark Wu > > The upcoming Calcite release 1.11 has a lot of stability fixes and new > features. We should update it for the Table API. > E.g. we can hopefully merge FLINK-4864 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3338: [FLINK-5414] [table] Bump up Calcite version to 1.11
Github user haohui commented on the issue: https://github.com/apache/flink/pull/3338 Looks good to me overall. One question -- I wonder, does it mean that all array types become nullable after this change? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-4660) HadoopFileSystem (with S3A) may leak connections, which cause job to stuck in a restarting loop
[ https://issues.apache.org/jira/browse/FLINK-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhenzhong Xu resolved FLINK-4660. - Resolution: Fixed > HadoopFileSystem (with S3A) may leak connections, which cause job to stuck in > a restarting loop > --- > > Key: FLINK-4660 > URL: https://issues.apache.org/jira/browse/FLINK-4660 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Reporter: Zhenzhong Xu >Priority: Critical > Attachments: Screen Shot 2016-09-20 at 2.49.14 PM.png, Screen Shot > 2016-09-20 at 2.49.32 PM.png > > > Flink job with checkpoints enabled and configured to use S3A file system > backend, sometimes experiences checkpointing failure due to S3 consistency > issue. This behavior is also reported by other people and documented in > https://issues.apache.org/jira/browse/FLINK-4218. > This problem gets magnified by current HadoopFileSystem implementation, which > can potentially leak S3 client connections, and eventually get into a > restarting loop with “Timeout waiting for a connection from pool” exception > thrown from aws client. > I looked at the code, seems HadoopFileSystem.java never invoke close() method > on fs object upon failure, but the FileSystem may be re-initialized every > time the job gets restarted. > A few evidence I observed: > 1. When I set the connection pool limit to 128, and below commands shows 128 > connections are stuck in CLOSE_WAIT state. > !Screen Shot 2016-09-20 at 2.49.14 PM.png|align=left, vspace=5! > 2. task manager logs indicates that state backend file system consistently > getting initialized upon job restarting. > !Screen Shot 2016-09-20 at 2.49.32 PM.png! > 3. Log indicates there is NPE during cleanning up of stream task which was > caused by “Timeout waiting for connection from pool” exception when trying to > create a directory in S3 bucket. > 2016-09-02 08:17:50,886 ERROR > org.apache.flink.streaming.runtime.tasks.StreamTask - Error during cleanup of > stream task > java.lang.NullPointerException > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.cleanup(OneInputStreamTask.java:73) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:323) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:589) > at java.lang.Thread.run(Thread.java:745) > 4.It appears StreamTask from invoking checkpointing operation, to handling > failure, there is no logic associated with closing Hadoop File System object > (which internally includes S3 aws client object), which resides in > HadoopFileSystem.java. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Closed] (FLINK-4660) HadoopFileSystem (with S3A) may leak connections, which cause job to stuck in a restarting loop
[ https://issues.apache.org/jira/browse/FLINK-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhenzhong Xu closed FLINK-4660. --- > HadoopFileSystem (with S3A) may leak connections, which cause job to stuck in > a restarting loop > --- > > Key: FLINK-4660 > URL: https://issues.apache.org/jira/browse/FLINK-4660 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Reporter: Zhenzhong Xu >Priority: Critical > Attachments: Screen Shot 2016-09-20 at 2.49.14 PM.png, Screen Shot > 2016-09-20 at 2.49.32 PM.png > > > Flink job with checkpoints enabled and configured to use S3A file system > backend, sometimes experiences checkpointing failure due to S3 consistency > issue. This behavior is also reported by other people and documented in > https://issues.apache.org/jira/browse/FLINK-4218. > This problem gets magnified by current HadoopFileSystem implementation, which > can potentially leak S3 client connections, and eventually get into a > restarting loop with “Timeout waiting for a connection from pool” exception > thrown from aws client. > I looked at the code, seems HadoopFileSystem.java never invoke close() method > on fs object upon failure, but the FileSystem may be re-initialized every > time the job gets restarted. > A few evidence I observed: > 1. When I set the connection pool limit to 128, and below commands shows 128 > connections are stuck in CLOSE_WAIT state. > !Screen Shot 2016-09-20 at 2.49.14 PM.png|align=left, vspace=5! > 2. task manager logs indicates that state backend file system consistently > getting initialized upon job restarting. > !Screen Shot 2016-09-20 at 2.49.32 PM.png! > 3. Log indicates there is NPE during cleanning up of stream task which was > caused by “Timeout waiting for connection from pool” exception when trying to > create a directory in S3 bucket. > 2016-09-02 08:17:50,886 ERROR > org.apache.flink.streaming.runtime.tasks.StreamTask - Error during cleanup of > stream task > java.lang.NullPointerException > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.cleanup(OneInputStreamTask.java:73) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:323) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:589) > at java.lang.Thread.run(Thread.java:745) > 4.It appears StreamTask from invoking checkpointing operation, to handling > failure, there is no logic associated with closing Hadoop File System object > (which internally includes S3 aws client object), which resides in > HadoopFileSystem.java. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-4660) HadoopFileSystem (with S3A) may leak connections, which cause job to stuck in a restarting loop
[ https://issues.apache.org/jira/browse/FLINK-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872727#comment-15872727 ] Zhenzhong Xu commented on FLINK-4660: - This is fixed,closing. > HadoopFileSystem (with S3A) may leak connections, which cause job to stuck in > a restarting loop > --- > > Key: FLINK-4660 > URL: https://issues.apache.org/jira/browse/FLINK-4660 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Reporter: Zhenzhong Xu >Priority: Critical > Attachments: Screen Shot 2016-09-20 at 2.49.14 PM.png, Screen Shot > 2016-09-20 at 2.49.32 PM.png > > > Flink job with checkpoints enabled and configured to use S3A file system > backend, sometimes experiences checkpointing failure due to S3 consistency > issue. This behavior is also reported by other people and documented in > https://issues.apache.org/jira/browse/FLINK-4218. > This problem gets magnified by current HadoopFileSystem implementation, which > can potentially leak S3 client connections, and eventually get into a > restarting loop with “Timeout waiting for a connection from pool” exception > thrown from aws client. > I looked at the code, seems HadoopFileSystem.java never invoke close() method > on fs object upon failure, but the FileSystem may be re-initialized every > time the job gets restarted. > A few evidence I observed: > 1. When I set the connection pool limit to 128, and below commands shows 128 > connections are stuck in CLOSE_WAIT state. > !Screen Shot 2016-09-20 at 2.49.14 PM.png|align=left, vspace=5! > 2. task manager logs indicates that state backend file system consistently > getting initialized upon job restarting. > !Screen Shot 2016-09-20 at 2.49.32 PM.png! > 3. Log indicates there is NPE during cleanning up of stream task which was > caused by “Timeout waiting for connection from pool” exception when trying to > create a directory in S3 bucket. > 2016-09-02 08:17:50,886 ERROR > org.apache.flink.streaming.runtime.tasks.StreamTask - Error during cleanup of > stream task > java.lang.NullPointerException > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.cleanup(OneInputStreamTask.java:73) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:323) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:589) > at java.lang.Thread.run(Thread.java:745) > 4.It appears StreamTask from invoking checkpointing operation, to handling > failure, there is no logic associated with closing Hadoop File System object > (which internally includes S3 aws client object), which resides in > HadoopFileSystem.java. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink pull request #3353: Multithreaded `DataSet#flatMap` function
Github user mohamagdy closed the pull request at: https://github.com/apache/flink/pull/3353 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3353: Multithreaded `DataSet#flatMap` function
GitHub user mohamagdy opened a pull request: https://github.com/apache/flink/pull/3353 Multithreaded `DataSet#flatMap` function # Mutli-Threaded FlatMap ## Overview The DataStream#flatMap function takes a FlatMapFunction interface that has a method named flatMap that gets called by the DataStream#flatMap. The FlatMapFunction#flatMap method takes an element (DataStream record) and do some transformation on that element for example if the element is a string, the flatMap function can implement the logic for splitting the element by space or converting to upper case. The current implementation of the DataStream#flatMap uses a single thread to transform the element from one form to another. The idea of this change is to introduce a new API method the DataStream#flatMap that takes the parallelism value for transforming the input elements. ## Implementation Details The following diagram shows the multithreaded `flatMap` function. Assume in the following diagram the parallelism (maximum thread pool) is set to `3` (3 threads can run transformations on the input element) Briefly, when the `flatMap` function receives an element it pushes it to a _buffer_ then spawns a thread per element to do the transformation then write back to the _output_. The _output_ is thread-safe and only 1 thread can write at a time. It uses an `Object` as a lock state to the output. The _buffer_ is used to accumulate elements so that when the _snapshot_ job runs and the element transformation is not yet finished, the _buffer_ writes all its elements serialized into an _output stream_. When a _restore_ is called it deserialize the elements of the buffer and try run the transformation again because their output state was not taken into consideration when the snapshot job ran. ![multithreaded flatmap](https://cloud.githubusercontent.com/assets/1228432/23085859/699a2926-f56a-11e6-9146-1be213c7.png) You can merge this pull request into a Git repository by running: $ git pull https://github.com/mohamagdy/flink parallel-dataset-flatmap Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3353.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3353 commit bf2c710fe8af2be3475d66221f9e6b8e2090bbfc Author: Mohamed Magdy Date: 2017-02-12T21:35:06Z [FLINK-] Fix JavaDoc class name commit bbee6c580815190139f289d0309bfdc2f4ca83c6 Author: Mohamed Magdy Date: 2017-02-12T21:36:09Z [FLINK-] Override `close()` method In order to introduce threads for processing `flatMap` elements, the `close` method in `StreamFlatMap` will be overriden to shutdown the thread pool. commit e0aff5e241aff97426d8c10e84ba5a932668d815 Author: Mohamed Magdy Date: 2017-02-13T20:09:15Z [FLINK-] Organize imports in test files Ran Intellij organize imports commit 73803bc8fc89f48691b4c0270b0061377a3e7e38 Author: Mohamed Magdy Date: 2017-02-14T15:08:30Z [FLINK-] Fix typo in tests commit b9343317eb35ef41aba82c70648dc8fd8767273e Author: Mohamed Magdy Date: 2017-02-14T22:21:46Z [FLINK-] Call `close` after `processElement` in tests In order to follow the flow described in `StreamOperator` interface for the `close` method which says the following: ``` This method is called after all records have been added to the operators via the methods {@link org.apache.flink.streaming.api.operators.OneInputStreamOperator#processElement(StreamRecord)}, or {@link org.apache.flink.streaming.api.operators.TwoInputStreamOperator#processElement1(StreamRecord)} and {@link org.apache.flink.streaming.api.operators.TwoInputStreamOperator#processElement2(StreamRecord)}. ``` commit fd0c28adceca9fbe5fc9e3ffd93679495edca02b Author: Mohamed Magdy Date: 2017-02-14T23:46:01Z [FLINK-] Add tests for multi-threaded `DataStream` `flatMap` Tweak tests to check results of `flatMap` when processing elements of `DataStream` in multiple threads. commit cbe2fa8059272d8cb49e50056466564d6d8c322d Author: Mohamed Magdy Date: 2017-02-14T23:50:45Z [FLINK-] Add multiple threads for processing `flatMap` elements Add an option to the `DataStream` `flatMap` function that sets the parallelizm of processing the `DataStream` elements. This helps in cases when processing elements of the `DataStream` blocks the main thread in favour of other elements. The main thread is blocked until all the elements are processed and all the threads finish. commit 357a72d0bc5cca497eb80483ca32a8d958dfdbb5 Author: Mohamed Magdy Date: 2017-02-15T16:44:46Z [FLINK-] Create a new `TimestampedCollector` for each thread In order to have a coll
[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...
Github user gyfora commented on the issue: https://github.com/apache/flink/pull/3335 I agree with @StephanEwen that people probably manage the directory permissions directly when configuring the Flink jobs. It would be quite annoying if the Flink job changed the permissions you set from somewhere else. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5818) change checkpoint dir permission to 700 for security reason
[ https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872581#comment-15872581 ] ASF GitHub Bot commented on FLINK-5818: --- Github user gyfora commented on the issue: https://github.com/apache/flink/pull/3335 I agree with @StephanEwen that people probably manage the directory permissions directly when configuring the Flink jobs. It would be quite annoying if the Flink job changed the permissions you set from somewhere else. > change checkpoint dir permission to 700 for security reason > --- > > Key: FLINK-5818 > URL: https://issues.apache.org/jira/browse/FLINK-5818 > Project: Flink > Issue Type: Improvement > Components: Security, State Backends, Checkpointing >Reporter: Tao Wang > > Now checkpoint directory is made w/o specified permission, so it is easy for > another user to delete or read files under it, which will cause restore > failure or information leak. > It's better to lower it down to 700. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3346: [FLINK-5763] [checkpoints] Add CheckpointOptions for self...
Github user gyfora commented on the issue: https://github.com/apache/flink/pull/3346 So will the same savepoint logic apply to externalized checkpoints? I think it would be good to have a similar way of restoring from checkpoints and savepoints from a usability perspective. I usually set the externalized checkpoint dir the same as the savepoint dir, to make it easy to write scripts to get the latest and restart. Otherwise I think the changes make a lot of sense and the directory structure looks very reasonable. One question, in the directory name could we maybe use the checkpoint id instead of the random suffix or something more predictable? Maybe the checkpoint date? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5763) Make savepoints self-contained and relocatable
[ https://issues.apache.org/jira/browse/FLINK-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872532#comment-15872532 ] ASF GitHub Bot commented on FLINK-5763: --- Github user gyfora commented on the issue: https://github.com/apache/flink/pull/3346 So will the same savepoint logic apply to externalized checkpoints? I think it would be good to have a similar way of restoring from checkpoints and savepoints from a usability perspective. I usually set the externalized checkpoint dir the same as the savepoint dir, to make it easy to write scripts to get the latest and restart. Otherwise I think the changes make a lot of sense and the directory structure looks very reasonable. One question, in the directory name could we maybe use the checkpoint id instead of the random suffix or something more predictable? Maybe the checkpoint date? > Make savepoints self-contained and relocatable > -- > > Key: FLINK-5763 > URL: https://issues.apache.org/jira/browse/FLINK-5763 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Reporter: Ufuk Celebi >Assignee: Ufuk Celebi > > After a user has triggered a savepoint, a single savepoint file will be > returned as a handle to the savepoint. A savepoint to {{}} creates a > savepoint file like {{/savepoint-}}. > This file contains the metadata of the corresponding checkpoint, but not the > actual program state. While this works well for short term management > (pause-and-resume a job), it makes it hard to manage savepoints over longer > periods of time. > h4. Problems > h5. Scattered Checkpoint Files > For file system based checkpoints (FsStateBackend, RocksDBStateBackend) this > results in the savepoint referencing files from the checkpoint directory > (usually different than ). For users, it is virtually impossible to > tell which checkpoint files belong to a savepoint and which are lingering > around. This can easily lead to accidentally invalidating a savepoint by > deleting checkpoint files. > h5. Savepoints Not Relocatable > Even if a user is able to figure out which checkpoint files belong to a > savepoint, moving these files will invalidate the savepoint as well, because > the metadata file references absolute file paths. > h5. Forced to Use CLI for Disposal > Because of the scattered files, the user is in practice forced to use Flink’s > CLI to dispose a savepoint. This should be possible to handle in the scope of > the user’s environment via a file system delete operation. > h4. Proposal > In order to solve the described problems, savepoints should contain all their > state, both metadata and program state, inside a single directory. > Furthermore the metadata must only hold relative references to the checkpoint > files. This makes it obvious which files make up the state of a savepoint and > it is possible to move savepoints around by moving the savepoint directory. > h5. Desired File Layout > Triggering a savepoint to {{}} creates a directory as follows: > {code} > /savepoint-- > +-- _metadata > +-- data- [1 or more] > {code} > We include the JobID in the savepoint directory name in order to give some > hints about which job a savepoint belongs to. > h5. CLI > - Trigger: When triggering a savepoint to {{}} the savepoint > directory will be returned as the handle to the savepoint. > - Restore: Users can restore by pointing to the directory or the _metadata > file. The data files should be required to be in the same directory as the > _metadata file. > - Dispose: The disposal command should be deprecated and eventually removed. > While deprecated, disposal can happen by specifying the directory or the > _metadata file (same as restore). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-5818) change checkpoint dir permission to 700 for security reason
[ https://issues.apache.org/jira/browse/FLINK-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872516#comment-15872516 ] ASF GitHub Bot commented on FLINK-5818: --- Github user greghogan commented on the issue: https://github.com/apache/flink/pull/3335 The HDFS administrator can configure the parent directory for checkpoints with user and/or group ACL permissions. A default ACL is then inherited by the newly created files and subdirectories therein. If you create an ACL which blocks access for `group` and `other` the effective permissions are the requested `700`. > change checkpoint dir permission to 700 for security reason > --- > > Key: FLINK-5818 > URL: https://issues.apache.org/jira/browse/FLINK-5818 > Project: Flink > Issue Type: Improvement > Components: Security, State Backends, Checkpointing >Reporter: Tao Wang > > Now checkpoint directory is made w/o specified permission, so it is easy for > another user to delete or read files under it, which will cause restore > failure or information leak. > It's better to lower it down to 700. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/3335 The HDFS administrator can configure the parent directory for checkpoints with user and/or group ACL permissions. A default ACL is then inherited by the newly created files and subdirectories therein. If you create an ACL which blocks access for `group` and `other` the effective permissions are the requested `700`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3679) DeserializationSchema should handle zero or more outputs for every input
[ https://issues.apache.org/jira/browse/FLINK-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872506#comment-15872506 ] Haohui Mai commented on FLINK-3679: --- [~tzulitai] -- would you mind taking a look? > DeserializationSchema should handle zero or more outputs for every input > > > Key: FLINK-3679 > URL: https://issues.apache.org/jira/browse/FLINK-3679 > Project: Flink > Issue Type: Bug > Components: DataStream API, Kafka Connector >Reporter: Jamie Grier >Assignee: Haohui Mai > > There are a couple of issues with the DeserializationSchema API that I think > should be improved. This request has come to me via an existing Flink user. > The main issue is simply that the API assumes that there is a one-to-one > mapping between input and outputs. In reality there are scenarios where one > input message (say from Kafka) might actually map to zero or more logical > elements in the pipeline. > Particularly important here is the case where you receive a message from a > source (such as Kafka) and say the raw bytes don't deserialize properly. > Right now the only recourse is to throw IOException and therefore fail the > job. > This is definitely not good since bad data is a reality and failing the job > is not the right option. If the job fails we'll just end up replaying the > bad data and the whole thing will start again. > Instead in this case it would be best if the user could just return the empty > set. > The other case is where one input message should logically be multiple output > messages. This case is probably less important since there are other ways to > do this but in general it might be good to make the > DeserializationSchema.deserialize() method return a collection rather than a > single element. > Maybe we need to support a DeserializationSchema variant that has semantics > more like that of FlatMap. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-5838) Print shell script usage
[ https://issues.apache.org/jira/browse/FLINK-5838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872400#comment-15872400 ] ASF GitHub Bot commented on FLINK-5838: --- GitHub user greghogan opened a pull request: https://github.com/apache/flink/pull/3352 [FLINK-5838] [scripts] Print shell script usage If jobmanager.sh, taskmanager.sh, or zookeeper.sh are called without arguments then argument list for the call to flink-daemon.sh is misaligned and the usage for flink-daemon displayed to the user. Adds a check to each script to check for a valid action and otherwise displays the proper usage string. Note: this PR conflicts with the PR for FLINK-4326 which adds a "start-foreground" action. You can merge this pull request into a Git repository by running: $ git pull https://github.com/greghogan/flink 5838_print_shell_script_usage Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3352.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3352 commit 0441164935a34032523040864ec18ee01b4adb5b Author: Greg Hogan Date: 2017-02-17T19:38:31Z [FLINK-5838] [scripts] Print shell script usage If jobmanager.sh, taskmanager.sh, or zookeeper.sh are called without arguments then argument list for the call to flink-daemon.sh is misaligned and the usage for flink-daemon displayed to the user. Adds a check to each script to check for a valid action and otherwise displays the proper usage string. > Print shell script usage > > > Key: FLINK-5838 > URL: https://issues.apache.org/jira/browse/FLINK-5838 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Affects Versions: 1.3.0 >Reporter: Greg Hogan >Assignee: Greg Hogan >Priority: Trivial > Fix For: 1.3.0 > > > {code} > $ ./bin/jobmanager.sh > Unknown daemon ''. Usage: flink-daemon.sh (start|stop|stop-all) > (jobmanager|taskmanager|zookeeper) [args]. > {code} > The arguments in {{jobmanager.sh}}'s call to {{flink-daemon.sh}} are > misaligned when {{$STARTSTOP}} is the null string. > {code} > "${FLINK_BIN_DIR}"/flink-daemon.sh $STARTSTOP jobmanager "${args[@]}" > {code} > Same issue in {{taskmanager.sh}} and {{zookeeper.sh}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink pull request #3352: [FLINK-5838] [scripts] Print shell script usage
GitHub user greghogan opened a pull request: https://github.com/apache/flink/pull/3352 [FLINK-5838] [scripts] Print shell script usage If jobmanager.sh, taskmanager.sh, or zookeeper.sh are called without arguments then argument list for the call to flink-daemon.sh is misaligned and the usage for flink-daemon displayed to the user. Adds a check to each script to check for a valid action and otherwise displays the proper usage string. Note: this PR conflicts with the PR for FLINK-4326 which adds a "start-foreground" action. You can merge this pull request into a Git repository by running: $ git pull https://github.com/greghogan/flink 5838_print_shell_script_usage Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3352.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3352 commit 0441164935a34032523040864ec18ee01b4adb5b Author: Greg Hogan Date: 2017-02-17T19:38:31Z [FLINK-5838] [scripts] Print shell script usage If jobmanager.sh, taskmanager.sh, or zookeeper.sh are called without arguments then argument list for the call to flink-daemon.sh is misaligned and the usage for flink-daemon displayed to the user. Adds a check to each script to check for a valid action and otherwise displays the proper usage string. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-5838) Print shell script usage
[ https://issues.apache.org/jira/browse/FLINK-5838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Hogan updated FLINK-5838: -- Description: {code} $ ./bin/jobmanager.sh Unknown daemon ''. Usage: flink-daemon.sh (start|stop|stop-all) (jobmanager|taskmanager|zookeeper) [args]. {code} The arguments in {{jobmanager.sh}}'s call to {{flink-daemon.sh}} are misaligned when {{$STARTSTOP}} is the null string. {code} "${FLINK_BIN_DIR}"/flink-daemon.sh $STARTSTOP jobmanager "${args[@]}" {code} Same issue in {{taskmanager.sh}} and {{zookeeper.sh}}. was: {code} $ ./bin/jobmanager.sh Unknown daemon ''. Usage: flink-daemon.sh (start|stop|stop-all) (jobmanager|taskmanager|zookeeper) [args]. {code} The arguments in {{jobmanager.sh}}'s call to {{flink-daemon.sh}} are misaligned when {{$STARTSTOP}} is the null string. {code} "${FLINK_BIN_DIR}"/flink-daemon.sh $STARTSTOP jobmanager "${args[@]}" {code} > Print shell script usage > > > Key: FLINK-5838 > URL: https://issues.apache.org/jira/browse/FLINK-5838 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Affects Versions: 1.3.0 >Reporter: Greg Hogan >Assignee: Greg Hogan >Priority: Trivial > Fix For: 1.3.0 > > > {code} > $ ./bin/jobmanager.sh > Unknown daemon ''. Usage: flink-daemon.sh (start|stop|stop-all) > (jobmanager|taskmanager|zookeeper) [args]. > {code} > The arguments in {{jobmanager.sh}}'s call to {{flink-daemon.sh}} are > misaligned when {{$STARTSTOP}} is the null string. > {code} > "${FLINK_BIN_DIR}"/flink-daemon.sh $STARTSTOP jobmanager "${args[@]}" > {code} > Same issue in {{taskmanager.sh}} and {{zookeeper.sh}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (FLINK-5838) Print shell script usage
[ https://issues.apache.org/jira/browse/FLINK-5838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Hogan updated FLINK-5838: -- Summary: Print shell script usage (was: Fix jobmanager.sh usage) > Print shell script usage > > > Key: FLINK-5838 > URL: https://issues.apache.org/jira/browse/FLINK-5838 > Project: Flink > Issue Type: Bug > Components: Startup Shell Scripts >Affects Versions: 1.3.0 >Reporter: Greg Hogan >Assignee: Greg Hogan >Priority: Trivial > Fix For: 1.3.0 > > > {code} > $ ./bin/jobmanager.sh > Unknown daemon ''. Usage: flink-daemon.sh (start|stop|stop-all) > (jobmanager|taskmanager|zookeeper) [args]. > {code} > The arguments in {{jobmanager.sh}}'s call to {{flink-daemon.sh}} are > misaligned when {{$STARTSTOP}} is the null string. > {code} > "${FLINK_BIN_DIR}"/flink-daemon.sh $STARTSTOP jobmanager "${args[@]}" > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #2982: [FLINK-4460] Side Outputs in Flink
Github user chenqin commented on the issue: https://github.com/apache/flink/pull/2982 @aljoscha Looks good to me ð I briefly looked at your git branch, a minor comment would be adding comments to `sideOutputLateData` so user get better idea when they opt-in to late arriving event stream. Initial late arriving event is decided by comparing watermark & eventTime, do you think there is a need to allow user pass a kinda `Evaluator` and enable user sideOutput any kind of sideOutputs? `window.sideOutput(OutputTag, Evaluator)` `interface Evaluator{ MergedWindows, key, watermark}` - Regarding `split` `select`, I think there is a chance of consolidate select and build upon `OutputTag`, but might be out of this PR's scope. - Regarding to `WindowStream`, I am a bit confused to figure out if I use `allowedlateness` and `sideOutputLateData` at same time. Thanks, Chen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5747) Eager Scheduling should deploy all Tasks together
[ https://issues.apache.org/jira/browse/FLINK-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872366#comment-15872366 ] ASF GitHub Bot commented on FLINK-5747: --- Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/3295#discussion_r101829659 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java --- @@ -754,6 +759,139 @@ public void scheduleForExecution(SlotProvider slotProvider) throws JobException } } + private void scheduleLazy(SlotProvider slotProvider) throws NoResourceAvailableException { + // simply take the vertices without inputs. + for (ExecutionJobVertex ejv : this.tasks.values()) { + if (ejv.getJobVertex().isInputVertex()) { + ejv.scheduleAll(slotProvider, allowQueuedScheduling); + } + } + } + + /** +* +* +* @param slotProvider The resource provider from which the slots are allocated +* @param timeout The maximum time that the deployment may take, before a +* TimeoutException is thrown. +*/ + private void scheduleEager(SlotProvider slotProvider, final Time timeout) { + checkState(state == JobStatus.RUNNING, "job is not running currently"); + + // Important: reserve all the space we need up front. + // that way we do not have any operation that can fail between allocating the slots + // and adding them to the list. If we had a failure in between there, that would + // cause the slots to get lost + final ArrayList resources = new ArrayList<>(getNumberOfExecutionJobVertices()); + final boolean queued = allowQueuedScheduling; + + // we use this flag to handle failures in a 'finally' clause + // that allows us to not go through clumsy cast-and-rethrow logic + boolean successful = false; + + try { + // collecting all the slots may resize and fail in that operation without slots getting lost + final ArrayList> slotFutures = new ArrayList<>(getNumberOfExecutionJobVertices()); + + // allocate the slots (obtain all their futures + for (ExecutionJobVertex ejv : getVerticesTopologically()) { + // these calls are not blocking, they only return futures + ExecutionAndSlot[] slots = ejv.allocateResourcesForAll(slotProvider, queued); + + // we need to first add the slots to this list, to be safe on release + resources.add(slots); + + for (ExecutionAndSlot ens : slots) { + slotFutures.add(ens.slotFuture); + } + } + + // this future is complete once all slot futures are complete. + // the future fails once one slot future fails. + final ConjunctFuture allAllocationsComplete = FutureUtils.combineAll(slotFutures); --- End diff -- True, it is not incorrect. But some tasks would be already deployed if we start as soon as some futures are ready. They would need to be canceled again, which gives these not so nice fast deploy/out-of-resource/cancel/wait-for-cancellation/retry/etc loops. > Eager Scheduling should deploy all Tasks together > - > > Key: FLINK-5747 > URL: https://issues.apache.org/jira/browse/FLINK-5747 > Project: Flink > Issue Type: Bug > Components: JobManager >Affects Versions: 1.2.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 1.3.0 > > > Currently, eager scheduling immediately triggers the scheduling for all > vertices and their subtasks in topological order. > This has two problems: > - This works only, as long as resource acquisition is "synchronous". With > dynamic resource acquisition in FLIP-6, the resources are returned as Futures > which may complete out of order. This results in out-of-order (not in > topological order) scheduling of tasks which does not work for streaming. > - Deploying some tasks that depend on other tasks before it is clear that > the other tasks have resources as well leads to situations where many > deploy/recovery cycles happen before enough resources are available to get > the job running fully. > For eager scheduling, we should allocate all resource
[GitHub] flink pull request #3295: [FLINK-5747] [distributed coordination] Eager sche...
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/3295#discussion_r101829659 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java --- @@ -754,6 +759,139 @@ public void scheduleForExecution(SlotProvider slotProvider) throws JobException } } + private void scheduleLazy(SlotProvider slotProvider) throws NoResourceAvailableException { + // simply take the vertices without inputs. + for (ExecutionJobVertex ejv : this.tasks.values()) { + if (ejv.getJobVertex().isInputVertex()) { + ejv.scheduleAll(slotProvider, allowQueuedScheduling); + } + } + } + + /** +* +* +* @param slotProvider The resource provider from which the slots are allocated +* @param timeout The maximum time that the deployment may take, before a +* TimeoutException is thrown. +*/ + private void scheduleEager(SlotProvider slotProvider, final Time timeout) { + checkState(state == JobStatus.RUNNING, "job is not running currently"); + + // Important: reserve all the space we need up front. + // that way we do not have any operation that can fail between allocating the slots + // and adding them to the list. If we had a failure in between there, that would + // cause the slots to get lost + final ArrayList resources = new ArrayList<>(getNumberOfExecutionJobVertices()); + final boolean queued = allowQueuedScheduling; + + // we use this flag to handle failures in a 'finally' clause + // that allows us to not go through clumsy cast-and-rethrow logic + boolean successful = false; + + try { + // collecting all the slots may resize and fail in that operation without slots getting lost + final ArrayList> slotFutures = new ArrayList<>(getNumberOfExecutionJobVertices()); + + // allocate the slots (obtain all their futures + for (ExecutionJobVertex ejv : getVerticesTopologically()) { + // these calls are not blocking, they only return futures + ExecutionAndSlot[] slots = ejv.allocateResourcesForAll(slotProvider, queued); + + // we need to first add the slots to this list, to be safe on release + resources.add(slots); + + for (ExecutionAndSlot ens : slots) { + slotFutures.add(ens.slotFuture); + } + } + + // this future is complete once all slot futures are complete. + // the future fails once one slot future fails. + final ConjunctFuture allAllocationsComplete = FutureUtils.combineAll(slotFutures); --- End diff -- True, it is not incorrect. But some tasks would be already deployed if we start as soon as some futures are ready. They would need to be canceled again, which gives these not so nice fast deploy/out-of-resource/cancel/wait-for-cancellation/retry/etc loops. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3295: [FLINK-5747] [distributed coordination] Eager sche...
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/3295#discussion_r101829019 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/concurrent/FutureUtils.java --- @@ -88,4 +100,104 @@ public RetryException(Throwable cause) { super(cause); } } + + // + // composing futures + // + + /** +* Creates a future that is complete once multiple other futures completed. +* The ConjunctFuture fails (completes exceptionally) once one of the Futures in the +* conjunction fails. +* +* The ConjunctFuture gives access to how many Futures in the conjunction have already +* completed successfully, via {@link ConjunctFuture#getNumFuturesCompleted()}. +* +* @param futures The futures that make up the conjunction. No null entries are allowed. +* @return The ConjunctFuture that completes once all given futures are complete (or one fails). +*/ + public static ConjunctFuture combineAll(Collection> futures) { + checkNotNull(futures, "futures"); + checkArgument(!futures.isEmpty(), "futures is empty"); + + final ConjunctFutureImpl conjunct = new ConjunctFutureImpl(futures.size()); + + for (Future future : futures) { + future.handle(conjunct.completionHandler); + } + + return conjunct; + } + + /** +* A future that is complete once multiple other futures completed. The futures are not +* necessarily of the same type, which is why the type of this Future is {@code Void}. +* The ConjunctFuture fails (completes exceptionally) once one of the Futures in the +* conjunction fails. +* +* The advantage of using the ConjunctFuture over chaining all the futures (such as via +* {@link Future#thenCombine(Future, BiFunction)}) is that ConjunctFuture also tracks how +* many of the Futures are already complete. +*/ + public interface ConjunctFuture extends CompletableFuture { + + /** +* Gets the total number of Futures in the conjunction. +* @return The total number of Futures in the conjunction. +*/ + int getNumFuturesTotal(); + + /** +* Gets the number of Futures in the conjunction that are already complete. +* @return The number of Futures in the conjunction that are already complete +*/ + int getNumFuturesCompleted(); + } + + /** +* The implementation of the {@link ConjunctFuture}. +* +* Implementation notice: The member fields all have package-private access, because they are +* either accessed by an inner subclass or by the enclosing class. +*/ + private static class ConjunctFutureImpl extends FlinkCompletableFuture implements ConjunctFuture { --- End diff -- Yes, with set rather then add it should work. Since the list gets initialized with an array, I would actually just use an array in the first place. Followup ;-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5747) Eager Scheduling should deploy all Tasks together
[ https://issues.apache.org/jira/browse/FLINK-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872360#comment-15872360 ] ASF GitHub Bot commented on FLINK-5747: --- Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/3295#discussion_r101829019 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/concurrent/FutureUtils.java --- @@ -88,4 +100,104 @@ public RetryException(Throwable cause) { super(cause); } } + + // + // composing futures + // + + /** +* Creates a future that is complete once multiple other futures completed. +* The ConjunctFuture fails (completes exceptionally) once one of the Futures in the +* conjunction fails. +* +* The ConjunctFuture gives access to how many Futures in the conjunction have already +* completed successfully, via {@link ConjunctFuture#getNumFuturesCompleted()}. +* +* @param futures The futures that make up the conjunction. No null entries are allowed. +* @return The ConjunctFuture that completes once all given futures are complete (or one fails). +*/ + public static ConjunctFuture combineAll(Collection> futures) { + checkNotNull(futures, "futures"); + checkArgument(!futures.isEmpty(), "futures is empty"); + + final ConjunctFutureImpl conjunct = new ConjunctFutureImpl(futures.size()); + + for (Future future : futures) { + future.handle(conjunct.completionHandler); + } + + return conjunct; + } + + /** +* A future that is complete once multiple other futures completed. The futures are not +* necessarily of the same type, which is why the type of this Future is {@code Void}. +* The ConjunctFuture fails (completes exceptionally) once one of the Futures in the +* conjunction fails. +* +* The advantage of using the ConjunctFuture over chaining all the futures (such as via +* {@link Future#thenCombine(Future, BiFunction)}) is that ConjunctFuture also tracks how +* many of the Futures are already complete. +*/ + public interface ConjunctFuture extends CompletableFuture { + + /** +* Gets the total number of Futures in the conjunction. +* @return The total number of Futures in the conjunction. +*/ + int getNumFuturesTotal(); + + /** +* Gets the number of Futures in the conjunction that are already complete. +* @return The number of Futures in the conjunction that are already complete +*/ + int getNumFuturesCompleted(); + } + + /** +* The implementation of the {@link ConjunctFuture}. +* +* Implementation notice: The member fields all have package-private access, because they are +* either accessed by an inner subclass or by the enclosing class. +*/ + private static class ConjunctFutureImpl extends FlinkCompletableFuture implements ConjunctFuture { --- End diff -- Yes, with set rather then add it should work. Since the list gets initialized with an array, I would actually just use an array in the first place. Followup ;-) > Eager Scheduling should deploy all Tasks together > - > > Key: FLINK-5747 > URL: https://issues.apache.org/jira/browse/FLINK-5747 > Project: Flink > Issue Type: Bug > Components: JobManager >Affects Versions: 1.2.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 1.3.0 > > > Currently, eager scheduling immediately triggers the scheduling for all > vertices and their subtasks in topological order. > This has two problems: > - This works only, as long as resource acquisition is "synchronous". With > dynamic resource acquisition in FLIP-6, the resources are returned as Futures > which may complete out of order. This results in out-of-order (not in > topological order) scheduling of tasks which does not work for streaming. > - Deploying some tasks that depend on other tasks before it is clear that > the other tasks have resources as well leads to situations where many > deploy/recovery cycles happen before enough resources are available to get > the job running fully. > For eager scheduling, we should allocate all resources in one chunk and then > deploy
[GitHub] flink pull request #3351: [FLINK-4326] [scripts] Flink foreground services
GitHub user greghogan opened a pull request: https://github.com/apache/flink/pull/3351 [FLINK-4326] [scripts] Flink foreground services Add a "start-foreground" option to the Flink service scripts which does not daemonize the service nor redirect output. You can merge this pull request into a Git repository by running: $ git pull https://github.com/greghogan/flink 4326_flink_startup_scripts_should_optionally_start_services_on_the_foreground Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3351.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3351 commit c9a84fb189ef8342f3673137c2137bcb3fb54df7 Author: Greg Hogan Date: 2016-10-07T20:06:48Z [FLINK-4326] [scripts] Flink foreground services Add a "start-foreground" option to the Flink service scripts which does not daemonize the service nor redirect output. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4460) Side Outputs in Flink
[ https://issues.apache.org/jira/browse/FLINK-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872353#comment-15872353 ] ASF GitHub Bot commented on FLINK-4460: --- Github user chenqin commented on the issue: https://github.com/apache/flink/pull/2982 @aljoscha Looks good to me 👍 I briefly looked at your git branch, a minor comment would be adding comments to `sideOutputLateData` so user get better idea when they opt-in to late arriving event stream. Initial late arriving event is decided by comparing watermark & eventTime, do you think there is a need to allow user pass a kinda `Evaluator` and enable user sideOutput any kind of sideOutputs? `window.sideOutput(OutputTag, Evaluator)` `interface Evaluator{ MergedWindows, key, watermark}` - Regarding `split` `select`, I think there is a chance of consolidate select and build upon `OutputTag`, but might be out of this PR's scope. - Regarding to `WindowStream`, I am a bit confused to figure out if I use `allowedlateness` and `sideOutputLateData` at same time. Thanks, Chen > Side Outputs in Flink > - > > Key: FLINK-4460 > URL: https://issues.apache.org/jira/browse/FLINK-4460 > Project: Flink > Issue Type: New Feature > Components: Core, DataStream API >Affects Versions: 1.2.0, 1.1.3 >Reporter: Chen Qin >Assignee: Chen Qin > Labels: latearrivingevents, sideoutput > > https://docs.google.com/document/d/1vg1gpR8JL4dM07Yu4NyerQhhVvBlde5qdqnuJv4LcV4/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-4326) Flink start-up scripts should optionally start services on the foreground
[ https://issues.apache.org/jira/browse/FLINK-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872352#comment-15872352 ] ASF GitHub Bot commented on FLINK-4326: --- GitHub user greghogan opened a pull request: https://github.com/apache/flink/pull/3351 [FLINK-4326] [scripts] Flink foreground services Add a "start-foreground" option to the Flink service scripts which does not daemonize the service nor redirect output. You can merge this pull request into a Git repository by running: $ git pull https://github.com/greghogan/flink 4326_flink_startup_scripts_should_optionally_start_services_on_the_foreground Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3351.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3351 commit c9a84fb189ef8342f3673137c2137bcb3fb54df7 Author: Greg Hogan Date: 2016-10-07T20:06:48Z [FLINK-4326] [scripts] Flink foreground services Add a "start-foreground" option to the Flink service scripts which does not daemonize the service nor redirect output. > Flink start-up scripts should optionally start services on the foreground > - > > Key: FLINK-4326 > URL: https://issues.apache.org/jira/browse/FLINK-4326 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.0.3 >Reporter: Elias Levy > > This has previously been mentioned in the mailing list, but has not been > addressed. Flink start-up scripts start the job and task managers in the > background. This makes it difficult to integrate Flink with most processes > supervisory tools and init systems, including Docker. One can get around > this via hacking the scripts or manually starting the right classes via Java, > but it is a brittle solution. > In addition to starting the daemons in the foreground, the start up scripts > should use exec instead of running the commends, so as to avoid forks. Many > supervisory tools assume the PID of the process to be monitored is that of > the process it first executes, and fork chains make it difficult for the > supervisor to figure out what process to monitor. Specifically, > jobmanager.sh and taskmanager.sh should exec flink-daemon.sh, and > flink-daemon.sh should exec java. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink pull request #3295: [FLINK-5747] [distributed coordination] Eager sche...
Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/3295#discussion_r101827724 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/concurrent/FutureUtils.java --- @@ -88,4 +100,104 @@ public RetryException(Throwable cause) { super(cause); } } + + // + // composing futures + // + + /** +* Creates a future that is complete once multiple other futures completed. +* The ConjunctFuture fails (completes exceptionally) once one of the Futures in the +* conjunction fails. +* +* The ConjunctFuture gives access to how many Futures in the conjunction have already +* completed successfully, via {@link ConjunctFuture#getNumFuturesCompleted()}. +* +* @param futures The futures that make up the conjunction. No null entries are allowed. +* @return The ConjunctFuture that completes once all given futures are complete (or one fails). +*/ + public static ConjunctFuture combineAll(Collection> futures) { + checkNotNull(futures, "futures"); + checkArgument(!futures.isEmpty(), "futures is empty"); --- End diff -- Yes, will change that... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5747) Eager Scheduling should deploy all Tasks together
[ https://issues.apache.org/jira/browse/FLINK-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872348#comment-15872348 ] ASF GitHub Bot commented on FLINK-5747: --- Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/3295#discussion_r101827724 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/concurrent/FutureUtils.java --- @@ -88,4 +100,104 @@ public RetryException(Throwable cause) { super(cause); } } + + // + // composing futures + // + + /** +* Creates a future that is complete once multiple other futures completed. +* The ConjunctFuture fails (completes exceptionally) once one of the Futures in the +* conjunction fails. +* +* The ConjunctFuture gives access to how many Futures in the conjunction have already +* completed successfully, via {@link ConjunctFuture#getNumFuturesCompleted()}. +* +* @param futures The futures that make up the conjunction. No null entries are allowed. +* @return The ConjunctFuture that completes once all given futures are complete (or one fails). +*/ + public static ConjunctFuture combineAll(Collection> futures) { + checkNotNull(futures, "futures"); + checkArgument(!futures.isEmpty(), "futures is empty"); --- End diff -- Yes, will change that... > Eager Scheduling should deploy all Tasks together > - > > Key: FLINK-5747 > URL: https://issues.apache.org/jira/browse/FLINK-5747 > Project: Flink > Issue Type: Bug > Components: JobManager >Affects Versions: 1.2.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 1.3.0 > > > Currently, eager scheduling immediately triggers the scheduling for all > vertices and their subtasks in topological order. > This has two problems: > - This works only, as long as resource acquisition is "synchronous". With > dynamic resource acquisition in FLIP-6, the resources are returned as Futures > which may complete out of order. This results in out-of-order (not in > topological order) scheduling of tasks which does not work for streaming. > - Deploying some tasks that depend on other tasks before it is clear that > the other tasks have resources as well leads to situations where many > deploy/recovery cycles happen before enough resources are available to get > the job running fully. > For eager scheduling, we should allocate all resources in one chunk and then > deploy once we know that all are available. > As a follow-up, the same should be done per pipelined component in lazy batch > scheduling as well. That way we get lazy scheduling across blocking > boundaries, and bulk (gang) scheduling in pipelined subgroups. > This also does not apply for efforts of fine grained recovery, where > individual tasks request replacement resources. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (FLINK-5838) Fix jobmanager.sh usage
Greg Hogan created FLINK-5838: - Summary: Fix jobmanager.sh usage Key: FLINK-5838 URL: https://issues.apache.org/jira/browse/FLINK-5838 Project: Flink Issue Type: Bug Components: Startup Shell Scripts Affects Versions: 1.3.0 Reporter: Greg Hogan Assignee: Greg Hogan Priority: Trivial Fix For: 1.3.0 {code} $ ./bin/jobmanager.sh Unknown daemon ''. Usage: flink-daemon.sh (start|stop|stop-all) (jobmanager|taskmanager|zookeeper) [args]. {code} The arguments in {{jobmanager.sh}}'s call to {{flink-daemon.sh}} are misaligned when {{$STARTSTOP}} is the null string. {code} "${FLINK_BIN_DIR}"/flink-daemon.sh $STARTSTOP jobmanager "${args[@]}" {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3246: [FLINK-5353] [elasticsearch] User-provided failure handle...
Github user fpompermaier commented on the issue: https://github.com/apache/flink/pull/3246 Any news on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5353) Elasticsearch Sink loses well-formed documents when there are malformed documents
[ https://issues.apache.org/jira/browse/FLINK-5353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872297#comment-15872297 ] ASF GitHub Bot commented on FLINK-5353: --- Github user fpompermaier commented on the issue: https://github.com/apache/flink/pull/3246 Any news on this? > Elasticsearch Sink loses well-formed documents when there are malformed > documents > - > > Key: FLINK-5353 > URL: https://issues.apache.org/jira/browse/FLINK-5353 > Project: Flink > Issue Type: Bug > Components: Streaming Connectors >Affects Versions: 1.1.3 >Reporter: Flavio Pompermaier >Assignee: Tzu-Li (Gordon) Tai > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-5634) Flink should not always redirect stdout to a file.
[ https://issues.apache.org/jira/browse/FLINK-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872294#comment-15872294 ] ASF GitHub Bot commented on FLINK-5634: --- Github user greghogan commented on the issue: https://github.com/apache/flink/pull/3204 This is now looking like a game of hot potato :) I'm happy to let @iemejia create the PR but he had offered the same to me. If I don't hear otherwise first I'll create a PR for FLINK-4326. While sitting down so I don't step on anyone's toes. > Flink should not always redirect stdout to a file. > -- > > Key: FLINK-5634 > URL: https://issues.apache.org/jira/browse/FLINK-5634 > Project: Flink > Issue Type: Bug > Components: Docker >Affects Versions: 1.2.0 >Reporter: Jamie Grier >Assignee: Jamie Grier > > Flink always redirects stdout to a file. While often convenient this isn't > always what people want. The most obvious case of this is a Docker > deployment. > It should be possible to have Flink log to stdout. > Here is a PR for this: https://github.com/apache/flink/pull/3204 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3204: [FLINK-5634] Flink should not always redirect stdout to a...
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/3204 This is now looking like a game of hot potato :) I'm happy to let @iemejia create the PR but he had offered the same to me. If I don't hear otherwise first I'll create a PR for FLINK-4326. While sitting down so I don't step on anyone's toes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3163) Configure Flink for NUMA systems
[ https://issues.apache.org/jira/browse/FLINK-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872261#comment-15872261 ] ASF GitHub Bot commented on FLINK-3163: --- Github user greghogan commented on the issue: https://github.com/apache/flink/pull/3249 @StephanEwen thanks for the review. I'll verify, test, and merge. > Configure Flink for NUMA systems > > > Key: FLINK-3163 > URL: https://issues.apache.org/jira/browse/FLINK-3163 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.0.0 >Reporter: Greg Hogan >Assignee: Greg Hogan > Fix For: 1.3.0 > > > On NUMA systems Flink can be pinned to a single physical processor ("node") > using {{numactl --membind=$node --cpunodebind=$node }}. Commonly > available NUMA systems include the largest AWS and Google Compute instances. > For example, on an AWS c4.8xlarge system with 36 hyperthreads the user could > configure a single TaskManager with 36 slots or have Flink create two > TaskManagers bound to each of the NUMA nodes, each with 18 slots. > There may be some extra overhead in transferring network buffers between > TaskManagers on the same system, though the fraction of data shuffled in this > manner decreases with the size of the cluster. The performance improvement > from only accessing local memory looks to be significant though difficult to > benchmark. > The JobManagers may fit into NUMA nodes rather than requiring full systems. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-4810) Checkpoint Coordinator should fail ExecutionGraph after "n" unsuccessful checkpoints
[ https://issues.apache.org/jira/browse/FLINK-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872260#comment-15872260 ] ASF GitHub Bot commented on FLINK-4810: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3334 Thank you for opening this pull request. I'll try to review it in the coming days... > Checkpoint Coordinator should fail ExecutionGraph after "n" unsuccessful > checkpoints > > > Key: FLINK-4810 > URL: https://issues.apache.org/jira/browse/FLINK-4810 > Project: Flink > Issue Type: Sub-task > Components: State Backends, Checkpointing >Reporter: Stephan Ewen > > The Checkpoint coordinator should track the number of consecutive > unsuccessful checkpoints. > If more than {{n}} (configured value) checkpoints fail in a row, it should > call {{fail()}} on the execution graph to trigger a recovery. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3249: [FLINK-3163] [scripts] Configure Flink for NUMA systems
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/3249 @StephanEwen thanks for the review. I'll verify, test, and merge. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3334: FLINK-4810 Checkpoint Coordinator should fail ExecutionGr...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3334 Thank you for opening this pull request. I'll try to review it in the coming days... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3323: [hotfix] [core] Add missing stability annotations for cla...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3323 Good cleanup, thanks a lot! Merging this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4813) Having flink-test-utils as a dependency outside Flink fails the build
[ https://issues.apache.org/jira/browse/FLINK-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872250#comment-15872250 ] ASF GitHub Bot commented on FLINK-4813: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3322 The change looks good, thank you! Merging this... > Having flink-test-utils as a dependency outside Flink fails the build > - > > Key: FLINK-4813 > URL: https://issues.apache.org/jira/browse/FLINK-4813 > Project: Flink > Issue Type: Bug > Components: Build System >Affects Versions: 1.2.0 >Reporter: Robert Metzger >Assignee: Nico Kruber > > The {{flink-test-utils}} depend on {{hadoop-minikdc}}, which has a > dependency, which is only resolvable, if the {{maven-bundle-plugin}} is > loaded. > This is the error message > {code} > [ERROR] Failed to execute goal on project quickstart-1.2-tests: Could not > resolve dependencies for project > com.dataartisans:quickstart-1.2-tests:jar:1.0-SNAPSHOT: Failure to find > org.apache.directory.jdbm:apacheds-jdbm1:bundle:2.0.0-M2 in > https://repo.maven.apache.org/maven2 was cached in the local repository, > resolution will not be reattempted until the update interval of central has > elapsed or updates are forced -> [Help 1] > {code} > {{flink-parent}} loads that plugin, so all "internal" dependencies to the > test utils can resolve the plugin. > Right now, users have to use the maven bundle plugin to use our test utils > externally. > By making the hadoop minikdc dependency optional, we can probably resolve the > issues. Then, only users who want to use the security-related tools in the > test utils need to manually add the hadoop minikdc dependency + the plugin. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3322: [FLINK-4813][flink-test-utils] make the hadoop-minikdc de...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3322 The change looks good, thank you! Merging this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5277) missing unit test for ensuring ResultPartition#add always recycles buffers
[ https://issues.apache.org/jira/browse/FLINK-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872246#comment-15872246 ] ASF GitHub Bot commented on FLINK-5277: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3309 Good addition, thanks! Merging this... > missing unit test for ensuring ResultPartition#add always recycles buffers > -- > > Key: FLINK-5277 > URL: https://issues.apache.org/jira/browse/FLINK-5277 > Project: Flink > Issue Type: Improvement > Components: Network >Reporter: Nico Kruber >Assignee: Nico Kruber > > We rely on ResultPartition to recycle the buffer if the add calls fails. > It makes sense to add a special test (to ResultPartitionTest or > RecordWriterTest) where we ensure that this actually happens to guard against > future behaviour changes in ResultPartition. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3309: [FLINK-5277] add unit tests for ResultPartition#add() in ...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3309 Good addition, thanks! Merging this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5739) NullPointerException in CliFrontend
[ https://issues.apache.org/jira/browse/FLINK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872240#comment-15872240 ] ASF GitHub Bot commented on FLINK-5739: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3292 Change looks good, thank you! Merging this... > NullPointerException in CliFrontend > --- > > Key: FLINK-5739 > URL: https://issues.apache.org/jira/browse/FLINK-5739 > Project: Flink > Issue Type: Bug > Components: Client >Affects Versions: 1.3.0 > Environment: Mac OS X 10.12.2, Java 1.8.0_92-b14 >Reporter: Zhuoluo Yang >Assignee: Zhuoluo Yang > Labels: newbie, starter > Fix For: 1.3.0 > > > I've run a simple program on a local cluster. It always fails with code > Version: 1.3-SNAPSHOTCommit: e24a866. > {quote} > Zhuoluos-MacBook-Pro:build-target zhuoluo.yzl$ bin/flink run -c > com.alibaba.blink.TableApp ~/gitlab/tableapp/target/tableapp-1.0-SNAPSHOT.jar > Cluster configuration: Standalone cluster with JobManager at > localhost/127.0.0.1:6123 > Using address localhost:6123 to connect to JobManager. > JobManager web interface address http://localhost:8081 > Starting execution of program > > The program finished with the following exception: > java.lang.NullPointerException > at > org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:845) > at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259) > at > org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1076) > at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1123) > at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1120) > at > org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at > org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40) > at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1120) > {quote} > I don't think there should be a NullPointerException here, even if you forgot > the "execute()" call. > The reproducing code looks like following: > {code:java} > public static void main(String[] args) throws Exception { > ExecutionEnvironment env = > ExecutionEnvironment.getExecutionEnvironment(); > DataSource customer = > env.readTextFile("/Users/zhuoluo.yzl/customer.tbl"); > customer.filter(new FilterFunction() { > public boolean filter(String value) throws Exception { > return true; > } > }) > .writeAsText("/Users/zhuoluo.yzl/customer.txt"); > //env.execute(); > } > {code} > We can use *start-cluster.sh* on a *local* computer to reproduce the problem. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3292: [FLINK-5739] [client] fix NullPointerException in CliFron...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3292 Change looks good, thank you! Merging this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5024) Add SimpleStateDescriptor to clarify the concepts
[ https://issues.apache.org/jira/browse/FLINK-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872233#comment-15872233 ] ASF GitHub Bot commented on FLINK-5024: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3243 Subsumed by another pull request... > Add SimpleStateDescriptor to clarify the concepts > - > > Key: FLINK-5024 > URL: https://issues.apache.org/jira/browse/FLINK-5024 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Reporter: Xiaogang Shi >Assignee: Xiaogang Shi > > Currently, StateDescriptors accept two type arguments : the first one is the > type of the created state and the second one is the type of the values in the > states. > The concepts however is a little confusing here because in ListStates, the > arguments passed to the StateDescriptors are the types of the list elements > instead of the lists. It also makes the implementation of MapStates difficult. > I suggest not to put the type serializer in StateDescriptors, making > StateDescriptors independent of the data structures of the values. > A new type of StateDescriptor named SimpleStateDescriptor can be provided to > abstract those states (namely ValueState, ReducingState and FoldingState) > whose states are not composited. > The states (e.g. ListStates and MapStates) can implement their own > descriptors according to their data structures. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-3163) Configure Flink for NUMA systems
[ https://issues.apache.org/jira/browse/FLINK-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872236#comment-15872236 ] ASF GitHub Bot commented on FLINK-3163: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3249 Looks good. +1 from my side! > Configure Flink for NUMA systems > > > Key: FLINK-3163 > URL: https://issues.apache.org/jira/browse/FLINK-3163 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.0.0 >Reporter: Greg Hogan >Assignee: Greg Hogan > Fix For: 1.3.0 > > > On NUMA systems Flink can be pinned to a single physical processor ("node") > using {{numactl --membind=$node --cpunodebind=$node }}. Commonly > available NUMA systems include the largest AWS and Google Compute instances. > For example, on an AWS c4.8xlarge system with 36 hyperthreads the user could > configure a single TaskManager with 36 slots or have Flink create two > TaskManagers bound to each of the NUMA nodes, each with 18 slots. > There may be some extra overhead in transferring network buffers between > TaskManagers on the same system, though the fraction of data shuffled in this > manner decreases with the size of the cluster. The performance improvement > from only accessing local memory looks to be significant though difficult to > benchmark. > The JobManagers may fit into NUMA nodes rather than requiring full systems. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-5024) Add SimpleStateDescriptor to clarify the concepts
[ https://issues.apache.org/jira/browse/FLINK-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872234#comment-15872234 ] ASF GitHub Bot commented on FLINK-5024: --- Github user StephanEwen closed the pull request at: https://github.com/apache/flink/pull/3243 > Add SimpleStateDescriptor to clarify the concepts > - > > Key: FLINK-5024 > URL: https://issues.apache.org/jira/browse/FLINK-5024 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Reporter: Xiaogang Shi >Assignee: Xiaogang Shi > > Currently, StateDescriptors accept two type arguments : the first one is the > type of the created state and the second one is the type of the values in the > states. > The concepts however is a little confusing here because in ListStates, the > arguments passed to the StateDescriptors are the types of the list elements > instead of the lists. It also makes the implementation of MapStates difficult. > I suggest not to put the type serializer in StateDescriptors, making > StateDescriptors independent of the data structures of the values. > A new type of StateDescriptor named SimpleStateDescriptor can be provided to > abstract those states (namely ValueState, ReducingState and FoldingState) > whose states are not composited. > The states (e.g. ListStates and MapStates) can implement their own > descriptors according to their data structures. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3249: [FLINK-3163] [scripts] Configure Flink for NUMA systems
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3249 Looks good. +1 from my side! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3243: [FLINK-5024] [core] Refactor the interface of State and S...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3243 Subsumed by another pull request... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5682) Fix scala version in flink-streaming-scala POM file
[ https://issues.apache.org/jira/browse/FLINK-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872231#comment-15872231 ] ASF GitHub Bot commented on FLINK-5682: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3231 @billliuatuber Since the dependency management section in the root `pom.xml` defines the Scala version for all sub-modules, I think this change is not needed. If you agree, could you close the pull request? If you think I am overlooking something, please let me know! > Fix scala version in flink-streaming-scala POM file > > > Key: FLINK-5682 > URL: https://issues.apache.org/jira/browse/FLINK-5682 > Project: Flink > Issue Type: Bug > Components: Build System >Reporter: Bill Liu > Labels: build, easyfix > Original Estimate: 48h > Remaining Estimate: 48h > > In flink-streaming-scala, it doesn't define the scala library version, > when build Flink for scala 2.10, it still possiblely includes scala 2.11. > {quote} > > org.scala-lang > scala-reflect > > > org.scala-lang > scala-library > > > org.scala-lang > scala-compiler > > {quote} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink pull request #3243: [FLINK-5024] [core] Refactor the interface of Stat...
Github user StephanEwen closed the pull request at: https://github.com/apache/flink/pull/3243 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3231: [FLINK-5682] Fix scala version in flink-streaming-scala P...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3231 @billliuatuber Since the dependency management section in the root `pom.xml` defines the Scala version for all sub-modules, I think this change is not needed. If you agree, could you close the pull request? If you think I am overlooking something, please let me know! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5669) flink-streaming-contrib DataStreamUtils.collect in local environment mode fails when offline
[ https://issues.apache.org/jira/browse/FLINK-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872227#comment-15872227 ] ASF GitHub Bot commented on FLINK-5669: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3223 This change looks good, thank you! Merging this... > flink-streaming-contrib DataStreamUtils.collect in local environment mode > fails when offline > > > Key: FLINK-5669 > URL: https://issues.apache.org/jira/browse/FLINK-5669 > Project: Flink > Issue Type: Bug > Components: flink-contrib >Reporter: Rick Cox >Priority: Minor > > {{DataStreamUtils.collect()}} needs to obtain the local machine's IP so that > the job can send the results back. In the case of local > {{StreamEnvironments}}, it uses {{InetAddress.getLocalHost()}}, which > attempts to resolve the local hostname using DNS. > If DNS is not available (for example, when offline) or if DNS is available > but cannot resolve the hostname (for example, if the hostname is an intranet > name but the machine is not currently on that network), an > {{UnknownHostException}} will be thrown (and wrapped in an {{IOException}}). > If the resolved IP is not reachable for some reason, streaming results will > fail. > Since this case is for local execution only, it seems that using > {{InetAddress.getLoopbackAddress()}} would work just as well, and avoid the > assumptions made by {{getLocalHost()}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3223: [FLINK-5669] Change DataStreamUtils to use the loopback a...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3223 This change looks good, thank you! Merging this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5640) configure the explicit Unit Test file suffix
[ https://issues.apache.org/jira/browse/FLINK-5640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872225#comment-15872225 ] ASF GitHub Bot commented on FLINK-5640: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3211 Change looks good, thanks! Merging this... > configure the explicit Unit Test file suffix > > > Key: FLINK-5640 > URL: https://issues.apache.org/jira/browse/FLINK-5640 > Project: Flink > Issue Type: Test > Components: Tests >Reporter: shijinkui >Assignee: shijinkui > Fix For: 1.2.1 > > > There are four types of Unit Test file: *ITCase.java, *Test.java, > *ITSuite.scala, *Suite.scala > File name ending with "IT.java" is integration test. File name ending with > "Test.java" is unit test. > It's clear for Surefire plugin of default-test execution to declare that > "*Test.*" is Java Unit Test. > The test file statistics below: > * Suite total: 10 > * ITCase total: 378 > * Test total: 1008 > * ITSuite total: 14 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3211: [FLINK-5640][build]configure the explicit Unit Test file ...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3211 Change looks good, thanks! Merging this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5634) Flink should not always redirect stdout to a file.
[ https://issues.apache.org/jira/browse/FLINK-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872224#comment-15872224 ] ASF GitHub Bot commented on FLINK-5634: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3204 Does anyone want to take a stab at addressing this the https://issues.apache.org/jira/browse/FLINK-4326 way? I think no one is active on that issue right now... > Flink should not always redirect stdout to a file. > -- > > Key: FLINK-5634 > URL: https://issues.apache.org/jira/browse/FLINK-5634 > Project: Flink > Issue Type: Bug > Components: Docker >Affects Versions: 1.2.0 >Reporter: Jamie Grier >Assignee: Jamie Grier > > Flink always redirects stdout to a file. While often convenient this isn't > always what people want. The most obvious case of this is a Docker > deployment. > It should be possible to have Flink log to stdout. > Here is a PR for this: https://github.com/apache/flink/pull/3204 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3204: [FLINK-5634] Flink should not always redirect stdout to a...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3204 Does anyone want to take a stab at addressing this the https://issues.apache.org/jira/browse/FLINK-4326 way? I think no one is active on that issue right now... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5546) java.io.tmpdir setted as project build directory in surefire plugin
[ https://issues.apache.org/jira/browse/FLINK-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872196#comment-15872196 ] ASF GitHub Bot commented on FLINK-5546: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3190 All right, I am convinced now that this is a helpful change. Merging this... > java.io.tmpdir setted as project build directory in surefire plugin > --- > > Key: FLINK-5546 > URL: https://issues.apache.org/jira/browse/FLINK-5546 > Project: Flink > Issue Type: Test > Components: Build System > Environment: CentOS 7.2 >Reporter: Syinchwun Leo >Assignee: shijinkui > Fix For: 1.2.1 > > > When multiple Linux users run test at the same time, flink-runtime module may > fail. User A creates /tmp/cacheFile, and User B will have no permission to > visit the fold. > Failed tests: > FileCacheDeleteValidationTest.setup:79 Error initializing the test: > /tmp/cacheFile (Permission denied) > Tests in error: > IOManagerTest.channelEnumerator:54 » Runtime Could not create storage > director... > Tests run: 1385, Failures: 1, Errors: 1, Skipped: 8 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3190: [FLINK-5546][build] java.io.tmpdir setted as project buil...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3190 All right, I am convinced now that this is a helpful change. Merging this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5817) Fix test concurrent execution failure by test dir conflicts.
[ https://issues.apache.org/jira/browse/FLINK-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872180#comment-15872180 ] ASF GitHub Bot commented on FLINK-5817: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3341 Good change, thank you. merging this... > Fix test concurrent execution failure by test dir conflicts. > > > Key: FLINK-5817 > URL: https://issues.apache.org/jira/browse/FLINK-5817 > Project: Flink > Issue Type: Bug >Reporter: Wenlong Lyu >Assignee: Wenlong Lyu > > Currently when different users build flink on the same machine, failure may > happen because some test utilities create test file using the fixed name, > which will cause file access failing when different user processing the same > file at the same time. > We have found errors from AbstractTestBase, IOManagerTest, FileCacheTest. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3341: [FLINK-5817]Fix test concurrent execution failure by test...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3341 Good change, thank you. merging this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3138: #Flink-5522 Storm Local Cluster can't work with powermock
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3138 I think that is a good fix, thank you! Merging this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5497) remove duplicated tests
[ https://issues.apache.org/jira/browse/FLINK-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872163#comment-15872163 ] ASF GitHub Bot commented on FLINK-5497: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3089 Thank you for fixing this! > remove duplicated tests > --- > > Key: FLINK-5497 > URL: https://issues.apache.org/jira/browse/FLINK-5497 > Project: Flink > Issue Type: Improvement > Components: Tests >Reporter: Alexey Diomin >Priority: Minor > > Now we have test which run the same code 4 times, every run 17+ seconds. > Need do small refactoring and remove duplicated code. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3089: [FLINK-5497] remove duplicated tests
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3089 Thank you for fixing this! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3089: [FLINK-5497] remove duplicated tests
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3089 Okay, I finally found the time to double check this. The changes are good, merging this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5497) remove duplicated tests
[ https://issues.apache.org/jira/browse/FLINK-5497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872162#comment-15872162 ] ASF GitHub Bot commented on FLINK-5497: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3089 Okay, I finally found the time to double check this. The changes are good, merging this... > remove duplicated tests > --- > > Key: FLINK-5497 > URL: https://issues.apache.org/jira/browse/FLINK-5497 > Project: Flink > Issue Type: Improvement > Components: Tests >Reporter: Alexey Diomin >Priority: Minor > > Now we have test which run the same code 4 times, every run 17+ seconds. > Need do small refactoring and remove duplicated code. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-5751) 404 in documentation
[ https://issues.apache.org/jira/browse/FLINK-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872126#comment-15872126 ] ASF GitHub Bot commented on FLINK-5751: --- Github user uce commented on a diff in the pull request: https://github.com/apache/flink/pull/3332#discussion_r101801834 --- Diff: docs/check_links.sh --- @@ -0,0 +1,36 @@ +#!/usr/bin/env bash + +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +target=${1:-"http://localhost:4000"} + +# Crawl the docs, ignoring robots.txt, storing nothing locally +wget --spider -r -nd -nv -e robots=off -p -o spider.log "$target" + +# Abort for anything other than 0 and 4 ("Network failure") +status=$? +if [ $status -ne 0 ] && [ $status -ne 4 ]; then +exit $status +fi + +# Fail the build if any broken links are found +broken_links_str=$(grep -e 'Found [[:digit:]]\+ broken link(s)' spider.log) +if [ -n "$broken_links_str" ]; then + echo -e "\e[1;31m$broken_links_str\e[0m" --- End diff -- Thanks for catching this! Fixed in a separate commit. > 404 in documentation > > > Key: FLINK-5751 > URL: https://issues.apache.org/jira/browse/FLINK-5751 > Project: Flink > Issue Type: Bug > Components: Documentation >Reporter: Colin Breame >Priority: Trivial > Fix For: 1.3.0, 1.2.1 > > > This page: > https://ci.apache.org/projects/flink/flink-docs-release-1.2/quickstart/setup_quickstart.html > Contains a link with title "Flink on Windows" with URL: > - > https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/flink_on_windows > This gives a 404. It should be: > - > https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/flink_on_windows.html -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink pull request #3332: [FLINK-5751] [docs] Add link check script
Github user uce commented on a diff in the pull request: https://github.com/apache/flink/pull/3332#discussion_r101801834 --- Diff: docs/check_links.sh --- @@ -0,0 +1,36 @@ +#!/usr/bin/env bash + +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +target=${1:-"http://localhost:4000"} + +# Crawl the docs, ignoring robots.txt, storing nothing locally +wget --spider -r -nd -nv -e robots=off -p -o spider.log "$target" + +# Abort for anything other than 0 and 4 ("Network failure") +status=$? +if [ $status -ne 0 ] && [ $status -ne 4 ]; then +exit $status +fi + +# Fail the build if any broken links are found +broken_links_str=$(grep -e 'Found [[:digit:]]\+ broken link(s)' spider.log) +if [ -n "$broken_links_str" ]; then + echo -e "\e[1;31m$broken_links_str\e[0m" --- End diff -- Thanks for catching this! Fixed in a separate commit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5178) allow BlobCache to use a distributed file system irrespective of the HA mode
[ https://issues.apache.org/jira/browse/FLINK-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872125#comment-15872125 ] ASF GitHub Bot commented on FLINK-5178: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3085 I would personally prefer to not make that change at this point. Interpreting HA parameters in non-HA mode might come across as confusing to users. Also, the new way of instantiating TaskManager and JobManager (FLIP-6) via the `HighAvailabiliytServices` should make more cases use the File-system-based BlobStore anyways (irrespective of HA setups). > allow BlobCache to use a distributed file system irrespective of the HA mode > > > Key: FLINK-5178 > URL: https://issues.apache.org/jira/browse/FLINK-5178 > Project: Flink > Issue Type: Improvement > Components: Network >Reporter: Nico Kruber >Assignee: Nico Kruber > > After FLINK-5129, high availability (HA) mode adds the ability for the > BlobCache instances at the task managers to download blobs directly from the > distributed file system. It would be nice if this also worked in non-HA mode. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3085: [FLINK-5178] allow BlobCache to use a distributed file sy...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3085 I would personally prefer to not make that change at this point. Interpreting HA parameters in non-HA mode might come across as confusing to users. Also, the new way of instantiating TaskManager and JobManager (FLIP-6) via the `HighAvailabiliytServices` should make more cases use the File-system-based BlobStore anyways (irrespective of HA setups). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5129) make the BlobServer use a distributed file system
[ https://issues.apache.org/jira/browse/FLINK-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872122#comment-15872122 ] ASF GitHub Bot commented on FLINK-5129: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3084 Good change, thanks! Merging this... > make the BlobServer use a distributed file system > - > > Key: FLINK-5129 > URL: https://issues.apache.org/jira/browse/FLINK-5129 > Project: Flink > Issue Type: Improvement > Components: Network >Reporter: Nico Kruber >Assignee: Nico Kruber > > Currently, the BlobServer uses a local storage and, in addition when the HA > mode is set, a distributed file system, e.g. hdfs. This, however, is only > used by the JobManager and all TaskManager instances request blobs from the > JobManager. By using the distributed file system there as well, we would > lower the load on the JobManager and increase scalability. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3084: [FLINK-5129] make the BlobServer use a distributed file s...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3084 Good change, thanks! Merging this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-5731) Split up CI builds
[ https://issues.apache.org/jira/browse/FLINK-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872119#comment-15872119 ] ASF GitHub Bot commented on FLINK-5731: --- Github user greghogan commented on the issue: https://github.com/apache/flink/pull/3344 Ten minutes allows for a significant number of new tests. Alternatively, we did look at parallelizing the tests but there were some dependency issues and the PR expired in review. This does need fixed, but as you note the longer we put this off the deeper we dig ourselves in a hole. Are we looking to split the libraries out of the main repo? > Split up CI builds > -- > > Key: FLINK-5731 > URL: https://issues.apache.org/jira/browse/FLINK-5731 > Project: Flink > Issue Type: Improvement > Components: Build System, Tests >Reporter: Ufuk Celebi >Assignee: Robert Metzger >Priority: Critical > > Test builds regularly time out because we are hitting the Travis 50 min > limit. Previously, we worked around this by splitting up the tests into > groups. I think we have to split them further. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] flink issue #3344: FLINK-5731 Spilt up tests into three disjoint groups
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/3344 Ten minutes allows for a significant number of new tests. Alternatively, we did look at parallelizing the tests but there were some dependency issues and the PR expired in review. This does need fixed, but as you note the longer we put this off the deeper we dig ourselves in a hole. Are we looking to split the libraries out of the main repo? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---