[GitHub] [flink] flinkbot edited a comment on issue #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9942: [FLINK-14461][configuration] Remove 
unused sessionTimeout from JobGraph
URL: https://github.com/apache/flink/pull/9942#issuecomment-54414
 
 
   
   ## CI report:
   
   * 9edacc5d121c71befd2506f276b4210288c4654e : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/132691365)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot commented on issue #9943: [FLINK-14463] [Kafka Consumer] Merge Handover in flink-connector-kafka and flink-connector-kafka-0.9

2019-10-19 Thread GitBox
flinkbot commented on issue #9943: [FLINK-14463] [Kafka Consumer] Merge 
Handover in flink-connector-kafka and flink-connector-kafka-0.9 
URL: https://github.com/apache/flink/pull/9943#issuecomment-544223382
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit e816681d90c2070b6136ddaca1ce11e4d32a21fd (Sun Oct 20 
05:55:32 UTC 2019)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-14463).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-14463) Merge Handover in flink-connector-kafka and flink-connector-kafka-0.9

2019-10-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14463:
---
Labels: pull-request-available  (was: )

> Merge Handover in flink-connector-kafka and flink-connector-kafka-0.9
> -
>
> Key: FLINK-14463
> URL: https://issues.apache.org/jira/browse/FLINK-14463
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.9.0
>Reporter: Jiayi Liao
>Priority: Major
>  Labels: pull-request-available
>
> Handover.java exists both in flink-connector-kafka(kafka 2.x) module and 
> flink-connector-kafka-0.9 module. We should put this file into kafka base 
> module to avoid repeated codes.
> cc [~sewen] [~yanghua]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] buptljy opened a new pull request #9943: [FLINK-14463] [Kafka Consumer] Merge Handover in flink-connector-kafka and flink-connector-kafka-0.9

2019-10-19 Thread GitBox
buptljy opened a new pull request #9943: [FLINK-14463] [Kafka Consumer] Merge 
Handover in flink-connector-kafka and flink-connector-kafka-0.9   
URL: https://github.com/apache/flink/pull/9943
 
 
   
   ## What is the purpose of the change
   
   Handover.java exists both in flink-connector-kafka(kafka 2.x) module and 
flink-connector-kafka-0.9 module. We should put this file into kafka base 
module to avoid repeated codes.
   
   ## Brief change log
   
   Remove Handover in  flink-connector-kafka(kafka 2.x) module and 
flink-connector-kafka-0.9 module, and add Handover in 
flink-connector-kafka-base module.
   
   ## Verifying this change
   
   Unit testing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-14463) Merge Handover in flink-connector-kafka and flink-connector-kafka-0.9

2019-10-19 Thread Jiayi Liao (Jira)
Jiayi Liao created FLINK-14463:
--

 Summary: Merge Handover in flink-connector-kafka and 
flink-connector-kafka-0.9
 Key: FLINK-14463
 URL: https://issues.apache.org/jira/browse/FLINK-14463
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.9.0
Reporter: Jiayi Liao


Handover.java exists both in flink-connector-kafka(kafka 2.x) module and 
flink-connector-kafka-0.9 module. We should put this file into kafka base 
module to avoid repeated codes.

cc [~sewen] [~yanghua]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on issue #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph

2019-10-19 Thread GitBox
flinkbot commented on issue #9942: [FLINK-14461][configuration] Remove unused 
sessionTimeout from JobGraph
URL: https://github.com/apache/flink/pull/9942#issuecomment-54414
 
 
   
   ## CI report:
   
   * 9edacc5d121c71befd2506f276b4210288c4654e : UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-8363) Build Hadoop 2.9.0 convenience binaries

2019-10-19 Thread Igor Dvorzhak (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-8363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955392#comment-16955392
 ] 

Igor Dvorzhak commented on FLINK-8363:
--

Any updates on this? I would be nice to have flink-shaded libraries published 
for Hadoop 2.9.x and Hadoop 3.x

> Build Hadoop 2.9.0 convenience binaries
> ---
>
> Key: FLINK-8363
> URL: https://issues.apache.org/jira/browse/FLINK-8363
> Project: Flink
>  Issue Type: New Feature
>  Components: Build System, BuildSystem / Shaded
>Affects Versions: 1.5.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Trivial
>
> Hadoop 2.9.0 was released on 17 November, 2017. A local {{mvn clean verify 
> -Dhadoop.version=2.9.0}} ran successfully.
> With the new Hadoopless build we may be able to improve the build process by 
> reusing the {{flink-dist}} jar (which differ only in build timestamps) and 
> simply make each Hadoop-specific tarball by copying in the corresponding 
> {{flink-shaded-hadoop2-uber}} jar.
> What portion of the TravisCI jobs can run Hadoopless? We could build and 
> verify these once and then run a Hadoop-versioned job for each Hadoop version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on issue #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph

2019-10-19 Thread GitBox
flinkbot commented on issue #9942: [FLINK-14461][configuration] Remove unused 
sessionTimeout from JobGraph
URL: https://github.com/apache/flink/pull/9942#issuecomment-544221279
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 9edacc5d121c71befd2506f276b4210288c4654e (Sun Oct 20 
05:09:03 UTC 2019)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-14461) Remove unused sessionTimeout from JobGraph

2019-10-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14461:
---
Labels: pull-request-available  (was: )

> Remove unused sessionTimeout from JobGraph
> --
>
> Key: FLINK-14461
> URL: https://issues.apache.org/jira/browse/FLINK-14461
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration
>Affects Versions: 1.10.0
>Reporter: Zili Chen
>Assignee: Zili Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] TisonKun opened a new pull request #9942: [FLINK-14461][configuration] Remove unused sessionTimeout from JobGraph

2019-10-19 Thread GitBox
TisonKun opened a new pull request #9942: [FLINK-14461][configuration] Remove 
unused sessionTimeout from JobGraph
URL: https://github.com/apache/flink/pull/9942
 
 
   ## What is the purpose of the change
   
   Remove unused sessionTimeout from JobGraph
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   
   cc @zentol @tillrohrmann 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-14462) Remove JobGraph#allowQueuedScheduling flag because it is always true

2019-10-19 Thread Zili Chen (Jira)
Zili Chen created FLINK-14462:
-

 Summary: Remove JobGraph#allowQueuedScheduling flag because it is 
always true
 Key: FLINK-14462
 URL: https://issues.apache.org/jira/browse/FLINK-14462
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Configuration
Affects Versions: 1.10.0
Reporter: Zili Chen
Assignee: Zili Chen
 Fix For: 1.10.0


CC [~trohrmann][~zhuzh]

The only point {{#setAllowQueuedScheduling(false)}} is in 
{{JobGraphGenerator}}. IIRC we always {{#setAllowQueuedScheduling(true)}} after 
the generation and before the submission. For reduce confusion I propose to 
remove {{JobGraph#allowQueuedScheduling}} and refactor the related logic to all 
respect {{true}}.

This flag is originally used for configuring different resource allocation 
strategy between legacy mode and FLIP-6 arch. And there remains branches in 
{{Scheduler}} which might cause further confusion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14461) Remove unused sessionTimeout from JobGraph

2019-10-19 Thread Zili Chen (Jira)
Zili Chen created FLINK-14461:
-

 Summary: Remove unused sessionTimeout from JobGraph
 Key: FLINK-14461
 URL: https://issues.apache.org/jira/browse/FLINK-14461
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Configuration
Affects Versions: 1.10.0
Reporter: Zili Chen
Assignee: Zili Chen
 Fix For: 1.10.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14424) Create tpc-ds end to end test to support all tpc-ds queries

2019-10-19 Thread Jiayi Liao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955375#comment-16955375
 ] 

Jiayi Liao commented on FLINK-14424:


A quick ask, are you going to create a new tpc-ds module and put it into 
flink-end-to-end-tests? Is this your plan?

>  Create tpc-ds end to end test to support all tpc-ds queries
> 
>
> Key: FLINK-14424
> URL: https://issues.apache.org/jira/browse/FLINK-14424
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API, Table SQL / Planner
>Reporter: Leonard Xu
>Assignee: Leonard Xu
>Priority: Major
> Fix For: 1.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable

2019-10-19 Thread GitBox
YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] 
[flink-table-planner] Fix TemporalTable row schema always inferred as nullable
URL: https://github.com/apache/flink/pull/9780#issuecomment-544151621
 
 
   @twalthr OK, once I wrote the test I figured out my fix here won't solve the 
problem. After doing some more debugging, I've come to the following findings:
   
   Given the following test:
   
   ```scala
   val util = scalaStreamTestUtil()
   
   val sourceTable = util.addTableSource[(Timestamp, Long, String)]("MyTable", 
'f1, 'f2, 'f3)
   val temporal =
   util.tableEnv.sqlQuery("SELECT f1, f2, COLLECT(DISTINCT f3) f3s FROM MyTable 
GROUP BY f1, f2")
   val temporalFunc = temporal.createTemporalTableFunction('f1, 'f2)
   
   util.tableEnv.registerFunction("f", temporalFunc)
   val queryTable =
 util.tableEnv.sqlQuery("select f1, f2, b from MyTable, LATERAL 
TABLE(f(f1)) AS T(a, b, cs)")
   
   util.verifyPlan(queryTable)
   ```
   If we first look at the generated table schema for the underlying table by 
the SQL query for the temporal table, we see:
   
   ```
   root
|-- f1: TIMESTAMP(3)
|-- f2: BIGINT
|-- f3s: MULTISET NOT NULL
   ```
   
   When `FlinkPlanner` validates the SQL query it reaches the part where it 
needs to validate the `TemporalTableFunction` I've defined called `f`. It calls 
[FunctionCatalogOperatorTable.lookupOperatorOverloads](https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/catalog/FunctionCatalogOperatorTable.java#L57)
 to search for the method and once it finds it, it needs to convert it to a 
standard `SqlFunction`. In order to do that, it needs to convert the 
`TypeInformation[T]` of the `TemporalTableFunction`'s result type to the new 
`DataType` API. 
   
   Problem is, `TypeInformation[T]` doesn't carry information about the 
nullability of the field, thus when the conversion to `Multiset[T]` happens, 
[it ends up calling the default constructor which sets `nullable = true` by 
default](https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/MultisetType.java#L60),
 which ends up blowing up at runtime because the `TableSchema` expected a NOT 
NULL field.
   
   I'm not entirely sure how we can get around this issue.
   
   **EDIT**:
   
   OK, it seems like the `TableFunctionDefinition` for the temporal table 
already carries the `DataType` information which is visible via: 
`((TemporalTableFunctionImpl) 
functionDefinition.tableFunction).getUnderlyingHistoryTable().getTableSchema().getFieldDataTypes()`,
 which we can use in order to avoid the data losing conversion from 
`TypeInformation[T]`. WDYT?
   
   **EDIT 2:**
   
   Turns out this is not enough, since 
`DeferredTypeFlinkTableFunction.getExternalResultType()` will attempt to call 
`tableFunction.getResult`, which has the `TypeInformation[T]` and thus will 
result in a nullable type as well.
   
   **EDIT 3:**
   
   `SqlTypeFactoryImpl.createMultisetType` defaults to creating a non null 
multiset:
   
   ```java
   public RelDataType createMultisetType(
 RelDataType type,
 long maxCardinality) {
   assert maxCardinality == -1;
   RelDataType newType = new MultisetSqlType(type, false);
   return canonize(newType);
 }
   ```
   
   This is where the two diverge and why the `TableSchema` has a non null 
multiset type, and it seems like this will happen for any complex data type?
   
   **EDIT 4**
   
   Perhaps the better way is to make `FlinkTypeFactory` produce a 
`MultisetSqlType` which is by default nullable? i.e.:
   
   ```scala
   override def createMultisetType(elementType: RelDataType, maxCardinality: 
Long): RelDataType = {
   // Just validate type, make sure there is a failure in validate phase.
   toLogicalType(elementType)
   super.createTypeWithNullability(super.createMultisetType(elementType, 
maxCardinality), true)
   }
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable

2019-10-19 Thread GitBox
YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] 
[flink-table-planner] Fix TemporalTable row schema always inferred as nullable
URL: https://github.com/apache/flink/pull/9780#issuecomment-544151621
 
 
   @twalthr OK, once I wrote the test I figured out my fix here won't solve the 
problem. After doing some more debugging, I've come to the following findings:
   
   Given the following test:
   
   ```scala
   val util = scalaStreamTestUtil()
   
   val sourceTable = util.addTableSource[(Timestamp, Long, String)]("MyTable", 
'f1, 'f2, 'f3)
   val temporal =
   util.tableEnv.sqlQuery("SELECT f1, f2, COLLECT(DISTINCT f3) f3s FROM MyTable 
GROUP BY f1, f2")
   val temporalFunc = temporal.createTemporalTableFunction('f1, 'f2)
   
   util.tableEnv.registerFunction("f", temporalFunc)
   val queryTable =
 util.tableEnv.sqlQuery("select f1, f2, b from MyTable, LATERAL 
TABLE(f(f1)) AS T(a, b, cs)")
   
   util.verifyPlan(queryTable)
   ```
   If we first look at the generated table schema for the underlying table by 
the SQL query for the temporal table, we see:
   
   ```
   root
|-- f1: TIMESTAMP(3)
|-- f2: BIGINT
|-- f3s: MULTISET NOT NULL
   ```
   
   When `FlinkPlanner` validates the SQL query it reaches the part where it 
needs to validate the `TemporalTableFunction` I've defined called `f`. It calls 
[FunctionCatalogOperatorTable.lookupOperatorOverloads](https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/catalog/FunctionCatalogOperatorTable.java#L57)
 to search for the method and once it finds it, it needs to convert it to a 
standard `SqlFunction`. In order to do that, it needs to convert the 
`TypeInformation[T]` of the `TemporalTableFunction`'s result type to the new 
`DataType` API. 
   
   Problem is, `TypeInformation[T]` doesn't carry information about the 
nullability of the field, thus when the conversion to `Multiset[T]` happens, 
[it ends up calling the default constructor which sets `nullable = true` by 
default](https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/MultisetType.java#L60),
 which ends up blowing up at runtime because the `TableSchema` expected a NOT 
NULL field.
   
   I'm not entirely sure how we can get around this issue.
   
   **EDIT**:
   
   OK, it seems like the `TableFunctionDefinition` for the temporal table 
already carries the `DataType` information which is visible via: 
`((TemporalTableFunctionImpl) 
functionDefinition.tableFunction).getUnderlyingHistoryTable().getTableSchema().getFieldDataTypes()`,
 which we can use in order to avoid the data losing conversion from 
`TypeInformation[T]`. WDYT?
   
   **EDIT 2:**
   
   Turns out this is not enough, since 
`DeferredTypeFlinkTableFunction.getExternalResultType()` will attempt to call 
`tableFunction.getResult`, which has the `TypeInformation[T]` and thus will 
result in a nullable type as well.
   
   **EDIT 3:**
   
   `SqlTypeFactoryImpl.createMultisetType` defaults to creating a non null 
multiset:
   
   ```java
   public RelDataType createMultisetType(
 RelDataType type,
 long maxCardinality) {
   assert maxCardinality == -1;
   RelDataType newType = new MultisetSqlType(type, false);
   return canonize(newType);
 }
   ```
   
   This is where the two diverge and why the `TableSchema` has a non null 
multiset type, and it seems like this will happen for any complex data type?
   
   **EDIT 4**
   
   Perhaps the better way is to make `FlinkTypeFactory` produce a 
`MultisetSqlType` which is by default nullable? i.e.:
   
   ```scala
   override def createMultisetType(elementType: RelDataType, maxCardinality: 
Long): RelDataType = {
   // Just validate type, make sure there is a failure in validate phase.
   toLogicalType(elementType)
   super.createTypeWithNullability(super.createMultisetType(elementType, 
maxCardinality), true)
 }
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable

2019-10-19 Thread GitBox
YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] 
[flink-table-planner] Fix TemporalTable row schema always inferred as nullable
URL: https://github.com/apache/flink/pull/9780#issuecomment-544151621
 
 
   @twalthr OK, once I wrote the test I figured out my fix here won't solve the 
problem. After doing some more debugging, I've come to the following findings:
   
   Given the following test:
   
   ```scala
   val util = scalaStreamTestUtil()
   
   val sourceTable = util.addTableSource[(Timestamp, Long, String)]("MyTable", 
'f1, 'f2, 'f3)
   val temporal =
   util.tableEnv.sqlQuery("SELECT f1, f2, COLLECT(DISTINCT f3) f3s FROM MyTable 
GROUP BY f1, f2")
   val temporalFunc = temporal.createTemporalTableFunction('f1, 'f2)
   
   util.tableEnv.registerFunction("f", temporalFunc)
   val queryTable =
 util.tableEnv.sqlQuery("select f1, f2, b from MyTable, LATERAL 
TABLE(f(f1)) AS T(a, b, cs)")
   
   util.verifyPlan(queryTable)
   ```
   If we first look at the generated table schema for the underlying table by 
the SQL query for the temporal table, we see:
   
   ```
   root
|-- f1: TIMESTAMP(3)
|-- f2: BIGINT
|-- f3s: MULTISET NOT NULL
   ```
   
   When `FlinkPlanner` validates the SQL query it reaches the part where it 
needs to validate the `TemporalTableFunction` I've defined called `f`. It calls 
[FunctionCatalogOperatorTable.lookupOperatorOverloads](https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/catalog/FunctionCatalogOperatorTable.java#L57)
 to search for the method and once it finds it, it needs to convert it to a 
standard `SqlFunction`. In order to do that, it needs to convert the 
`TypeInformation[T]` of the `TemporalTableFunction`'s result type to the new 
`DataType` API. 
   
   Problem is, `TypeInformation[T]` doesn't carry information about the 
nullability of the field, thus when the conversion to `Multiset[T]` happens, 
[it ends up calling the default constructor which sets `nullable = true` by 
default](https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/MultisetType.java#L60),
 which ends up blowing up at runtime because the `TableSchema` expected a NOT 
NULL field.
   
   I'm not entirely sure how we can get around this issue.
   
   **EDIT**:
   
   OK, it seems like the `TableFunctionDefinition` for the temporal table 
already carries the `DataType` information which is visible via: 
`((TemporalTableFunctionImpl) 
functionDefinition.tableFunction).getUnderlyingHistoryTable().getTableSchema().getFieldDataTypes()`,
 which we can use in order to avoid the data losing conversion from 
`TypeInformation[T]`. WDYT?
   
   **EDIT 2:**
   
   Turns out this is not enough, since 
`DeferredTypeFlinkTableFunction.getExternalResultType()` will attempt to call 
`tableFunction.getResult`, which has the `TypeInformation[T]` and thus will 
result in a nullable type as well.
   
   **EDIT 3:**
   
   `SqlTypeFactoryImpl.createMultisetType` defaults to creating a non null 
multiset:
   
   ```java
   public RelDataType createMultisetType(
 RelDataType type,
 long maxCardinality) {
   assert maxCardinality == -1;
   RelDataType newType = new MultisetSqlType(type, false);
   return canonize(newType);
 }
   ```
   
   This is where the two diverge and why the `TableSchema` has a non null 
multiset type, and it seems like this will happen for any complex data type?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-12399) FilterableTableSource does not use filters on job run

2019-10-19 Thread Rong Rong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955236#comment-16955236
 ] 

Rong Rong commented on FLINK-12399:
---

merged to 1.9: c1019105c22455c554ab91b9fc2ef8512873bee8

> FilterableTableSource does not use filters on job run
> -
>
> Key: FLINK-12399
> URL: https://issues.apache.org/jira/browse/FLINK-12399
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.8.0
>Reporter: Josh Bradt
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
> Attachments: flink-filter-bug.tar.gz
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As discussed [on the mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Filter-push-down-not-working-for-a-custom-BatchTableSource-tp27654.html],
>  there appears to be a bug where a job that uses a custom 
> FilterableTableSource does not keep the filters that were pushed down into 
> the table source. More specifically, the table source does receive filters 
> via applyPredicates, and a new table source with those filters is returned, 
> but the final job graph appears to use the original table source, which does 
> not contain any filters.
> I attached a minimal example program to this ticket. The custom table source 
> is as follows: 
> {code:java}
> public class CustomTableSource implements BatchTableSource, 
> FilterableTableSource {
> private static final Logger LOG = 
> LoggerFactory.getLogger(CustomTableSource.class);
> private final Filter[] filters;
> private final FilterConverter converter = new FilterConverter();
> public CustomTableSource() {
> this(null);
> }
> private CustomTableSource(Filter[] filters) {
> this.filters = filters;
> }
> @Override
> public DataSet getDataSet(ExecutionEnvironment execEnv) {
> if (filters == null) {
>LOG.info(" No filters defined ");
> } else {
> LOG.info(" Found filters ");
> for (Filter filter : filters) {
> LOG.info("FILTER: {}", filter);
> }
> }
> return execEnv.fromCollection(allModels());
> }
> @Override
> public TableSource applyPredicate(List predicates) {
> LOG.info("Applying predicates");
> List acceptedFilters = new ArrayList<>();
> for (final Expression predicate : predicates) {
> converter.convert(predicate).ifPresent(acceptedFilters::add);
> }
> return new CustomTableSource(acceptedFilters.toArray(new Filter[0]));
> }
> @Override
> public boolean isFilterPushedDown() {
> return filters != null;
> }
> @Override
> public TypeInformation getReturnType() {
> return TypeInformation.of(Model.class);
> }
> @Override
> public TableSchema getTableSchema() {
> return TableSchema.fromTypeInfo(getReturnType());
> }
> private List allModels() {
> List models = new ArrayList<>();
> models.add(new Model(1, 2, 3, 4));
> models.add(new Model(10, 11, 12, 13));
> models.add(new Model(20, 21, 22, 23));
> return models;
> }
> }
> {code}
>  
> When run, it logs
> {noformat}
> 15:24:54,888 INFO  com.klaviyo.filterbug.CustomTableSource
>- Applying predicates
> 15:24:54,901 INFO  com.klaviyo.filterbug.CustomTableSource
>- Applying predicates
> 15:24:54,910 INFO  com.klaviyo.filterbug.CustomTableSource
>- Applying predicates
> 15:24:54,977 INFO  com.klaviyo.filterbug.CustomTableSource
>-  No filters defined {noformat}
> which appears to indicate that although filters are getting pushed down, the 
> final job does not use them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-12399) FilterableTableSource does not use filters on job run

2019-10-19 Thread Rong Rong (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-12399:
--
Fix Version/s: 1.9.2

> FilterableTableSource does not use filters on job run
> -
>
> Key: FLINK-12399
> URL: https://issues.apache.org/jira/browse/FLINK-12399
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.8.0
>Reporter: Josh Bradt
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0, 1.9.2
>
> Attachments: flink-filter-bug.tar.gz
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As discussed [on the mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Filter-push-down-not-working-for-a-custom-BatchTableSource-tp27654.html],
>  there appears to be a bug where a job that uses a custom 
> FilterableTableSource does not keep the filters that were pushed down into 
> the table source. More specifically, the table source does receive filters 
> via applyPredicates, and a new table source with those filters is returned, 
> but the final job graph appears to use the original table source, which does 
> not contain any filters.
> I attached a minimal example program to this ticket. The custom table source 
> is as follows: 
> {code:java}
> public class CustomTableSource implements BatchTableSource, 
> FilterableTableSource {
> private static final Logger LOG = 
> LoggerFactory.getLogger(CustomTableSource.class);
> private final Filter[] filters;
> private final FilterConverter converter = new FilterConverter();
> public CustomTableSource() {
> this(null);
> }
> private CustomTableSource(Filter[] filters) {
> this.filters = filters;
> }
> @Override
> public DataSet getDataSet(ExecutionEnvironment execEnv) {
> if (filters == null) {
>LOG.info(" No filters defined ");
> } else {
> LOG.info(" Found filters ");
> for (Filter filter : filters) {
> LOG.info("FILTER: {}", filter);
> }
> }
> return execEnv.fromCollection(allModels());
> }
> @Override
> public TableSource applyPredicate(List predicates) {
> LOG.info("Applying predicates");
> List acceptedFilters = new ArrayList<>();
> for (final Expression predicate : predicates) {
> converter.convert(predicate).ifPresent(acceptedFilters::add);
> }
> return new CustomTableSource(acceptedFilters.toArray(new Filter[0]));
> }
> @Override
> public boolean isFilterPushedDown() {
> return filters != null;
> }
> @Override
> public TypeInformation getReturnType() {
> return TypeInformation.of(Model.class);
> }
> @Override
> public TableSchema getTableSchema() {
> return TableSchema.fromTypeInfo(getReturnType());
> }
> private List allModels() {
> List models = new ArrayList<>();
> models.add(new Model(1, 2, 3, 4));
> models.add(new Model(10, 11, 12, 13));
> models.add(new Model(20, 21, 22, 23));
> return models;
> }
> }
> {code}
>  
> When run, it logs
> {noformat}
> 15:24:54,888 INFO  com.klaviyo.filterbug.CustomTableSource
>- Applying predicates
> 15:24:54,901 INFO  com.klaviyo.filterbug.CustomTableSource
>- Applying predicates
> 15:24:54,910 INFO  com.klaviyo.filterbug.CustomTableSource
>- Applying predicates
> 15:24:54,977 INFO  com.klaviyo.filterbug.CustomTableSource
>-  No filters defined {noformat}
> which appears to indicate that although filters are getting pushed down, the 
> final job does not use them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable

2019-10-19 Thread GitBox
YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] 
[flink-table-planner] Fix TemporalTable row schema always inferred as nullable
URL: https://github.com/apache/flink/pull/9780#issuecomment-544151621
 
 
   @twalthr OK, once I wrote the test I figured out my fix here won't solve the 
problem. After doing some more debugging, I've come to the following findings:
   
   Given the following test:
   
   ```scala
   val util = scalaStreamTestUtil()
   
   val sourceTable = util.addTableSource[(Timestamp, Long, String)]("MyTable", 
'f1, 'f2, 'f3)
   val temporal =
   util.tableEnv.sqlQuery("SELECT f1, f2, COLLECT(DISTINCT f3) f3s FROM MyTable 
GROUP BY f1, f2")
   val temporalFunc = temporal.createTemporalTableFunction('f1, 'f2)
   
   util.tableEnv.registerFunction("f", temporalFunc)
   val queryTable =
 util.tableEnv.sqlQuery("select f1, f2, b from MyTable, LATERAL 
TABLE(f(f1)) AS T(a, b, cs)")
   
   util.verifyPlan(queryTable)
   ```
   If we first look at the generated table schema for the underlying table by 
the SQL query for the temporal table, we see:
   
   ```
   root
|-- f1: TIMESTAMP(3)
|-- f2: BIGINT
|-- f3s: MULTISET NOT NULL
   ```
   
   When `FlinkPlanner` validates the SQL query it reaches the part where it 
needs to validate the `TemporalTableFunction` I've defined called `f`. It calls 
[FunctionCatalogOperatorTable.lookupOperatorOverloads](https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/catalog/FunctionCatalogOperatorTable.java#L57)
 to search for the method and once it finds it, it needs to convert it to a 
standard `SqlFunction`. In order to do that, it needs to convert the 
`TypeInformation[T]` of the `TemporalTableFunction`'s result type to the new 
`DataType` API. 
   
   Problem is, `TypeInformation[T]` doesn't carry information about the 
nullability of the field, thus when the conversion to `Multiset[T]` happens, 
[it ends up calling the default constructor which sets `nullable = true` by 
default](https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/MultisetType.java#L60),
 which ends up blowing up at runtime because the `TableSchema` expected a NOT 
NULL field.
   
   I'm not entirely sure how we can get around this issue.
   
   **EDIT**:
   
   OK, it seems like the `TableFunctionDefinition` for the temporal table 
already carries the `DataType` information which is visible via: 
`((TemporalTableFunctionImpl) 
functionDefinition.tableFunction).getUnderlyingHistoryTable().getTableSchema().getFieldDataTypes()`,
 which we can use in order to avoid the data losing conversion from 
`TypeInformation[T]`. WDYT?
   
   **EDIT 2:**
   
   Turns out this is not enough, since 
`DeferredTypeFlinkTableFunction.getExternalResultType()` will attempt to call 
`tableFunction.getResult`, which has the `TypeInformation[T]` and thus will 
result in a nullable type as well.
   
   **EDIT 3:**
   
   `SqlTypeFactoryImpl.createMultisetType` defaults to creating a non null 
multiset:
   
   ```java
   public RelDataType createMultisetType(
 RelDataType type,
 long maxCardinality) {
   assert maxCardinality == -1;
   RelDataType newType = new MultisetSqlType(type, false);
   return canonize(newType);
 }
   ```
   
   This is where the types converge and why the `TableSchema` has a non null 
multiset type, and it seems like this will happen for any complex data type?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable

2019-10-19 Thread GitBox
YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] 
[flink-table-planner] Fix TemporalTable row schema always inferred as nullable
URL: https://github.com/apache/flink/pull/9780#issuecomment-544151621
 
 
   @twalthr OK, once I wrote the test I figured out my fix here won't solve the 
problem. After doing some more debugging, I've come to the following findings:
   
   Given the following test:
   
   ```scala
   val util = scalaStreamTestUtil()
   
   val sourceTable = util.addTableSource[(Timestamp, Long, String)]("MyTable", 
'f1, 'f2, 'f3)
   val temporal =
   util.tableEnv.sqlQuery("SELECT f1, f2, COLLECT(DISTINCT f3) f3s FROM MyTable 
GROUP BY f1, f2")
   val temporalFunc = temporal.createTemporalTableFunction('f1, 'f2)
   
   util.tableEnv.registerFunction("f", temporalFunc)
   val queryTable =
 util.tableEnv.sqlQuery("select f1, f2, b from MyTable, LATERAL 
TABLE(f(f1)) AS T(a, b, cs)")
   
   util.verifyPlan(queryTable)
   ```
   If we first look at the generated table schema for the underlying table by 
the SQL query for the temporal table, we see:
   
   ```
   root
|-- f1: TIMESTAMP(3)
|-- f2: BIGINT
|-- f3s: MULTISET NOT NULL
   ```
   
   When `FlinkPlanner` validates the SQL query it reaches the part where it 
needs to validate the `TemporalTableFunction` I've defined called `f`. It calls 
[FunctionCatalogOperatorTable.lookupOperatorOverloads](https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/catalog/FunctionCatalogOperatorTable.java#L57)
 to search for the method and once it finds it, it needs to convert it to a 
standard `SqlFunction`. In order to do that, it needs to convert the 
`TypeInformation[T]` of the `TemporalTableFunction`'s result type to the new 
`DataType` API. 
   
   Problem is, `TypeInformation[T]` doesn't carry information about the 
nullability of the field, thus when the conversion to `Multiset[T]` happens, 
[it ends up calling the default constructor which sets `nullable = true` by 
default](https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/MultisetType.java#L60),
 which ends up blowing up at runtime because the `TableSchema` expected a NOT 
NULL field.
   
   I'm not entirely sure how we can get around this issue.
   
   **EDIT**:
   
   OK, it seems like the `TableFunctionDefinition` for the temporal table 
already carries the `DataType` information which is visible via: 
`((TemporalTableFunctionImpl) 
functionDefinition.tableFunction).getUnderlyingHistoryTable().getTableSchema().getFieldDataTypes()`,
 which we can use in order to avoid the data losing conversion from 
`TypeInformation[T]`. WDYT?
   
   **EDIT 2:**
   
   Turns out this is not enough, since 
`DeferredTypeFlinkTableFunction.getExternalResultType()` will attempt to call 
`tableFunction.getResult`, which has the `TypeInformation[T]` and thus will 
result in a nullable type as well.
   
   **EDIT 3:**
   
   `SqlTypeFactoryImpl.createMultisetType` defaults to creating a non null 
multiset:
   
   ```java
   public RelDataType createMultisetType(
 RelDataType type,
 long maxCardinality) {
   assert maxCardinality == -1;
   RelDataType newType = new MultisetSqlType(type, false);
   return canonize(newType);
 }
   ```
   
   This is where the types converge, and it seems like this will happen for any 
complex data type?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable

2019-10-19 Thread GitBox
YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] 
[flink-table-planner] Fix TemporalTable row schema always inferred as nullable
URL: https://github.com/apache/flink/pull/9780#issuecomment-544151621
 
 
   @twalthr OK, once I wrote the test I figured out my fix here won't solve the 
problem. After doing some more debugging, I've come to the following findings:
   
   Given the following test:
   
   ```scala
   val util = scalaStreamTestUtil()
   
   val sourceTable = util.addTableSource[(Timestamp, Long, String)]("MyTable", 
'f1, 'f2, 'f3)
   val temporal =
   util.tableEnv.sqlQuery("SELECT f1, f2, COLLECT(DISTINCT f3) f3s FROM MyTable 
GROUP BY f1, f2")
   val temporalFunc = temporal.createTemporalTableFunction('f1, 'f2)
   
   util.tableEnv.registerFunction("f", temporalFunc)
   val queryTable =
 util.tableEnv.sqlQuery("select f1, f2, b from MyTable, LATERAL 
TABLE(f(f1)) AS T(a, b, cs)")
   
   util.verifyPlan(queryTable)
   ```
   If we first look at the generated table schema for the underlying table by 
the SQL query for the temporal table, we see:
   
   ```
   root
|-- f1: TIMESTAMP(3)
|-- f2: BIGINT
|-- f3s: MULTISET NOT NULL
   ```
   
   When `FlinkPlanner` validates the SQL query it reaches the part where it 
needs to validate the `TemporalTableFunction` I've defined called `f`. It calls 
[FunctionCatalogOperatorTable.lookupOperatorOverloads](https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/catalog/FunctionCatalogOperatorTable.java#L57)
 to search for the method and once it finds it, it needs to convert it to a 
standard `SqlFunction`. In order to do that, it needs to convert the 
`TypeInformation[T]` of the `TemporalTableFunction`'s result type to the new 
`DataType` API. 
   
   Problem is, `TypeInformation[T]` doesn't carry information about the 
nullability of the field, thus when the conversion to `Multiset[T]` happens, 
[it ends up calling the default constructor which sets `nullable = true` by 
default](https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/MultisetType.java#L60),
 which ends up blowing up at runtime because the `TableSchema` expected a NOT 
NULL field.
   
   I'm not entirely sure how we can get around this issue.
   
   **EDIT**:
   
   OK, it seems like the `TableFunctionDefinition` for the temporal table 
already carries the `DataType` information which is visible via: 
`((TemporalTableFunctionImpl) 
functionDefinition.tableFunction).getUnderlyingHistoryTable().getTableSchema().getFieldDataTypes()`,
 which we can use in order to avoid the data losing conversion from 
`TypeInformation[T]`. WDYT?
   
   **EDIT 2:**
   
   Turns out this is not enough, since 
`DeferredTypeFlinkTableFunction.getExternalResultType()` will attempt to call 
`tableFunction.getResult`, which has the `TypeInformation[T]` and thus will 
result in a nullable type as well.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-13850) Refactor part file configuration into a single method

2019-10-19 Thread lichong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955200#comment-16955200
 ] 

lichong commented on FLINK-13850:
-

This is what I really need recently. I want to name the target file and the 
in-porgress file instead of the default way, such as a new target name without 
any subtask info, and remove the dot prefix of the in-progress file, etc. 

OutputFileConfig maybe make more sense for me as it means this is the config 
for the output file, and also there should be a way for users who can provide 
the configuration or just use the default value. 

My opinion, thanks.

I am expecting this feature.

> Refactor part file configuration into a single method
> -
>
> Key: FLINK-13850
> URL: https://issues.apache.org/jira/browse/FLINK-13850
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / FileSystem
>Reporter: Gyula Fora
>Assignee: João Boto
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently there is only two methods on both format builders
> withPartFilePrefix and withPartFileSuffix for configuring the part files but 
> in the future it is likely to grow.
>  * More settings, different directories for pending / inprogress files etc
> I suggest we remove these two methods and replace them with a single : 
> withPartFileConfig(..) where we use an extensible config class.
> This should be fixed before 1.10 in order to not release the other methods.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on issue #9576: [FLINK-13915][ml] Add several base classes of summarizer.

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9576: [FLINK-13915][ml] Add several base 
classes of summarizer.
URL: https://github.com/apache/flink/pull/9576#issuecomment-526645449
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 010706cbbd12d506f23a52e8b71474ab2b06c2b6 (Sat Oct 19 
15:10:48 UTC 2019)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-13915).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] walterddr commented on a change in pull request #9576: [FLINK-13915][ml] Add several base classes of summarizer.

2019-10-19 Thread GitBox
walterddr commented on a change in pull request #9576: [FLINK-13915][ml] Add 
several base classes of summarizer.
URL: https://github.com/apache/flink/pull/9576#discussion_r336740930
 
 

 ##
 File path: 
flink-ml-parent/flink-ml-lib/src/main/java/org/apache/flink/ml/common/statistics/basicstatistic/BaseSummarizer.java
 ##
 @@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.flink.ml.common.statistics.basicstatistic;
+
+import org.apache.flink.ml.common.linalg.DenseMatrix;
+
+import java.io.Serializable;
+
+/**
+ * Summarizer is the base class to calculate summary and store intermediate 
results, and Summary is the result of Summarizer.
 
 Review comment:
   I actually have a higher level questions regarding the type of summary we 
are going to support and what's the road map in the future.
   
   In general: a "summary" to me should be something that can be serialized 
into a format that an external system can read and use it to produce meaningful 
insights. such as visualization e.g. 
[tf.summary](https://www.tensorflow.org/api_docs/python/tf/summary), or for 
transforming into some object that can be used to export data 
[R.summary](https://www.rdocumentation.org/packages/base/versions/3.6.1/topics/summary).
 
   
   In this case, the class "summary" in this PR is more like a wrapper around: 
`sum()`, `count()`, `correlate()`, `covariance()`. I am not sure what's the 
value add to the ML-library. Maybe I was understanding this entire intention in 
the wrong way.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9576: [FLINK-13915][ml] Add several base classes of summarizer.

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9576: [FLINK-13915][ml] Add several base 
classes of summarizer.
URL: https://github.com/apache/flink/pull/9576#issuecomment-526645449
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 010706cbbd12d506f23a52e8b71474ab2b06c2b6 (Sat Oct 19 
15:05:44 UTC 2019)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-13915).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] walterddr commented on a change in pull request #9576: [FLINK-13915][ml] Add several base classes of summarizer.

2019-10-19 Thread GitBox
walterddr commented on a change in pull request #9576: [FLINK-13915][ml] Add 
several base classes of summarizer.
URL: https://github.com/apache/flink/pull/9576#discussion_r33674
 
 

 ##
 File path: 
flink-ml-parent/flink-ml-lib/src/main/java/org/apache/flink/ml/common/statistics/basicstatistic/BaseSummarizer.java
 ##
 @@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.flink.ml.common.statistics.basicstatistic;
+
+import org.apache.flink.ml.common.linalg.DenseMatrix;
+
+import java.io.Serializable;
+
+/**
+ * Summarizer is the base class to calculate summary and store intermediate 
results, and Summary is the result of Summarizer.
 
 Review comment:
   To add to this discussion. Some of the summaries used by Tensorflow for 
example, is meant to be used by Tensorboard - which understands the format of 
the summary in order to plot visualization via 
[protobuf](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/summary.proto),
 same thing with 
[tf.Event](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/util/event.proto).
 
   
   If this is the case, is there an existing discussion regarding what needs to 
be done in terms of the public APIs of a `Summary` class should have in 
Flink-ML? I haven't follow up closely with the ML but I cant find any in the 
[FLIP-39 
documentation](https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] walterddr commented on a change in pull request #9576: [FLINK-13915][ml] Add several base classes of summarizer.

2019-10-19 Thread GitBox
walterddr commented on a change in pull request #9576: [FLINK-13915][ml] Add 
several base classes of summarizer.
URL: https://github.com/apache/flink/pull/9576#discussion_r33674
 
 

 ##
 File path: 
flink-ml-parent/flink-ml-lib/src/main/java/org/apache/flink/ml/common/statistics/basicstatistic/BaseSummarizer.java
 ##
 @@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.flink.ml.common.statistics.basicstatistic;
+
+import org.apache.flink.ml.common.linalg.DenseMatrix;
+
+import java.io.Serializable;
+
+/**
+ * Summarizer is the base class to calculate summary and store intermediate 
results, and Summary is the result of Summarizer.
 
 Review comment:
   To add to this discussion. Some of the summaries used by Tensorflow for 
example, is meant to be used by Tensorboard - which understands the format of 
the summary in order to plot visualization via 
[protobuf](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/summary.proto)),
 same thing with 
[tf.Event](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/util/event.proto).
 
   
   If this is the case, is there an existing discussion regarding what needs to 
be done in terms of the public APIs of a `Summary` class should have in 
Flink-ML? I haven't follow up closely with the ML but I cant find any in the 
[FLIP-39 
documentation](https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable

2019-10-19 Thread GitBox
YuvalItzchakov edited a comment on issue #9780: [FLINK-14042] 
[flink-table-planner] Fix TemporalTable row schema always inferred as nullable
URL: https://github.com/apache/flink/pull/9780#issuecomment-544151621
 
 
   @twalthr OK, once I wrote the test I figured out my fix here won't solve the 
problem. After doing some more debugging, I've come to the following findings:
   
   Given the following test:
   
   ```scala
   val util = scalaStreamTestUtil()
   
   val sourceTable = util.addTableSource[(Timestamp, Long, String)]("MyTable", 
'f1, 'f2, 'f3)
   val temporal =
   util.tableEnv.sqlQuery("SELECT f1, f2, COLLECT(DISTINCT f3) f3s FROM MyTable 
GROUP BY f1, f2")
   val temporalFunc = temporal.createTemporalTableFunction('f1, 'f2)
   
   util.tableEnv.registerFunction("f", temporalFunc)
   val queryTable =
 util.tableEnv.sqlQuery("select f1, f2, b from MyTable, LATERAL 
TABLE(f(f1)) AS T(a, b, cs)")
   
   util.verifyPlan(queryTable)
   ```
   If we first look at the generated table schema for the underlying table by 
the SQL query for the temporal table, we see:
   
   ```
   root
|-- f1: TIMESTAMP(3)
|-- f2: BIGINT
|-- f3s: MULTISET NOT NULL
   ```
   
   When `FlinkPlanner` validates the SQL query it reaches the part where it 
needs to validate the `TemporalTableFunction` I've defined called `f`. It calls 
[FunctionCatalogOperatorTable.lookupOperatorOverloads](https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/catalog/FunctionCatalogOperatorTable.java#L57)
 to search for the method and once it finds it, it needs to convert it to a 
standard `SqlFunction`. In order to do that, it needs to convert the 
`TypeInformation[T]` of the `TemporalTableFunction`'s result type to the new 
`DataType` API. 
   
   Problem is, `TypeInformation[T]` doesn't carry information about the 
nullability of the field, thus when the conversion to `Multiset[T]` happens, 
[it ends up calling the default constructor which sets `nullable = true` by 
default](https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/MultisetType.java#L60),
 which ends up blowing up at runtime because the `TableSchema` expected a NOT 
NULL field.
   
   I'm not entirely sure how we can get around this issue.
   
   **EDIT**:
   
   OK, it seems like the `TableFunctionDefinition` for the temporal table 
already carries the `DataType` information which is visible via: 
`((TemporalTableFunctionImpl) 
functionDefinition.tableFunction).getUnderlyingHistoryTable().getTableSchema().getFieldDataTypes()`,
 which we can use in order to avoid the data losing conversion from 
`TypeInformation[T]`. WDYT?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9780: [FLINK-14042] [flink-table-planner] 
Fix TemporalTable row schema always inferred as nullable
URL: https://github.com/apache/flink/pull/9780#issuecomment-535560331
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 8c0e3b4b94421c9f5ac4cfa470d0859e88471f38 (Sat Oct 19 
14:29:10 UTC 2019)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-14042).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9832: [FLINK-11843] Bind lifespan of Dispatcher to leader session

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9832: [FLINK-11843] Bind lifespan of 
Dispatcher to leader session
URL: https://github.com/apache/flink/pull/9832#issuecomment-537039262
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 271703eda6f6c55b1641a54206109ef659f62854 (Sat Oct 19 
14:29:14 UTC 2019)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] YuvalItzchakov commented on issue #9780: [FLINK-14042] [flink-table-planner] Fix TemporalTable row schema always inferred as nullable

2019-10-19 Thread GitBox
YuvalItzchakov commented on issue #9780: [FLINK-14042] [flink-table-planner] 
Fix TemporalTable row schema always inferred as nullable
URL: https://github.com/apache/flink/pull/9780#issuecomment-544151621
 
 
   @twalthr OK, once I wrote the test I figured out my fix here won't solve the 
problem. After doing some more debugging, I've come to the following findings:
   
   Given the following test:
   
   ```scala
   val util = scalaStreamTestUtil()
   
   val sourceTable = util.addTableSource[(Timestamp, Long, String)]("MyTable", 
'f1, 'f2, 'f3)
   val temporal =
   util.tableEnv.sqlQuery("SELECT f1, f2, COLLECT(DISTINCT f3) f3s FROM MyTable 
GROUP BY f1, f2")
   val temporalFunc = temporal.createTemporalTableFunction('f1, 'f2)
   
   util.tableEnv.registerFunction("f", temporalFunc)
   val queryTable =
 util.tableEnv.sqlQuery("select f1, f2, b from MyTable, LATERAL 
TABLE(f(f1)) AS T(a, b, cs)")
   
   util.verifyPlan(queryTable)
   ```
   If we first look at the generated table schema for the underlying table by 
the SQL query for the temporal table, we see:
   
   ```
   root
|-- f1: TIMESTAMP(3)
|-- f2: BIGINT
|-- f3s: MULTISET NOT NULL
   ```
   
   When `FlinkPlanner` validates the SQL query it reaches the part where it 
needs to validate the `TemporalTableFunction` I've defined called `f`. It calls 
[FunctionCatalogOperatorTable.lookupOperatorOverloads](https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/catalog/FunctionCatalogOperatorTable.java#L57)
 to search for the method and once it finds it, it needs to convert it to a 
standard `SqlFunction`. In order to do that, it needs to convert the 
`TypeInformation[T]` of the `TemporalTableFunction`'s result type to the new 
`DataType` API. 
   
   Problem is, `TypeInformation[T]` doesn't carry information about the 
nullability of the field, thus when the conversion to `Multiset[T]` happens, 
[it ends up calling the default constructor which sets `nullable = true` by 
default](https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/MultisetType.java#L60),
 which ends up blowing up at runtime because the `TableSchema` expected a NOT 
NULL field.
   
   I'm not entirely sure how we can get around this issue.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] tillrohrmann commented on a change in pull request #9832: [FLINK-11843] Bind lifespan of Dispatcher to leader session

2019-10-19 Thread GitBox
tillrohrmann commented on a change in pull request #9832: [FLINK-11843] Bind 
lifespan of Dispatcher to leader session
URL: https://github.com/apache/flink/pull/9832#discussion_r336739367
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/runner/StoppedDispatcherLeaderProcess.java
 ##
 @@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.dispatcher.runner;
+
+import org.apache.flink.runtime.clusterframework.ApplicationStatus;
+import org.apache.flink.runtime.dispatcher.DispatcherGateway;
+
+import java.util.UUID;
+import java.util.concurrent.CompletableFuture;
+
+/**
+ * {@link DispatcherLeaderProcess} implementation which is stopped. This class
+ * is useful as the initial state of the {@link DefaultDispatcherRunner}.
+ */
+public enum StoppedDispatcherLeaderProcess implements DispatcherLeaderProcess {
+   INSTANCE;
+
+   private static final CompletableFuture TERMINATION_FUTURE = 
CompletableFuture.completedFuture(null);
+   private static final UUID LEADER_SESSION_ID = new UUID(0L, 0L);
+   private static final CompletableFuture 
NEVER_COMPLETED_LEADER_SESSION_FUTURE = new CompletableFuture<>();
+   private static final CompletableFuture 
NEVER_COMPLETED_SHUTDOWN_FUTURE = new CompletableFuture<>();
+
+   @Override
+   public void start() {
+
+   }
+
+   @Override
+   public UUID getLeaderSessionId() {
+   return LEADER_SESSION_ID;
+   }
+
+   @Override
+   public CompletableFuture getDispatcherGateway() {
+   return null;
+   }
+
+   @Override
+   public CompletableFuture getConfirmLeaderSessionFuture() {
+   return NEVER_COMPLETED_LEADER_SESSION_FUTURE;
 
 Review comment:
   I think you are right. I will update the implementation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9832: [FLINK-11843] Bind lifespan of Dispatcher to leader session

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9832: [FLINK-11843] Bind lifespan of 
Dispatcher to leader session
URL: https://github.com/apache/flink/pull/9832#issuecomment-537039262
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 271703eda6f6c55b1641a54206109ef659f62854 (Sat Oct 19 
14:25:08 UTC 2019)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] tillrohrmann commented on a change in pull request #9832: [FLINK-11843] Bind lifespan of Dispatcher to leader session

2019-10-19 Thread GitBox
tillrohrmann commented on a change in pull request #9832: [FLINK-11843] Bind 
lifespan of Dispatcher to leader session
URL: https://github.com/apache/flink/pull/9832#discussion_r336739208
 
 

 ##
 File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/dispatcher/runner/DispatcherLeaderProcessImplTest.java
 ##
 @@ -239,6 +246,109 @@ public void 
closeAsync_duringJobRecovery_preventsDispatcherServiceCreation() thr
}
}
 
+   @Test
+   public void onRemovedJobGraph_cancelsRunningJob() throws Exception {
 
 Review comment:
   In this test class I'm trying to follow a new naming scheme which I believe 
is more helpful and is described here: 
https://osherove.com/blog/2005/4/3/naming-standards-for-unit-tests.html. The 
idea is to have more descriptive names for tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] tillrohrmann commented on a change in pull request #9832: [FLINK-11843] Bind lifespan of Dispatcher to leader session

2019-10-19 Thread GitBox
tillrohrmann commented on a change in pull request #9832: [FLINK-11843] Bind 
lifespan of Dispatcher to leader session
URL: https://github.com/apache/flink/pull/9832#discussion_r336739241
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/runner/DefaultDispatcherRunner.java
 ##
 @@ -0,0 +1,235 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.dispatcher.runner;
+
+import org.apache.flink.runtime.clusterframework.ApplicationStatus;
+import org.apache.flink.runtime.concurrent.FutureUtils;
+import org.apache.flink.runtime.dispatcher.DispatcherGateway;
+import org.apache.flink.runtime.leaderelection.LeaderContender;
+import org.apache.flink.runtime.leaderelection.LeaderElectionService;
+import org.apache.flink.runtime.rpc.FatalErrorHandler;
+import org.apache.flink.util.FlinkException;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.Arrays;
+import java.util.UUID;
+import java.util.concurrent.CompletableFuture;
+
+/**
+ * Runner for the {@link org.apache.flink.runtime.dispatcher.Dispatcher} which 
is responsible for the
+ * leader election.
+ */
+public class DefaultDispatcherRunner implements DispatcherRunner, 
LeaderContender {
+
+   private static final Logger LOG = 
LoggerFactory.getLogger(DefaultDispatcherRunner.class);
+
+   private final Object lock = new Object();
+
+   private final LeaderElectionService leaderElectionService;
+
+   private final FatalErrorHandler fatalErrorHandler;
+
+   private final DispatcherLeaderProcessFactory 
dispatcherLeaderProcessFactory;
+
+   private final CompletableFuture terminationFuture;
+
+   private final CompletableFuture shutDownFuture;
+
+   private boolean isRunning;
+
+   private DispatcherLeaderProcess dispatcherLeaderProcess;
+
+   private CompletableFuture 
previousDispatcherLeaderProcessTerminationFuture;
+
+   private CompletableFuture dispatcherGatewayFuture;
+
+   DefaultDispatcherRunner(
+   LeaderElectionService leaderElectionService,
+   FatalErrorHandler fatalErrorHandler,
+   DispatcherLeaderProcessFactory 
dispatcherLeaderProcessFactory) throws Exception {
+   this.leaderElectionService = leaderElectionService;
+   this.fatalErrorHandler = fatalErrorHandler;
+   this.dispatcherLeaderProcessFactory = 
dispatcherLeaderProcessFactory;
+   this.terminationFuture = new CompletableFuture<>();
+   this.shutDownFuture = new CompletableFuture<>();
+
+   this.isRunning = true;
+   this.dispatcherLeaderProcess = 
StoppedDispatcherLeaderProcess.INSTANCE;
+   this.previousDispatcherLeaderProcessTerminationFuture = 
CompletableFuture.completedFuture(null);
+   this.dispatcherGatewayFuture = new CompletableFuture<>();
+
+   startDispatcherRunner(leaderElectionService);
+   }
+
+   private void startDispatcherRunner(LeaderElectionService 
leaderElectionService) throws Exception {
+   LOG.info("Starting {}.", getClass().getName());
+
+   leaderElectionService.start(this);
+   }
+
+   @Override
+   public CompletableFuture getDispatcherGateway() {
+   synchronized (lock) {
+   return dispatcherGatewayFuture;
+   }
+   }
+
+   @Override
+   public CompletableFuture getShutDownFuture() {
+   return shutDownFuture;
+   }
+
+   @Override
+   public CompletableFuture closeAsync() {
+   synchronized (lock) {
+   if (!isRunning) {
+   return terminationFuture;
+   } else {
+   isRunning = false;
+   }
+   }
+
+   stopDispatcherLeaderProcess();
+   final CompletableFuture servicesTerminationFuture = 
stopServices();
+
+   FutureUtils.forward(
+   FutureUtils.completeAll(
+   Arrays.asList(
+   

[GitHub] [flink] flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions.

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add 
documentation for Python User-Defined Scalar functions.
URL: https://github.com/apache/flink/pull/9886#issuecomment-541326107
 
 
   
   ## CI report:
   
   * e431c1c28a35dd86e60a0892ff0e0d15b8da7245 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/131646097)
   * 0b45fe65b2a35154095af4944ea8c33e36165d0a : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/131895829)
   * 586ea9b19299a7ea6bb9efd679522a872c15d0a8 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/132129350)
   * 9ee50d8d31d809e5c5eb578b3d1d428658006554 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/132643699)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (FLINK-14459) Python module build hangs

2019-10-19 Thread Hequn Cheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hequn Cheng closed FLINK-14459.
---
Resolution: Fixed

> Python module build hangs
> -
>
> Key: FLINK-14459
> URL: https://issues.apache.org/jira/browse/FLINK-14459
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.9.0, 1.9.1
>Reporter: Hequn Cheng
>Assignee: Hequn Cheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0, 1.9.2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The build of python module hangs when installing conda. See travis log: 
> https://api.travis-ci.org/v3/job/599704570/log.txt
> Can't reproduce it neither on my local mac nor on my repo with travis. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14459) Python module build hangs

2019-10-19 Thread Hequn Cheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955156#comment-16955156
 ] 

Hequn Cheng commented on FLINK-14459:
-

Fix in
1.10.0: 2b1187d299bc6fd8dcae0d4e565238d7800dd4bb
1.9.2: 4474748c31a0d71e23147409ad338d7f5b37d5e4

> Python module build hangs
> -
>
> Key: FLINK-14459
> URL: https://issues.apache.org/jira/browse/FLINK-14459
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.9.0, 1.10.0, 1.9.1
>Reporter: Hequn Cheng
>Assignee: Hequn Cheng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The build of python module hangs when installing conda. See travis log: 
> https://api.travis-ci.org/v3/job/599704570/log.txt
> Can't reproduce it neither on my local mac nor on my repo with travis. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14459) Python module build hangs

2019-10-19 Thread Hequn Cheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hequn Cheng updated FLINK-14459:

Affects Version/s: (was: 1.10.0)

> Python module build hangs
> -
>
> Key: FLINK-14459
> URL: https://issues.apache.org/jira/browse/FLINK-14459
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.9.0, 1.9.1
>Reporter: Hequn Cheng
>Assignee: Hequn Cheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0, 1.9.2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The build of python module hangs when installing conda. See travis log: 
> https://api.travis-ci.org/v3/job/599704570/log.txt
> Can't reproduce it neither on my local mac nor on my repo with travis. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14459) Python module build hangs

2019-10-19 Thread Hequn Cheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hequn Cheng updated FLINK-14459:

Fix Version/s: 1.9.2
   1.10.0

> Python module build hangs
> -
>
> Key: FLINK-14459
> URL: https://issues.apache.org/jira/browse/FLINK-14459
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.9.0, 1.10.0, 1.9.1
>Reporter: Hequn Cheng
>Assignee: Hequn Cheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0, 1.9.2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The build of python module hangs when installing conda. See travis log: 
> https://api.travis-ci.org/v3/job/599704570/log.txt
> Can't reproduce it neither on my local mac nor on my repo with travis. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on issue #9941: [FLINK-14459][python] Fix python module build hang problem

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9941: [FLINK-14459][python] Fix python 
module build hang problem
URL: https://github.com/apache/flink/pull/9941#issuecomment-544126860
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit d14be543d3faf99a424a5480fda53ca7b7a65e49 (Sat Oct 19 
12:21:56 UTC 2019)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] hequn8128 closed pull request #9941: [FLINK-14459][python] Fix python module build hang problem

2019-10-19 Thread GitBox
hequn8128 closed pull request #9941: [FLINK-14459][python] Fix python module 
build hang problem
URL: https://github.com/apache/flink/pull/9941
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9941: [FLINK-14459][python] Fix python module build hang problem

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9941: [FLINK-14459][python] Fix python 
module build hang problem
URL: https://github.com/apache/flink/pull/9941#issuecomment-544131502
 
 
   
   ## CI report:
   
   * d14be543d3faf99a424a5480fda53ca7b7a65e49 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/132643044)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions.

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add 
documentation for Python User-Defined Scalar functions.
URL: https://github.com/apache/flink/pull/9886#issuecomment-541326107
 
 
   
   ## CI report:
   
   * e431c1c28a35dd86e60a0892ff0e0d15b8da7245 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/131646097)
   * 0b45fe65b2a35154095af4944ea8c33e36165d0a : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/131895829)
   * 586ea9b19299a7ea6bb9efd679522a872c15d0a8 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/132129350)
   * 9ee50d8d31d809e5c5eb578b3d1d428658006554 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/132643699)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9941: [FLINK-14459][python] Fix python module build hang problem

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9941: [FLINK-14459][python] Fix python 
module build hang problem
URL: https://github.com/apache/flink/pull/9941#issuecomment-544131502
 
 
   
   ## CI report:
   
   * d14be543d3faf99a424a5480fda53ca7b7a65e49 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/132643044)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions.

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add 
documentation for Python User-Defined Scalar functions.
URL: https://github.com/apache/flink/pull/9886#issuecomment-541326107
 
 
   
   ## CI report:
   
   * e431c1c28a35dd86e60a0892ff0e0d15b8da7245 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/131646097)
   * 0b45fe65b2a35154095af4944ea8c33e36165d0a : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/131895829)
   * 586ea9b19299a7ea6bb9efd679522a872c15d0a8 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/132129350)
   * 9ee50d8d31d809e5c5eb578b3d1d428658006554 : UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions.

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add 
documentation for Python User-Defined Scalar functions.
URL: https://github.com/apache/flink/pull/9886#issuecomment-541324020
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 9ee50d8d31d809e5c5eb578b3d1d428658006554 (Sat Oct 19 
11:35:06 UTC 2019)
   
   **Warnings:**
* **1 pom.xml files were touched**: Check for build and licensing issues.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] WeiZhong94 commented on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions.

2019-10-19 Thread GitBox
WeiZhong94 commented on issue #9886: [FLINK-14027][python][doc] Add 
documentation for Python User-Defined Scalar functions.
URL: https://github.com/apache/flink/pull/9886#issuecomment-544132906
 
 
   @hequn8128 Thank you for your reminding me of that! I have updated the 
python_configuration.html in the latest commit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add documentation for Python User-Defined Scalar functions.

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9886: [FLINK-14027][python][doc] Add 
documentation for Python User-Defined Scalar functions.
URL: https://github.com/apache/flink/pull/9886#issuecomment-541324020
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 9ee50d8d31d809e5c5eb578b3d1d428658006554 (Sat Oct 19 
11:27:59 UTC 2019)
   
   **Warnings:**
* **1 pom.xml files were touched**: Check for build and licensing issues.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot commented on issue #9941: [FLINK-14459][python] Fix python module build hang problem

2019-10-19 Thread GitBox
flinkbot commented on issue #9941: [FLINK-14459][python] Fix python module 
build hang problem
URL: https://github.com/apache/flink/pull/9941#issuecomment-544131502
 
 
   
   ## CI report:
   
   * d14be543d3faf99a424a5480fda53ca7b7a65e49 : UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-9953) Active Kubernetes integration

2019-10-19 Thread Yang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang updated FLINK-9953:
-
Description: 
This is the umbrella issue tracking Flink's active Kubernetes integration. 
Active means in this context that the {{ResourceManager}} can talk to 
Kubernetes to launch new pods similar to Flink's Yarn and Mesos integration.

Phase1 implementation will have complete functions to make flink running on 
kubernetes. Phrase2 is mainly focused on production optimization, including k8s 
native high-availability, storage, network, log collector and etc.

  was:
This is the umbrella issue tracking Flink's active Kubernetes integration. 
Active means in this context that the {{ResourceManager}} can talk to 
Kubernetes to launch new pods similar to Flink's Yarn and Mesos integration.

Phase1 implementation will have complete functions to make flink running on 
kubernetes. Phrase1 is mainly focused on production optimization, including k8s 
native high-availability, storage, network, log collector and etc.


> Active Kubernetes integration
> -
>
> Key: FLINK-9953
> URL: https://issues.apache.org/jira/browse/FLINK-9953
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination
>Reporter: Till Rohrmann
>Assignee: Yang Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is the umbrella issue tracking Flink's active Kubernetes integration. 
> Active means in this context that the {{ResourceManager}} can talk to 
> Kubernetes to launch new pods similar to Flink's Yarn and Mesos integration.
> Phase1 implementation will have complete functions to make flink running on 
> kubernetes. Phrase2 is mainly focused on production optimization, including 
> k8s native high-availability, storage, network, log collector and etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14460) Active Kubernetes integration phase2 - Production Optimization

2019-10-19 Thread Yang Wang (Jira)
Yang Wang created FLINK-14460:
-

 Summary: Active Kubernetes integration phase2 - Production 
Optimization
 Key: FLINK-14460
 URL: https://issues.apache.org/jira/browse/FLINK-14460
 Project: Flink
  Issue Type: Improvement
Reporter: Yang Wang


This is phase2 of active kubernetes integration. It is a umbrella jira to track 
all production optimization features.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-9953) Active Kubernetes integration

2019-10-19 Thread Yang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955141#comment-16955141
 ] 

Yang Wang edited comment on FLINK-9953 at 10/19/19 10:59 AM:
-

> Perjob cluster mode

I suggest to build the user image with required dependencies in per job mode. 
And actually, standalone job cluster is also like this. Many companies has use 
this way in production.  In order to solve dynamic dependency management, we 
could add the init container before jm and tm pod starting. The init container 
could download the jars and other files from http server, hdfs and other shared 
storage. This make flink application more like k8s style. In this way, the 
`MiniDispatcher` and `ClassPathJobGraphRetriever` is enough for the per job 
mode.

The two parts submission is more like to start a session cluster to simulate 
per job. So we will need to new dispatcher to accept job from rest and allow 
only one job. Maybe we could support this in the future, but it need more 
discussion.

 

> Submission cli

Currently the `flink run` coud only support detach mode for per job cluster on 
Yarn. In attach mode, we use a session to simulate a per job cluster for 
multi-parts. Do we need to keep the same behavior as flink on Yarn? We do not 
need user jar in k8s per job mode, so using the `flink run` to start per job 
cluster will be strange.

 
{code:java}
// detach, DeployJobCluster() Use the jar in the image, not in the cli.
./bin/flink run -d -m kubernetes-cluster ./examples/batch/WordCount.jar

// attach, DeploySessionCluster()
./bin/flink run -m yarn-cluster ./examples/batch/WordCount.jar{code}
 

 

> Implementation plan

Let's focus on current design and move the production optimization to phase2. I 
have created another umbrella jira to track. Also we need more feedback from 
other users to improve the active kubernetes integration after phase1.

I will attach the PRs in the next few days.

 

[~felixzheng] [~trohrmann] How do you think?


was (Author: fly_in_gis):
> Perjob cluster mode

I suggest to build the user image with required dependencies in per job mode. 
And actually, standalone job cluster is also like. Many companies has use this 
way in production.  In order to solve dynamic dependency management, we could 
add the init container before jm and tm pod starting. The init container could 
download the jars and other files from http server, hdfs and other shared 
storage. This make flink application more like k8s style. In this way, the 
`MiniDispatcher` and `ClassPathJobGraphRetriever` is enough for the per job 
mode.

The two parts submission is more like to start a session cluster to simulate 
per job. So we will need to new dispatcher to accept job from rest and allow 
only one job. Maybe we could support this in the future, but it need more 
discussion.

 

> Submission cli

Currently the `flink run` coud only support detach mode for per job cluster on 
Yarn. In attach mode, we use a session to simulate a per job cluster for 
multi-parts. Do we need to keep the same behavior as flink on Yarn? We do not 
need user jar in k8s per job mode, so using the `flink run` to start per job 
cluster will be strange.

 
{code:java}
// detach, DeployJobCluster() Use the jar in the image, not in the cli.
./bin/flink run -d -m kubernetes-cluster ./examples/batch/WordCount.jar

// attach, DeploySessionCluster()
./bin/flink run -m yarn-cluster ./examples/batch/WordCount.jar{code}
 

 

> Implementation plan

Let's focus on current design and move the production optimization to phase2. I 
have created another umbrella jira to track. Also we need more feedback from 
other users to improve the active kubernetes integration after phase1.

I will attach the PRs in the next few days.

 

[~felixzheng] [~trohrmann] How do you think?

> Active Kubernetes integration
> -
>
> Key: FLINK-9953
> URL: https://issues.apache.org/jira/browse/FLINK-9953
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination
>Reporter: Till Rohrmann
>Assignee: Yang Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is the umbrella issue tracking Flink's active Kubernetes integration. 
> Active means in this context that the {{ResourceManager}} can talk to 
> Kubernetes to launch new pods similar to Flink's Yarn and Mesos integration.
> Phase1 implementation will have complete functions to make flink running on 
> kubernetes. Phrase1 is mainly focused on production optimization, including 
> k8s native high-availability, storage, network, log collector and etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-9953) Active Kubernetes integration

2019-10-19 Thread Yang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955141#comment-16955141
 ] 

Yang Wang commented on FLINK-9953:
--

> Perjob cluster mode

I suggest to build the user image with required dependencies in per job mode. 
And actually, standalone job cluster is also like. Many companies has use this 
way in production.  In order to solve dynamic dependency management, we could 
add the init container before jm and tm pod starting. The init container could 
download the jars and other files from http server, hdfs and other shared 
storage. This make flink application more like k8s style. In this way, the 
`MiniDispatcher` and `ClassPathJobGraphRetriever` is enough for the per job 
mode.

The two parts submission is more like to start a session cluster to simulate 
per job. So we will need to new dispatcher to accept job from rest and allow 
only one job. Maybe we could support this in the future, but it need more 
discussion.

 

> Submission cli

Currently the `flink run` coud only support detach mode for per job cluster on 
Yarn. In attach mode, we use a session to simulate a per job cluster for 
multi-parts. Do we need to keep the same behavior as flink on Yarn? We do not 
need user jar in k8s per job mode, so using the `flink run` to start per job 
cluster will be strange.

 
{code:java}
// detach, DeployJobCluster() Use the jar in the image, not in the cli.
./bin/flink run -d -m kubernetes-cluster ./examples/batch/WordCount.jar

// attach, DeploySessionCluster()
./bin/flink run -m yarn-cluster ./examples/batch/WordCount.jar{code}
 

 

> Implementation plan

Let's focus on current design and move the production optimization to phase2. I 
have created another umbrella jira to track. Also we need more feedback from 
other users to improve the active kubernetes integration after phase1.

I will attach the PRs in the next few days.

 

[~felixzheng] [~trohrmann] How do you think?

> Active Kubernetes integration
> -
>
> Key: FLINK-9953
> URL: https://issues.apache.org/jira/browse/FLINK-9953
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination
>Reporter: Till Rohrmann
>Assignee: Yang Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is the umbrella issue tracking Flink's active Kubernetes integration. 
> Active means in this context that the {{ResourceManager}} can talk to 
> Kubernetes to launch new pods similar to Flink's Yarn and Mesos integration.
> Phase1 implementation will have complete functions to make flink running on 
> kubernetes. Phrase1 is mainly focused on production optimization, including 
> k8s native high-availability, storage, network, log collector and etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot commented on issue #9941: [FLINK-14459][python] Fix python module build hang problem

2019-10-19 Thread GitBox
flinkbot commented on issue #9941: [FLINK-14459][python] Fix python module 
build hang problem
URL: https://github.com/apache/flink/pull/9941#issuecomment-544126860
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit d14be543d3faf99a424a5480fda53ca7b7a65e49 (Sat Oct 19 
10:44:34 UTC 2019)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-14459) Python module build hangs

2019-10-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-14459:
---
Labels: pull-request-available  (was: )

> Python module build hangs
> -
>
> Key: FLINK-14459
> URL: https://issues.apache.org/jira/browse/FLINK-14459
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.9.0, 1.10.0, 1.9.1
>Reporter: Hequn Cheng
>Assignee: Hequn Cheng
>Priority: Major
>  Labels: pull-request-available
>
> The build of python module hangs when installing conda. See travis log: 
> https://api.travis-ci.org/v3/job/599704570/log.txt
> Can't reproduce it neither on my local mac nor on my repo with travis. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] hequn8128 opened a new pull request #9941: [FLINK-14459][python] Fix python module build hang problem

2019-10-19 Thread GitBox
hequn8128 opened a new pull request #9941: [FLINK-14459][python] Fix python 
module build hang problem
URL: https://github.com/apache/flink/pull/9941
 
 
   
   ## What is the purpose of the change
   
   This pull request is a hotfix for the build failure for the master and 
release-1.9.
   The problem is caused by the latest conda installer: 
https://github.com/conda/conda/issues/9345.
   In this PR, we specify a stable version instead of the latest one.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-14459) Python module build hangs

2019-10-19 Thread Hequn Cheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955132#comment-16955132
 ] 

Hequn Cheng commented on FLINK-14459:
-

It seems a problem with the latest conda installer. 
https://github.com/conda/conda/issues/9345

> Python module build hangs
> -
>
> Key: FLINK-14459
> URL: https://issues.apache.org/jira/browse/FLINK-14459
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.9.0, 1.10.0, 1.9.1
>Reporter: Hequn Cheng
>Priority: Major
>
> The build of python module hangs when installing conda. See travis log: 
> https://api.travis-ci.org/v3/job/599704570/log.txt
> Can't reproduce it neither on my local mac nor on my repo with travis. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (FLINK-14459) Python module build hangs

2019-10-19 Thread Hequn Cheng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hequn Cheng reassigned FLINK-14459:
---

Assignee: Hequn Cheng

> Python module build hangs
> -
>
> Key: FLINK-14459
> URL: https://issues.apache.org/jira/browse/FLINK-14459
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.9.0, 1.10.0, 1.9.1
>Reporter: Hequn Cheng
>Assignee: Hequn Cheng
>Priority: Major
>
> The build of python module hangs when installing conda. See travis log: 
> https://api.travis-ci.org/v3/job/599704570/log.txt
> Can't reproduce it neither on my local mac nor on my repo with travis. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14459) Python module build hangs

2019-10-19 Thread Hequn Cheng (Jira)
Hequn Cheng created FLINK-14459:
---

 Summary: Python module build hangs
 Key: FLINK-14459
 URL: https://issues.apache.org/jira/browse/FLINK-14459
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.9.0, 1.10.0, 1.9.1
Reporter: Hequn Cheng


The build of python module hangs when installing conda. See travis log: 
https://api.travis-ci.org/v3/job/599704570/log.txt

Can't reproduce it neither on my local mac nor on my repo with travis. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-14370) KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink fails on Travis

2019-10-19 Thread Arvid Heise (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955110#comment-16955110
 ] 

Arvid Heise edited comment on FLINK-14370 at 10/19/19 8:32 AM:
---

We found the root cause and are currently discussing fixes as this is a 
non-trivial interplay of components.

If you are in need of a quick (and dirty) fix: 
[https://github.com/apache/flink/pull/9918/commits/921ef31baa96bfc7c0629854104515dd856a6d29|https://github.com/apache/flink/pull/9918/commits/6a79fb8e9272b5d56ecb286634170c72403c751e]


was (Author: arvid.he...@gmail.com):
We found the root cause and are currently discussing fixes as this is a 
non-trivial interplay of components.

If you are in need of a quick (and dirty) fix: 
https://github.com/apache/flink/pull/9918/commits/921ef31baa96bfc7c0629854104515dd856a6d29

> KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink
>  fails on Travis
> ---
>
> Key: FLINK-14370
> URL: https://issues.apache.org/jira/browse/FLINK-14370
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.10.0
>Reporter: Till Rohrmann
>Assignee: Jiangjie Qin
>Priority: Critical
>  Labels: pull-request-available, test-stability
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The 
> {{KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink}}
>  fails on Travis with
> {code}
> Test 
> testOneToOneAtLeastOnceRegularSink(org.apache.flink.streaming.connectors.kafka.KafkaProducerAtLeastOnceITCase)
>  failed with:
> java.lang.AssertionError: Job should fail!
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.flink.streaming.connectors.kafka.KafkaProducerTestBase.testOneToOneAtLeastOnce(KafkaProducerTestBase.java:280)
>   at 
> org.apache.flink.streaming.connectors.kafka.KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink(KafkaProducerTestBase.java:206)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> 

[jira] [Comment Edited] (FLINK-14370) KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink fails on Travis

2019-10-19 Thread Arvid Heise (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955110#comment-16955110
 ] 

Arvid Heise edited comment on FLINK-14370 at 10/19/19 8:30 AM:
---

We found the root cause and are currently discussing fixes as this is a 
non-trivial interplay of components.

If you are in need of a quick (and dirty) fix: 
https://github.com/apache/flink/pull/9918/commits/921ef31baa96bfc7c0629854104515dd856a6d29


was (Author: arvid.he...@gmail.com):
We found the root cause and are currently discussing fixes as this is a 
non-trivial interplay of components.

> KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink
>  fails on Travis
> ---
>
> Key: FLINK-14370
> URL: https://issues.apache.org/jira/browse/FLINK-14370
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.10.0
>Reporter: Till Rohrmann
>Assignee: Jiangjie Qin
>Priority: Critical
>  Labels: pull-request-available, test-stability
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The 
> {{KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink}}
>  fails on Travis with
> {code}
> Test 
> testOneToOneAtLeastOnceRegularSink(org.apache.flink.streaming.connectors.kafka.KafkaProducerAtLeastOnceITCase)
>  failed with:
> java.lang.AssertionError: Job should fail!
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.flink.streaming.connectors.kafka.KafkaProducerTestBase.testOneToOneAtLeastOnce(KafkaProducerTestBase.java:280)
>   at 
> org.apache.flink.streaming.connectors.kafka.KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink(KafkaProducerTestBase.java:206)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}
> https://api.travis-ci.com/v3/job/244297223/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14370) KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink fails on Travis

2019-10-19 Thread Arvid Heise (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955110#comment-16955110
 ] 

Arvid Heise commented on FLINK-14370:
-

We found the root cause and are currently discussing fixes as this is a 
non-trivial interplay of components.

> KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink
>  fails on Travis
> ---
>
> Key: FLINK-14370
> URL: https://issues.apache.org/jira/browse/FLINK-14370
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.10.0
>Reporter: Till Rohrmann
>Assignee: Jiangjie Qin
>Priority: Critical
>  Labels: pull-request-available, test-stability
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The 
> {{KafkaProducerAtLeastOnceITCase>KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink}}
>  fails on Travis with
> {code}
> Test 
> testOneToOneAtLeastOnceRegularSink(org.apache.flink.streaming.connectors.kafka.KafkaProducerAtLeastOnceITCase)
>  failed with:
> java.lang.AssertionError: Job should fail!
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.flink.streaming.connectors.kafka.KafkaProducerTestBase.testOneToOneAtLeastOnce(KafkaProducerTestBase.java:280)
>   at 
> org.apache.flink.streaming.connectors.kafka.KafkaProducerTestBase.testOneToOneAtLeastOnceRegularSink(KafkaProducerTestBase.java:206)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}
> https://api.travis-ci.com/v3/job/244297223/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize 
the execution plan for Python Calc when there is a condition
URL: https://github.com/apache/flink/pull/9907#issuecomment-542243031
 
 
   
   ## CI report:
   
   * dd14191ee919be31148be254070e7a777cf9cb4d : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/131984556)
   * 1559637102832603a0dc0d09ab730e00f2e9d224 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/132628239)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9953) Active Kubernetes integration

2019-10-19 Thread Canbin Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955105#comment-16955105
 ] 

Canbin Zheng commented on FLINK-9953:
-

[~trohrmann] Thanks very much for taking time to read my design doc.

I think this is something really great to have. We plan to run Flink on 
Kubernetes natively in production shortly, our clusters could be massive and 
may rapidly scale up on size or number of the cluster.

Since this is a significant feature, and the maintenance and enhancement work 
in the future could be much larger than the initial commit. I have some 
suggestions to forward this feature.
 # We can further discuss the design proposals to reach a general consensus, 
especially on resource resolution, managed pods lifecycle management, worker 
store, garbage collection and user-oriented interfaces(shell scripts), after 
that we merge the current design docs into a final one, maybe we need a new 
FLIP too.
 # Take some time to list the small features we want to have and make more 
small plans on proceeding the implementations incrementally in phases, it’s 
better to keep the initial commit as small as possible to ease the code 
reviewing and help the initial version merged into Flink release more quickly, 
then we start to iteratively improve it according to the plans made and the 
feedbacks from users.

> Active Kubernetes integration
> -
>
> Key: FLINK-9953
> URL: https://issues.apache.org/jira/browse/FLINK-9953
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination
>Reporter: Till Rohrmann
>Assignee: Yang Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is the umbrella issue tracking Flink's active Kubernetes integration. 
> Active means in this context that the {{ResourceManager}} can talk to 
> Kubernetes to launch new pods similar to Flink's Yarn and Mesos integration.
> Phase1 implementation will have complete functions to make flink running on 
> kubernetes. Phrase1 is mainly focused on production optimization, including 
> k8s native high-availability, storage, network, log collector and etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize 
the execution plan for Python Calc when there is a condition
URL: https://github.com/apache/flink/pull/9907#issuecomment-542243031
 
 
   
   ## CI report:
   
   * dd14191ee919be31148be254070e7a777cf9cb4d : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/131984556)
   * 1559637102832603a0dc0d09ab730e00f2e9d224 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/132628239)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize the execution plan for Python Calc when there is a condition

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9907: [FLINK-14202][table][python] Optimize 
the execution plan for Python Calc when there is a condition
URL: https://github.com/apache/flink/pull/9907#issuecomment-542243031
 
 
   
   ## CI report:
   
   * dd14191ee919be31148be254070e7a777cf9cb4d : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/131984556)
   * 1559637102832603a0dc0d09ab730e00f2e9d224 : UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9953) Active Kubernetes integration

2019-10-19 Thread Canbin Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955099#comment-16955099
 ] 

Canbin Zheng commented on FLINK-9953:
-

[~fly_in_gis]  Thanks a lot for your quick response and very glad to have a 
chance to work with you on this exciting feature.

> I think in the per-job cluster in flink means it is dedicated cluster for 
> only one job and will not accept more other jobs.

Yep, you are right; a per-job cluster is always a dedicated cluster for only 
one job. One option mentioned in my design doc is to split the deploying step 
into two parts; 
 * Part one, deploy a cluster without JobGraph attached;
 * Part two, submit a job to that cluster via `RestClusterClient` and shut down 
the cluster when the job finishes. For this idea, some new Kubernetes dedicated 
specialization, such as `KubernetesMiniDispatcher`, 
`KubernetesSingleJobGraphStore` will be introduced to support the new per-job 
cluster workflow. Both `KubernetesMiniDispatcher` and 
`KubernetesSingleJobGraphStore` do not need a JobGraph when we construct them, 
but accept only one single `JobGraph` after they are instantiated. So actually, 
this solution does not change the current definition of a per-job cluster.

It seems this solution is slightly customised, but at a higher level, I think 
it provides a possible way to unify the deploying process of session and 
per-job cluster.

In addition to the previous option, we have other options to solve dynamic 
dependency management for a per-job cluster. As a prerequisite, we upload those 
locally-hosted dependencies to a Hadoop Compatible File System, referred to 
HCFS, which is accessible to the Flink cluster. Then we fetch those 
dependencies for the job to run, and there are at least two solutions to get 
this.
 * Solution One
Download dependencies from HCFS after starting JM(s) or TMs. 
1. JM localizes those dependencies by downloading them when a JobManagerImpl is 
instantiated.
2. TMs fetch those dependencies when Task#run() is invoked.
 * Solution Two
Download dependencies before starting JM(s) or TMs by utilizing a Kubernetes 
feature known as init-containers. Init-containers always run to completion 
before the main container is started, typically used to handle initialization 
work for the primary containers.


> The users could put their jars in the image. And the 
> `ClassPathJobGraphRetriever` will be used to generate and retrieve the job 
> graph in flink master pod.

This is a straightforward workflow; we build an image containing all the 
necessary application resource, such as application code, input files, etc., 
then run the application entirely from that image; many applications are 
working in this way in the Kubernetes ecosystem, we can add support for this 
use case.

But some dependencies may not be known at image build time, or could be too 
large to be baked into a container image, or need frequent changes according to 
new business scenarios. For these cases, I propose to use a standard image with 
Flink distribution and supply dependencies at runtime; surely, we have several 
workarounds to support dynamic dependencies.

> So i suggest to add kubernetes-job.sh to start per-job cluster. It will not 
> need a user jar as required argument. `flink run` could be used to submitted 
> a flink job to existed session.

We can make changes to the existing `flink` shell to meet this requirement; 
it’s better not to introduce another dedicated kubernetes-job.sh to start a 
per-job cluster on Kubernetes.

> Active Kubernetes integration
> -
>
> Key: FLINK-9953
> URL: https://issues.apache.org/jira/browse/FLINK-9953
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Coordination
>Reporter: Till Rohrmann
>Assignee: Yang Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is the umbrella issue tracking Flink's active Kubernetes integration. 
> Active means in this context that the {{ResourceManager}} can talk to 
> Kubernetes to launch new pods similar to Flink's Yarn and Mesos integration.
> Phase1 implementation will have complete functions to make flink running on 
> kubernetes. Phrase1 is mainly focused on production optimization, including 
> k8s native high-availability, storage, network, log collector and etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on issue #9935: [FLINK-14456][client] Remove or shift down field from ClusterClient

2019-10-19 Thread GitBox
flinkbot edited a comment on issue #9935: [FLINK-14456][client] Remove or shift 
down field from ClusterClient
URL: https://github.com/apache/flink/pull/9935#issuecomment-543796546
 
 
   
   ## CI report:
   
   * c4aef47f26bb9fe9cde0291046c365533ec5b08d : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/132555336)
   * ab926c8b9a9235489c35c7f05b3a7ebde3696558 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/132564151)
   * 02cb7f81f32c17d1d1fd6cc2e9713b20bb5fe733 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/132622502)
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services