date:20200303

[jira] [Created] (FLINK-16416) Shutdown the task manager gracefully in standalone mode

2020-03-03 Thread Yangze Guo (Jira)

Yangze Guo created FLINK-16416:
--

 Summary: Shutdown the task manager gracefully in standalone mode
 Key: FLINK-16416
 URL: https://issues.apache.org/jira/browse/FLINK-16416
 Project: Flink
  Issue Type: Improvement
  Components: Command Line Client
Reporter: Yangze Guo


Recently, I try to add a new {{GPUManager}} to the {{TaskExecutorServices}}. I 
register the "close", in which I write some cleanup logic, function to the 
{{TaskExecutorServices#shutDown}}. However, I found that the cleanup logic does 
not run as expected in standalone mode.
After an investigation in the codebase, I found that the 
{{TaskExecutorServices#shutDown}} will be called only on a fatal error while we 
just kill the TM process in the {{flink-daemon.sh}}. However, the LOG shows 
that some services did clean up themselves by registering {{shutdownHook}}.
If that is the right way, then we need to register a {{shutdownHook}} for 
{{TaskExecutorServices}} as well.
If that is not, we may find another solution to shutdown TM gracefully.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot commented on issue #11303: [Flink-16245] Decoupling user classloader from context classloader.

2020-03-03 Thread GitBox

flinkbot commented on issue #11303: [Flink-16245] Decoupling user classloader 
from context classloader.
URL: https://github.com/apache/flink/pull/11303#issuecomment-594377249
 
 
   
   ## CI report:
   
   * af7e6a441a3f9105e4cfda044cc0a76331c91c33 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11261: [hotfix][streaming] Clean up redundant & dead code about StreamExecutionEnvironment

2020-03-03 Thread GitBox

flinkbot edited a comment on issue #11261: [hotfix][streaming] Clean up 
redundant & dead code about StreamExecutionEnvironment
URL: https://github.com/apache/flink/pull/11261#issuecomment-592684125
 
 
   
   ## CI report:
   
   * ed504c69ae44a27c482bfbf82679ab40fc638d1d Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/151079491) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5759)
 
   * 1f93243324fed1d67e4d17360e80bb91d6aa1726 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11302: [FLINK-16414]fix sql validation failed when using udaf/udtf which doesn't implement getResultType

2020-03-03 Thread GitBox

flinkbot edited a comment on issue #11302: [FLINK-16414]fix sql validation 
failed when using udaf/udtf which doesn't implement getResultType
URL: https://github.com/apache/flink/pull/11302#issuecomment-594371839
 
 
   
   ## CI report:
   
   * d549e324ccf8459f5731f72a9d1ff0fa87188182 Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/151683087) Azure: 
[PENDING](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5894)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Closed] (FLINK-16362) Remove deprecated method in StreamTableSink

2020-03-03 Thread Kurt Young (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Young closed FLINK-16362.
--
  Assignee: godfrey he
Resolution: Fixed

merged to master: 2b13a4155fd4284f6092decba867e71eea058043

> Remove deprecated method in StreamTableSink
> ---
>
> Key: FLINK-16362
> URL: https://issues.apache.org/jira/browse/FLINK-16362
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: godfrey he
>Assignee: godfrey he
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [FLIP-84|https://cwiki.apache.org/confluence/display/FLINK/FLIP-84%3A+Improve+%26+Refactor+API+of+TableEnvironment]
>  proposes to unify the behavior of {{TableEnvironment}} and 
> {{StreamTableEnvironment}}, and requires the {{StreamTableSink}} always 
> returns {{DataStream}}. However
> {{StreamTableSink.emitDataStream}} returns nothing and is deprecated since 
> Flink 1.9, So we will remove it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16362) Remove deprecated method in StreamTableSink

2020-03-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-16362:
---
Labels: pull-request-available  (was: )

> Remove deprecated method in StreamTableSink
> ---
>
> Key: FLINK-16362
> URL: https://issues.apache.org/jira/browse/FLINK-16362
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: godfrey he
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>
> [FLIP-84|https://cwiki.apache.org/confluence/display/FLINK/FLIP-84%3A+Improve+%26+Refactor+API+of+TableEnvironment]
>  proposes to unify the behavior of {{TableEnvironment}} and 
> {{StreamTableEnvironment}}, and requires the {{StreamTableSink}} always 
> returns {{DataStream}}. However
> {{StreamTableSink.emitDataStream}} returns nothing and is deprecated since 
> Flink 1.9, So we will remove it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] KurtYoung merged pull request #11279: [FLINK-16362] [table] Remove deprecated method in StreamTableSink

2020-03-03 Thread GitBox

KurtYoung merged pull request #11279: [FLINK-16362] [table] Remove deprecated 
method in StreamTableSink
URL: https://github.com/apache/flink/pull/11279
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Closed] (FLINK-16140) Translate "Event Processing (CEP)" page into Chinese

2020-03-03 Thread Dian Fu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu closed FLINK-16140.
---
Resolution: Resolved

Merged to master via b3f8e33121b83628e2d20e12ebe84e269a05e7fe

> Translate "Event Processing (CEP)" page into Chinese
> 
>
> Key: FLINK-16140
> URL: https://issues.apache.org/jira/browse/FLINK-16140
> Project: Flink
>  Issue Type: Task
>  Components: chinese-translation, Documentation
>Reporter: shuai.xu
>Assignee: shuai.xu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Translate the internal page 
> "[https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html]; 
> into Chinese.
> The doc located in "flink/docs/dev/libs/cep.zh.md"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16140) Translate "Event Processing (CEP)" page into Chinese

2020-03-03 Thread Dian Fu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu updated FLINK-16140:

Fix Version/s: 1.11.0

> Translate "Event Processing (CEP)" page into Chinese
> 
>
> Key: FLINK-16140
> URL: https://issues.apache.org/jira/browse/FLINK-16140
> Project: Flink
>  Issue Type: Task
>  Components: chinese-translation, Documentation
>Reporter: shuai.xu
>Assignee: shuai.xu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Translate the internal page 
> "[https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html]; 
> into Chinese.
> The doc located in "flink/docs/dev/libs/cep.zh.md"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16140) Translate "Event Processing (CEP)" page into Chinese

2020-03-03 Thread Dian Fu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu updated FLINK-16140:

Affects Version/s: (was: 1.10.0)

> Translate "Event Processing (CEP)" page into Chinese
> 
>
> Key: FLINK-16140
> URL: https://issues.apache.org/jira/browse/FLINK-16140
> Project: Flink
>  Issue Type: Task
>  Components: chinese-translation, Documentation
>Reporter: shuai.xu
>Assignee: shuai.xu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Translate the internal page 
> "[https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html]; 
> into Chinese.
> The doc located in "flink/docs/dev/libs/cep.zh.md"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] dianfu merged pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

dianfu merged pull request #11168: [FLINK-16140] [docs-zh] Translate Event 
Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] dianfu commented on issue #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

dianfu commented on issue #11168: [FLINK-16140] [docs-zh] Translate Event 
Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#issuecomment-594374801
 
 
   @shuai-xu Thanks for the update. Merging...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot commented on issue #11303: [Flink-16245] Decoupling user classloader from context classloader.

2020-03-03 Thread GitBox

flinkbot commented on issue #11303: [Flink-16245] Decoupling user classloader 
from context classloader.
URL: https://github.com/apache/flink/pull/11303#issuecomment-594373776
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit af7e6a441a3f9105e4cfda044cc0a76331c91c33 (Wed Mar 04 
07:46:45 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-16245).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Comment Edited] (FLINK-15249) Improve PipelinedRegions calculation with Union Set

2020-03-03 Thread Zhu Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050963#comment-17050963
 ] 

Zhu Zhu edited comment on FLINK-15249 at 3/4/20, 7:45 AM:
--

I had run the benchmark multiple (20) times for each, the performance 
difference is obvious and stable. So it does not reject the conclusion made 
above even though the result numbers may not be that accurate.


was (Author: zhuzh):
I had run the benchmark multiple (20) times for each, the performance 
difference is obvious and stable. So it does not reject the conclusion made 
above even though the exact number may not be that accurate.

> Improve PipelinedRegions calculation with Union Set
> ---
>
> Key: FLINK-15249
> URL: https://issues.apache.org/jira/browse/FLINK-15249
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: Chongchen Chen
>Priority: Major
>  Labels: pull-request-available
> Attachments: PipelinedRegionComputeUtil.diff, 
> RegionFailoverPerfTest.java, new.diff
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Union Set's Merge Set cost is O(1). current implementation is O(N). the 
> attachment is patch.
> [Disjoint Set Data 
> Structure|[https://en.wikipedia.org/wiki/Disjoint-set_data_structure]]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] AHeise opened a new pull request #11303: [Flink-16245] Decoupling user classloader from context classloader.

2020-03-03 Thread GitBox

AHeise opened a new pull request #11303: [Flink-16245] Decoupling user 
classloader from context classloader.
URL: https://github.com/apache/flink/pull/11303
 
 
   
   
   ## What is the purpose of the change
   
   Decoupling user class loader from context classloader.
   
   Thus, user classloader can be unloaded even though a reference on the 
context classloader outlives the user code.
   
   ## Brief change log
   
   - Adding SafetyNetWrapperClassLoader
   - Use it for FlinkUserCodeClassLoaders
   - Make sure LocalExecutor closes user classloader
   
   
   ## Verifying this change
   
   Added unit test.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / 
don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-15249) Improve PipelinedRegions calculation with Union Set

2020-03-03 Thread Zhu Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050963#comment-17050963
 ] 

Zhu Zhu commented on FLINK-15249:
-

I had run the benchmark multiple (20) times for each, the performance 
difference is obvious and stable. So it does not reject the conclusion made 
above even though the exact number may not be that accurate.

> Improve PipelinedRegions calculation with Union Set
> ---
>
> Key: FLINK-15249
> URL: https://issues.apache.org/jira/browse/FLINK-15249
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: Chongchen Chen
>Priority: Major
>  Labels: pull-request-available
> Attachments: PipelinedRegionComputeUtil.diff, 
> RegionFailoverPerfTest.java, new.diff
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Union Set's Merge Set cost is O(1). current implementation is O(N). the 
> attachment is patch.
> [Disjoint Set Data 
> Structure|[https://en.wikipedia.org/wiki/Disjoint-set_data_structure]]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot commented on issue #11302: [FLINK-16414]fix sql validation failed when using udaf/udtf which doesn't implement getResultType

2020-03-03 Thread GitBox

flinkbot commented on issue #11302: [FLINK-16414]fix sql validation failed when 
using udaf/udtf which doesn't implement getResultType
URL: https://github.com/apache/flink/pull/11302#issuecomment-594371839
 
 
   
   ## CI report:
   
   * d549e324ccf8459f5731f72a9d1ff0fa87188182 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Created] (FLINK-16415) Add a version property to module.yaml

2020-03-03 Thread Igal Shilman (Jira)

Igal Shilman created FLINK-16415:


 Summary: Add a version property to module.yaml
 Key: FLINK-16415
 URL: https://issues.apache.org/jira/browse/FLINK-16415
 Project: Flink
  Issue Type: Improvement
  Components: Stateful Functions
Reporter: Igal Shilman


Having a version would allow us to evolve the schema of the yaml based module

configuration.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16405) Fix hive-metastore dependency description in connecting to hive docs

2020-03-03 Thread Leonard Xu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu updated FLINK-16405:
---
Description:  we need hive-metastore dependency when connect to 
hive(version 3.1.x) if using hive-metastore in embedded mode. but looks like we 
missed it: 
[https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/#connecting-to-hive]
  (was:  we need hive-metastore dependency when connect to hive(version 3.1.x)

but looks like we missed it: 
[https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/#connecting-to-hive])

> Fix hive-metastore dependency description in connecting to hive docs
> 
>
> Key: FLINK-16405
> URL: https://issues.apache.org/jira/browse/FLINK-16405
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Leonard Xu
>Priority: Major
> Fix For: 1.10.1, 1.11.0
>
>
>  we need hive-metastore dependency when connect to hive(version 3.1.x) if 
> using hive-metastore in embedded mode. but looks like we missed it: 
> [https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/#connecting-to-hive]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on issue #10922: [FLINK-11899][parquet] Introduce parquet ColumnarRow split reader

2020-03-03 Thread GitBox

flinkbot edited a comment on issue #10922: [FLINK-11899][parquet] Introduce 
parquet ColumnarRow split reader
URL: https://github.com/apache/flink/pull/10922#issuecomment-577180180
 
 
   
   ## CI report:
   
   * 37cb28b6b3238516df476318cb28d30114689442 Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/151674029) Azure: 
[PENDING](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5893)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] hequn8128 commented on a change in pull request #11220: [FLINK-16249][python][ml] Add interfaces for Params, ParamInfo and WithParams

2020-03-03 Thread GitBox

hequn8128 commented on a change in pull request #11220: 
[FLINK-16249][python][ml] Add interfaces for Params, ParamInfo and WithParams
URL: https://github.com/apache/flink/pull/11220#discussion_r387489337
 
 

 ##
 File path: flink-python/pyflink/ml/lib/param/colname.py
 ##
 @@ -0,0 +1,55 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+from pyflink.ml.api.param import WithParams, ParamInfo, TypeConverters
+
+
+class HasSelectedCols(WithParams):
+"""
+An interface for classes with a parameter specifying the name of multiple 
table columns.
+"""
+
+selected_cols = ParamInfo(
+"selectedCols",
+"Names of the columns used for processing",
+is_optional=False,
+type_converter=TypeConverters.to_list_string)
+
+def set_selected_cols(self, v: list) -> 'HasSelectedCols':
+return super().set(self.selected_cols, v)
+
+def get_selected_cols(self) -> list:
+return super().get(self.selected_cols)
+
+
+class HasOutputCol(WithParams):
 
 Review comment:
   This is the behavior of the Java API. Maybe the reason is we use another 
interface, i.e., `HasOutputCols` to support multi-columns?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] hequn8128 commented on a change in pull request #11220: [FLINK-16249][python][ml] Add interfaces for Params, ParamInfo and WithParams

2020-03-03 Thread GitBox

hequn8128 commented on a change in pull request #11220: 
[FLINK-16249][python][ml] Add interfaces for Params, ParamInfo and WithParams
URL: https://github.com/apache/flink/pull/11220#discussion_r387487800
 
 

 ##
 File path: flink-python/pyflink/ml/api/param/base.py
 ##
 @@ -0,0 +1,357 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+import array
+from typing import TypeVar, Generic
+
+V = TypeVar('V')
+
+
+class WithParams(Generic[V]):
+"""
+Parameters are widely used in machine learning realm. This class defines a 
common
+interface to interact with classes with parameters.
+"""
+
+def get_params(self) -> 'Params':
+"""
+Returns all the parameters.
+
+:return: all the parameters.
+"""
+pass
+
+def set(self, info: 'ParamInfo', value: V) -> 'WithParams':
+"""
+Set the value of a specific parameter.
+
+:param info: the info of the specific param to set.
+:param value: the value to be set to the specific param.
+:return: the WithParams itself.
+"""
+self.get_params().set(info, value)
+return self
+
+def get(self, info: 'ParamInfo') -> V:
+"""
+Returns the value of the specific param.
+
+:param info: the info of the specific param, usually with default 
value.
+:return: the value of the specific param, or default value defined in 
the \
+ParamInfo if the inner Params doesn't contains this param.
+"""
+return self.get_params().get(info)
+
+def _set(self, **kwargs):
+"""
+Sets user-supplied params.
+"""
+for param, value in kwargs.items():
+p = getattr(self, param)
+if value is not None:
+try:
+value = p.type_converter(value)
+except TypeError as e:
+raise TypeError('Invalid param value given for param "%s". 
%s' % (p.name, e))
+self.get_params().set(p, value)
+return self
+
+
+class Params(Generic[V]):
+"""
+The map-like container class for parameter. This class is provided to unify
+the interaction with parameters.
+"""
+
+def __init__(self):
+self._param_map = {}
+
+def set(self, info: 'ParamInfo', value: V) -> 'Params':
+"""
+Return the number of params.
+
+:param info: the info of the specific parameter to set.
+:param value: the value to be set to the specific parameter.
+:return: return the current Params.
+"""
+self._param_map[info] = value
+return self
+
+def get(self, info: 'ParamInfo') -> V:
+"""
+Returns the value of the specific parameter, or default value defined 
in the
+info if this Params doesn't have a value set for the parameter. An 
exception
+will be thrown in the following cases because no value could be found 
for the
+specified parameter.
+
+:param info: the info of the specific parameter to set.
+:return: the value of the specific param, or default value defined in 
the \
+info if this Params doesn't contain the parameter.
+"""
+if info not in self._param_map:
+if not info.is_optional:
+raise ValueError("Missing non-optional parameter %s" % 
info.name)
+elif not info.has_default_value:
+raise ValueError("Cannot find default value for optional 
parameter %s" % info.name)
+else:
+return info.default_value
+else:
+return self._param_map[info]
+
+def remove(self, info: 'ParamInfo') -> V:
+"""
+Removes the specific parameter from this Params.
+
+:param info: the info of the specific parameter to remove.
+:return: the type of the specific parameter.
+"""
+self._param_map.pop(info)
+
+def contains(self, info: 'ParamInfo') -> bool:
+"""
+Check whether this params has a value set for

[GitHub] [flink] flinkbot commented on issue #11302: [FLINK-16414]fix sql validation failed when using udaf/udtf which doesn't implement getResultType

2020-03-03 Thread GitBox

flinkbot commented on issue #11302: [FLINK-16414]fix sql validation failed when 
using udaf/udtf which doesn't implement getResultType
URL: https://github.com/apache/flink/pull/11302#issuecomment-594367237
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit d549e324ccf8459f5731f72a9d1ff0fa87188182 (Wed Mar 04 
07:25:55 UTC 2020)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] zjuwangg commented on issue #11302: [FLINK-16414]fix sql validation failed when using udaf/udtf which doesn't implement getResultType

2020-03-03 Thread GitBox

zjuwangg commented on issue #11302: [FLINK-16414]fix sql validation failed when 
using udaf/udtf which doesn't implement getResultType
URL: https://github.com/apache/flink/pull/11302#issuecomment-594367128
 
 
   cc @bowenli86 or @KurtYoung to have a review


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-16414) create udaf/udtf function using sql casuing ValidationException: SQL validation failed. null

2020-03-03 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-16414:

Fix Version/s: 1.10.1

> create udaf/udtf function using sql casuing ValidationException: SQL 
> validation failed. null
> 
>
> Key: FLINK-16414
> URL: https://issues.apache.org/jira/browse/FLINK-16414
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.10.0
>Reporter: Terry Wang
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.10.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When using TableEnvironment#sqlupdate to create a udaf or udtf function, 
> which doesn't override the getResultType() method, it's normal. But when 
> using this function in later insert sql,  some exception like following will 
> be throwed:
> Exception in thread "main" org.apache.flink.table.api.ValidationException: 
> SQL validation failed. null
>   at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:130)
>   at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:105)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:127)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convertSqlInsert(SqlToOperationConverter.java:342)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:142)
>   at 
> org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:66)
>   at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlUpdate(TableEnvironmentImpl.java:484)
> The reason is in FunctionDefinitionUtil#createFunctionDefinition, we 
> shouldn't direct call t.getResultType or a.getAccumulatorType() or 
> a.getResultType() but using 
> UserDefinedFunctionHelper#getReturnTypeOfTableFunction
>  UserDefinedFunctionHelper#getAccumulatorTypeOfAggregateFunction 
> UserDefinedFunctionHelper#getReturnTypeOfAggregateFunction instead.
> ```
>   if (udf instanceof ScalarFunction) {
>   return new ScalarFunctionDefinition(
>   name,
>   (ScalarFunction) udf
>   );
>   } else if (udf instanceof TableFunction) {
>   TableFunction t = (TableFunction) udf;
>   return new TableFunctionDefinition(
>   name,
>   t,
>   t.getResultType()
>   );
>   } else if (udf instanceof AggregateFunction) {
>   AggregateFunction a = (AggregateFunction) udf;
>   return new AggregateFunctionDefinition(
>   name,
>   a,
>   a.getAccumulatorType(),
>   a.getResultType()
>   );
>   } else if (udf instanceof TableAggregateFunction) {
>   TableAggregateFunction a = (TableAggregateFunction) udf;
>   return new TableAggregateFunctionDefinition(
>   name,
>   a,
>   a.getAccumulatorType(),
>   a.getResultType()
>   );
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16414) create udaf/udtf function using sql casuing ValidationException: SQL validation failed. null

2020-03-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-16414:
---
Labels: pull-request-available  (was: )

> create udaf/udtf function using sql casuing ValidationException: SQL 
> validation failed. null
> 
>
> Key: FLINK-16414
> URL: https://issues.apache.org/jira/browse/FLINK-16414
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.10.0
>Reporter: Terry Wang
>Priority: Critical
>  Labels: pull-request-available
>
> When using TableEnvironment#sqlupdate to create a udaf or udtf function, 
> which doesn't override the getResultType() method, it's normal. But when 
> using this function in later insert sql,  some exception like following will 
> be throwed:
> Exception in thread "main" org.apache.flink.table.api.ValidationException: 
> SQL validation failed. null
>   at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:130)
>   at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:105)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:127)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convertSqlInsert(SqlToOperationConverter.java:342)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:142)
>   at 
> org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:66)
>   at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlUpdate(TableEnvironmentImpl.java:484)
> The reason is in FunctionDefinitionUtil#createFunctionDefinition, we 
> shouldn't direct call t.getResultType or a.getAccumulatorType() or 
> a.getResultType() but using 
> UserDefinedFunctionHelper#getReturnTypeOfTableFunction
>  UserDefinedFunctionHelper#getAccumulatorTypeOfAggregateFunction 
> UserDefinedFunctionHelper#getReturnTypeOfAggregateFunction instead.
> ```
>   if (udf instanceof ScalarFunction) {
>   return new ScalarFunctionDefinition(
>   name,
>   (ScalarFunction) udf
>   );
>   } else if (udf instanceof TableFunction) {
>   TableFunction t = (TableFunction) udf;
>   return new TableFunctionDefinition(
>   name,
>   t,
>   t.getResultType()
>   );
>   } else if (udf instanceof AggregateFunction) {
>   AggregateFunction a = (AggregateFunction) udf;
>   return new AggregateFunctionDefinition(
>   name,
>   a,
>   a.getAccumulatorType(),
>   a.getResultType()
>   );
>   } else if (udf instanceof TableAggregateFunction) {
>   TableAggregateFunction a = (TableAggregateFunction) udf;
>   return new TableAggregateFunctionDefinition(
>   name,
>   a,
>   a.getAccumulatorType(),
>   a.getResultType()
>   );
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-16414) create udaf/udtf function using sql casuing ValidationException: SQL validation failed. null

2020-03-03 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-16414:
---

Assignee: Terry Wang

> create udaf/udtf function using sql casuing ValidationException: SQL 
> validation failed. null
> 
>
> Key: FLINK-16414
> URL: https://issues.apache.org/jira/browse/FLINK-16414
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.10.0
>Reporter: Terry Wang
>Assignee: Terry Wang
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.10.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When using TableEnvironment#sqlupdate to create a udaf or udtf function, 
> which doesn't override the getResultType() method, it's normal. But when 
> using this function in later insert sql,  some exception like following will 
> be throwed:
> Exception in thread "main" org.apache.flink.table.api.ValidationException: 
> SQL validation failed. null
>   at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:130)
>   at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:105)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:127)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convertSqlInsert(SqlToOperationConverter.java:342)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:142)
>   at 
> org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:66)
>   at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlUpdate(TableEnvironmentImpl.java:484)
> The reason is in FunctionDefinitionUtil#createFunctionDefinition, we 
> shouldn't direct call t.getResultType or a.getAccumulatorType() or 
> a.getResultType() but using 
> UserDefinedFunctionHelper#getReturnTypeOfTableFunction
>  UserDefinedFunctionHelper#getAccumulatorTypeOfAggregateFunction 
> UserDefinedFunctionHelper#getReturnTypeOfAggregateFunction instead.
> ```
>   if (udf instanceof ScalarFunction) {
>   return new ScalarFunctionDefinition(
>   name,
>   (ScalarFunction) udf
>   );
>   } else if (udf instanceof TableFunction) {
>   TableFunction t = (TableFunction) udf;
>   return new TableFunctionDefinition(
>   name,
>   t,
>   t.getResultType()
>   );
>   } else if (udf instanceof AggregateFunction) {
>   AggregateFunction a = (AggregateFunction) udf;
>   return new AggregateFunctionDefinition(
>   name,
>   a,
>   a.getAccumulatorType(),
>   a.getResultType()
>   );
>   } else if (udf instanceof TableAggregateFunction) {
>   TableAggregateFunction a = (TableAggregateFunction) udf;
>   return new TableAggregateFunctionDefinition(
>   name,
>   a,
>   a.getAccumulatorType(),
>   a.getResultType()
>   );
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] zjuwangg opened a new pull request #11302: [FLINK-16414]fix sql validation failed when using udaf/udtf which doesn't implement getResultType

2020-03-03 Thread GitBox

zjuwangg opened a new pull request #11302: [FLINK-16414]fix sql validation 
failed when using udaf/udtf which doesn't implement getResultType
URL: https://github.com/apache/flink/pull/11302
 
 
   ## What is the purpose of the change
   
   *fix sql validation failed when using udaf/udtf which doesn't implement 
getResultTyp)*
   
   
   ## Brief change log
   
 - * d549e32  fix sql validation bug and add test case *
   
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
 - *add more test case in FunctionDefinitionUtilTest.java *
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no )
 - The runtime per-record code paths (performance sensitive): (no )
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? ( no)
 - If yes, how is the feature documented? (not applicable)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Comment Edited] (FLINK-16398) Rename Protobuf Kafka Ingress / Routable Kafka Ingress type identifier strings

2020-03-03 Thread Igal Shilman (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050957#comment-17050957
 ] 

Igal Shilman edited comment on FLINK-16398 at 3/4/20, 7:23 AM:
---

+1 for ingress, and +1 for shorter strings.

statefun.sdk/


was (Author: igal):
+1 for ingress, and +1 for shorter strings.

statefun.sdk/

> Rename Protobuf Kafka Ingress / Routable Kafka Ingress type identifier strings
> --
>
> Key: FLINK-16398
> URL: https://issues.apache.org/jira/browse/FLINK-16398
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Affects Versions: statefun-1.1
>Reporter: Tzu-Li (Gordon) Tai
>Priority: Blocker
>
> Currently the strings for the ingress type identifier for the (Routable) 
> Kafka ingresses are:
> {{org.apache.flink.statefun.sdk.kafka/(routable-)protobuf-kafka-connector}}
> The term {{connector}} is better renamed as "ingress", to be consistent with 
> Stateful Functions terminology.
> Also, we could maybe consider shortening the namespace string 
> ({{org.apache.flink.statefun.sdk.kafka}}) to something more compact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16398) Rename Protobuf Kafka Ingress / Routable Kafka Ingress type identifier strings

2020-03-03 Thread Igal Shilman (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050957#comment-17050957
 ] 

Igal Shilman commented on FLINK-16398:
--

+1 for ingress, and +1 for shorter strings.

statefun.sdk/

> Rename Protobuf Kafka Ingress / Routable Kafka Ingress type identifier strings
> --
>
> Key: FLINK-16398
> URL: https://issues.apache.org/jira/browse/FLINK-16398
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Affects Versions: statefun-1.1
>Reporter: Tzu-Li (Gordon) Tai
>Priority: Blocker
>
> Currently the strings for the ingress type identifier for the (Routable) 
> Kafka ingresses are:
> {{org.apache.flink.statefun.sdk.kafka/(routable-)protobuf-kafka-connector}}
> The term {{connector}} is better renamed as "ingress", to be consistent with 
> Stateful Functions terminology.
> Also, we could maybe consider shortening the namespace string 
> ({{org.apache.flink.statefun.sdk.kafka}}) to something more compact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] hequn8128 commented on a change in pull request #11220: [FLINK-16249][python][ml] Add interfaces for Params, ParamInfo and WithParams

2020-03-03 Thread GitBox

hequn8128 commented on a change in pull request #11220: 
[FLINK-16249][python][ml] Add interfaces for Params, ParamInfo and WithParams
URL: https://github.com/apache/flink/pull/11220#discussion_r387485499
 
 

 ##
 File path: flink-python/pyflink/ml/api/param/base.py
 ##
 @@ -0,0 +1,357 @@
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+import array
+from typing import TypeVar, Generic
+
+V = TypeVar('V')
+
+
+class WithParams(Generic[V]):
+"""
+Parameters are widely used in machine learning realm. This class defines a 
common
+interface to interact with classes with parameters.
+"""
+
+def get_params(self) -> 'Params':
+"""
+Returns all the parameters.
+
+:return: all the parameters.
+"""
+pass
+
+def set(self, info: 'ParamInfo', value: V) -> 'WithParams':
+"""
+Set the value of a specific parameter.
+
+:param info: the info of the specific param to set.
+:param value: the value to be set to the specific param.
+:return: the WithParams itself.
+"""
+self.get_params().set(info, value)
+return self
+
+def get(self, info: 'ParamInfo') -> V:
+"""
+Returns the value of the specific param.
+
+:param info: the info of the specific param, usually with default 
value.
+:return: the value of the specific param, or default value defined in 
the \
+ParamInfo if the inner Params doesn't contains this param.
+"""
+return self.get_params().get(info)
+
+def _set(self, **kwargs):
+"""
+Sets user-supplied params.
+"""
+for param, value in kwargs.items():
+p = getattr(self, param)
+if value is not None:
+try:
+value = p.type_converter(value)
+except TypeError as e:
+raise TypeError('Invalid param value given for param "%s". 
%s' % (p.name, e))
+self.get_params().set(p, value)
+return self
+
+
+class Params(Generic[V]):
+"""
+The map-like container class for parameter. This class is provided to unify
+the interaction with parameters.
+"""
+
+def __init__(self):
+self._param_map = {}
+
+def set(self, info: 'ParamInfo', value: V) -> 'Params':
+"""
+Return the number of params.
+
+:param info: the info of the specific parameter to set.
+:param value: the value to be set to the specific parameter.
+:return: return the current Params.
+"""
+self._param_map[info] = value
+return self
+
+def get(self, info: 'ParamInfo') -> V:
+"""
+Returns the value of the specific parameter, or default value defined 
in the
+info if this Params doesn't have a value set for the parameter. An 
exception
+will be thrown in the following cases because no value could be found 
for the
+specified parameter.
+
+:param info: the info of the specific parameter to set.
+:return: the value of the specific param, or default value defined in 
the \
+info if this Params doesn't contain the parameter.
+"""
+if info not in self._param_map:
+if not info.is_optional:
+raise ValueError("Missing non-optional parameter %s" % 
info.name)
+elif not info.has_default_value:
+raise ValueError("Cannot find default value for optional 
parameter %s" % info.name)
+else:
+return info.default_value
+else:
+return self._param_map[info]
+
+def remove(self, info: 'ParamInfo') -> V:
+"""
+Removes the specific parameter from this Params.
+
+:param info: the info of the specific parameter to remove.
+:return: the type of the specific parameter.
+"""
+self._param_map.pop(info)
+
+def contains(self, info: 'ParamInfo') -> bool:
+"""
+Check whether this params has a value set for

[jira] [Commented] (FLINK-15249) Improve PipelinedRegions calculation with Union Set

2020-03-03 Thread Gary Yao (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050956#comment-17050956
 ] 

Gary Yao commented on FLINK-15249:
--

I just wanted to point out that microbenchmarks can be difficult to get right 
and running the code with a single iteration most likely will not yield 
accurate results. It is recommended to use JMH or Caliper. Also see 
https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java

> Improve PipelinedRegions calculation with Union Set
> ---
>
> Key: FLINK-15249
> URL: https://issues.apache.org/jira/browse/FLINK-15249
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: Chongchen Chen
>Priority: Major
>  Labels: pull-request-available
> Attachments: PipelinedRegionComputeUtil.diff, 
> RegionFailoverPerfTest.java, new.diff
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Union Set's Merge Set cost is O(1). current implementation is O(N). the 
> attachment is patch.
> [Disjoint Set Data 
> Structure|[https://en.wikipedia.org/wiki/Disjoint-set_data_structure]]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-15670) Provide a Kafka Source/Sink pair that aligns Kafka's Partitions and Flink's KeyGroups

2020-03-03 Thread Yuan Mei (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050031#comment-17050031
 ] 

Yuan Mei edited comment on FLINK-15670 at 3/4/20, 6:41 AM:
---

[~sewen]

Need to chat a bit for two things:
 # Redefine the scope of the problem, at least for 1.11; 
 # Watermark handling when multiple subtasks writing to the same partition
 ** This is a common problem for intermediate persistency, not just for Kafka
 ** The current mechanism relies on downstream `ExecutionVertex` to progress 
watermark. However, in the case of a sink, there is no such thing as 
`downstream OP`.
 ** I was thinking if there is a coordinator of all subtasks of a 
ExecutionJobVertex then the watermark progress logic can be handled in the 
coordinator 
 ** I find there is an interface `OperatorCoordinator` that may be able to be 
used in this case. But the only two usages of it is under `test`

 

*A bit more details for reference:*

Downstream watermark is handled in

`StreamTaskNetworkInput.processElement` ->

`StatusWatermarkValue.inputWatermark`

In such a case, the watermark in each channel is kept and aligned until 
reaching downstream.

 

Upstream data is written and buffered through `ChannelSelectorRecordWriter`, 
which maintains bufferBuilders for each subpartition (channel).

 

 

 


was (Author: ym):
[~sewen]

Need to chat a bit for two things:
 # Redefine the scope of the problem, at least for 1.11; 
 # Watermark handling when multiple subtasks writing to the same partition
 ** This is a common problem for intermediate persistency, not just for Kafka
 ** The current mechanism relies on downstream `ExecutionVertex` to progress 
watermark. However, in the case of a sink, there is no such thing as 
`downstream OP`.
 ** I was thinking if there is a coordinator of all subtasks of a 
ExecutionJobVertex then the watermark progress logic can be handled in the 
coordinator 
 ** I find there is an interface `OperatorCoordinator` that may be able to be 
used in this case. But the only two usages of it is under `test`

 

*A bit more details for reference:*

Downstream watermark is handled in

`StreamTaskNetworkInput.processElement` ->

`StatusWatermarkValue.inputWatermark`

In such a case, the watermark in each channel is kept and aligned until 
reaching downstream.

 

Upstream data is buffered through `ChannelSelectorRecordWriter`, which 
maintains 

bufferBuilders for each subpartition (channel).

 

 

 

> Provide a Kafka Source/Sink pair that aligns Kafka's Partitions and Flink's 
> KeyGroups
> -
>
> Key: FLINK-15670
> URL: https://issues.apache.org/jira/browse/FLINK-15670
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream, Connectors / Kafka
>Reporter: Stephan Ewen
>Priority: Major
>  Labels: usability
> Fix For: 1.11.0
>
>
> This Source/Sink pair would serve two purposes:
> 1. You can read topics that are already partitioned by key and process them 
> without partitioning them again (avoid shuffles)
> 2. You can use this to shuffle through Kafka, thereby decomposing the job 
> into smaller jobs and independent pipelined regions that fail over 
> independently.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-15670) Provide a Kafka Source/Sink pair that aligns Kafka's Partitions and Flink's KeyGroups

2020-03-03 Thread Yuan Mei (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050031#comment-17050031
 ] 

Yuan Mei edited comment on FLINK-15670 at 3/4/20, 6:37 AM:
---

[~sewen]

Need to chat a bit for two things:
 # Redefine the scope of the problem, at least for 1.11; 
 # Watermark handling when multiple subtasks writing to the same partition
 ** This is a common problem for intermediate persistency, not just for Kafka
 ** The current mechanism relies on downstream `ExecutionVertex` to progress 
watermark. However, in the case of a sink, there is no such thing as 
`downstream OP`.
 ** I was thinking if there is a coordinator of all subtasks of a 
ExecutionJobVertex then the watermark progress logic can be handled in the 
coordinator 
 ** I find there is an interface `OperatorCoordinator` that may be able to be 
used in this case. But the only two usages of it is under `test`

 

*A bit more details for reference:*

Downstream watermark is handled in

`StreamTaskNetworkInput.processElement` ->

`StatusWatermarkValue.inputWatermark`

In such a case, the watermark in each channel is kept and aligned until 
reaching downstream.

 

Upstream data is buffered through `ChannelSelectorRecordWriter`, which 
maintains 

bufferBuilders for each subpartition (channel).

 

 

 


was (Author: ym):
[~sewen]

Need to chat a bit for two things:
 # Redefine the scope of the problem, at least for 1.11; 
 # Watermark handling when multiple subtasks writing to the same partition
 ** This is a common problem for intermediate persistency, not just for Kafka
 ** The current mechanism relies on downstream `ExecutionVertex` to progress 
watermark. However, in the case of a sink, there is no such thing as 
`downstream OP`.
 ** I was thinking if there is a coordinator of all subtasks of a 
ExecutionJobVertex then the watermark progress logic can be handled in the 
coordinator 
 ** I find there is an interface `OperatorCoordinator` that may be able to be 
used in this case. But the only two usages of it is under `test`

 

*A bit more details for reference:*

Downstream watermark is handled in

`StreamTaskNetworkInput.processElement` ->

`StatusWatermarkValue.inputWatermark`

In such a case, the watermark in each channel is kept and aligned until 
reaching downstream.

 

> Provide a Kafka Source/Sink pair that aligns Kafka's Partitions and Flink's 
> KeyGroups
> -
>
> Key: FLINK-15670
> URL: https://issues.apache.org/jira/browse/FLINK-15670
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream, Connectors / Kafka
>Reporter: Stephan Ewen
>Priority: Major
>  Labels: usability
> Fix For: 1.11.0
>
>
> This Source/Sink pair would serve two purposes:
> 1. You can read topics that are already partitioned by key and process them 
> without partitioning them again (avoid shuffles)
> 2. You can use this to shuffle through Kafka, thereby decomposing the job 
> into smaller jobs and independent pipelined regions that fail over 
> independently.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16392) oneside sorted cache in intervaljoin

2020-03-03 Thread Chen Qin (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Qin updated FLINK-16392:
-
Description: 
IntervalJoin is getting lots of usecases in our side. Those use cases shares 
following similar pattern
 * left stream pulled from low QPS source
 * from right stream to left stream lookup time range is very large (days weeks)
 * right stream is web traffic with high QPS

In current interval join implementation, we treat both streams equal and 
ondemand pull / scan keyed state from backend (rocksdb here). When rocksdb 
fetch and update gets more expensive, performance took hit to unblock large use 
cases.

In proposed implementation, we plan to introduce two changes
 * allow user opt-in more aggresive right buffer cleanup

 ** allow overwrite earlier clean up right stream earlier than interval 
upper-bound
 * leverage ram cache on demand build sortedMap from it's otherBuffer for each 
join key, in our use cases, it helps
 ** expedite right stream lookup of left buffers without access rocksdb 
everytime (disk -> sorted memory cache)
 ** if a key see event from left side, it cleanup cache and load cache from 
right side
 ** in worst case scenario, we only see two stream with round robin 
processElement1 and processElement2 of same set of keys at same frequency. 
Performance is expected to be similar with current implementation, memory 
footprint will be bounded by 1/2 state size.

 

Open discussion
 * how to control cache size?
 ** by default cache size is set to 1 key
 * how to avoid dirty cache
 ** if a given key see insertion from other side, cache will be cleared for 
that key and rebuild.
 * what happens when checkpoint/restore
 ** state still persists in statebackend, clear cache and rebuild of each new 
key seen.
 * how is performance
 ** Given assumption ram is magnitude faster than ram, this is a small overhead 
(<5%) to populate cache, compare with current rocksdb implemenation, we need do 
full loop at every event. It saves on bucket scan logic. If key recurring more 
than 1 access in same direction on cache, we expect significant perf gain.

 

  was:
IntervalJoin is getting lots of usecases in our side. Those use cases shares 
following similar pattern
 * left stream  pulled from slow evolving static dataset periodically
 * lookup time range is very large (days weeks)
 * right stream is web traffic with high QPS

In current interval join implementation, we treat both streams equal and 
ondemand pull / scan keyed state from backend (rocksdb here). When rocksdb 
fetch and update gets more expensive, performance took hit to unblock large use 
cases.

In proposed implementation, we plan to introduce two changes
 * allow user opt-in use cases like above by customize and inherit from 
ProcessJoinFunction.
 ** whether skip trigger scan from left events(static data set)
 ** allow set earlier clean up right stream earlier than interval upper-bound
 * leverage ram cache on demand build sortedMap from it's otherBuffer for each 
join key, in our use cases, it helps
 ** expedite right stream lookup of left buffers without access rocksdb 
everytime (disk -> sorted memory cache)
 ** if a key see event from left side, it cleanup cache and load cache from 
right side
 ** in worst case scenario, we only see two stream with round robin 
processElement1 and processElement2 of same set of keys at same frequency. 
Performance is expected to be similar with current implementation, memory 
footprint will be bounded by 1/2 state size.

 

Open discussion
 * how to control cache size?
 ** by default cache size is set to 1 key
 * how to avoid dirty cache
 ** if a given key see insertion from other side, cache will be cleared for 
that key and rebuild.
 * what happens when checkpoint/restore
 ** state still persists in statebackend, clear cache and rebuild of each new 
key seen.
 * how is performance
 ** Given assumption ram is magnitude faster than ram, this is a small overhead 
(<5%) to populate cache, compare with current rocksdb implemenation, we need do 
full loop at every event. It saves on bucket scan logic. If key recurring more 
than 1 access in same direction on cache, we expect significant perf gain.

 


> oneside sorted cache in intervaljoin
> 
>
> Key: FLINK-16392
> URL: https://issues.apache.org/jira/browse/FLINK-16392
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Affects Versions: 1.10.0
>Reporter: Chen Qin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> IntervalJoin is getting lots of usecases in our side. Those use cases shares 
> following similar pattern
>  * left stream pulled from low QPS source
>  * from right stream to left stream lookup

[jira] [Comment Edited] (FLINK-15670) Provide a Kafka Source/Sink pair that aligns Kafka's Partitions and Flink's KeyGroups

2020-03-03 Thread Yuan Mei (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050031#comment-17050031
 ] 

Yuan Mei edited comment on FLINK-15670 at 3/4/20, 6:31 AM:
---

[~sewen]

Need to chat a bit for two things:
 # Redefine the scope of the problem, at least for 1.11; 
 # Watermark handling when multiple subtasks writing to the same partition
 ** This is a common problem for intermediate persistency, not just for Kafka
 ** The current mechanism relies on downstream `ExecutionVertex` to progress 
watermark. However, in the case of a sink, there is no such thing as 
`downstream OP`.
 ** I was thinking if there is a coordinator of all subtasks of a 
ExecutionJobVertex then the watermark progress logic can be handled in the 
coordinator 
 ** I find there is an interface `OperatorCoordinator` that may be able to be 
used in this case. But the only two usages of it is under `test`

 

*A bit more details for reference:*

Downstream watermark is handled in

`StreamTaskNetworkInput.processElement` ->

`StatusWatermarkValue.inputWatermark`

In such a case, the watermark in each channel is kept and aligned until 
reaching downstream.

 


was (Author: ym):
[~sewen]

Need to chat a bit for two things:
 # Redefine the scope of the problem, at least for 1.11; 
 #  Watermark handling when multiple subtasks writing to the same partition
 ** This is a common problem for intermediate persistency, not just for Kafka
 ** The current mechanism relies on downstream `ExecutionVertex` to progress 
watermark. However, in the case of a sink, there is no such thing as 
`downstream OP`.
 ** I was thinking if there is a coordinator of all subtasks of a 
ExecutionJobVertex then the watermark progress logic can be handled in the 
coordinator 
 ** I find there is an interface `OperatorCoordinator` that may be able to be 
used in this case. But the only two usages of it is under `test`

> Provide a Kafka Source/Sink pair that aligns Kafka's Partitions and Flink's 
> KeyGroups
> -
>
> Key: FLINK-15670
> URL: https://issues.apache.org/jira/browse/FLINK-15670
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream, Connectors / Kafka
>Reporter: Stephan Ewen
>Priority: Major
>  Labels: usability
> Fix For: 1.11.0
>
>
> This Source/Sink pair would serve two purposes:
> 1. You can read topics that are already partitioned by key and process them 
> without partitioning them again (avoid shuffles)
> 2. You can use this to shuffle through Kafka, thereby decomposing the job 
> into smaller jobs and independent pipelined regions that fail over 
> independently.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on issue #10922: [FLINK-11899][parquet] Introduce parquet ColumnarRow split reader

2020-03-03 Thread GitBox

flinkbot edited a comment on issue #10922: [FLINK-11899][parquet] Introduce 
parquet ColumnarRow split reader
URL: https://github.com/apache/flink/pull/10922#issuecomment-577180180
 
 
   
   ## CI report:
   
   * 1177cdde9727901d1be7938a1c2d609c02181b3b Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/151535331) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5870)
 
   * 37cb28b6b3238516df476318cb28d30114689442 Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/151674029) Azure: 
[PENDING](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5893)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11291: [FLINK-16392] [API / DataStream] oneside sorted cache in intervaljoin

2020-03-03 Thread GitBox

flinkbot edited a comment on issue #11291: [FLINK-16392] [API / DataStream] 
oneside sorted cache in intervaljoin
URL: https://github.com/apache/flink/pull/11291#issuecomment-593677297
 
 
   
   ## CI report:
   
   * d6b2919dd28c55230e530ccc45ca7d93d90a60df UNKNOWN
   * 3d121cb731b45565917fa47ecfd999163ef06625 UNKNOWN
   * 87148bdb635a6981d9ecc6c827061f3e13a47966 Travis: 
[CANCELED](https://travis-ci.com/flink-ci/flink/builds/151673112) Azure: 
[PENDING](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5892)
 
   * b09c4d1a7612259e851ad12620ebb3751adb55be UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

flinkbot edited a comment on issue #11168: [FLINK-16140] [docs-zh] Translate 
Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#issuecomment-589576910
 
 
   
   ## CI report:
   
   * 829770f1ca71973e386bf808fd6b9d813179061b Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/151669711) Azure: 
[PENDING](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5891)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-16414) create udaf/udtf function using sql casuing ValidationException: SQL validation failed. null

2020-03-03 Thread Terry Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050923#comment-17050923
 ] 

Terry Wang commented on FLINK-16414:


cc [~bli] to confirm~

> create udaf/udtf function using sql casuing ValidationException: SQL 
> validation failed. null
> 
>
> Key: FLINK-16414
> URL: https://issues.apache.org/jira/browse/FLINK-16414
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.10.0
>Reporter: Terry Wang
>Priority: Critical
>
> When using TableEnvironment#sqlupdate to create a udaf or udtf function, 
> which doesn't override the getResultType() method, it's normal. But when 
> using this function in later insert sql,  some exception like following will 
> be throwed:
> Exception in thread "main" org.apache.flink.table.api.ValidationException: 
> SQL validation failed. null
>   at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:130)
>   at 
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:105)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:127)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convertSqlInsert(SqlToOperationConverter.java:342)
>   at 
> org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:142)
>   at 
> org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:66)
>   at 
> org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlUpdate(TableEnvironmentImpl.java:484)
> The reason is in FunctionDefinitionUtil#createFunctionDefinition, we 
> shouldn't direct call t.getResultType or a.getAccumulatorType() or 
> a.getResultType() but using 
> UserDefinedFunctionHelper#getReturnTypeOfTableFunction
>  UserDefinedFunctionHelper#getAccumulatorTypeOfAggregateFunction 
> UserDefinedFunctionHelper#getReturnTypeOfAggregateFunction instead.
> ```
>   if (udf instanceof ScalarFunction) {
>   return new ScalarFunctionDefinition(
>   name,
>   (ScalarFunction) udf
>   );
>   } else if (udf instanceof TableFunction) {
>   TableFunction t = (TableFunction) udf;
>   return new TableFunctionDefinition(
>   name,
>   t,
>   t.getResultType()
>   );
>   } else if (udf instanceof AggregateFunction) {
>   AggregateFunction a = (AggregateFunction) udf;
>   return new AggregateFunctionDefinition(
>   name,
>   a,
>   a.getAccumulatorType(),
>   a.getResultType()
>   );
>   } else if (udf instanceof TableAggregateFunction) {
>   TableAggregateFunction a = (TableAggregateFunction) udf;
>   return new TableAggregateFunctionDefinition(
>   name,
>   a,
>   a.getAccumulatorType(),
>   a.getResultType()
>   );
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on issue #11291: [FLINK-16392] [API / DataStream] oneside sorted cache in intervaljoin

2020-03-03 Thread GitBox

flinkbot edited a comment on issue #11291: [FLINK-16392] [API / DataStream] 
oneside sorted cache in intervaljoin
URL: https://github.com/apache/flink/pull/11291#issuecomment-593677297
 
 
   
   ## CI report:
   
   * d6b2919dd28c55230e530ccc45ca7d93d90a60df UNKNOWN
   * 3d121cb731b45565917fa47ecfd999163ef06625 UNKNOWN
   * 718dc465d0653fc2f168de023bda603d78fa2101 Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/151652734) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5889)
 
   * 87148bdb635a6981d9ecc6c827061f3e13a47966 Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/151673112) Azure: 
[PENDING](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5892)
 
   * b09c4d1a7612259e851ad12620ebb3751adb55be UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #10922: [FLINK-11899][parquet] Introduce parquet ColumnarRow split reader

2020-03-03 Thread GitBox

flinkbot edited a comment on issue #10922: [FLINK-11899][parquet] Introduce 
parquet ColumnarRow split reader
URL: https://github.com/apache/flink/pull/10922#issuecomment-577180180
 
 
   
   ## CI report:
   
   * 1177cdde9727901d1be7938a1c2d609c02181b3b Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/151535331) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5870)
 
   * 37cb28b6b3238516df476318cb28d30114689442 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-16414) create udaf/udtf function using sql casuing ValidationException: SQL validation failed. null

2020-03-03 Thread Terry Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Terry Wang updated FLINK-16414:
---
Description: 
When using TableEnvironment#sqlupdate to create a udaf or udtf function, which 
doesn't override the getResultType() method, it's normal. But when using this 
function in later insert sql,  some exception like following will be throwed:

Exception in thread "main" org.apache.flink.table.api.ValidationException: SQL 
validation failed. null
at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:130)
at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:105)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:127)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convertSqlInsert(SqlToOperationConverter.java:342)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:142)
at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:66)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlUpdate(TableEnvironmentImpl.java:484)

The reason is in FunctionDefinitionUtil#createFunctionDefinition, we shouldn't 
direct call t.getResultType or a.getAccumulatorType() or a.getResultType() but 
using UserDefinedFunctionHelper#getReturnTypeOfTableFunction
 UserDefinedFunctionHelper#getAccumulatorTypeOfAggregateFunction 
UserDefinedFunctionHelper#getReturnTypeOfAggregateFunction instead.
```

if (udf instanceof ScalarFunction) {
return new ScalarFunctionDefinition(
name,
(ScalarFunction) udf
);
} else if (udf instanceof TableFunction) {
TableFunction t = (TableFunction) udf;
return new TableFunctionDefinition(
name,
t,
t.getResultType()
);
} else if (udf instanceof AggregateFunction) {
AggregateFunction a = (AggregateFunction) udf;

return new AggregateFunctionDefinition(
name,
a,
a.getAccumulatorType(),
a.getResultType()
);
} else if (udf instanceof TableAggregateFunction) {
TableAggregateFunction a = (TableAggregateFunction) udf;

return new TableAggregateFunctionDefinition(
name,
a,
a.getAccumulatorType(),
a.getResultType()
);
```




  was:
When using TableEnvironment.sqlupdate() to create a udaf or udtf function, if 
the function doesn't override the getResultType() method, it's normal. But when 
using this function in the insert sql,  some exception like following will be 
throwed:

Exception in thread "main" org.apache.flink.table.api.ValidationException: SQL 
validation failed. null
at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:130)
at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:105)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:127)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convertSqlInsert(SqlToOperationConverter.java:342)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:142)
at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:66)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlUpdate(TableEnvironmentImpl.java:484)

The reason is in FunctionDefinitionUtil#createFunctionDefinition, we shouldn't 
direct call t.getResultType or a.getAccumulatorType() or a.getResultType() but 
using UserDefinedFunctionHelper#getReturnTypeOfTableFunction
 UserDefinedFunctionHelper#getAccumulatorTypeOfAggregateFunction 
UserDefinedFunctionHelper#getReturnTypeOfAggregateFunction instead.
```

if (udf instanceof ScalarFunction) {
return new ScalarFunctionDefinition(
name,
(ScalarFunction) udf
);
} else if (udf instanceof

[jira] [Created] (FLINK-16414) create udaf/udtf function using sql casuing ValidationException: SQL validation failed. null

2020-03-03 Thread Terry Wang (Jira)

Terry Wang created FLINK-16414:
--

 Summary: create udaf/udtf function using sql casuing 
ValidationException: SQL validation failed. null
 Key: FLINK-16414
 URL: https://issues.apache.org/jira/browse/FLINK-16414
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Affects Versions: 1.10.0
Reporter: Terry Wang


When using TableEnvironment.sqlupdate() to create a udaf or udtf function, if 
the function doesn't override the getResultType() method, it's normal. But when 
using this function in the insert sql,  some exception like following will be 
throwed:

Exception in thread "main" org.apache.flink.table.api.ValidationException: SQL 
validation failed. null
at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:130)
at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:105)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:127)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convertSqlInsert(SqlToOperationConverter.java:342)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:142)
at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:66)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlUpdate(TableEnvironmentImpl.java:484)

The reason is in FunctionDefinitionUtil#createFunctionDefinition, we shouldn't 
direct call t.getResultType or a.getAccumulatorType() or a.getResultType() but 
using UserDefinedFunctionHelper#getReturnTypeOfTableFunction
 UserDefinedFunctionHelper#getAccumulatorTypeOfAggregateFunction 
UserDefinedFunctionHelper#getReturnTypeOfAggregateFunction instead.
```

if (udf instanceof ScalarFunction) {
return new ScalarFunctionDefinition(
name,
(ScalarFunction) udf
);
} else if (udf instanceof TableFunction) {
TableFunction t = (TableFunction) udf;
return new TableFunctionDefinition(
name,
t,
t.getResultType()
);
} else if (udf instanceof AggregateFunction) {
AggregateFunction a = (AggregateFunction) udf;

return new AggregateFunctionDefinition(
name,
a,
a.getAccumulatorType(),
a.getResultType()
);
} else if (udf instanceof TableAggregateFunction) {
TableAggregateFunction a = (TableAggregateFunction) udf;

return new TableAggregateFunctionDefinition(
name,
a,
a.getAccumulatorType(),
a.getResultType()
);
```






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on issue #11291: [FLINK-16392] [API / DataStream] oneside sorted cache in intervaljoin

2020-03-03 Thread GitBox

flinkbot edited a comment on issue #11291: [FLINK-16392] [API / DataStream] 
oneside sorted cache in intervaljoin
URL: https://github.com/apache/flink/pull/11291#issuecomment-593677297
 
 
   
   ## CI report:
   
   * d6b2919dd28c55230e530ccc45ca7d93d90a60df UNKNOWN
   * 3d121cb731b45565917fa47ecfd999163ef06625 UNKNOWN
   * 718dc465d0653fc2f168de023bda603d78fa2101 Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/151652734) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5889)
 
   * 87148bdb635a6981d9ecc6c827061f3e13a47966 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Created] (FLINK-16413) Reduce hive source parallelism when limit push down

2020-03-03 Thread Jingsong Lee (Jira)

Jingsong Lee created FLINK-16413:


 Summary: Reduce hive source parallelism when limit push down
 Key: FLINK-16413
 URL: https://issues.apache.org/jira/browse/FLINK-16413
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Reporter: Jingsong Lee
 Fix For: 1.11.0


User started hive source parallelism automatic inference. For example, Set the 
maximum parallelism of inference to 10.

User have a similar SQL SELECT * from mytable limit 1;

There are more than 10 files in the hive table mytable. Is it a bit wasteful to 
start 10 parallelism.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-16326) Eagerly validate strictly required Flink configurations for Stateful Functions

2020-03-03 Thread Tzu-Li (Gordon) Tai (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai reassigned FLINK-16326:
---

Assignee: Tzu-Li (Gordon) Tai

> Eagerly validate strictly required Flink configurations for Stateful Functions
> --
>
> Key: FLINK-16326
> URL: https://issues.apache.org/jira/browse/FLINK-16326
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Affects Versions: statefun-1.1
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Critical
>  Labels: pull-request-available
> Fix For: statefun-1.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, when Stateful Functions users want to set their own Flink 
> configurations, they are required to build on top of a base template 
> {{flink-conf.yaml}} which has some strictly required configurations 
> predefined, such as parent-first classloading and state backend settings.
> These Flink settings should never (as of now) be changed by the user, but 
> there is no validation of that in place. We should do that eagerly 
> pre-submission of the translated job, probably in {{StatefulFunctionsConfig}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (FLINK-16326) Eagerly validate strictly required Flink configurations for Stateful Functions

2020-03-03 Thread Tzu-Li (Gordon) Tai (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai closed FLINK-16326.
---
Fix Version/s: statefun-1.1
   Resolution: Fixed

> Eagerly validate strictly required Flink configurations for Stateful Functions
> --
>
> Key: FLINK-16326
> URL: https://issues.apache.org/jira/browse/FLINK-16326
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Affects Versions: statefun-1.1
>Reporter: Tzu-Li (Gordon) Tai
>Priority: Critical
>  Labels: pull-request-available
> Fix For: statefun-1.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, when Stateful Functions users want to set their own Flink 
> configurations, they are required to build on top of a base template 
> {{flink-conf.yaml}} which has some strictly required configurations 
> predefined, such as parent-first classloading and state backend settings.
> These Flink settings should never (as of now) be changed by the user, but 
> there is no validation of that in place. We should do that eagerly 
> pre-submission of the translated job, probably in {{StatefulFunctionsConfig}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16326) Eagerly validate strictly required Flink configurations for Stateful Functions

2020-03-03 Thread Tzu-Li (Gordon) Tai (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050893#comment-17050893
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-16326:
-

That indeed does look like a good direction for centralizing validation of 
configs.

In StatefulFunction's scenario though, the problem is that some configs, like 
{{execution.checkpointing.max-concurrent-checkpoints}} is strictly required to 
be set to 1, because statefun currently doesn't support concurrent checkpoints.

I think that's not what the use case that 
{{ConfigOption.withValidator(OptionValidator)}} was targeting at in the first 
place, where there is some user-space validation.

None the less, I think FLINK-54 is good to have.

> Eagerly validate strictly required Flink configurations for Stateful Functions
> --
>
> Key: FLINK-16326
> URL: https://issues.apache.org/jira/browse/FLINK-16326
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Affects Versions: statefun-1.1
>Reporter: Tzu-Li (Gordon) Tai
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, when Stateful Functions users want to set their own Flink 
> configurations, they are required to build on top of a base template 
> {{flink-conf.yaml}} which has some strictly required configurations 
> predefined, such as parent-first classloading and state backend settings.
> These Flink settings should never (as of now) be changed by the user, but 
> there is no validation of that in place. We should do that eagerly 
> pre-submission of the translated job, probably in {{StatefulFunctionsConfig}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16326) Eagerly validate strictly required Flink configurations for Stateful Functions

2020-03-03 Thread Tzu-Li (Gordon) Tai (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050894#comment-17050894
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-16326:
-

Merged for master via 7b0970812f69d2140cd5a3c5087ff399f57a6487.

> Eagerly validate strictly required Flink configurations for Stateful Functions
> --
>
> Key: FLINK-16326
> URL: https://issues.apache.org/jira/browse/FLINK-16326
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Affects Versions: statefun-1.1
>Reporter: Tzu-Li (Gordon) Tai
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, when Stateful Functions users want to set their own Flink 
> configurations, they are required to build on top of a base template 
> {{flink-conf.yaml}} which has some strictly required configurations 
> predefined, such as parent-first classloading and state backend settings.
> These Flink settings should never (as of now) be changed by the user, but 
> there is no validation of that in place. We should do that eagerly 
> pre-submission of the translated job, probably in {{StatefulFunctionsConfig}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on issue #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

flinkbot edited a comment on issue #11168: [FLINK-16140] [docs-zh] Translate 
Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#issuecomment-589576910
 
 
   
   ## CI report:
   
   * 76146c2111a47b68765168064b4d1dd90448789c Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/151304506) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5800)
 
   * 829770f1ca71973e386bf808fd6b9d813179061b Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/151669711) Azure: 
[PENDING](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5891)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Closed] (FLINK-16390) Add view and clear methods to PersistedTable

2020-03-03 Thread Tzu-Li (Gordon) Tai (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tzu-Li (Gordon) Tai closed FLINK-16390.
---
Fix Version/s: statefun-1.1
   Resolution: Fixed

Merged to master via 44a103fa64b122adef7c6fb2008c148f2d512307

> Add view and clear methods to PersistedTable
> 
>
> Key: FLINK-16390
> URL: https://issues.apache.org/jira/browse/FLINK-16390
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Affects Versions: statefun-1.1
>Reporter: Seth Wiesman
>Assignee: Seth Wiesman
>Priority: Major
>  Labels: pull-request-available
> Fix For: statefun-1.1
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Add common collection methods to PersistedTable. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink-statefun] tzulitai closed pull request #47: [FLINK-16326] [core] Eagerly validate strictly required Flink configurations

2020-03-03 Thread GitBox

tzulitai closed pull request #47: [FLINK-16326] [core] Eagerly validate 
strictly required Flink configurations
URL: https://github.com/apache/flink-statefun/pull/47
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-16326) Eagerly validate strictly required Flink configurations for Stateful Functions

2020-03-03 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-16326:
---
Labels: pull-request-available  (was: )

> Eagerly validate strictly required Flink configurations for Stateful Functions
> --
>
> Key: FLINK-16326
> URL: https://issues.apache.org/jira/browse/FLINK-16326
> Project: Flink
>  Issue Type: Improvement
>  Components: Stateful Functions
>Affects Versions: statefun-1.1
>Reporter: Tzu-Li (Gordon) Tai
>Priority: Critical
>  Labels: pull-request-available
>
> Currently, when Stateful Functions users want to set their own Flink 
> configurations, they are required to build on top of a base template 
> {{flink-conf.yaml}} which has some strictly required configurations 
> predefined, such as parent-first classloading and state backend settings.
> These Flink settings should never (as of now) be changed by the user, but 
> there is no validation of that in place. We should do that eagerly 
> pre-submission of the translated job, probably in {{StatefulFunctionsConfig}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink-statefun] tzulitai closed pull request #42: [FLINK-16390][sdk] Add view and clear methods to PersistedTable

2020-03-03 Thread GitBox

tzulitai closed pull request #42: [FLINK-16390][sdk] Add view and clear methods 
to PersistedTable
URL: https://github.com/apache/flink-statefun/pull/42
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on issue #47: [FLINK-16326] [core] Eagerly validate strictly required Flink configurations

2020-03-03 Thread GitBox

tzulitai commented on issue #47: [FLINK-16326] [core] Eagerly validate strictly 
required Flink configurations
URL: https://github.com/apache/flink-statefun/pull/47#issuecomment-594326230
 
 
   Thanks for the reviews @igalshilman @sjwiesman!
   All your comments have been addressed with the suggestions.
   
   Will proceed to merge this now ..


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

flinkbot edited a comment on issue #11168: [FLINK-16140] [docs-zh] Translate 
Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#issuecomment-589576910
 
 
   
   ## CI report:
   
   * 76146c2111a47b68765168064b4d1dd90448789c Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/151304506) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5800)
 
   * 829770f1ca71973e386bf808fd6b9d813179061b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-16406) Increase default value for JVM Metaspace to minimise its OutOfMemoryError

2020-03-03 Thread Xintong Song (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050890#comment-17050890
 ] 

Xintong Song commented on FLINK-16406:
--

A summary of bad cases collected so far:
 * [~Jamalarm]  reported in FLINK-16142. He had both problems of memory leak 
and insufficient default size. He has not mention how large metaspace size 
fixes his problem.
 * [~blablabla123] also reported in FLINK-16142 about the insufficient default 
size, fixed by increasing to 256 MB.
 * [~nielsbasjes] reported in a [user ML 
thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kubernetes-java-lang-OutOfMemoryError-Metaspace-td33285.html]
 and in FLINK-16142. From what he described that the problem occurs with 
repeatedly executing a job, it sounds like a memory leak problem to me. We 
would need to investigate more into this.
 * John Smith reported in another [user ML 
thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/MaxMetaspace-default-may-be-to-low-td33049.html]
 about insufficient default size, fixed by increasing to 256 MB. I believe he 
is also the one who opened FLINK-16278 ([~javadevmtl]).

> Increase default value for JVM Metaspace to minimise its OutOfMemoryError
> -
>
> Key: FLINK-16406
> URL: https://issues.apache.org/jira/browse/FLINK-16406
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration, Runtime / Task
>Affects Versions: 1.10.0
>Reporter: Andrey Zagrebin
>Assignee: Andrey Zagrebin
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.10.1, 1.11.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> With FLIP-49 
> ([FLINK-13980|https://issues.apache.org/jira/browse/FLINK-13980]), we 
> introduced a limit for JVM Metaspace 
> ('taskmanager.memory.jvm-metaspace.size') when TM JVM process is started. It 
> caused '_OutOfMemoryError: Metaspace_' for some use cases after upgrading to 
> the latest 1.10 version. In some cases, a real class loading leak has been 
> discovered, like in 
> [FLINK-16142|https://issues.apache.org/jira/browse/FLINK-16142]. Some users 
> had to increase the default value to accommodate for their use cases (mostly 
> from 96Mb to 256Mb).
> While this limit was introduced to properly plan Flink resources, especially 
> for container environment, and to detect class loading leaks, the user 
> experience should be as smooth as possible. One way is provide good 
> documentation for this change 
> ([FLINK-16278|https://issues.apache.org/jira/browse/FLINK-16278]).
> Another question is the sanity of the default value. It is still arguable 
> what the default value should be (currently 96Mb). In general, the size 
> depends on the use case (job user code, how many jobs are deployed in the 
> cluster etc).
> This issue tries to tackle this problem by firstly increasing it to 256Mb and 
> overall default process size to 1728Mb in flink-conf.yaml to have no impact 
> on default sizes of other memory components. We also want to poll which 
> Metaspace setting resolved the _OutOfMemoryError_. Please, if you encountered 
> this problem, report here any relevant specifics of your job and your 
> Metaspace size if there was no class loading leak.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] 
Translate Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#discussion_r387442283
 
 

 ##
 File path: docs/dev/libs/cep.zh.md
 ##
 @@ -736,41 +715,39 @@ Pattern relaxedNot = 
start.notFollowedBy("not").where(...);
 
 {% highlight scala %}
 
-// strict contiguity
+// 严格连续
 val strict: Pattern[Event, _] = start.next("middle").where(...)
 
-// relaxed contiguity
+// 松散连续
 val relaxed: Pattern[Event, _] = start.followedBy("middle").where(...)
 
-// non-deterministic relaxed contiguity
+// 不确定的松散连续
 val nonDetermin: Pattern[Event, _] = start.followedByAny("middle").where(...)
 
-// NOT pattern with strict contiguity
+// 严格连续的NOT模式
 val strictNot: Pattern[Event, _] = start.notNext("not").where(...)
 
-// NOT pattern with relaxed contiguity
+// 松散连续的NOT模式
 val relaxedNot: Pattern[Event, _] = start.notFollowedBy("not").where(...)
 
 {% endhighlight %}
 
 
 
-Relaxed contiguity means that only the first succeeding matching event will be 
matched, while
-with non-deterministic relaxed contiguity, multiple matches will be emitted 
for the same beginning. As an example,
-a pattern `"a b"`, given the event sequence `"a", "c", "b1", "b2"`, will give 
the following results:
+松散连续意味着跟着的事件中，只有第一个可匹配的事件会被匹配上，而不确定的松散连接情况下，有着同样起始的多个匹配会被输出。
+举例来说，模式`"a b"`，给定事件序列`"a"，"c"，"b1"，"b2"`，会产生如下的结果：
 
-1. Strict Contiguity between `"a"` and `"b"`: `{}` (no match), the `"c"` after 
`"a"` causes `"a"` to be discarded.
+1. `"a"`和`"b"`之间严格连续： `{}` （没有匹配），`"a"`之后的`"c"`导致`"a"`被丢弃。
 
-2. Relaxed Contiguity between `"a"` and `"b"`: `{a b1}`, as relaxed continuity 
is viewed as "skip non-matching events
-till the next matching one".
+2. `"a"`和`"b"`之间松散连续： `{a b1}`，松散连续会"跳过不匹配的事件直到匹配上的事件"。
 
-3. Non-Deterministic Relaxed Contiguity between `"a"` and `"b"`: `{a b1}`, `{a 
b2}`, as this is the most general form.
+3. `"a"`和`"b"`之间不确定的松散连续： `{a b1}`, `{a b2}`，这是最常见的情况。
 
-It's also possible to define a temporal constraint for the pattern to be valid.
-For example, you can define that a pattern should occur within 10 seconds via 
the `pattern.within()` method.
-Temporal patterns are supported for both [processing and event 
time]({{site.baseurl}}/dev/event_time.html).
+也可以为模式定义一个有效时间约束。
+例如，你可以通过`pattern.within()`方法指定一个模式应该在10秒内发生。
+这种暂时的模式支持[处理时间和事件时间]({{site.baseurl}}/zh/dev/event_time.html).
 
 Review comment:
   叫时间模式吧，这个好像合适一点


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] dianfu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

dianfu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] 
Translate Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#discussion_r387442140
 
 

 ##
 File path: docs/dev/libs/cep.zh.md
 ##
 @@ -445,15 +440,14 @@ pattern.where(new IterativeCondition() {
   
  until(condition)
  
- Specifies a stop condition for a looping pattern. 
Meaning if event matching the given condition occurs, no more
- events will be accepted into the pattern.
- Applicable only in conjunction with 
oneOrMore()
- NOTE: It allows for cleaning state for 
corresponding pattern on event-based condition.
 
 Review comment:
   嗯，是的


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] 
Translate Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#discussion_r387441314
 
 

 ##
 File path: docs/dev/libs/cep.zh.md
 ##
 @@ -445,15 +440,14 @@ pattern.where(new IterativeCondition() {
   
  until(condition)
  
- Specifies a stop condition for a looping pattern. 
Meaning if event matching the given condition occurs, no more
- events will be accepted into the pattern.
- Applicable only in conjunction with 
oneOrMore()
- NOTE: It allows for cleaning state for 
corresponding pattern on event-based condition.
 
 Review comment:
   叫提示更好一点吧


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-16142) Memory Leak causes Metaspace OOM error on repeated job submission

2020-03-03 Thread Xintong Song (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050884#comment-17050884
 ] 

Xintong Song commented on FLINK-16142:
--

Hi [~Jamalarm],

Could you share how large you increased the metaspace size to that solves your 
problem? I mean after applying Stephan's fix on the leak. I'm asking because we 
are collecting information about the proper default metaspace size that might 
work for most use cases. See FLINK-16406 for more details.

> Memory Leak causes Metaspace OOM error on repeated job submission
> -
>
> Key: FLINK-16142
> URL: https://issues.apache.org/jira/browse/FLINK-16142
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.10.0
>Reporter: Thomas Wozniakowski
>Priority: Blocker
> Fix For: 1.10.1, 1.11.0
>
> Attachments: Leak-GC-root.png, java_pid1.hprof, java_pid1.hprof
>
>
> Hi Guys,
> We've just tried deploying 1.10.0 as it has lots of shiny stuff that fits our 
> use-case exactly (RocksDB state backend running in a containerised cluster). 
> Unfortunately, it seems like there is a memory leak somewhere in the job 
> submission logic. We are getting this error:
> {code:java}
> 2020-02-18 10:22:10,020 INFO 
> org.apache.flink.runtime.executiongraph.ExecutionGraph - OPERATOR_NAME 
> switched from RUNNING to FAILED.
> java.lang.OutOfMemoryError: Metaspace
> at java.lang.ClassLoader.defineClass1(Native Method)
> at java.lang.ClassLoader.defineClass(ClassLoader.java:757)
> at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
> at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
> at 
> org.apache.flink.util.ChildFirstClassLoader.loadClass(ChildFirstClassLoader.java:60)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
> at 
> org.apache.flink.kinesis.shaded.com.amazonaws.jmx.SdkMBeanRegistrySupport.registerMetricAdminMBean(SdkMBeanRegistrySupport.java:27)
> at 
> org.apache.flink.kinesis.shaded.com.amazonaws.metrics.AwsSdkMetrics.registerMetricAdminMBean(AwsSdkMetrics.java:398)
> at 
> org.apache.flink.kinesis.shaded.com.amazonaws.metrics.AwsSdkMetrics.(AwsSdkMetrics.java:359)
> at 
> org.apache.flink.kinesis.shaded.com.amazonaws.AmazonWebServiceClient.requestMetricCollector(AmazonWebServiceClient.java:728)
> at 
> org.apache.flink.kinesis.shaded.com.amazonaws.AmazonWebServiceClient.isRMCEnabledAtClientOrSdkLevel(AmazonWebServiceClient.java:660)
> at 
> org.apache.flink.kinesis.shaded.com.amazonaws.AmazonWebServiceClient.isRequestMetricsEnabled(AmazonWebServiceClient.java:652)
> at 
> org.apache.flink.kinesis.shaded.com.amazonaws.AmazonWebServiceClient.createExecutionContext(AmazonWebServiceClient.java:611)
> at 
> org.apache.flink.kinesis.shaded.com.amazonaws.AmazonWebServiceClient.createExecutionContext(AmazonWebServiceClient.java:606)
> at 
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.executeListShards(AmazonKinesisClient.java:1534)
> at 
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.listShards(AmazonKinesisClient.java:1528)
> at 
> org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.listShards(KinesisProxy.java:439)
> at 
> org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.getShardsOfStream(KinesisProxy.java:389)
> at 
> org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.getShardList(KinesisProxy.java:279)
> at 
> org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher.discoverNewShardsToSubscribe(KinesisDataFetcher.java:686)
> at 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.run(FlinkKinesisConsumer.java:287)
> at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100)
> at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63)
> {code}
> (The only change in the above text is the OPERATOR_NAME text where I removed 
> some of the internal specifics of our system).
> This will reliably happen on a fresh cluster after submitting and cancelling 
> our job 3 times.
> We are using the presto-s3 plugin, the CEP library and the Kinesis connector.
> Please let me know what other diagnostics would be useful.
> Tom



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] 
Translate Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#discussion_r387440988
 
 

 ##
 File path: docs/dev/libs/cep.zh.md
 ##
 @@ -23,23 +23,20 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-FlinkCEP is the Complex Event Processing (CEP) library implemented on top of 
Flink.
-It allows you to detect event patterns in an endless stream of events, giving 
you the opportunity to get hold of what's important in your
-data.
+FlinkCEP是在Flink上层实现的复杂事件处理库。
+它可以让你在无限事件流中检测出特定的事件模型，有机会掌握数据中重要的那部分。
 
-This page describes the API calls available in Flink CEP. We start by 
presenting the [Pattern API](#the-pattern-api),
-which allows you to specify the patterns that you want to detect in your 
stream, before presenting how you can
-[detect and act upon matching event sequences](#detecting-patterns). We then 
present the assumptions the CEP
-library makes when [dealing with lateness](#handling-lateness-in-event-time) 
in event time and how you can
-[migrate your job](#migrating-from-an-older-flink-versionpre-13) from an older 
Flink version to Flink-1.3.
+本页讲述了Flink 
CEP中可用的API，我们首先讲述[模式API](#模式api)，它可以让你指定想在数据流中检测的模式，然后讲述如何[检测匹配的事件序列并进行处理](#检测模式)。
+再然后我们讲述Flink在按照事件时间[处理迟到事件](#按照事件时间处理晚到事件)时的假设，
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] xintongsong commented on issue #11300: [FLINK-16406] Increase default value for JVM Metaspace to minimise its OutOfMemoryError

2020-03-03 Thread GitBox

xintongsong commented on issue #11300: [FLINK-16406] Increase default value for 
JVM Metaspace to minimise its OutOfMemoryError
URL: https://github.com/apache/flink/pull/11300#issuecomment-594316624
 
 
   Thanks for the PR @azagrebin. LGTM.
   I'm wondering shall we hold this PR for now? Just to give some time for 
people responding the the user ML survey. I guess there is still time before 
releasing 1.10.1?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink-statefun] tzulitai commented on issue #42: [FLINK-16390][sdk] Add view and clear methods to PersistedTable

2020-03-03 Thread GitBox

tzulitai commented on issue #42: [FLINK-16390][sdk] Add view and clear methods 
to PersistedTable
URL: https://github.com/apache/flink-statefun/pull/42#issuecomment-594316276
 
 
   Thanks for the update @sjwiesman.
   
   +1 LGTM, merging ..


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] 
Translate Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#discussion_r387438715
 
 

 ##
 File path: docs/dev/libs/cep.zh.md
 ##
 @@ -1537,14 +1493,14 @@ class MyPatternProcessFunction extends 
PatternProcessFunction
 }
 {% endhighlight %}
 
-Note The `processTimedOutMatch` does not 
give one access to the main output. You can still emit results
-through [side-outputs]({{ site.baseurl }}/dev/stream/side_output.html) though, 
through the `Context` object.
+Note `processTimedOutMatch`不能访问主输出。
+但你可以通过`Context`对象把结果输出到[侧输出]({{ site.baseurl 
}}/zh/dev/stream/side_output.html)。
 
 
- Convenience API
+ 便捷的API
 
-The aforementioned `PatternProcessFunction` was introduced in Flink 1.8 and 
since then it is the recommended way to interact with matches.
-One can still use the old style API like `select`/`flatSelect`, which 
internally will be translated into a `PatternProcessFunction`.
+前面提到的`PatternProcessFunction`是在Flink 1.8之后引入的，从那之后推荐使用这个接口来处理匹配到的结果。
+仍然可以使用像`select`/`flatSelect`这样旧格式的API，它们会在内部被转换为`PatternProcessFunction`。
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] 
Translate Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#discussion_r387438670
 
 

 ##
 File path: docs/dev/libs/cep.zh.md
 ##
 @@ -1598,17 +1554,17 @@ val timeoutResult: DataStream[TimeoutEvent] = 
result.getSideOutput(outputTag)
 
 
 
-## Time in CEP library
+## CEP库中的时间
 
-### Handling Lateness in Event Time
+### 按照事件时间处理晚到事件
 
-In `CEP` the order in which elements are processed matters. To guarantee that 
elements are processed in the correct order when working in event time, an 
incoming element is initially put in a buffer where elements are *sorted in 
ascending order based on their timestamp*, and when a watermark arrives, all 
the elements in this buffer with timestamps smaller than that of the watermark 
are processed. This implies that elements between watermarks are processed in 
event-time order.
+在`CEP`中，事件的处理顺序很重要。在使用事件时间时，为了保证事件按照正确的顺序被处理，一个事件到来后会先被放到一个缓冲区中，
+在缓冲区里事件都按照时间戳从小到大排序，当水位线到达后，缓冲区中所有小于水位线的事件被处理。这意味这水位线之间的数据都按照事件戳被顺序处理。
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] 
Translate Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#discussion_r387438919
 
 

 ##
 File path: docs/dev/libs/cep.zh.md
 ##
 @@ -425,18 +421,17 @@ pattern.where(new IterativeCondition() {
 
 or(condition)
 
-Adds a new condition which is ORed with an existing one. An 
event can match the pattern only if it
-passes at least one of the conditions:
+增加一个新的判断和当前的判断取或。一个事件只要满足至少一个判断条件就匹配到模式：
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] 
Translate Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#discussion_r387438639
 
 

 ##
 File path: docs/dev/libs/cep.zh.md
 ##
 @@ -136,140 +132,143 @@ val result: DataStream[Alert] = patternStream.process(
 
 
 
-## The Pattern API
+## 模式API
 
-The pattern API allows you to define complex pattern sequences that you want 
to extract from your input stream.
+模式API可以让你定义想从输入流中抽取的复杂模式序列。
 
-Each complex pattern sequence consists of multiple simple patterns, i.e. 
patterns looking for individual events with the same properties. From now on, 
we will call these simple patterns **patterns**, and the final complex pattern 
sequence we are searching for in the stream, the **pattern sequence**. You can 
see a pattern sequence as a graph of such patterns, where transitions from one 
pattern to the next occur based on user-specified
-*conditions*, e.g. `event.getName().equals("end")`. A **match** is a sequence 
of input events which visits all
-patterns of the complex pattern graph, through a sequence of valid pattern 
transitions.
+每个复杂的模式序列包括多个简单的模式，比如，寻找拥有相同属性事件序列的模式。从现在开始，我们把这些简单的模式称作**模式**，
+把我们在数据流中最终寻找的复杂模式序列称作**模式序列**，你可以把模式序列看作是这样的模式构成的图，
+这些模式基于用户指定的**条件**从一个转换到另外一个，比如 `event.getName().equals("end")`。
+一个**匹配**是输入事件的一个序列，这些事件通过一系列有效的模式转换，能够访问到复杂模式图中的所有模式。
 
-{% warn Attention %} Each pattern must have a unique name, which you use later 
to identify the matched events.
+{% warn 注意 %} 每个模式必须有一个独一无二的名字，你可以在后面使用它来识别匹配到的事件。
 
-{% warn Attention %} Pattern names **CANNOT** contain the character `":"`.
+{% warn 注意 %} 模式的名字不能包含字符`":"`.
 
-In the rest of this section we will first describe how to define [Individual 
Patterns](#individual-patterns), and then how you can combine individual 
patterns into [Complex Patterns](#combining-patterns).
+这一节的剩余部分我们会先讲述如何定义[单个模式](#单个模式)，然后讲如何将单个模式组合成[复杂模式](#组合模式)。
 
-### Individual Patterns
+### 单个模式
 
-A **Pattern** can be either a *singleton* or a *looping* pattern. Singleton 
patterns accept a single
-event, while looping patterns can accept more than one. In pattern matching 
symbols, the pattern `"a b+ c? d"` (or `"a"`, followed by *one or more* 
`"b"`'s, optionally followed by a `"c"`, followed by a `"d"`), `a`, `c?`, and 
`d` are
-singleton patterns, while `b+` is a looping one. By default, a pattern is a 
singleton pattern and you can transform
-it to a looping one by using [Quantifiers](#quantifiers). Each pattern can 
have one or more
-[Conditions](#conditions) based on which it accepts events.
+一个**模式**可以是一个**单例**或者**循环**模式。单例模式只接受一个事件，循环模式可以接受多个事件。
+在模式匹配表达式中，模式`"a b+ c? 
d"`（或者`"a"`，后面跟着一个或者多个`"b"`，再往后可选择的跟着一个`"c"`，最后跟着一个`"d"`），
+`a`，`c?`，和 `d`都是单例模式，`b+`是一个循环模式。默认情况下，模式都是单例的，你可以通过使用[量词](#量词)把它们转换成循环模式。
+每个模式可以有一个或者多个[条件](#条件)来决定它接受哪些事件。
 
- Quantifiers
+ 量词
 
-In FlinkCEP, you can specify looping patterns using these methods: 
`pattern.oneOrMore()`, for patterns that expect one or more occurrences of a 
given event (e.g. the `b+` mentioned before); and `pattern.times(#ofTimes)`, 
for patterns that
-expect a specific number of occurrences of a given type of event, e.g. 4 
`a`'s; and `pattern.times(#fromTimes, #toTimes)`, for patterns that expect a 
specific minimum number of occurrences and a maximum number of occurrences of a 
given type of event, e.g. 2-4 `a`s.
+在FlinkCEP中，你可以通过这些方法指定循环模式：`pattern.oneOrMore()`，指定期望一个给定事件出现一次或者多次的模式（例如前面提到的`b+`模式）；
+`pattern.times(#ofTimes)`，指定期望一个给定事件出现特定次数的模式，例如出现4次`a`；
+`pattern.times(#fromTimes, 
#toTimes)`，指定期望一个给定事件出现次数在一个最小值和最大值中间的模式，比如出现2-4次`a`。
 
-You can make looping patterns greedy using the `pattern.greedy()` method, but 
you cannot yet make group patterns greedy. You can make all patterns, looping 
or not, optional using the `pattern.optional()` method.
+你可以使用`pattern.greedy()`方法让循环模式变成贪心的，但现在还不能让模式组贪心。
+你可以使用`pattern.optional()`方法让所有的模式变成可选的，不管是否是循环模式。
 
-For a pattern named `start`, the following are valid quantifiers:
+对一个命名为`start`的模式，以下量词是有效的：
 
 
 
 {% highlight java %}
-// expecting 4 occurrences
+// 期望出现4次
 start.times(4);
 
-// expecting 0 or 4 occurrences
+// 期望出现0或者4次
 start.times(4).optional();
 
-// expecting 2, 3 or 4 occurrences
+// 期望出现2、3或者4次
 start.times(2, 4);
 
-// expecting 2, 3 or 4 occurrences and repeating as many as possible
+// 期望出现2、3或者4次，并且尽可能的重复次数多
 start.times(2, 4).greedy();
 
-// expecting 0, 2, 3 or 4 occurrences
+// 期望出现0、2、3或者4次
 start.times(2, 4).optional();
 
-// expecting 0, 2, 3 or 4 occurrences and repeating as many as possible
+// 期望出现0、2、3或者4次，并且尽可能的重复次数多
 start.times(2, 4).optional().greedy();
 
-// expecting 1 or more occurrences
+// 期望出现1到多次
 start.oneOrMore();
 
-// expecting 1 or more occurrences and repeating as many as possible
+// 期望出现1到多次，并且尽可能的重复次数多
 start.oneOrMore().greedy();
 
-// expecting 0 or more occurrences
+// 期望出现0到多次
 start.oneOrMore().optional();
 
-// expecting 0 or more occurrences and repeating as many as

[GitHub] [flink] shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] 
Translate Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#discussion_r387437963
 
 

 ##
 File path: docs/dev/libs/cep.zh.md
 ##
 @@ -736,41 +715,39 @@ Pattern relaxedNot = 
start.notFollowedBy("not").where(...);
 
 {% highlight scala %}
 
-// strict contiguity
+// 严格连续
 val strict: Pattern[Event, _] = start.next("middle").where(...)
 
-// relaxed contiguity
+// 松散连续
 val relaxed: Pattern[Event, _] = start.followedBy("middle").where(...)
 
-// non-deterministic relaxed contiguity
+// 不确定的松散连续
 val nonDetermin: Pattern[Event, _] = start.followedByAny("middle").where(...)
 
-// NOT pattern with strict contiguity
+// 严格连续的NOT模式
 val strictNot: Pattern[Event, _] = start.notNext("not").where(...)
 
-// NOT pattern with relaxed contiguity
+// 松散连续的NOT模式
 val relaxedNot: Pattern[Event, _] = start.notFollowedBy("not").where(...)
 
 {% endhighlight %}
 
 
 
-Relaxed contiguity means that only the first succeeding matching event will be 
matched, while
-with non-deterministic relaxed contiguity, multiple matches will be emitted 
for the same beginning. As an example,
-a pattern `"a b"`, given the event sequence `"a", "c", "b1", "b2"`, will give 
the following results:
+松散连续意味着跟着的事件中，只有第一个可匹配的事件会被匹配上，而不确定的松散连接情况下，有着同样起始的多个匹配会被输出。
+举例来说，模式`"a b"`，给定事件序列`"a"，"c"，"b1"，"b2"`，会产生如下的结果：
 
-1. Strict Contiguity between `"a"` and `"b"`: `{}` (no match), the `"c"` after 
`"a"` causes `"a"` to be discarded.
+1. `"a"`和`"b"`之间严格连续： `{}` （没有匹配），`"a"`之后的`"c"`导致`"a"`被丢弃。
 
-2. Relaxed Contiguity between `"a"` and `"b"`: `{a b1}`, as relaxed continuity 
is viewed as "skip non-matching events
-till the next matching one".
+2. `"a"`和`"b"`之间松散连续： `{a b1}`，松散连续会"跳过不匹配的事件直到匹配上的事件"。
 
-3. Non-Deterministic Relaxed Contiguity between `"a"` and `"b"`: `{a b1}`, `{a 
b2}`, as this is the most general form.
+3. `"a"`和`"b"`之间不确定的松散连续： `{a b1}`, `{a b2}`，这是最常见的情况。
 
-It's also possible to define a temporal constraint for the pattern to be valid.
-For example, you can define that a pattern should occur within 10 seconds via 
the `pattern.within()` method.
-Temporal patterns are supported for both [processing and event 
time]({{site.baseurl}}/dev/event_time.html).
+也可以为模式定义一个有效时间约束。
+例如，你可以通过`pattern.within()`方法指定一个模式应该在10秒内发生。
+这种暂时的模式支持[处理时间和事件时间]({{site.baseurl}}/zh/dev/event_time.html).
 
-{% warn Attention %} A pattern sequence can only have one temporal constraint. 
If multiple such constraints are defined on different individual patterns, then 
the smallest is applied.
+{% warn 注意 %} 一个模式序列只能有一个时间限制。如果在限制了多个时间在不同的单个模式上，会使用最小的那个时间限制。
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] 
Translate Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#discussion_r387438795
 
 

 ##
 File path: docs/dev/libs/cep.zh.md
 ##
 @@ -335,88 +332,87 @@ start.where(event => event.getName.startsWith("foo"))
 
 
 
-Finally, you can also restrict the type of the accepted event to a subtype of 
the initial event type (here `Event`)
-via the `pattern.subtype(subClass)` method.
+最后，你可以通过`pattern.subtype(subClass)`方法限制接受的事件类型是初始事件的子类型。
 
 
 
 {% highlight java %}
 start.subtype(SubEvent.class).where(new SimpleCondition() {
 @Override
 public boolean filter(SubEvent value) {
-return ... // some condition
+return ... // 一些判断条件
 }
 });
 {% endhighlight %}
 
 
 
 {% highlight scala %}
-start.subtype(classOf[SubEvent]).where(subEvent => ... /* some condition */)
+start.subtype(classOf[SubEvent]).where(subEvent => ... /* 一些判断条件 */)
 {% endhighlight %}
 
 
 
-**Combining Conditions:** As shown above, you can combine the `subtype` 
condition with additional conditions. This holds for every condition. You can 
arbitrarily combine conditions by sequentially calling `where()`. The final 
result will be the logical **AND** of the results of the individual conditions. 
To combine conditions using **OR**, you can use the `or()` method, as shown 
below.
+**组合条件：** 如上所示，你可以把`subtype`条件和其他的条件结合起来使用。这适用于任何条件，你可以通过依次调用`where()`来组合条件。
+最终的结果是是每个单一条件的结果的逻辑**AND**。如果想使用**OR**来组合条件，你可以使用像下面这样使用`or()`方法。
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] libenchao commented on a change in pull request #11119: [FLINK-15396][json] Support to ignore parse errors for JSON format

2020-03-03 Thread GitBox

libenchao commented on a change in pull request #9: [FLINK-15396][json] 
Support to ignore parse errors for JSON format
URL: https://github.com/apache/flink/pull/9#discussion_r387438612
 
 

 ##
 File path: 
flink-formats/flink-json/src/main/java/org/apache/flink/formats/json/JsonRowDeserializationSchema.java
 ##
 @@ -351,76 +379,187 @@ private DeserializationRuntimeConverter 
createFallbackConverter(Class valueTy
}
}
 
+   private boolean convertToBoolean(ObjectMapper mapper, JsonNode 
jsonNode) {
+   try {
+   return jsonNode.asBoolean();
+   } catch (Throwable t) {
+   throw new ParseErrorException("Unable to deserialize 
boolean.", t);
+   }
+   }
+
+   private String convertToString(ObjectMapper mapper, JsonNode jsonNode) {
+   try {
+   return jsonNode.asText();
+   } catch (Throwable t) {
+   throw new ParseErrorException("Unable to deserialize 
string.", t);
+   }
+   }
+
+   private int convertToInt(ObjectMapper mapper, JsonNode jsonNode) {
+   try {
+   return jsonNode.asInt();
+   } catch (Throwable t) {
+   throw new ParseErrorException("Unable to deserialize 
int.", t);
+   }
+   }
+
+   private long convertToLong(ObjectMapper mapper, JsonNode jsonNode) {
+   try {
+   return jsonNode.asLong();
+   } catch (Throwable t) {
+   throw new ParseErrorException("Unable to deserialize 
long.", t);
+   }
+   }
+
+   private double convertToDouble(ObjectMapper mapper, JsonNode jsonNode) {
+   try {
+   return jsonNode.asDouble();
+   } catch (Throwable t) {
+   throw new ParseErrorException("Unable to deserialize 
double.", t);
+   }
+   }
+
+   private float convertToFloat(ObjectMapper mapper, JsonNode jsonNode) {
+   try {
+   return Float.parseFloat(jsonNode.asText().trim());
+   } catch (Throwable t) {
+   throw new ParseErrorException("Unable to deserialize 
float.", t);
+   }
+   }
+
+   private short convertToShot(ObjectMapper mapper, JsonNode jsonNode) {
+   try {
+   return Short.parseShort(jsonNode.asText().trim());
+   } catch (Throwable t) {
+   throw new ParseErrorException("Unable to deserialize 
short.", t);
+   }
+   }
+
+   private byte convertToByte(ObjectMapper mapper, JsonNode jsonNode) {
+   try {
+   return Byte.parseByte(jsonNode.asText().trim());
+   } catch (Throwable t) {
+   throw new ParseErrorException("Unable to deserialize 
byte.", t);
+   }
+   }
+
+   private BigDecimal convertToBigDecimal(ObjectMapper mapper, JsonNode 
jsonNode) {
+   try {
+   return jsonNode.decimalValue();
+   } catch (Throwable t) {
+   throw new ParseErrorException("Unable to deserialize 
BigDecimal.", t);
+   }
+   }
+
+   private BigInteger convertToBigInt(ObjectMapper mapper, JsonNode 
jsonNode) {
+   try {
+   return jsonNode.bigIntegerValue();
+   } catch (Throwable t) {
+   throw new ParseErrorException("Unable to deserialize 
BigInteger.", t);
+   }
+   }
+
private LocalDate convertToLocalDate(ObjectMapper mapper, JsonNode 
jsonNode) {
-   return 
ISO_LOCAL_DATE.parse(jsonNode.asText()).query(TemporalQueries.localDate());
+   try {
+   return 
ISO_LOCAL_DATE.parse(jsonNode.asText()).query(TemporalQueries.localDate());
+   } catch (Throwable t) {
+   throw new ParseErrorException("Unable to deserialize 
local date.", t);
+   }
}
 
private Date convertToDate(ObjectMapper mapper, JsonNode jsonNode) {
-   return Date.valueOf(convertToLocalDate(mapper, jsonNode));
+   try {
+   return Date.valueOf(convertToLocalDate(mapper, 
jsonNode));
+   } catch (Throwable t) {
+   throw new ParseErrorException("Unable to deserialize 
date.", t);
+   }
}
 
private LocalDateTime convertToLocalDateTime(ObjectMapper mapper, 
JsonNode jsonNode) {
// according to RFC 3339 every date-time must have a timezone;
// until we have full timezone support, we only support UTC;
// users can parse their time as string as a workaround
-   TemporalAccessor parsedTimestamp =

[GitHub] [flink] shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] 
Translate Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#discussion_r387438690
 
 

 ##
 File path: docs/dev/libs/cep.zh.md
 ##
 @@ -335,88 +332,87 @@ start.where(event => event.getName.startsWith("foo"))
 
 
 
-Finally, you can also restrict the type of the accepted event to a subtype of 
the initial event type (here `Event`)
-via the `pattern.subtype(subClass)` method.
+最后，你可以通过`pattern.subtype(subClass)`方法限制接受的事件类型是初始事件的子类型。
 
 
 
 {% highlight java %}
 start.subtype(SubEvent.class).where(new SimpleCondition() {
 @Override
 public boolean filter(SubEvent value) {
-return ... // some condition
+return ... // 一些判断条件
 }
 });
 {% endhighlight %}
 
 
 
 {% highlight scala %}
-start.subtype(classOf[SubEvent]).where(subEvent => ... /* some condition */)
+start.subtype(classOf[SubEvent]).where(subEvent => ... /* 一些判断条件 */)
 {% endhighlight %}
 
 
 
-**Combining Conditions:** As shown above, you can combine the `subtype` 
condition with additional conditions. This holds for every condition. You can 
arbitrarily combine conditions by sequentially calling `where()`. The final 
result will be the logical **AND** of the results of the individual conditions. 
To combine conditions using **OR**, you can use the `or()` method, as shown 
below.
+**组合条件：** 如上所示，你可以把`subtype`条件和其他的条件结合起来使用。这适用于任何条件，你可以通过依次调用`where()`来组合条件。
+最终的结果是是每个单一条件的结果的逻辑**AND**。如果想使用**OR**来组合条件，你可以使用像下面这样使用`or()`方法。
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] 
Translate Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#discussion_r387438680
 
 

 ##
 File path: docs/dev/libs/cep.zh.md
 ##
 @@ -1651,44 +1607,42 @@ val lateData: DataStream[String] = 
result.getSideOutput(lateDataOutputTag)
 
 
 
-### Time context
+### 时间上下文
 
-In [PatternProcessFunction](#selecting-from-patterns) as well as in 
[IterativeCondition](#conditions) user has access to a context
-that implements `TimeContext` as follows:
+在[PatternProcessFunction](#从模式中选取)中，用户可以和[IterativeCondition](#条件)中
+一样按照下面的方法使用实现了`TimeContext`的上下文：
 
 {% highlight java %}
 /**
- * Enables access to time related characteristics such as current processing 
time or timestamp of
- * currently processed element. Used in {@link PatternProcessFunction} and
- * {@link org.apache.flink.cep.pattern.conditions.IterativeCondition}
+ * 支持获取事件属性比如当前处理事件或当前正处理的事件的时间。
+ * 用在{@link PatternProcessFunction}和{@link 
org.apache.flink.cep.pattern.conditions.IterativeCondition}中
  */
 @PublicEvolving
 public interface TimeContext {
 
/**
-* Timestamp of the element currently being processed.
+* 当前正处理的事件的时间戳。
 *
-* In case of {@link 
org.apache.flink.streaming.api.TimeCharacteristic#ProcessingTime} this
-* will be set to the time when event entered the cep operator.
+* 如果是{@link 
org.apache.flink.streaming.api.TimeCharacteristic#ProcessingTime}，这个值会被设置为事件进入CEP算子的时间。
 */
long timestamp();
 
-   /** Returns the current processing time. */
+   /** 返回当前的处理时间。 */
long currentProcessingTime();
 }
 {% endhighlight %}
 
-This context gives user access to time characteristics of processed events 
(incoming records in case of `IterativeCondition` and matches in case of 
`PatternProcessFunction`).
-Call to `TimeContext#currentProcessingTime` always gives you the value of 
current processing time and this call should be preferred to e.g. calling 
`System.currentTimeMillis()`.
+这个上下文让用户可以获取处理的事件（在`IterativeCondition`时候是进来的记录，在`PatternProcessFunction`是匹配的结果）的时间属性。
+调用`TimeContext#currentProcessingTime`总是返回当前的处理时间，而且尽量去调用这个函数而不是调用其它的比如说`System.currentTimeMillis()`。
 
-In case of `TimeContext#timestamp()` the returned value is equal to assigned 
timestamp in case of `EventTime`. In `ProcessingTime` this will equal to the 
point of time when said event entered
-cep operator (or when the match was generated in case of 
`PatternProcessFunction`). This means that the value will be consistent across 
multiple calls to that method.
+使用`EventTime`时，`TimeContext#timestamp()`返回的值等于分配的时间戳。
+使用`ProcessingTime`是，这个值等于事件进入CEP算子的时间点（在`PatternProcessFunction`中是匹配产生的时间）。
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] 
Translate Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#discussion_r387437113
 
 

 ##
 File path: docs/dev/libs/cep.zh.md
 ##
 @@ -1509,18 +1466,17 @@ class MyPatternProcessFunction extends 
PatternProcessFunction
 }
 {% endhighlight %}
 
-The `PatternProcessFunction` gives access to a `Context` object. Thanks to it, 
one can access time related
-characteristics such as `currentProcessingTime` or `timestamp` of current 
match (which is the timestamp of the last element assigned to the match).
-For more info see [Time context](#time-context).
-Through this context one can also emit results to a [side-output]({{ 
site.baseurl }}/dev/stream/side_output.html).
+`PatternProcessFunction`可以访问`Context`对象。有了它之后，你可以访问时间属性比如`currentProcessingTime`或者当前匹配的`timestamp`
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] 
Translate Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#discussion_r387435811
 
 

 ##
 File path: docs/dev/libs/cep.zh.md
 ##
 @@ -1509,18 +1466,17 @@ class MyPatternProcessFunction extends 
PatternProcessFunction
 }
 {% endhighlight %}
 
-The `PatternProcessFunction` gives access to a `Context` object. Thanks to it, 
one can access time related
-characteristics such as `currentProcessingTime` or `timestamp` of current 
match (which is the timestamp of the last element assigned to the match).
-For more info see [Time context](#time-context).
-Through this context one can also emit results to a [side-output]({{ 
site.baseurl }}/dev/stream/side_output.html).
+`PatternProcessFunction`可以访问`Context`对象。有了它之后，你可以访问时间属性比如`currentProcessingTime`或者当前匹配的`timestamp`
+（最新分配到匹配上的事件的时间戳）.
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] 
Translate Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#discussion_r387435604
 
 

 ##
 File path: docs/dev/libs/cep.zh.md
 ##
 @@ -1598,17 +1554,17 @@ val timeoutResult: DataStream[TimeoutEvent] = 
result.getSideOutput(outputTag)
 
 
 
-## Time in CEP library
+## CEP库中的时间
 
-### Handling Lateness in Event Time
+### 按照事件时间处理晚到事件
 
-In `CEP` the order in which elements are processed matters. To guarantee that 
elements are processed in the correct order when working in event time, an 
incoming element is initially put in a buffer where elements are *sorted in 
ascending order based on their timestamp*, and when a watermark arrives, all 
the elements in this buffer with timestamps smaller than that of the watermark 
are processed. This implies that elements between watermarks are processed in 
event-time order.
+在`CEP`中，事件的处理顺序很重要。在使用事件时间时，为了保证事件按照正确的顺序被处理，一个事件到来后会先被放到一个缓冲区中，
+在缓冲区里事件都按照时间戳从小到大排序，当水位线到达后，缓冲区中所有小于水位线的事件被处理。这意味这水位线之间的数据都按照事件戳被顺序处理。
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] 
Translate Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#discussion_r387435594
 
 

 ##
 File path: docs/dev/libs/cep.zh.md
 ##
 @@ -1750,31 +1704,25 @@ val alerts = patternStream.select(createAlert(_))
 
 
 
-## Migrating from an older Flink version(pre 1.3)
+## 从旧版本迁移（1.3之前）
 
-### Migrating to 1.4+
+### 迁移到1.4+
 
-In Flink-1.4 the backward compatibility of CEP library with <= Flink 1.2 was 
dropped. Unfortunately 
-it is not possible to restore a CEP job that was once run with 1.2.x
+在Flink-1.4放弃了和<= Flink 1.2版本的兼容性。很不幸，不能再恢复用1.2.x运行过的CEP作业。
 
-### Migrating to 1.3.x
+### 迁移到1.3.x
 
-The CEP library in Flink-1.3 ships with a number of new features which have 
led to some changes in the API. Here we
-describe the changes that you need to make to your old CEP jobs, in order to 
be able to run them with Flink-1.3. After
-making these changes and recompiling your job, you will be able to resume its 
execution from a savepoint taken with the
-old version of your job, *i.e.* without having to re-process your past data.
+CEP库在Flink-1.3发布的一系列的新特性引入了一些API上的修改。这里我们描述你需要对旧的CEP作业所做的修改，以能够用Flink-1.3来运行它们。
+在做完这些修改并重新编译你的作业之后，可以从旧版本作业的保存点之后继续运行，*也就是说*不需要再重新处理旧的数据。
 
-The changes required are:
+需要的修改是：
 
-1. Change your conditions (the ones in the `where(...)` clause) to extend the 
`SimpleCondition` class instead of
-implementing the `FilterFunction` interface.
+1. 修改你的条件（在`where(...)`语句中的）来继承`SimpleCondition`类而不是实现`FilterFunction`接口。
 
-2. Change your functions provided as arguments to the `select(...)` and 
`flatSelect(...)` methods to expect a list of
-events associated with each pattern (`List` in `Java`, `Iterable` in `Scala`). 
This is because with the addition of
-the looping patterns, multiple input events can match a single (looping) 
pattern.
+2. 
修改你作为`select(...)`和`flatSelect(...)`方法的参数的函数为期望每个模式关联一个事件列表（`Java`中`List`，`Scala`中`Iterable`）。
+这是因为增加了循环模式后，多个事件可能匹配一个单一的（循环）模式。
 
-3. The `followedBy()` in Flink 1.1 and 1.2 implied `non-deterministic relaxed 
contiguity` (see
-[here](#conditions-on-contiguity)). In Flink 1.3 this has changed and 
`followedBy()` implies `relaxed contiguity`,
-while `followedByAny()` should be used if `non-deterministic relaxed 
contiguity` is required.
+3. 在Flink 1.1和1.2中，`followedBy()` in Flink 1.1 and 1.2隐含了`不确定的松散连续` 
(参见[这里](#组合模式))。
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] Translate Event Processing (CEP) page into Chinese

2020-03-03 Thread GitBox

shuai-xu commented on a change in pull request #11168: [FLINK-16140] [docs-zh] 
Translate Event Processing (CEP) page into Chinese
URL: https://github.com/apache/flink/pull/11168#discussion_r387435614
 
 

 ##
 File path: docs/dev/libs/cep.zh.md
 ##
 @@ -1509,18 +1466,17 @@ class MyPatternProcessFunction extends 
PatternProcessFunction
 }
 {% endhighlight %}
 
-The `PatternProcessFunction` gives access to a `Context` object. Thanks to it, 
one can access time related
-characteristics such as `currentProcessingTime` or `timestamp` of current 
match (which is the timestamp of the last element assigned to the match).
-For more info see [Time context](#time-context).
-Through this context one can also emit results to a [side-output]({{ 
site.baseurl }}/dev/stream/side_output.html).
+`PatternProcessFunction`可以访问`Context`对象。有了它之后，你可以访问时间属性比如`currentProcessingTime`或者当前匹配的`timestamp`
+（最新分配到匹配上的事件的时间戳）.
+更多信息可以看[时间上下文](#时间上下文)。
+通过这个上下文也可以将结果输出到[侧输出]({{ site.baseurl }}/zh/dev/stream/side_output.html).
 
 
- Handling Timed Out Partial Patterns
+ 处理超时的部分匹配
 
-Whenever a pattern has a window length attached via the `within` keyword, it 
is possible that partial event sequences
-are discarded because they exceed the window length. To act upon a timed out 
partial match one can use `TimedOutPartialMatchHandler` interface.
-The interface is supposed to be used in a mixin style. This mean you can 
additionally implement this interface with your `PatternProcessFunction`.
-The `TimedOutPartialMatchHandler` provides the additional 
`processTimedOutMatch` method which will be called for every timed out partial 
match.
+当一个模式上通过`within`加上窗口长度后，部分匹配的事件序列就可能因为超过窗口长度而被丢弃。可以使用`TimedOutPartialMatchHandler`接口
+来处理超时的部分匹配。这个接口可以和其它的混合使用。也就是说你可以在自己的`PatternProcessFunction`里另外实现这个接口。
+`TimedOutPartialMatchHandler`提供了另外的`processTimedOutMatch`方法，这个方法对每个超时的部分匹配都对调用。
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11252: [FLINK-16337][python][table-planner-blink] Add support of vectorized Python UDF in blink planner

2020-03-03 Thread GitBox

flinkbot edited a comment on issue #11252: 
[FLINK-16337][python][table-planner-blink] Add support of vectorized Python UDF 
in blink planner
URL: https://github.com/apache/flink/pull/11252#issuecomment-592511050
 
 
   
   ## CI report:
   
   * 542c6f80de4976c204a68968f6d5af383dad1ac5 Travis: 
[SUCCESS](https://travis-ci.com/flink-ci/flink/builds/151659712) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5890)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-16412) Disallow embedded metastore in HiveCatalog production code

2020-03-03 Thread Jingsong Lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050738#comment-17050738
 ] 

Jingsong Lee commented on FLINK-16412:
--

[~lirui] Assigned to you. I mark it as a usability issue, since a lot of users 
are bothered by this.

> Disallow embedded metastore in HiveCatalog production code
> --
>
> Key: FLINK-16412
> URL: https://issues.apache.org/jira/browse/FLINK-16412
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.10.0
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
>  Labels: usability
> Fix For: 1.11.0
>
>
> Embedded metastore can cause weird problems for HiveCatalog, e.g. missing DN 
> dependencies. Since embedded mode is rarely used in production, we should ban 
> it in HiveCatalog production code. This can give users a clearer message when 
> something goes wrong, and makes it easier for dependency management.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (FLINK-16412) Disallow embedded metastore in HiveCatalog production code

2020-03-03 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee reassigned FLINK-16412:


Assignee: Rui Li

> Disallow embedded metastore in HiveCatalog production code
> --
>
> Key: FLINK-16412
> URL: https://issues.apache.org/jira/browse/FLINK-16412
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
>
> Embedded metastore can cause weird problems for HiveCatalog, e.g. missing DN 
> dependencies. Since embedded mode is rarely used in production, we should ban 
> it in HiveCatalog production code. This can give users a clearer message when 
> something goes wrong, and makes it easier for dependency management.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16412) Disallow embedded metastore in HiveCatalog production code

2020-03-03 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-16412:
-
Affects Version/s: 1.10.0

> Disallow embedded metastore in HiveCatalog production code
> --
>
> Key: FLINK-16412
> URL: https://issues.apache.org/jira/browse/FLINK-16412
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.10.0
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
> Fix For: 1.11.0
>
>
> Embedded metastore can cause weird problems for HiveCatalog, e.g. missing DN 
> dependencies. Since embedded mode is rarely used in production, we should ban 
> it in HiveCatalog production code. This can give users a clearer message when 
> something goes wrong, and makes it easier for dependency management.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16412) Disallow embedded metastore in HiveCatalog production code

2020-03-03 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-16412:
-
Labels: usability  (was: )

> Disallow embedded metastore in HiveCatalog production code
> --
>
> Key: FLINK-16412
> URL: https://issues.apache.org/jira/browse/FLINK-16412
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.10.0
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
>  Labels: usability
> Fix For: 1.11.0
>
>
> Embedded metastore can cause weird problems for HiveCatalog, e.g. missing DN 
> dependencies. Since embedded mode is rarely used in production, we should ban 
> it in HiveCatalog production code. This can give users a clearer message when 
> something goes wrong, and makes it easier for dependency management.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16412) Disallow embedded metastore in HiveCatalog production code

2020-03-03 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-16412:
-
Fix Version/s: 1.11.0

> Disallow embedded metastore in HiveCatalog production code
> --
>
> Key: FLINK-16412
> URL: https://issues.apache.org/jira/browse/FLINK-16412
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
> Fix For: 1.11.0
>
>
> Embedded metastore can cause weird problems for HiveCatalog, e.g. missing DN 
> dependencies. Since embedded mode is rarely used in production, we should ban 
> it in HiveCatalog production code. This can give users a clearer message when 
> something goes wrong, and makes it easier for dependency management.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16412) Disallow embedded metastore in HiveCatalog production code

2020-03-03 Thread Rui Li (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050735#comment-17050735
 ] 

Rui Li commented on FLINK-16412:


[~lzljs3620320] could you please assign this to me?

> Disallow embedded metastore in HiveCatalog production code
> --
>
> Key: FLINK-16412
> URL: https://issues.apache.org/jira/browse/FLINK-16412
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Reporter: Rui Li
>Priority: Major
>
> Embedded metastore can cause weird problems for HiveCatalog, e.g. missing DN 
> dependencies. Since embedded mode is rarely used in production, we should ban 
> it in HiveCatalog production code. This can give users a clearer message when 
> something goes wrong, and makes it easier for dependency management.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16412) Disallow embedded metastore in HiveCatalog production code

2020-03-03 Thread Rui Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated FLINK-16412:
---
Description: Embedded metastore can cause weird problems for HiveCatalog, 
e.g. missing DN dependencies. Since embedded mode is rarely used in production, 
we should ban it in HiveCatalog production code. This can give users a clearer 
message when something goes wrong, and makes it easier for dependency 
management.

> Disallow embedded metastore in HiveCatalog production code
> --
>
> Key: FLINK-16412
> URL: https://issues.apache.org/jira/browse/FLINK-16412
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Reporter: Rui Li
>Priority: Major
>
> Embedded metastore can cause weird problems for HiveCatalog, e.g. missing DN 
> dependencies. Since embedded mode is rarely used in production, we should ban 
> it in HiveCatalog production code. This can give users a clearer message when 
> something goes wrong, and makes it easier for dependency management.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on issue #11252: [FLINK-16337][python][table-planner-blink] Add support of vectorized Python UDF in blink planner

2020-03-03 Thread GitBox

flinkbot edited a comment on issue #11252: 
[FLINK-16337][python][table-planner-blink] Add support of vectorized Python UDF 
in blink planner
URL: https://github.com/apache/flink/pull/11252#issuecomment-592511050
 
 
   
   ## CI report:
   
   * 542c6f80de4976c204a68968f6d5af383dad1ac5 Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/151659712) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5890)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-16379) Introduce fromValues in TableEnvironment

2020-03-03 Thread Zhenghua Gao (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-16379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050732#comment-17050732
 ] 

Zhenghua Gao commented on FLINK-16379:
--

>From the user perspective, constructing a Table from raw objects is a more 
>convenient and direct way.  Expression brings with it additional complexity 
>and a steeper learning curve (especially among the java users). Another 
>concern is the optimization of LogicalValues and LogicalTableScan is very 
>different. There would be big changes for the plan test if we use fromValues.

> Introduce fromValues in TableEnvironment
> 
>
> Key: FLINK-16379
> URL: https://issues.apache.org/jira/browse/FLINK-16379
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.11.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Introduce a fromValues method to TableEnvironment similar to {{VALUES}} 
> clause in SQL
> The suggested API could look like:
> {code}
>   /**
>* Creates a Table from a given row constructing expressions.
>*
>* Examples:
>*
>* You can use {@link Expressions#row(Object, Object...)} to create 
> a composite rows:
>* {@code
>*  tEnv.fromValues(
>*  row(1, "ABC"),
>*  row(2L, "ABCDE")
>*  )
>* }
>* will produce a Table with a schema as follows:
>* {@code
>*  root
>*  |-- f0: BIGINT NOT NULL
>*  |-- f1: VARCHAR(5) NOT NULL
>* }
>*
>* ROWs that are a result of e.g. a function call are not flattened
>* {@code
>*  public class RowFunction extends ScalarFunction {
>*  @DataTypeHint("ROW")
>*  Row eval();
>*  }
>*
>*  tEnv.fromValues(
>*  call(new RowFunction()),
>*  call(new RowFunction())
>*  )
>* }
>* will produce a Table with a schema as follows:
>* {@code
>*  root
>*  |-- f0: ROW<`f0` BIGINT, `f1` VARCHAR(5)>
>* }
>*
>* The row constructor can be dropped to create a table with a 
> single row:
>*
>* ROWs that are a result of e.g. a function call are not flattened
>* {@code
>*  tEnv.fromValues(
>*  1,
>*  2L,
>*  3
>*  )
>* }
>* will produce a Table with a schema as follows:
>* {@code
>*  root
>*  |-- f0: BIGINT NOT NULL
>* }
>*
>* @param expressions Expressions for constructing rows of the VALUES 
> table.
>*/
>   Table fromValues(Expression... expressions);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16412) Disallow embedded metastore in HiveCatalog production code

2020-03-03 Thread Rui Li (Jira)

Rui Li created FLINK-16412:
--

 Summary: Disallow embedded metastore in HiveCatalog production code
 Key: FLINK-16412
 URL: https://issues.apache.org/jira/browse/FLINK-16412
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Reporter: Rui Li






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] WeiZhong94 commented on a change in pull request #11238: [FLINK-16304][python] Remove python packages bundled in the flink-python jar.

2020-03-03 Thread GitBox

WeiZhong94 commented on a change in pull request #11238: [FLINK-16304][python] 
Remove python packages bundled in the flink-python jar.
URL: https://github.com/apache/flink/pull/11238#discussion_r387422670
 
 

 ##
 File path: 
flink-python/src/main/resources/META-INF/licenses/LICENSE.cloudpickle
 ##
 @@ -1,32 +0,0 @@
-This module was extracted from the `cloud` package, developed by
 
 Review comment:
   Thanks for your reminder Chesnay. But the `cloudpickle` still exists in the 
`build-target` of `flink-dist`, i.e. 
`flink/build-target/opt/python/cloudpickle-1.2.2-src.zip`. So I think there is 
no need to change the root NOTICE and `licenses` directory.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] vthinkxie commented on issue #11041: [hotfix] Fix broken link and wrong node version command in runtime-we…

2020-03-03 Thread GitBox

vthinkxie commented on issue #11041: [hotfix] Fix broken link and wrong node 
version command in runtime-we…
URL: https://github.com/apache/flink/pull/11041#issuecomment-594294518
 
 
   Hi @rmetzger 
   I didn't update this part last time, left some comments here


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] vthinkxie commented on a change in pull request #11041: [hotfix] Fix broken link and wrong node version command in runtime-we…

2020-03-03 Thread GitBox

vthinkxie commented on a change in pull request #11041: [hotfix] Fix broken 
link and wrong node version command in runtime-we…
URL: https://github.com/apache/flink/pull/11041#discussion_r387418806
 
 

 ##
 File path: flink-runtime-web/README.md
 ##
 @@ -55,9 +55,9 @@ Depending on your version of Linux, Windows or MacOS, you 
may need to manually i
 
  Ubuntu Linux
 
-Install *node.js* by following [these 
instructions](https://github.com/joyent/node/wiki/installing-node.js-via-package-manager).
+Install *node.js* by following [these 
instructions](https://github.com/nodesource/distributions/blob/master/README.md#debinstall).
 
 Review comment:
   ```suggestion
   Install *node.js* by following [these 
instructions](https://nodejs.org/en/download/).
   ```
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11252: [FLINK-16337][python][table-planner-blink] Add support of vectorized Python UDF in blink planner

2020-03-03 Thread GitBox

flinkbot edited a comment on issue #11252: 
[FLINK-16337][python][table-planner-blink] Add support of vectorized Python UDF 
in blink planner
URL: https://github.com/apache/flink/pull/11252#issuecomment-592511050
 
 
   
   ## CI report:
   
   * d90b0eb350b2062894d7c9da8230deda7e592b45 Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/151592119) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5885)
 
   * 542c6f80de4976c204a68968f6d5af383dad1ac5 Travis: 
[PENDING](https://travis-ci.com/flink-ci/flink/builds/151659712) Azure: 
[PENDING](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5890)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] wuchong commented on issue #11236: [FLINK-16269][FLINK-16108][table-planner-blink] Fix schema of query and sink do not match when generic or POJO type is requested

2020-03-03 Thread GitBox

wuchong commented on issue #11236: 
[FLINK-16269][FLINK-16108][table-planner-blink] Fix schema of query and sink do 
not match when generic or POJO type is requested
URL: https://github.com/apache/flink/pull/11236#issuecomment-594284462
 
 
   Travis is failed on flink-runtime-web which is not related to this changes. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] flinkbot edited a comment on issue #11252: [FLINK-16337][python][table-planner-blink] Add support of vectorized Python UDF in blink planner

2020-03-03 Thread GitBox

flinkbot edited a comment on issue #11252: 
[FLINK-16337][python][table-planner-blink] Add support of vectorized Python UDF 
in blink planner
URL: https://github.com/apache/flink/pull/11252#issuecomment-592511050
 
 
   
   ## CI report:
   
   * d90b0eb350b2062894d7c9da8230deda7e592b45 Travis: 
[FAILURE](https://travis-ci.com/flink-ci/flink/builds/151592119) Azure: 
[FAILURE](https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_build/results?buildId=5885)
 
   * 542c6f80de4976c204a68968f6d5af383dad1ac5 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-15585) Improve function identifier string in plan digest

2020-03-03 Thread godfrey he (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-15585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050686#comment-17050686
 ] 

godfrey he commented on FLINK-15585:


I'm working on this

> Improve function identifier string in plan digest
> -
>
> Key: FLINK-15585
> URL: https://issues.apache.org/jira/browse/FLINK-15585
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Reporter: Jark Wu
>Assignee: godfrey he
>Priority: Major
> Fix For: 1.11.0
>
>
> Currently, we are using {{UserDefinedFunction#functionIdentifier}} as the 
> identifier string of UDFs in plan digest, for example: 
> {code:java}
> LogicalTableFunctionScan(invocation=[org$apache$flink$table$planner$utils$TableFunc1$8050927803993624f40152a838c98018($2)],
>  rowType=...)
> {code}
> However, the result of {{UserDefinedFunction#functionIdentifier}} will change 
> if we just add a method in UserDefinedFunction, because it uses Java 
> serialization. Then we have to update 60 plan tests which is very annoying. 
> In the other hand, displaying the function identifier string in operator name 
> in Web UI is verbose to users. 
> In order to improve this situation, there are something we can do:
> 1) If the UDF has a catalog function name, we can just use the catalog name 
> as the digest. Otherwise, fallback to (2). 
> 2) If the UDF doesn't contain fields, we just use the full calss name as the 
> digest. Otherwise, fallback to (3).
> 3) Use identifier string which will do the full serialization.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] KarmaGYZ closed pull request #11248: [FLINK-16299] Release containers recovered from previous attempt in w…

2020-03-03 Thread GitBox

KarmaGYZ closed pull request #11248: [FLINK-16299] Release containers recovered 
from previous attempt in w…
URL: https://github.com/apache/flink/pull/11248
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [flink] KarmaGYZ commented on issue #11248: [FLINK-16299] Release containers recovered from previous attempt in w…

2020-03-03 Thread GitBox

KarmaGYZ commented on issue #11248: [FLINK-16299] Release containers recovered 
from previous attempt in w…
URL: https://github.com/apache/flink/pull/11248#issuecomment-594262138
 
 
   Closed since it is not a problem. For details, see 
[FLINK-16299](https://issues.apache.org/jira/browse/FLINK-16299).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Resolved] (FLINK-16299) Release containers recovered from previous attempt in which TaskExecutor is not started.

2020-03-03 Thread Yangze Guo (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-16299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yangze Guo resolved FLINK-16299.

Resolution: Not A Problem

> Release containers recovered from previous attempt in which TaskExecutor is 
> not started.
> 
>
> Key: FLINK-16299
> URL: https://issues.apache.org/jira/browse/FLINK-16299
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Reporter: Xintong Song
>Assignee: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As discussed in FLINK-16215, on Yarn deployment, {{YarnResourceManager}} 
> starts a new {{TaskExecutor}} in two steps:
>  # Request a new container from Yarn
>  # Starts a {{TaskExecutor}} process in the allocated container
> If JM failover happens between the two steps, in the new attempt 
> {{YarnResourceManager}} will not start {{TaskExecutor}} processes in 
> recovered containers. That means such containers are neither used nor 
> released.
> A potential fix to this problem is to query for the container status by 
> calling {{NMClientAsync#getContainerStatusAsync}}, and release the containers 
> whose state is {{NEW}}, keeps only those whose state is {{RUNNING}} and 
> waiting for them to register.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

1 2 3 4 5 >

1 - 100 of 429 matches

Mail list logo