[GitHub] [flink] flinkbot edited a comment on issue #9469: [FLINK-13757][Docs]update logical fucntions for this expression `boolean IS NOT TRUE`

2019-08-16 Thread GitBox
flinkbot edited a comment on issue #9469: [FLINK-13757][Docs]update logical 
fucntions for this expression `boolean IS NOT TRUE`
URL: https://github.com/apache/flink/pull/9469#issuecomment-522203394
 
 
   ## CI report:
   
   * 6a793ee313851c7e8afc09f4b77534f00e23017c : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/123584918)
   * b8ecce2c9e2642c295c4770123443dd796e5d5da : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123585169)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9469: [FLINK-13757][Docs]update logical fucntions for this expression `boolean IS NOT TRUE`

2019-08-16 Thread GitBox
flinkbot edited a comment on issue #9469: [FLINK-13757][Docs]update logical 
fucntions for this expression `boolean IS NOT TRUE`
URL: https://github.com/apache/flink/pull/9469#issuecomment-522203394
 
 
   ## CI report:
   
   * 6a793ee313851c7e8afc09f4b77534f00e23017c : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/123584918)
   * b8ecce2c9e2642c295c4770123443dd796e5d5da : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123585169)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot commented on issue #9469: [FLINK-13757][Docs]update logical fucntions for this expression `boolean IS NOT TRUE`

2019-08-16 Thread GitBox
flinkbot commented on issue #9469: [FLINK-13757][Docs]update logical fucntions 
for this expression `boolean IS NOT TRUE`
URL: https://github.com/apache/flink/pull/9469#issuecomment-522203394
 
 
   ## CI report:
   
   * 6a793ee313851c7e8afc09f4b77534f00e23017c : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123584918)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot commented on issue #9469: [FLINK-13757][Docs]update logical fucntions for this expression `boolean IS NOT TRUE`

2019-08-16 Thread GitBox
flinkbot commented on issue #9469: [FLINK-13757][Docs]update logical fucntions 
for this expression `boolean IS NOT TRUE`
URL: https://github.com/apache/flink/pull/9469#issuecomment-522202884
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 6a793ee313851c7e8afc09f4b77534f00e23017c (Sat Aug 17 
04:16:46 UTC 2019)
   
   **Warnings:**
* Documentation files were touched, but no `.zh.md` files: Update Chinese 
documentation or file Jira ticket.
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-13757).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] hehuiyuan opened a new pull request #9469: [FLINK-13757][Docs]update logical fucntions for this expression `boolean IS NOT TRUE`

2019-08-16 Thread GitBox
hehuiyuan opened a new pull request #9469: [FLINK-13757][Docs]update logical 
fucntions for this expression `boolean IS NOT TRUE`
URL: https://github.com/apache/flink/pull/9469
 
 
   
   
   ## What is the purpose of the change
   
   
![image](https://user-images.githubusercontent.com/18002496/63206504-147a0f00-c0e8-11e9-993f-0e1308d942d4.png)
   
   
   False:
   
   boolean IS NOT TRUE | Returns TRUE if boolean is FALSE or UNKNOWN; returns 
FALSE if boolean is **FALSE**.
   -- | --
   
   
   True:
   
   boolean IS NOT TRUE | Returns TRUE if boolean is FALSE or UNKNOWN; returns 
FALSE if boolean is **TRUE**.
   -- | --
   
   
   
   
   
   
   
   ## Brief change log
   
   Update logical fucntions for this expression `boolean IS NOT TRUE`
   
   
   ## Documentation
   
 - Does this pull request introduce a new feature? ( no)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-13757) Document error for `logical functions`

2019-08-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-13757:
---
Labels: pull-request-available  (was: )

> Document error for  `logical functions`
> ---
>
> Key: FLINK-13757
> URL: https://issues.apache.org/jira/browse/FLINK-13757
> Project: Flink
>  Issue Type: Wish
>  Components: Documentation
>Reporter: hehuiyuan
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2019-08-17-11-58-53-247.png
>
>
> [https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/functions.html#logical-functions]
> False:
> |{{boolean IS NOT TRUE}}|Returns TRUE if _boolean_ is FALSE or UNKNOWN; 
> returns FALSE if _boolean_ is *FALSE*.|
> True:
> |{{boolean IS NOT TRUE}}|Returns TRUE if _boolean_ is FALSE or UNKNOWN; 
> returns FALSE if _boolean_ is *TRUE*.|
> [!image-2019-08-17-11-58-53-247.png!|https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/functions.html#logical-functions]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13757) Document error for `logical functions`

2019-08-16 Thread hehuiyuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hehuiyuan updated FLINK-13757:
--
Description: 
[https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/functions.html#logical-functions]

False:
|{{boolean IS NOT TRUE}}|Returns TRUE if _boolean_ is FALSE or UNKNOWN; returns 
FALSE if _boolean_ is *FALSE*.|

True:
|{{boolean IS NOT TRUE}}|Returns TRUE if _boolean_ is FALSE or UNKNOWN; returns 
FALSE if _boolean_ is *TRUE*.|

[!image-2019-08-17-11-58-53-247.png!|https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/functions.html#logical-functions]

  was:
[https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/functions.html#logical-functions]

False:
|{{boolean IS NOT TRUE}}|Returns TRUE if _boolean_ is FALSE or UNKNOWN; returns 
FALSE if _boolean_ is FALSE.|

True:
|{{boolean IS NOT TRUE}}|Returns TRUE if _boolean_ is FALSE or UNKNOWN; returns 
FALSE if _boolean_ is TURE.|

[!image-2019-08-17-11-58-53-247.png!|https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/functions.html#logical-functions]


> Document error for  `logical functions`
> ---
>
> Key: FLINK-13757
> URL: https://issues.apache.org/jira/browse/FLINK-13757
> Project: Flink
>  Issue Type: Wish
>  Components: Documentation
>Reporter: hehuiyuan
>Priority: Major
> Attachments: image-2019-08-17-11-58-53-247.png
>
>
> [https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/functions.html#logical-functions]
> False:
> |{{boolean IS NOT TRUE}}|Returns TRUE if _boolean_ is FALSE or UNKNOWN; 
> returns FALSE if _boolean_ is *FALSE*.|
> True:
> |{{boolean IS NOT TRUE}}|Returns TRUE if _boolean_ is FALSE or UNKNOWN; 
> returns FALSE if _boolean_ is *TRUE*.|
> [!image-2019-08-17-11-58-53-247.png!|https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/functions.html#logical-functions]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13757) Document error for `logical functions`

2019-08-16 Thread hehuiyuan (JIRA)
hehuiyuan created FLINK-13757:
-

 Summary: Document error for  `logical functions`
 Key: FLINK-13757
 URL: https://issues.apache.org/jira/browse/FLINK-13757
 Project: Flink
  Issue Type: Wish
  Components: Documentation
Reporter: hehuiyuan
 Attachments: image-2019-08-17-11-58-53-247.png

[https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/functions.html#logical-functions]

False:
|{{boolean IS NOT TRUE}}|Returns TRUE if _boolean_ is FALSE or UNKNOWN; returns 
FALSE if _boolean_ is FALSE.|

True:
|{{boolean IS NOT TRUE}}|Returns TRUE if _boolean_ is FALSE or UNKNOWN; returns 
FALSE if _boolean_ is TURE.|

[!image-2019-08-17-11-58-53-247.png!|https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/functions.html#logical-functions]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13756) Modify Code Annotations for findAndCreateTableSource in TableFactoryUtil

2019-08-16 Thread hehuiyuan (JIRA)
hehuiyuan created FLINK-13756:
-

 Summary:  Modify Code Annotations for findAndCreateTableSource  in 
TableFactoryUtil
 Key: FLINK-13756
 URL: https://issues.apache.org/jira/browse/FLINK-13756
 Project: Flink
  Issue Type: Wish
  Components: Table SQL / API
Reporter: hehuiyuan


 

/**
 * Returns a *table sink* matching the \{@link 
org.apache.flink.table.catalog.CatalogTable}.
 */
public static  TableSource findAndCreateTableSource(CatalogTable table) {
 return findAndCreateTableSource(table.toProperties());
}

 

Hi , this method `findAndCreateTableSource`   is used for returning  
`TableSource` , but the annotation is *` Returns a table sink`*

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13755) support Hive built-in functions in Flink

2019-08-16 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13755:
-
Description: 
Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
registered into in-memory function catalog at runtime, which makes it hard for 
Flink to integrate with architecturely.

First and basic option is to do it the hard way by integrating hive's function 
registry, which architecturely can be hard.

Second option to support rich Hive built-in functions is to develop builtin 
functions in Flink with the same logic. I did a simple comparison. With Flink 
1.10.0 and Hive 2.3.4, they have 56 common (of same name) built-in functions; 
there are 195 functions in Hive 2.3.4 that don't exist in Flink 1.10.0. Please 
see attached files. According my sampling in the 195 functions, some are 
straight-forward to rewrite, some don't seem to be frequently used.

Besides rewriting all of them, another option for users is to manually register 
those builtin functions in Hive metastore, so Flink can load them thru 
HiveCatalog at runtime.

Lastly, we can load and hold hive builtin functions in an in-memory map of 
HiveCatalog as if it's from hive function registry

 

cc [~xuefuz] [~lirui] [~Terry1897]

  was:
Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
registered into in-memory function catalog at runtime, which makes it hard for 
Flink to integrate with architecturely.

One option to support rich Hive built-in functions is to develop builtin 
functions in Flink with the same logic. I did a simple comparison. With Flink 
1.10.0 and Hive 2.3.4, they have 56 common (of same name) built-in functions; 
there are 195 functions in Hive 2.3.4 that don't exist in Flink 1.10.0. Please 
see attached files. According my sampling in the 195 functions, some are 
straight-forward to rewrite, some don't seem to be frequently used. Besides 
rewriting all of them, another option for users is to manually register those 
builtin functions in Hive metastore, so Flink can load them thru HiveCatalog at 
runtime.

 

cc [~xuefuz] [~lirui] [~Terry1897]


> support Hive built-in functions in Flink
> 
>
> Key: FLINK-13755
> URL: https://issues.apache.org/jira/browse/FLINK-13755
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.10.0
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Major
> Fix For: 1.10.0
>
> Attachments: common builtin functions is flink and hive.txt, hive 
> builtin functions that are missing in flink.txt
>
>
> Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
> registered into in-memory function catalog at runtime, which makes it hard 
> for Flink to integrate with architecturely.
> First and basic option is to do it the hard way by integrating hive's 
> function registry, which architecturely can be hard.
> Second option to support rich Hive built-in functions is to develop builtin 
> functions in Flink with the same logic. I did a simple comparison. With Flink 
> 1.10.0 and Hive 2.3.4, they have 56 common (of same name) built-in functions; 
> there are 195 functions in Hive 2.3.4 that don't exist in Flink 1.10.0. 
> Please see attached files. According my sampling in the 195 functions, some 
> are straight-forward to rewrite, some don't seem to be frequently used.
> Besides rewriting all of them, another option for users is to manually 
> register those builtin functions in Hive metastore, so Flink can load them 
> thru HiveCatalog at runtime.
> Lastly, we can load and hold hive builtin functions in an in-memory map of 
> HiveCatalog as if it's from hive function registry
>  
> cc [~xuefuz] [~lirui] [~Terry1897]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13749) Make Flink client respect classloading policy

2019-08-16 Thread Paul Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909547#comment-16909547
 ] 

Paul Lin commented on FLINK-13749:
--

Please assign this issue to me. Thanks!

> Make Flink client respect classloading policy
> -
>
> Key: FLINK-13749
> URL: https://issues.apache.org/jira/browse/FLINK-13749
> Project: Flink
>  Issue Type: Improvement
>  Components: Command Line Client, Runtime / REST
>Affects Versions: 1.9.0
>Reporter: Paul Lin
>Priority: Minor
>
> Currently, Flink client does not respect the classloading policy and uses 
> hardcoded parent-first classloader, while the other components like 
> jobmanager and taskmanager use child-first classloader by default and respect 
> the classloading options. This makes the client more likely to have 
> dependency conflicts, especially after we removed the convenient hadoop 
> binaries (so users need to add hadoop classpath in the client classpath).
> So I propose to make Flink client's (including cli and rest handler) 
> classloading behavior aligned with the other components.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] zjuwangg commented on issue #9447: [FLINK-13643][docs]Document the workaround for users with a different minor Hive version

2019-08-16 Thread GitBox
zjuwangg commented on issue #9447: [FLINK-13643][docs]Document the workaround 
for users with a different minor Hive version
URL: https://github.com/apache/flink/pull/9447#issuecomment-522194254
 
 
   > @zjuwangg Thanks for updating. I think we should make the following points 
clear to users:
   > 
   > 1. Only 2.3.4 and 1.2.1 have been tested.
   > 2. Users are welcome to try out different versions with this workaround, 
but there might be unexpected issues.
   > 3. We will test and support more versions in future releases.
   > 
   > Besides, I think we also have to mention how to set the version via Table 
API -- pass the version string when creating HiveCatalog instance.
   
   Agreed.
   As for `how to set the version via Table AP` may be redundant for there are 
examples following.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-13025) Elasticsearch 7.x support

2019-08-16 Thread vinoyang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909531#comment-16909531
 ] 

vinoyang commented on FLINK-13025:
--

[~victorvilladev] Thanks for your opinion. If [~aljoscha] agreed, I will start 
to implement a new connector.

> Elasticsearch 7.x support
> -
>
> Key: FLINK-13025
> URL: https://issues.apache.org/jira/browse/FLINK-13025
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.8.0
>Reporter: Keegan Standifer
>Priority: Major
>
> Elasticsearch 7.0.0 was released in April of 2019: 
> [https://www.elastic.co/blog/elasticsearch-7-0-0-released]
> The latest elasticsearch connector is 
> [flink-connector-elasticsearch6|https://github.com/apache/flink/tree/master/flink-connectors/flink-connector-elasticsearch6]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13755) support Hive built-in functions in Flink

2019-08-16 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13755:
-
Description: 
Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
registered into in-memory function catalog at runtime, which makes it hard for 
Flink to integrate with architecturely.

One option to support rich Hive built-in functions is to develop builtin 
functions in Flink with the same logic. I did a simple comparison. With Flink 
1.10.0 and Hive 2.3.4, they have 56 common (of same name) built-in functions; 
there are 195 functions in Hive 2.3.4 that don't exist in Flink 1.10.0. Please 
see attached files. According my sampling in the 195 functions, some are 
straight-forward to rewrite, some don't seem to be frequently used. Besides 
rewriting all of them, another option for users is to manually register those 
builtin functions in Hive metastore, so Flink can load them thru HiveCatalog at 
runtime.

 

cc [~xuefuz] [~lirui] [~Terry1897]

  was:
Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
registered into in-memory function catalog at runtime, which makes it hard for 
Flink to integrate with architecturely.

One option to support rich Hive built-in functions is to develop builtin 
functions in Flink with the same logic. I did a simple comparison. With Flink 
1.10.0 and Hive 2.3.4, they have 56 common (of same name) built-in functions; 
there are 195 functions in Hive 2.3.4 that don't exist in Flink 1.10.0. Please 
see attached files

 

cc [~xuefuz] [~lirui] [~Terry1897]


> support Hive built-in functions in Flink
> 
>
> Key: FLINK-13755
> URL: https://issues.apache.org/jira/browse/FLINK-13755
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.10.0
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Major
> Fix For: 1.10.0
>
> Attachments: common builtin functions is flink and hive.txt, hive 
> builtin functions that are missing in flink.txt
>
>
> Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
> registered into in-memory function catalog at runtime, which makes it hard 
> for Flink to integrate with architecturely.
> One option to support rich Hive built-in functions is to develop builtin 
> functions in Flink with the same logic. I did a simple comparison. With Flink 
> 1.10.0 and Hive 2.3.4, they have 56 common (of same name) built-in functions; 
> there are 195 functions in Hive 2.3.4 that don't exist in Flink 1.10.0. 
> Please see attached files. According my sampling in the 195 functions, some 
> are straight-forward to rewrite, some don't seem to be frequently used. 
> Besides rewriting all of them, another option for users is to manually 
> register those builtin functions in Hive metastore, so Flink can load them 
> thru HiveCatalog at runtime.
>  
> cc [~xuefuz] [~lirui] [~Terry1897]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13755) support Hive built-in functions in Flink

2019-08-16 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13755:
-
Description: 
Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
registered into in-memory function catalog at runtime, which makes it hard for 
Flink to integrate with architecturely.

One option to support rich Hive built-in functions is to develop builtin 
functions in Flink with the same logic. I did a simple comparison. With Flink 
1.10.0 and Hive 2.3.4, they have 56 common (of same name) built-in functions; 
there are 195 functions in Hive 2.3.4 that don't exist in Flink 1.10.0. Please 
see attached files

 

cc [~xuefuz] [~lirui] [~Terry1897]

  was:
Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
registered into in-memory function catalog at runtime, which makes it hard for 
Flink to integrate with architecturely.

One option to support rich Hive built-in functions is to develop builtin 
functions in Flink with the same logic. I did a simple comparison. With Flink 
1.10.0 and Hive 2.3.4, they have 56 common (of same name) built-in functions; 
there are 195 functions in Hive 2.3.4 that don't exist in Flink 1.10.0.


> support Hive built-in functions in Flink
> 
>
> Key: FLINK-13755
> URL: https://issues.apache.org/jira/browse/FLINK-13755
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.10.0
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Major
> Fix For: 1.10.0
>
> Attachments: common builtin functions is flink and hive.txt, hive 
> builtin functions that are missing in flink.txt
>
>
> Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
> registered into in-memory function catalog at runtime, which makes it hard 
> for Flink to integrate with architecturely.
> One option to support rich Hive built-in functions is to develop builtin 
> functions in Flink with the same logic. I did a simple comparison. With Flink 
> 1.10.0 and Hive 2.3.4, they have 56 common (of same name) built-in functions; 
> there are 195 functions in Hive 2.3.4 that don't exist in Flink 1.10.0. 
> Please see attached files
>  
> cc [~xuefuz] [~lirui] [~Terry1897]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13755) support Hive built-in functions in Flink

2019-08-16 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13755:
-
Attachment: hive builtin functions that are missing in flink.txt

> support Hive built-in functions in Flink
> 
>
> Key: FLINK-13755
> URL: https://issues.apache.org/jira/browse/FLINK-13755
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.10.0
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Major
> Fix For: 1.10.0
>
> Attachments: common builtin functions is flink and hive.txt, hive 
> builtin functions that are missing in flink.txt
>
>
> Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
> registered into in-memory function catalog at runtime, which makes it hard 
> for Flink to integrate with architecturely.
> One option to support rich Hive built-in functions is to develop builtin 
> functions in Flink with the same logic. I did a simple comparison. With Flink 
> 1.10.0 and Hive 2.3.4, they have 56 common (of same name) built-in functions; 
> there are 195 functions in Hive 2.3.4 that don't exist in Flink 1.10.0.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13755) support Hive built-in functions in Flink

2019-08-16 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13755:
-
Attachment: (was: hive builtin functions that are missing in flink.txt)

> support Hive built-in functions in Flink
> 
>
> Key: FLINK-13755
> URL: https://issues.apache.org/jira/browse/FLINK-13755
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.10.0
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Major
> Fix For: 1.10.0
>
> Attachments: common builtin functions is flink and hive.txt, hive 
> builtin functions that are missing in flink.txt
>
>
> Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
> registered into in-memory function catalog at runtime, which makes it hard 
> for Flink to integrate with architecturely.
> One option to support rich Hive built-in functions is to develop builtin 
> functions in Flink with the same logic. I did a simple comparison. With Flink 
> 1.10.0 and Hive 2.3.4, they have 56 common (of same name) built-in functions; 
> there are 195 functions in Hive 2.3.4 that don't exist in Flink 1.10.0.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13755) support Hive built-in functions in Flink

2019-08-16 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13755:
-
Description: 
Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
registered into in-memory function catalog at runtime, which makes it hard for 
Flink to integrate with architecturely.

One option to support rich Hive built-in functions is to develop builtin 
functions in Flink with the same logic. I did a simple comparison. With Flink 
1.10.0 and Hive 2.3.4, they have 56 common (of same name) built-in functions; 
there are 195 functions in Hive 2.3.4 that don't exist in Flink 1.10.0.

> support Hive built-in functions in Flink
> 
>
> Key: FLINK-13755
> URL: https://issues.apache.org/jira/browse/FLINK-13755
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.10.0
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Major
> Fix For: 1.10.0
>
> Attachments: common builtin functions is flink and hive.txt, hive 
> builtin functions that are missing in flink.txt
>
>
> Unlike UDFs that are persisted in Hive Metastore, Hive builtin functions are 
> registered into in-memory function catalog at runtime, which makes it hard 
> for Flink to integrate with architecturely.
> One option to support rich Hive built-in functions is to develop builtin 
> functions in Flink with the same logic. I did a simple comparison. With Flink 
> 1.10.0 and Hive 2.3.4, they have 56 common (of same name) built-in functions; 
> there are 195 functions in Hive 2.3.4 that don't exist in Flink 1.10.0.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] flinkbot edited a comment on issue #9457: [FLINK-13741][table] "SHOW FUNCTIONS" should include Flink built-in functions' names

2019-08-16 Thread GitBox
flinkbot edited a comment on issue #9457: [FLINK-13741][table] "SHOW FUNCTIONS" 
should include Flink built-in functions' names
URL: https://github.com/apache/flink/pull/9457#issuecomment-521829752
 
 
   ## CI report:
   
   * 55c0e5843e029f022ff59fe14a9e6c1d2c5ac69e : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123443311)
   * 006236fff94d0204223a2c3b89f621da3248f6a4 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123444248)
   * 726259a0a1bfb2061f77a82c586d9b3a4c70abb6 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123572731)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] YngwieWang commented on a change in pull request #7751: [FLINK-11608] [docs] Translate the "Local Setup Tutorial" page into Chinese

2019-08-16 Thread GitBox
YngwieWang commented on a change in pull request #7751:  [FLINK-11608] [docs] 
Translate the "Local Setup Tutorial" page into Chinese
URL: https://github.com/apache/flink/pull/7751#discussion_r314920585
 
 

 ##
 File path: docs/tutorials/local_setup.zh.md
 ##
 @@ -0,0 +1,288 @@
+---
+title: "本地安装教程"
+nav-title: 'Local Setup'
+nav-parent_id: setuptutorials
+nav-pos: 10
+---
+
+
+* This will be replaced by the TOC
+{:toc}
+
+只需要几个简单的步骤即可启动并运行 Flink 示例程序。
+
+## 安装:下载并启动 Flink 
+
+Flink 可以在 __Linux、Mac OS X、和 Windows__ 环境中运行。为了能够运行 Flink 唯一要求是安装 __Java 8.x__ 
。 Windows 用户,请查阅[在  Windows 上运行 Flink]({{ site.baseurl 
}}/tutorials/flink_on_windows.html)上面描述了如何在 windows 上以本地模式运行 Flink。
+
+你可以用下面的命令来检查一下是否正确安装了 Java 程序:
+
+{% highlight bash %}
+java -version
+{% endhighlight %}
+
+如果你已经安装了 Java 8,应该输出如下内容:
+
+{% highlight bash %}
+java version "1.8.0_111"
+Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
+Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)
+{% endhighlight %}
+
+{% if site.is_stable %}
+
+
+1. 从[下载页](http://flink.apache.org/downloads.html)下载二进制文件。你可以选择任何喜欢的 Scala 
版本。针对某些特性,你可能还需要下载预捆绑的 Hadoop jar 包并将它们放入 `/lib` 目录。
+2. 进入下载后的目录。
+3. 解压下载的文件。
+
+
+{% highlight bash %}
+$ cd ~/Downloads# Go to download directory
+$ tar xzf flink-*.tgz   # Unpack the downloaded archive
+$ cd flink-{{site.version}}
+{% endhighlight %}
+
+
+
+对于 MacOS X 用户,Flink 可以通过[Homebrew](https://brew.sh/)进行安装。
+
+{% highlight bash %}
+$ brew install apache-flink
+...
+$ flink --version
+Version: 1.2.0, Commit ID: 1c659cf
+{% endhighlight %}
+
+
+
+
+{% else %}
+### 下载和编译
+从我们的[代码仓库](http://flink.apache.org/community.html#source-code)中克隆源码,比如:
+
+{% highlight bash %}
+$ git clone https://github.com/apache/flink.git
+$ cd flink
+$ mvn clean package -DskipTests # this will take up to 10 minutes
+$ cd build-target   # this is where Flink is installed to
+{% endhighlight %}
+{% endif %}
+
+### 启动 Flink 本地集群
+
+{% highlight bash %}
+$ ./bin/start-cluster.sh  # Start Flink
+{% endhighlight %}
+
+检查位于[http://localhost:8081](http://localhost:8081)的 __web 调度界面__以确保一切正常运行。Web 
界面上会仅显示一个可用的 TaskManager 实例。
+
+
+
+还可以通过检查 `logs` 目录中的日志文件来验证系统是否正在运行:
+
+{% highlight bash %}
+$ tail log/flink-*-standalonesession-*.log
+INFO ... - Rest endpoint listening at localhost:8081
+INFO ... - http://localhost:8081 was granted leadership ...
+INFO ... - Web frontend listening at http://localhost:8081.
+INFO ... - Starting RPC endpoint for StandaloneResourceManager at 
akka://flink/user/resourcemanager .
+INFO ... - Starting RPC endpoint for StandaloneDispatcher at 
akka://flink/user/dispatcher .
+INFO ... - ResourceManager 
akka.tcp://flink@localhost:6123/user/resourcemanager was granted leadership ...
+INFO ... - Starting the SlotManager.
+INFO ... - Dispatcher akka.tcp://flink@localhost:6123/user/dispatcher was 
granted leadership ...
+INFO ... - Recovering all persisted jobs.
+INFO ... - Registering TaskManager ... under ... at the SlotManager.
+{% endhighlight %}
+
+## 阅读代码
+
+你可以在 Github 
上看到分别用[scala](https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/scala/org/apache/flink/streaming/scala/examples/socket/SocketWindowWordCount.scala)和[java](https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/socket/SocketWindowWordCount.java)编写的
 SocketWindowWordCount 的完整代码。
 
 Review comment:
   
![image](https://user-images.githubusercontent.com/22651167/63203413-d833b800-c0c1-11e9-886d-87016a4cd881.png)
   运行一下 `build_docs.sh` 看看效果,英文中括号后面跟小括号是超链接,如果超链接的文本内容是英文,要和前后的中文之间留有空格。


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] YngwieWang commented on a change in pull request #7751: [FLINK-11608] [docs] Translate the "Local Setup Tutorial" page into Chinese

2019-08-16 Thread GitBox
YngwieWang commented on a change in pull request #7751:  [FLINK-11608] [docs] 
Translate the "Local Setup Tutorial" page into Chinese
URL: https://github.com/apache/flink/pull/7751#discussion_r314919768
 
 

 ##
 File path: docs/tutorials/local_setup.zh.md
 ##
 @@ -0,0 +1,288 @@
+---
+title: "本地安装教程"
+nav-title: 'Local Setup'
+nav-parent_id: setuptutorials
+nav-pos: 10
+---
+
+
+* This will be replaced by the TOC
+{:toc}
+
+只需要几个简单的步骤即可启动并运行 Flink 示例程序。
+
+## 安装:下载并启动 Flink 
+
+Flink 可以在 __Linux、Mac OS X、和 Windows__ 环境中运行。为了能够运行 Flink 唯一要求是安装 __Java 8.x__ 
。 Windows 用户,请查阅[在  Windows 上运行 Flink]({{ site.baseurl 
}}/tutorials/flink_on_windows.html)上面描述了如何在 windows 上以本地模式运行 Flink。
+
+你可以用下面的命令来检查一下是否正确安装了 Java 程序:
+
+{% highlight bash %}
+java -version
+{% endhighlight %}
+
+如果你已经安装了 Java 8,应该输出如下内容:
+
+{% highlight bash %}
+java version "1.8.0_111"
+Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
+Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)
+{% endhighlight %}
+
+{% if site.is_stable %}
+
+
+1. 从[下载页](http://flink.apache.org/downloads.html)下载二进制文件。你可以选择任何喜欢的 Scala 
版本。针对某些特性,你可能还需要下载预捆绑的 Hadoop jar 包并将它们放入 `/lib` 目录。
+2. 进入下载后的目录。
+3. 解压下载的文件。
+
+
+{% highlight bash %}
+$ cd ~/Downloads# Go to download directory
+$ tar xzf flink-*.tgz   # Unpack the downloaded archive
+$ cd flink-{{site.version}}
+{% endhighlight %}
+
+
+
+对于 MacOS X 用户,Flink 可以通过[Homebrew](https://brew.sh/)进行安装。
+
+{% highlight bash %}
+$ brew install apache-flink
+...
+$ flink --version
+Version: 1.2.0, Commit ID: 1c659cf
+{% endhighlight %}
+
+
+
+
+{% else %}
+### 下载和编译
+从我们的[代码仓库](http://flink.apache.org/community.html#source-code)中克隆源码,比如:
+
+{% highlight bash %}
+$ git clone https://github.com/apache/flink.git
+$ cd flink
+$ mvn clean package -DskipTests # this will take up to 10 minutes
+$ cd build-target   # this is where Flink is installed to
+{% endhighlight %}
+{% endif %}
+
+### 启动 Flink 本地集群
+
+{% highlight bash %}
+$ ./bin/start-cluster.sh  # Start Flink
+{% endhighlight %}
+
+检查位于[http://localhost:8081](http://localhost:8081)的 __web 调度界面__以确保一切正常运行。Web 
界面上会仅显示一个可用的 TaskManager 实例。
+
+
+
+还可以通过检查 `logs` 目录中的日志文件来验证系统是否正在运行:
+
+{% highlight bash %}
+$ tail log/flink-*-standalonesession-*.log
+INFO ... - Rest endpoint listening at localhost:8081
+INFO ... - http://localhost:8081 was granted leadership ...
+INFO ... - Web frontend listening at http://localhost:8081.
+INFO ... - Starting RPC endpoint for StandaloneResourceManager at 
akka://flink/user/resourcemanager .
+INFO ... - Starting RPC endpoint for StandaloneDispatcher at 
akka://flink/user/dispatcher .
+INFO ... - ResourceManager 
akka.tcp://flink@localhost:6123/user/resourcemanager was granted leadership ...
+INFO ... - Starting the SlotManager.
+INFO ... - Dispatcher akka.tcp://flink@localhost:6123/user/dispatcher was 
granted leadership ...
+INFO ... - Recovering all persisted jobs.
+INFO ... - Registering TaskManager ... under ... at the SlotManager.
+{% endhighlight %}
+
+## 阅读代码
+
+你可以在 Github 
上看到分别用[scala](https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/scala/org/apache/flink/streaming/scala/examples/socket/SocketWindowWordCount.scala)和[java](https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/socket/SocketWindowWordCount.java)编写的
 SocketWindowWordCount 的完整代码。
+
+
+
+{% highlight scala %}
+object SocketWindowWordCount {
+
+def main(args: Array[String]) : Unit = {
+
+// the port to connect to
+val port: Int = try {
+ParameterTool.fromArgs(args).getInt("port")
+} catch {
+case e: Exception => {
+System.err.println("No port specified. Please run 
'SocketWindowWordCount --port '")
+return
+}
+}
+
+// get the execution environment
+val env: StreamExecutionEnvironment = 
StreamExecutionEnvironment.getExecutionEnvironment
+
+// get input data by connecting to the socket
+val text = env.socketTextStream("localhost", port, '\n')
+
+// parse the data, group it, window it, and aggregate the counts
+val windowCounts = text
+.flatMap { w => w.split("\\s") }
+.map { w => WordWithCount(w, 1) }
+.keyBy("word")
+.timeWindow(Time.seconds(5), Time.seconds(1))
+.sum("count")
+
+// print the results with a single thread, rather than in parallel
+windowCounts.print().setParallelism(1)
+
+env.execute("Socket Window WordCount")
+}
+
+// Data type for words with count
+case class WordWithCount(word: String, count: Long)
+}
+{% endhighlight %}
+
+
+{% highlight java %}
+public class SocketWindowWordCount {
+
+public static void main(String[] args) throws Exception {
+
+// the port to connect to
+final int 

[GitHub] [flink] YngwieWang commented on a change in pull request #7751: [FLINK-11608] [docs] Translate the "Local Setup Tutorial" page into Chinese

2019-08-16 Thread GitBox
YngwieWang commented on a change in pull request #7751:  [FLINK-11608] [docs] 
Translate the "Local Setup Tutorial" page into Chinese
URL: https://github.com/apache/flink/pull/7751#discussion_r314919426
 
 

 ##
 File path: docs/tutorials/local_setup.zh.md
 ##
 @@ -0,0 +1,288 @@
+---
+title: "本地安装教程"
+nav-title: 'Local Setup'
+nav-parent_id: setuptutorials
+nav-pos: 10
+---
+
+
+* This will be replaced by the TOC
+{:toc}
+
+只需要几个简单的步骤即可启动并运行 Flink 示例程序。
+
+## 安装:下载并启动 Flink 
+
+Flink 可以在 __Linux、Mac OS X、和 Windows__ 环境中运行。为了能够运行 Flink 唯一要求是安装 __Java 8.x__ 
。 Windows 用户,请查阅[在  Windows 上运行 Flink]({{ site.baseurl 
}}/tutorials/flink_on_windows.html)上面描述了如何在 windows 上以本地模式运行 Flink。
+
+你可以用下面的命令来检查一下是否正确安装了 Java 程序:
+
+{% highlight bash %}
+java -version
+{% endhighlight %}
+
+如果你已经安装了 Java 8,应该输出如下内容:
+
+{% highlight bash %}
+java version "1.8.0_111"
+Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
+Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)
+{% endhighlight %}
+
+{% if site.is_stable %}
+
+
+1. 从[下载页](http://flink.apache.org/downloads.html)下载二进制文件。你可以选择任何喜欢的 Scala 
版本。针对某些特性,你可能还需要下载预捆绑的 Hadoop jar 包并将它们放入 `/lib` 目录。
+2. 进入下载后的目录。
+3. 解压下载的文件。
+
+
+{% highlight bash %}
+$ cd ~/Downloads# Go to download directory
+$ tar xzf flink-*.tgz   # Unpack the downloaded archive
+$ cd flink-{{site.version}}
+{% endhighlight %}
+
+
+
+对于 MacOS X 用户,Flink 可以通过[Homebrew](https://brew.sh/)进行安装。
+
+{% highlight bash %}
+$ brew install apache-flink
+...
+$ flink --version
+Version: 1.2.0, Commit ID: 1c659cf
+{% endhighlight %}
+
+
+
+
+{% else %}
+### 下载和编译
+从我们的[代码仓库](http://flink.apache.org/community.html#source-code)中克隆源码,比如:
+
+{% highlight bash %}
+$ git clone https://github.com/apache/flink.git
+$ cd flink
+$ mvn clean package -DskipTests # this will take up to 10 minutes
+$ cd build-target   # this is where Flink is installed to
+{% endhighlight %}
+{% endif %}
+
+### 启动 Flink 本地集群
+
+{% highlight bash %}
+$ ./bin/start-cluster.sh  # Start Flink
+{% endhighlight %}
+
+检查位于[http://localhost:8081](http://localhost:8081)的 __web 调度界面__以确保一切正常运行。Web 
界面上会仅显示一个可用的 TaskManager 实例。
+
+
+
+还可以通过检查 `logs` 目录中的日志文件来验证系统是否正在运行:
+
+{% highlight bash %}
+$ tail log/flink-*-standalonesession-*.log
+INFO ... - Rest endpoint listening at localhost:8081
+INFO ... - http://localhost:8081 was granted leadership ...
+INFO ... - Web frontend listening at http://localhost:8081.
+INFO ... - Starting RPC endpoint for StandaloneResourceManager at 
akka://flink/user/resourcemanager .
+INFO ... - Starting RPC endpoint for StandaloneDispatcher at 
akka://flink/user/dispatcher .
+INFO ... - ResourceManager 
akka.tcp://flink@localhost:6123/user/resourcemanager was granted leadership ...
+INFO ... - Starting the SlotManager.
+INFO ... - Dispatcher akka.tcp://flink@localhost:6123/user/dispatcher was 
granted leadership ...
+INFO ... - Recovering all persisted jobs.
+INFO ... - Registering TaskManager ... under ... at the SlotManager.
+{% endhighlight %}
+
+## 阅读代码
+
+你可以在 Github 
上看到分别用[scala](https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/scala/org/apache/flink/streaming/scala/examples/socket/SocketWindowWordCount.scala)和[java](https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/socket/SocketWindowWordCount.java)编写的
 SocketWindowWordCount 的完整代码。
 
 Review comment:
   ```suggestion
   你可以在 Github 上看到分别用 
[scala](https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/scala/org/apache/flink/streaming/scala/examples/socket/SocketWindowWordCount.scala)
 和 
[java](https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/socket/SocketWindowWordCount.java)
 编写的 SocketWindowWordCount 示例程序的完整代码。
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] YngwieWang commented on a change in pull request #7751: [FLINK-11608] [docs] Translate the "Local Setup Tutorial" page into Chinese

2019-08-16 Thread GitBox
YngwieWang commented on a change in pull request #7751:  [FLINK-11608] [docs] 
Translate the "Local Setup Tutorial" page into Chinese
URL: https://github.com/apache/flink/pull/7751#discussion_r313786797
 
 

 ##
 File path: docs/tutorials/local_setup.zh.md
 ##
 @@ -0,0 +1,288 @@
+---
+title: "本地安装教程"
+nav-title: 'Local Setup'
+nav-parent_id: setuptutorials
+nav-pos: 10
+---
+
+
+* This will be replaced by the TOC
+{:toc}
+
+只需要几个简单的步骤即可启动并运行 Flink 示例程序。
+
+## 安装:下载并启动 Flink 
+
+Flink 可以在 __Linux、 Mac OS X、 和 Windows__ 环境中运行。为了能够运行 Flink 唯一的要求是安装 __Java 
8.x__ 。 Windows 用户, 请查阅[在  Windows 上运行 Flink]({{ site.baseurl 
}}/tutorials/flink_on_windows.html)  上面描述了如何在 windows 上以本地模式运行 Flink。
+
+你可以用下面的命令来检查一下是否正确的安装了 Java 程序:
+
+{% highlight bash %}
+java -version
+{% endhighlight %}
+
+如果你已经安装了 Java 8,应该输出如下内容:
+
+{% highlight bash %}
+java version "1.8.0_111"
+Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
+Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)
+{% endhighlight %}
+
+{% if site.is_stable %}
+
+
+1. 从[下载页](http://flink.apache.org/downloads.html)下载二进制文件。你可以选择任何喜欢的 Scala 
版本。针对某些特性,你可能还需要下载预捆绑的 Hadoop jar 包并将它们放入 `/lib`  目录。
+2. 进入下载后的目录。
+3. 解压下载的文件。
+
+
+{% highlight bash %}
+$ cd ~/Downloads# Go to download directory
+$ tar xzf flink-*.tgz   # Unpack the downloaded archive
+$ cd flink-{{site.version}}
+{% endhighlight %}
+
+
+
+对于 MacOS X 用户,Flink 可以通过[Homebrew](https://brew.sh/)进行安装。
+
+{% highlight bash %}
+$ brew install apache-flink
+...
+$ flink --version
+Version: 1.2.0, Commit ID: 1c659cf
+{% endhighlight %}
+
+
+
+
+{% else %}
+### 下载和编译
+从我们的[代码仓库](http://flink.apache.org/community.html#source-code)中克隆源码,比如:
+
+{% highlight bash %}
+$ git clone https://github.com/apache/flink.git
+$ cd flink
+$ mvn clean package -DskipTests # this will take up to 10 minutes
+$ cd build-target   # this is where Flink is installed to
+{% endhighlight %}
+{% endif %}
+
+### 启动 Flink 本地集群
+
+{% highlight bash %}
+$ ./bin/start-cluster.sh  # Start Flink
+{% endhighlight %}
+
+检查位于[http://localhost:8081](http://localhost:8081)的 __web 调度界面__以确保一切正常运行。Web 
界面上会仅显示一个可用的 TaskManager 实例。
 
 Review comment:
   ```suggestion
   检查位于 [http://localhost:8081](http://localhost:8081) 的 __web 调度界面__ 
以确保一切正常运行。Web 界面上会仅显示一个可用的 TaskManager 实例。
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] YngwieWang commented on a change in pull request #7751: [FLINK-11608] [docs] Translate the "Local Setup Tutorial" page into Chinese

2019-08-16 Thread GitBox
YngwieWang commented on a change in pull request #7751:  [FLINK-11608] [docs] 
Translate the "Local Setup Tutorial" page into Chinese
URL: https://github.com/apache/flink/pull/7751#discussion_r314919093
 
 

 ##
 File path: docs/tutorials/local_setup.zh.md
 ##
 @@ -0,0 +1,288 @@
+---
+title: "本地安装教程"
+nav-title: 'Local Setup'
+nav-parent_id: setuptutorials
+nav-pos: 10
+---
+
+
+* This will be replaced by the TOC
+{:toc}
+
+只需要几个简单的步骤即可启动并运行 Flink 示例程序。
+
+## 安装:下载并启动 Flink 
+
+Flink 可以在 __Linux、Mac OS X、和 Windows__ 环境中运行。为了能够运行 Flink 唯一要求是安装 __Java 8.x__ 
。 Windows 用户,请查阅[在  Windows 上运行 Flink]({{ site.baseurl 
}}/tutorials/flink_on_windows.html)上面描述了如何在 windows 上以本地模式运行 Flink。
 
 Review comment:
   ```suggestion
   Flink 可以在 __Linux、Mac OS X 和 Windows__ 环境中运行。运行 Flink 的唯一要求是安装 __Java 
8.x__。Windows 用户请查阅[在  Windows 上运行 Flink]({{ site.baseurl 
}}/zh/tutorials/flink_on_windows.html),上面描述了如何在 windows 上以本地模式运行 Flink。
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-13755) support Hive built-in functions in Flink

2019-08-16 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13755:
-
Attachment: common builtin functions is flink and hive.txt
hive builtin functions that are missing in flink.txt

> support Hive built-in functions in Flink
> 
>
> Key: FLINK-13755
> URL: https://issues.apache.org/jira/browse/FLINK-13755
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.10.0
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Major
> Fix For: 1.10.0
>
> Attachments: common builtin functions is flink and hive.txt, hive 
> builtin functions that are missing in flink.txt
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13755) support Hive built-in functions in Flink

2019-08-16 Thread Bowen Li (JIRA)
Bowen Li created FLINK-13755:


 Summary: support Hive built-in functions in Flink
 Key: FLINK-13755
 URL: https://issues.apache.org/jira/browse/FLINK-13755
 Project: Flink
  Issue Type: New Feature
  Components: Connectors / Hive
Affects Versions: 1.10.0
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.10.0






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] flinkbot edited a comment on issue #9457: [FLINK-13741][table] "SHOW FUNCTIONS" should include Flink built-in functions' names

2019-08-16 Thread GitBox
flinkbot edited a comment on issue #9457: [FLINK-13741][table] "SHOW FUNCTIONS" 
should include Flink built-in functions' names
URL: https://github.com/apache/flink/pull/9457#issuecomment-521829752
 
 
   ## CI report:
   
   * 55c0e5843e029f022ff59fe14a9e6c1d2c5ac69e : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123443311)
   * 006236fff94d0204223a2c3b89f621da3248f6a4 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123444248)
   * 726259a0a1bfb2061f77a82c586d9b3a4c70abb6 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123572731)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (FLINK-13747) Remove some TODOs in Hive connector

2019-08-16 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li closed FLINK-13747.

Resolution: Fixed

merged in 1.10.0: a6571bb61f41a65b47ec250231a32e14b1390069

> Remove some TODOs in Hive connector
> ---
>
> Key: FLINK-13747
> URL: https://issues.apache.org/jira/browse/FLINK-13747
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Affects Versions: 1.9.0
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] asfgit closed pull request #9460: [FLINK-13747][hive] Remove some TODOs in Hive connector

2019-08-16 Thread GitBox
asfgit closed pull request #9460: [FLINK-13747][hive] Remove some TODOs in Hive 
connector
URL: https://github.com/apache/flink/pull/9460
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-13711) Hive array values not properly displayed in SQL CLI

2019-08-16 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13711:
-
Fix Version/s: 1.9.1

> Hive array values not properly displayed in SQL CLI
> ---
>
> Key: FLINK-13711
> URL: https://issues.apache.org/jira/browse/FLINK-13711
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.9.0
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Array values are displayed like:
> {noformat}
>  [Ljava.lang.Integer;@632~
>  [Ljava.lang.Integer;@6de~
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13711) Hive array values not properly displayed in SQL CLI

2019-08-16 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13711:
-
Affects Version/s: 1.9.0

> Hive array values not properly displayed in SQL CLI
> ---
>
> Key: FLINK-13711
> URL: https://issues.apache.org/jira/browse/FLINK-13711
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.9.0
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Array values are displayed like:
> {noformat}
>  [Ljava.lang.Integer;@632~
>  [Ljava.lang.Integer;@6de~
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (FLINK-13711) Hive array values not properly displayed in SQL CLI

2019-08-16 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li reassigned FLINK-13711:


Assignee: Rui Li

> Hive array values not properly displayed in SQL CLI
> ---
>
> Key: FLINK-13711
> URL: https://issues.apache.org/jira/browse/FLINK-13711
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Array values are displayed like:
> {noformat}
>  [Ljava.lang.Integer;@632~
>  [Ljava.lang.Integer;@6de~
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] bowenli86 commented on issue #9450: [FLINK-13711][sql-client] Hive array values not properly displayed in…

2019-08-16 Thread GitBox
bowenli86 commented on issue #9450: [FLINK-13711][sql-client] Hive array values 
not properly displayed in…
URL: https://github.com/apache/flink/pull/9450#issuecomment-522159565
 
 
   @twalthr given this is a bug, I think we can merge it into 1.9.1, what do 
you think?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-13747) Remove some TODOs in Hive connector

2019-08-16 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13747:
-
Issue Type: Improvement  (was: Bug)

> Remove some TODOs in Hive connector
> ---
>
> Key: FLINK-13747
> URL: https://issues.apache.org/jira/browse/FLINK-13747
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Affects Versions: 1.9.0
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13747) Remove some TODOs in Hive connector

2019-08-16 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13747:
-
Fix Version/s: 1.10.0

> Remove some TODOs in Hive connector
> ---
>
> Key: FLINK-13747
> URL: https://issues.apache.org/jira/browse/FLINK-13747
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive
>Affects Versions: 1.9.0
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13747) Remove some TODOs in Hive connector

2019-08-16 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-13747:
-
Affects Version/s: 1.9.0

> Remove some TODOs in Hive connector
> ---
>
> Key: FLINK-13747
> URL: https://issues.apache.org/jira/browse/FLINK-13747
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Affects Versions: 1.9.0
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (FLINK-13747) Remove some TODOs in Hive connector

2019-08-16 Thread Bowen Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li reassigned FLINK-13747:


Assignee: Rui Li

> Remove some TODOs in Hive connector
> ---
>
> Key: FLINK-13747
> URL: https://issues.apache.org/jira/browse/FLINK-13747
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] asfgit closed pull request #9217: [FLINK-13277][hive] add documentation of Hive source/sink

2019-08-16 Thread GitBox
asfgit closed pull request #9217: [FLINK-13277][hive] add documentation of Hive 
source/sink
URL: https://github.com/apache/flink/pull/9217
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] bowenli86 commented on issue #9446: [hotfix][hive][doc] refine Hive related documentations

2019-08-16 Thread GitBox
bowenli86 commented on issue #9446: [hotfix][hive][doc] refine Hive related 
documentations
URL: https://github.com/apache/flink/pull/9446#issuecomment-522150808
 
 
   merged in master and release-1.9


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] bowenli86 closed pull request #9446: [hotfix][hive][doc] refine Hive related documentations

2019-08-16 Thread GitBox
bowenli86 closed pull request #9446: [hotfix][hive][doc] refine Hive related 
documentations
URL: https://github.com/apache/flink/pull/9446
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9468: [FLINK-13689] [Connectors/ElasticSearch] Fix thread leak when elasticsearch6 rest high level cli…

2019-08-16 Thread GitBox
flinkbot edited a comment on issue #9468: [FLINK-13689] 
[Connectors/ElasticSearch] Fix thread leak when elasticsearch6 rest high level 
cli…
URL: https://github.com/apache/flink/pull/9468#issuecomment-522106295
 
 
   ## CI report:
   
   * 78654e8f80aa0b1a01654e7684f4610eb1d3aff4 : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/123548600)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9210: [FLINK-12746][docs] Getting Started - DataStream Example Walkthrough

2019-08-16 Thread GitBox
flinkbot edited a comment on issue #9210: [FLINK-12746][docs] Getting Started - 
DataStream Example Walkthrough
URL: https://github.com/apache/flink/pull/9210#issuecomment-514437706
 
 
   ## CI report:
   
   * 5eb979da047c442c0205464c92b5bd9ee3a740dc : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120299964)
   * d7bf53a30514664925357bd5817305a02553d0a3 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120506936)
   * 02cca7fb6283b84a20ee019159ccb023ccffbd82 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120769129)
   * 5009b10d38eef92f25bfe4ff4608f2dd121ea9c6 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120915709)
   * e3b272586d8f41d3800e86134730c4dc427952a6 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120916220)
   * f1aee543a7aef88e3cf052f4d686ab0a8e5938e5 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120996260)
   * c66060dba290844085f90f554d447c6d7033779d : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121131224)
   * 700e5c19a3d49197ef2b18a646f0b6e1bf783ba8 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121174288)
   * 6f3fccea82189ef95d46f12212f6f7386fc11668 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/123540519)
   * 829c9c0505b6f08bb68e20a34e0613d83ae21758 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123545553)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-13025) Elasticsearch 7.x support

2019-08-16 Thread Victor (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909329#comment-16909329
 ] 

Victor commented on FLINK-13025:


Hi [~aljoscha], I was looking for an ES 7.X connector and came across this US.
My apologies for chiming in, but I believe the vast majority of those 
compilation issues might go away if you add a "RequestOptions" parameter since 
it seems ES Method's changed their signature and are requiring it since 7.0 
[https://www.elastic.co/guide/en/elasticsearch/client/java-rest/7.x/java-rest-high-getting-started-request-options.html]

Also, I've noticed the connector submits "type" name with the IndexRequest and 
although is marked as deprecated in 7.2.0, it will go away soon, ES moved to a 
1 to 1 relation between MappingType and Index since later versions of 6 and 
it's definitely not supported I believe since 7.0 so since 6.8 you can no 
longer name the type and it will default to "_doc" 
[https://www.elastic.co/guide/en/elasticsearch/reference/7.0/breaking-changes-7.0.html#include-type-name-defaults-false]

So overall I believe a new connector will be needed for 7.X eventhough most of 
the code will still work with minor changes.

 

> Elasticsearch 7.x support
> -
>
> Key: FLINK-13025
> URL: https://issues.apache.org/jira/browse/FLINK-13025
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.8.0
>Reporter: Keegan Standifer
>Priority: Major
>
> Elasticsearch 7.0.0 was released in April of 2019: 
> [https://www.elastic.co/blog/elasticsearch-7-0-0-released]
> The latest elasticsearch connector is 
> [flink-connector-elasticsearch6|https://github.com/apache/flink/tree/master/flink-connectors/flink-connector-elasticsearch6]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] flinkbot commented on issue #9468: [FLINK-13689] [Connectors/ElasticSearch] Fix thread leak when elasticsearch6 rest high level cli…

2019-08-16 Thread GitBox
flinkbot commented on issue #9468: [FLINK-13689] [Connectors/ElasticSearch] Fix 
thread leak when elasticsearch6 rest high level cli…
URL: https://github.com/apache/flink/pull/9468#issuecomment-522106295
 
 
   ## CI report:
   
   * 78654e8f80aa0b1a01654e7684f4610eb1d3aff4 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123548600)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-13689) Rest High Level Client for Elasticsearch6.x connector leaks threads if no connection could be established

2019-08-16 Thread Rishindra Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909313#comment-16909313
 ] 

Rishindra Kumar commented on FLINK-13689:
-

Hi [~fhueske],

I created pull request with the mentioned change.

[https://github.com/apache/flink/pull/9468]

 

> Rest High Level Client for Elasticsearch6.x connector leaks threads if no 
> connection could be established
> -
>
> Key: FLINK-13689
> URL: https://issues.apache.org/jira/browse/FLINK-13689
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.8.1
>Reporter: Rishindra Kumar
>Assignee: Rishindra Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.2
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If the created Elastic Search Rest High Level Client(rhlClient) is 
> unreachable, Current code throws RuntimeException. But, it doesn't close the 
> client which causes thread leak.
>  
> *Current Code*
> *if (!rhlClient.ping()) {*
>      *throw new RuntimeException("There are no reachable Elasticsearch 
> nodes!");*
> *}*
>  
> *Change Needed*
> rhlClient needs to be closed.
>  
> *Steps to Reproduce*
> 1. Add the ElasticSearch Sink to the stream. Start the Flink program without 
> starting the ElasticSearch. 
> 2. Program will give error: "*Too many open files*" and it doesn't write even 
> though you start the Elastic Search later.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] flinkbot commented on issue #9468: [FLINK-13689] [Connectors/ElasticSearch] Fix thread leak when elasticsearch6 rest high level cli…

2019-08-16 Thread GitBox
flinkbot commented on issue #9468: [FLINK-13689] [Connectors/ElasticSearch] Fix 
thread leak when elasticsearch6 rest high level cli…
URL: https://github.com/apache/flink/pull/9468#issuecomment-522103874
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 78654e8f80aa0b1a01654e7684f4610eb1d3aff4 (Fri Aug 16 
18:17:30 UTC 2019)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-13689) Rest High Level Client for Elasticsearch6.x connector leaks threads if no connection could be established

2019-08-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-13689:
---
Labels: pull-request-available  (was: )

> Rest High Level Client for Elasticsearch6.x connector leaks threads if no 
> connection could be established
> -
>
> Key: FLINK-13689
> URL: https://issues.apache.org/jira/browse/FLINK-13689
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.8.1
>Reporter: Rishindra Kumar
>Assignee: Rishindra Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.2
>
>
> If the created Elastic Search Rest High Level Client(rhlClient) is 
> unreachable, Current code throws RuntimeException. But, it doesn't close the 
> client which causes thread leak.
>  
> *Current Code*
> *if (!rhlClient.ping()) {*
>      *throw new RuntimeException("There are no reachable Elasticsearch 
> nodes!");*
> *}*
>  
> *Change Needed*
> rhlClient needs to be closed.
>  
> *Steps to Reproduce*
> 1. Add the ElasticSearch Sink to the stream. Start the Flink program without 
> starting the ElasticSearch. 
> 2. Program will give error: "*Too many open files*" and it doesn't write even 
> though you start the Elastic Search later.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] Rishi55 opened a new pull request #9468: [FLINK-13689] [Connectors/ElasticSearch] Fix thread leak when elasticsearch6 rest high level cli…

2019-08-16 Thread GitBox
Rishi55 opened a new pull request #9468: [FLINK-13689] 
[Connectors/ElasticSearch] Fix thread leak when elasticsearch6 rest high level 
cli…
URL: https://github.com/apache/flink/pull/9468
 
 
   **(The sections below can be removed for hotfixes of typos)**
   -->
   
   ## What is the purpose of the change
   
   *(To fix the thread leak when elasticsearch6 rest high level client ping is 
unsuccessful)*
   
   
   ## Brief change log
   
   *(Added close statement for the rhlClient when ping is unsuccessful)*
   
   
   ## Verifying this change
   
   *(This change is a trivial rework / code cleanup without any test coverage. 
Tested in local though)*
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9210: [FLINK-12746][docs] Getting Started - DataStream Example Walkthrough

2019-08-16 Thread GitBox
flinkbot edited a comment on issue #9210: [FLINK-12746][docs] Getting Started - 
DataStream Example Walkthrough
URL: https://github.com/apache/flink/pull/9210#issuecomment-514437706
 
 
   ## CI report:
   
   * 5eb979da047c442c0205464c92b5bd9ee3a740dc : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120299964)
   * d7bf53a30514664925357bd5817305a02553d0a3 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120506936)
   * 02cca7fb6283b84a20ee019159ccb023ccffbd82 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120769129)
   * 5009b10d38eef92f25bfe4ff4608f2dd121ea9c6 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120915709)
   * e3b272586d8f41d3800e86134730c4dc427952a6 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120916220)
   * f1aee543a7aef88e3cf052f4d686ab0a8e5938e5 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120996260)
   * c66060dba290844085f90f554d447c6d7033779d : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121131224)
   * 700e5c19a3d49197ef2b18a646f0b6e1bf783ba8 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121174288)
   * 6f3fccea82189ef95d46f12212f6f7386fc11668 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/123540519)
   * 829c9c0505b6f08bb68e20a34e0613d83ae21758 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123545553)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-13754) Decouple OperatorChain from StreamStatusMaintainer

2019-08-16 Thread zhijiang (JIRA)
zhijiang created FLINK-13754:


 Summary: Decouple OperatorChain from StreamStatusMaintainer
 Key: FLINK-13754
 URL: https://issues.apache.org/jira/browse/FLINK-13754
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Task
Reporter: zhijiang
Assignee: zhijiang


There are two motivations for this refactoring:
 * It is the precondition for the following work of decoupling the dependency 
between two inputs status in ForwardingValveOutputHandler.
 * From the aspect of design rule, the current OperatorChain takes many 
unrelated roles like StreamStatusMaintainer to make it unmaintainable. The root 
reason for this case is from the cycle dependency between RecordWriterOutput 
(created by OperatorChain) and  StreamStatusMaintainer.

The solution is to refactor the creation of StreamStatusMaintainer and 
RecordWriterOutput in StreamTask level, and then break the implementation cycle 
dependency between them. The array of RecordWriters which has close 
relationship with RecordWriterOutput is created in StreamTask, so it is 
reasonable to create them together. The created StreamStatusMaintainer in 
StreamTask can be directly referenced by subclasses like 
OneInputStreamTask/TwoInputStreamTask.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] Johnlon commented on issue #9456: FLINK-13588 flink-streaming-java don't throw away exception info in logging

2019-08-16 Thread GitBox
Johnlon commented on issue #9456: FLINK-13588 flink-streaming-java don't throw 
away exception info in logging 
URL: https://github.com/apache/flink/pull/9456#issuecomment-522091063
 
 
   Will look over the weekend.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] Johnlon commented on a change in pull request #9456: FLINK-13588 flink-streaming-java don't throw away exception info in logging

2019-08-16 Thread GitBox
Johnlon commented on a change in pull request #9456: FLINK-13588 
flink-streaming-java don't throw away exception info in logging 
URL: https://github.com/apache/flink/pull/9456#discussion_r314819650
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/AsynchronousException.java
 ##
 @@ -38,6 +38,10 @@ public AsynchronousException(String message, Throwable 
cause) {
 
@Override
public String toString() {
-   return "AsynchronousException{" + getCause() + "}";
+   if (getMessage() != null) {
+   return "AsynchronousException{" + getMessage() + ", 
caused by " + getCause() + "}";
+   } else {
+   return "AsynchronousException{" + getCause() + "}";
+   }
 
 Review comment:
   Done.
   
   This is a more intrusive change that may break more tests as it  impacts the 
route where the message was null.
   
   Any objection to me changing the tests to assert on the default 
AsyncException.toString() rendering?
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] Johnlon commented on a change in pull request #9456: FLINK-13588 flink-streaming-java don't throw away exception info in logging

2019-08-16 Thread GitBox
Johnlon commented on a change in pull request #9456: FLINK-13588 
flink-streaming-java don't throw away exception info in logging 
URL: https://github.com/apache/flink/pull/9456#discussion_r314818982
 
 

 ##
 File path: 
flink-streaming-java/src/test/java/org/apache/flink/streaming/runtime/tasks/StreamTaskTest.java
 ##
 @@ -180,6 +180,25 @@
@Rule
public final Timeout timeoutPerTest = Timeout.seconds(30);
 
+   /**
+* This test checks the async exceptions handling wraps the message and 
cause as an AsynchronousException
+* and propagates this to the environment.
+*/
+   @Test
+   public void exceptionReporting()  {
+   Environment e = mock(Environment.class);
+   RuntimeException expectedException = new 
RuntimeException("RUNTIME EXCEPTION");
+
+   SourceStreamTask sut = new SourceStreamTask(e);
+   sut.handleAsyncException("EXPECTED_ERROR", expectedException);
+
+   ArgumentCaptor c = 
ArgumentCaptor.forClass(AsynchronousException.class);
+   verify(e).failExternally(c.capture());
+   assertEquals(c.getValue().getMessage(), "EXPECTED_ERROR");
+   assertEquals(c.getValue().getCause(), expectedException);
+   assertEquals(expectedException, 
"AsynchronousException{EXPECTED_ERROR, caused by RUNTIME EXCEPTION}");
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-13741) "SHOW FUNCTIONS" should include Flink built-in functions' names

2019-08-16 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909252#comment-16909252
 ] 

Xuefu Zhang commented on FLINK-13741:
-

Re: Once FunctionCatalog has been integrated into the catalog API and built-in 
functions are stored there as well.

Hi [~twalthr], thanks for bringing this up. I think we briefly touched upon 
this before, but there wasn't consensus how we are going to do this. Thus, I 
look forward to the proposal. To me, certain logic in FunctionCatalog such as 
function resolution doesn't seem belonging to Catalog API, and I'm also a 
little dubious about whether built-in functions need to be stored in a catalog. 
I think we can have more discussions once we get to it.

Thanks.

> "SHOW FUNCTIONS" should include Flink built-in functions' names
> ---
>
> Key: FLINK-13741
> URL: https://issues.apache.org/jira/browse/FLINK-13741
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.9.0
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently "SHOW FUNCTIONS;" only returns catalog functions and 
> FunctionDefinitions registered in memory, but does not include Flink built-in 
> functions' names.
> AFAIK, it's standard for "SHOW FUNCTIONS;" to show all available functions 
> for use in queries in SQL systems like Hive, Presto, Teradata, etc, thus it 
> includes built-in functions naturally. Besides, 
> {{FunctionCatalog.lookupFunction(name)}} resolves calls to built-in 
> functions, it's not feeling right to not displaying functions but can 
> successfully resolve to them.
> It seems to me that the root cause is the call stack for "SHOW FUNCTIONS;" 
> has been a bit messy - it calls {{tEnv.listUserDefinedFunctions()}} which 
> further calls {{FunctionCatalog.getUserDefinedFunctions()}}, and I'm not sure 
> what's the intention of those two APIs. Are they dedicated to getting all 
> functions, or just user defined functions excluding built-in ones?
> In the end, I believe "SHOW FUNCTIONS;" should display built-in functions. To 
> achieve that, we either need to modify and/or rename existing APIs mentioned 
> above, or add new APIs to return all functions from FunctionCatalog.
> cc [~xuefuz] [~lirui] [~twalthr]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] flinkbot edited a comment on issue #9456: FLINK-13588 flink-streaming-java don't throw away exception info in logging

2019-08-16 Thread GitBox
flinkbot edited a comment on issue #9456: FLINK-13588 flink-streaming-java 
don't throw away exception info in logging 
URL: https://github.com/apache/flink/pull/9456#issuecomment-521825874
 
 
   ## CI report:
   
   * 1242679f7bd5ec3f7c1115006e978267abafc84b : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/123441772)
   * c2e57b175b07e9ee854598140676ab428c2b4b8f : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123442281)
   * cd9568ae549b007727edaacb0607c7310b2fd520 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123541232)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-13753) Integrate new Source Operator with Mailbox Model in StreamTask

2019-08-16 Thread zhijiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhijiang updated FLINK-13753:
-
Description: 
This is the umbrella issue for integrating new source operator with mailbox 
model in StreamTask.

The motivation is based on 
[FLIP-27|https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface]
 which proposes to refactor the whole source API and the integration of 
task-level actions (including checkpoint, timer, async operator) with unified 
mailbox model on runtime side.
 * The benefits are simple unified processing logics because only one single 
thread handles all the actions without concurrent issue, and further getting 
rid of lock dependency which causes unfair lock concern in checkpoint process.
 * We still need to support the current legacy source in some releases which 
would probably be used for a while, especially for the scenario of performance 
concern.

The design doc is 
[https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]

  was:
This is the umbrella issue for integrating new source operator with mailbox 
model in StreamTask.

The motivation is based on 
[FLIP-27|https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface]
 which proposes to refactor the whole source API and the integration of 
task-level actions (including checkpoint, timer, async operator) with unified 
mailbox model on runtime side.
 * The benefits are simple unified processing logics because only one single 
thread handles all the actions without concurrent issue, and further getting 
rid of lock dependency which causes unfair lock concern in checkpoint process.
 * We still need to support the current legacy source in some releases which 
would probably be used for a while, especially for the scenario of performance 
concern.

The design doc is 
[https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]|[https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]


> Integrate new Source Operator with Mailbox Model in StreamTask
> --
>
> Key: FLINK-13753
> URL: https://issues.apache.org/jira/browse/FLINK-13753
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: zhijiang
>Assignee: zhijiang
>Priority: Minor
>
> This is the umbrella issue for integrating new source operator with mailbox 
> model in StreamTask.
> The motivation is based on 
> [FLIP-27|https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface]
>  which proposes to refactor the whole source API and the integration of 
> task-level actions (including checkpoint, timer, async operator) with unified 
> mailbox model on runtime side.
>  * The benefits are simple unified processing logics because only one single 
> thread handles all the actions without concurrent issue, and further getting 
> rid of lock dependency which causes unfair lock concern in checkpoint process.
>  * We still need to support the current legacy source in some releases which 
> would probably be used for a while, especially for the scenario of 
> performance concern.
> The design doc is 
> [https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13753) Integrate new Source Operator with Mailbox Model in StreamTask

2019-08-16 Thread zhijiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhijiang updated FLINK-13753:
-
Description: 
This is the umbrella issue for integrating new source operator with mailbox 
model in StreamTask.

The motivation is based on 
[FLIP-27|https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface]
 which proposes to refactor the whole source API and the integration of 
task-level actions (including checkpoint, timer, async operator) with unified 
mailbox model on runtime side.
 * The benefits are simple unified processing logics because only one single 
thread handles all the actions without concurrent issue, and further getting 
rid of lock dependency which causes unfair lock concern in checkpoint process.
 * We still need to support the current legacy source in some releases which 
would probably be used for a while, especially for the scenario of performance 
concern.

The design doc is 
[https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]|[https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]

  was:
This is the umbrella issue for integrating new source operator with mailbox 
model in StreamTask.

The motivation is based on 
[FLIP-27|https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface]
 which proposes to refactor the whole source API and the integration of 
task-level actions (including checkpoint, timer, async operator) with unified 
[mailbox 
model|[https://docs.google.com/document/d/1eDpsUKv2FqwZiS1Pm6gYO5eFHScBHfULKmH1-ZEWB4g]]
 on runtime side.
 * The benefits are simple unified processing logics because only one single 
thread handles all the actions without concurrent issue, and further getting 
rid of lock dependency which causes unfair lock concern in checkpoint process.
 * We still need to support the current legacy source in some releases which 
would probably be used for a while, especially for the scenario of performance 
concern.

The design doc is 
[https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]|[https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]]


> Integrate new Source Operator with Mailbox Model in StreamTask
> --
>
> Key: FLINK-13753
> URL: https://issues.apache.org/jira/browse/FLINK-13753
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: zhijiang
>Assignee: zhijiang
>Priority: Minor
>
> This is the umbrella issue for integrating new source operator with mailbox 
> model in StreamTask.
> The motivation is based on 
> [FLIP-27|https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface]
>  which proposes to refactor the whole source API and the integration of 
> task-level actions (including checkpoint, timer, async operator) with unified 
> mailbox model on runtime side.
>  * The benefits are simple unified processing logics because only one single 
> thread handles all the actions without concurrent issue, and further getting 
> rid of lock dependency which causes unfair lock concern in checkpoint process.
>  * We still need to support the current legacy source in some releases which 
> would probably be used for a while, especially for the scenario of 
> performance concern.
> The design doc is 
> [https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]|[https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13753) Integrate new Source Operator with Mailbox Model in StreamTask

2019-08-16 Thread zhijiang (JIRA)
zhijiang created FLINK-13753:


 Summary: Integrate new Source Operator with Mailbox Model in 
StreamTask
 Key: FLINK-13753
 URL: https://issues.apache.org/jira/browse/FLINK-13753
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Task
Reporter: zhijiang
Assignee: zhijiang


This is the umbrella issue for integrating new source operator with mailbox 
model in StreamTask.

The motivation is based on 
[FLIP-27|https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface]
 which proposes to refactor the whole source API and the integration of 
task-level actions (including checkpoint, timer, async operator) with unified 
[mailbox model| 
[https://docs.google.com/document/d/1eDpsUKv2FqwZiS1Pm6gYO5eFHScBHfULKmH1-ZEWB4g]]
 on runtime side.
 * The benefits are simple unified processing logics because only one single 
thread handles all the actions without concurrent issue, and further getting 
rid of lock dependency which causes unfair lock concern in checkpoint process.
 * We still need to support the current legacy source in some releases which 
would probably be used for a while, especially for the scenario of performance 
concern.

The design doc is 
[https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]|[https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13753) Integrate new Source Operator with Mailbox Model in StreamTask

2019-08-16 Thread zhijiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhijiang updated FLINK-13753:
-
Description: 
This is the umbrella issue for integrating new source operator with mailbox 
model in StreamTask.

The motivation is based on 
[FLIP-27|https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface]
 which proposes to refactor the whole source API and the integration of 
task-level actions (including checkpoint, timer, async operator) with unified 
[mailbox 
model|[https://docs.google.com/document/d/1eDpsUKv2FqwZiS1Pm6gYO5eFHScBHfULKmH1-ZEWB4g]]
 on runtime side.
 * The benefits are simple unified processing logics because only one single 
thread handles all the actions without concurrent issue, and further getting 
rid of lock dependency which causes unfair lock concern in checkpoint process.
 * We still need to support the current legacy source in some releases which 
would probably be used for a while, especially for the scenario of performance 
concern.

The design doc is 
[https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]|[https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]]

  was:
This is the umbrella issue for integrating new source operator with mailbox 
model in StreamTask.

The motivation is based on 
[FLIP-27|https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface]
 which proposes to refactor the whole source API and the integration of 
task-level actions (including checkpoint, timer, async operator) with unified 
[mailbox model| 
[https://docs.google.com/document/d/1eDpsUKv2FqwZiS1Pm6gYO5eFHScBHfULKmH1-ZEWB4g]]
 on runtime side.
 * The benefits are simple unified processing logics because only one single 
thread handles all the actions without concurrent issue, and further getting 
rid of lock dependency which causes unfair lock concern in checkpoint process.
 * We still need to support the current legacy source in some releases which 
would probably be used for a while, especially for the scenario of performance 
concern.

The design doc is 
[https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]|[https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]]


> Integrate new Source Operator with Mailbox Model in StreamTask
> --
>
> Key: FLINK-13753
> URL: https://issues.apache.org/jira/browse/FLINK-13753
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: zhijiang
>Assignee: zhijiang
>Priority: Minor
>
> This is the umbrella issue for integrating new source operator with mailbox 
> model in StreamTask.
> The motivation is based on 
> [FLIP-27|https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface]
>  which proposes to refactor the whole source API and the integration of 
> task-level actions (including checkpoint, timer, async operator) with unified 
> [mailbox 
> model|[https://docs.google.com/document/d/1eDpsUKv2FqwZiS1Pm6gYO5eFHScBHfULKmH1-ZEWB4g]]
>  on runtime side.
>  * The benefits are simple unified processing logics because only one single 
> thread handles all the actions without concurrent issue, and further getting 
> rid of lock dependency which causes unfair lock concern in checkpoint process.
>  * We still need to support the current legacy source in some releases which 
> would probably be used for a while, especially for the scenario of 
> performance concern.
> The design doc is 
> [https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]|[https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit#|https://docs.google.com/document/d/13x9M7k1SRqkOFXP0bETcJemIRyJzoqGgkdy11pz5qHM/edit]]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] flinkbot edited a comment on issue #9456: FLINK-13588 flink-streaming-java don't throw away exception info in logging

2019-08-16 Thread GitBox
flinkbot edited a comment on issue #9456: FLINK-13588 flink-streaming-java 
don't throw away exception info in logging 
URL: https://github.com/apache/flink/pull/9456#issuecomment-521825874
 
 
   ## CI report:
   
   * 1242679f7bd5ec3f7c1115006e978267abafc84b : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/123441772)
   * c2e57b175b07e9ee854598140676ab428c2b4b8f : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123442281)
   * cd9568ae549b007727edaacb0607c7310b2fd520 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123541232)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9210: [FLINK-12746][docs] Getting Started - DataStream Example Walkthrough

2019-08-16 Thread GitBox
flinkbot edited a comment on issue #9210: [FLINK-12746][docs] Getting Started - 
DataStream Example Walkthrough
URL: https://github.com/apache/flink/pull/9210#issuecomment-514437706
 
 
   ## CI report:
   
   * 5eb979da047c442c0205464c92b5bd9ee3a740dc : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120299964)
   * d7bf53a30514664925357bd5817305a02553d0a3 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120506936)
   * 02cca7fb6283b84a20ee019159ccb023ccffbd82 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120769129)
   * 5009b10d38eef92f25bfe4ff4608f2dd121ea9c6 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120915709)
   * e3b272586d8f41d3800e86134730c4dc427952a6 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120916220)
   * f1aee543a7aef88e3cf052f4d686ab0a8e5938e5 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/120996260)
   * c66060dba290844085f90f554d447c6d7033779d : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121131224)
   * 700e5c19a3d49197ef2b18a646f0b6e1bf783ba8 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121174288)
   * 6f3fccea82189ef95d46f12212f6f7386fc11668 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123540519)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-13741) "SHOW FUNCTIONS" should include Flink built-in functions' names

2019-08-16 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909226#comment-16909226
 ] 

Xuefu Zhang commented on FLINK-13741:
-

Thanks all for the inputs. I think we are pretty much aligned:

1. SQL "SHOW FUNCTIONS" and tableEnv.listFunction() should include built-in 
functions. It would be nice to have some options for filtering, like what were 
suggested above, but missing it will not kill us right way. Most users to use 
this to see what are out there.

2. Showing the details of a function is the job of "DESCRIBE FUNCTION func". 
With "EXTENDED", more info can be shown. "SHOW FUNCTIONS" just list the 
function names. This aligns with table/column/database.

> "SHOW FUNCTIONS" should include Flink built-in functions' names
> ---
>
> Key: FLINK-13741
> URL: https://issues.apache.org/jira/browse/FLINK-13741
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.9.0
>Reporter: Bowen Li
>Assignee: Bowen Li
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently "SHOW FUNCTIONS;" only returns catalog functions and 
> FunctionDefinitions registered in memory, but does not include Flink built-in 
> functions' names.
> AFAIK, it's standard for "SHOW FUNCTIONS;" to show all available functions 
> for use in queries in SQL systems like Hive, Presto, Teradata, etc, thus it 
> includes built-in functions naturally. Besides, 
> {{FunctionCatalog.lookupFunction(name)}} resolves calls to built-in 
> functions, it's not feeling right to not displaying functions but can 
> successfully resolve to them.
> It seems to me that the root cause is the call stack for "SHOW FUNCTIONS;" 
> has been a bit messy - it calls {{tEnv.listUserDefinedFunctions()}} which 
> further calls {{FunctionCatalog.getUserDefinedFunctions()}}, and I'm not sure 
> what's the intention of those two APIs. Are they dedicated to getting all 
> functions, or just user defined functions excluding built-in ones?
> In the end, I believe "SHOW FUNCTIONS;" should display built-in functions. To 
> achieve that, we either need to modify and/or rename existing APIs mentioned 
> above, or add new APIs to return all functions from FunctionCatalog.
> cc [~xuefuz] [~lirui] [~twalthr]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] sjwiesman commented on a change in pull request #9210: [FLINK-12746][docs] Getting Started - DataStream Example Walkthrough

2019-08-16 Thread GitBox
sjwiesman commented on a change in pull request #9210: [FLINK-12746][docs] 
Getting Started - DataStream Example Walkthrough
URL: https://github.com/apache/flink/pull/9210#discussion_r314805990
 
 

 ##
 File path: docs/getting-started/walkthroughs/datastream_api.md
 ##
 @@ -0,0 +1,897 @@
+---
+title: "DataStream API"
+nav-id: datastreamwalkthrough
+nav-title: 'DataStream API'
+nav-parent_id: walkthroughs
+nav-pos: 2
+---
+
+
+Apache Flink offers a DataStream API for building robust, stateful streaming 
applications.
+It provides fine-grained control over state and time, which allows for the 
implementation of complex event-driven systems.
+
+* This will be replaced by the TOC
+{:toc}
+
+## What Are You Building? 
+
+Credit card fraud is a growing concern in the digital age.
+Criminals steal credit card numbers by running scams or hacking into insecure 
systems.
+Stolen numbers are tested by making one or more small purchases, often for a 
dollar or less.
+If that works, they then make more significant purchases to get items they can 
sell or keep for themselves.
+
+In this tutorial, you will build a fraud detection system for alerting on 
suspicious credit card transactions.
+Using a simple set of rules, you will see how Flink allows us to implement 
advanced business logic and act in real-time.
+
+## Prerequisites
+
+This walkthrough assumes that you have some familiarity with Java or Scala, 
but you should be able to follow along even if you are coming from a different 
programming language.
+
+## Help, I’m Stuck! 
+
+If you get stuck, check out the [community support 
resources](https://flink.apache.org/community.html).
+In particular, Apache Flink's [user mailing 
list](https://flink.apache.org/community.html#mailing-lists) is consistently 
ranked as one of the most active of any Apache project and a great way to get 
help quickly.
+
+## How To Follow Along
+
+If you want to follow along, you will require a computer with:
+
+* Java 8 
+* Maven 
+
+A provided Flink Maven Archetype will create a skeleton project with all the 
necessary dependencies quickly:
+
+
+
+{% highlight bash %}
+$ mvn archetype:generate \
+-DarchetypeGroupId=org.apache.flink \
+-DarchetypeArtifactId=flink-walkthrough-datastream-java \{% unless 
site.is_stable %}
+
-DarchetypeCatalog=https://repository.apache.org/content/repositories/snapshots/
 \{% endunless %}
+-DarchetypeVersion={{ site.version }} \
+-DgroupId=frauddetection \
+-DartifactId=frauddetection \
+-Dversion=0.1 \
+-Dpackage=spendreport \
+-DinteractiveMode=false
+{% endhighlight %}
+
+
+{% highlight bash %}
+$ mvn archetype:generate \
+-DarchetypeGroupId=org.apache.flink \
+-DarchetypeArtifactId=flink-walkthrough-datastream-scala \{% unless 
site.is_stable %}
+
-DarchetypeCatalog=https://repository.apache.org/content/repositories/snapshots/
 \{% endunless %}
+-DarchetypeVersion={{ site.version }} \
+-DgroupId=frauddetection \
+-DartifactId=frauddetection \
+-Dversion=0.1 \
+-Dpackage=spendreport \
+-DinteractiveMode=false
+{% endhighlight %}
+
+
+
+{% unless site.is_stable %}
+
+Note: For Maven 3.0 or higher, it is no longer possible to specify 
the repository (-DarchetypeCatalog) via the commandline. If you wish to use the 
snapshot repository, you need to add a repository entry to your settings.xml. 
For details about this change, please refer to http://maven.apache.org/archetype/maven-archetype-plugin/archetype-repository.html;>Maven
 official document
+
+{% endunless %}
+
+You can edit the `groupId`, `artifactId` and `package` if you like. With the 
above parameters,
+Maven will create a project with all the dependencies to complete this 
tutorial.
+After importing the project into your editor, you will see a file with the 
following code which you can run directly inside your IDE.
+
+
+
+ FraudDetectionJob.java
+
+{% highlight java %}
+package frauddetection;
+
+import org.apache.flink.api.common.state.ValueState;
+import org.apache.flink.api.common.state.ValueStateDescriptor;
+import org.apache.flink.api.common.typeinfo.Types;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.functions.KeyedProcessFunction;
+import org.apache.flink.streaming.api.functions.sink.PrintSinkFunction;
+import org.apache.flink.util.Collector;
+import org.apache.flink.walkthrough.common.entity.Alert;
+import org.apache.flink.walkthrough.common.entity.Transaction;
+import org.apache.flink.walkthrough.common.source.TransactionSource;
+
+public class FraudDetectionJob {
+
+public static void main(String[] args) throws Exception {
+StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
+
+DataStream transactions = env
+.addSource(new TransactionSource())
+.name("transactions");
+
+

[GitHub] [flink] flinkbot edited a comment on issue #9467: [FLINK-9941][ScalaAPI] Flush in ScalaCsvOutputFormat before close

2019-08-16 Thread GitBox
flinkbot edited a comment on issue #9467: [FLINK-9941][ScalaAPI] Flush in 
ScalaCsvOutputFormat before close
URL: https://github.com/apache/flink/pull/9467#issuecomment-522053886
 
 
   ## CI report:
   
   * 009da497d7c551ba854dc7ed8fa658f2acd6d6ce : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/123529451)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-13752) TaskDeploymentDescriptor cannot be recycled by GC due to referenced by an anonymous function

2019-08-16 Thread Yun Gao (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Gao updated FLINK-13752:

Description: 
When comparing the 1.8 and 1.9.0-rc2 on a test streaming job, we found that the 
performance on 1.9.0-rc2 is much lower than that of 1.8. By comparing the two 
versions, we found that the count of Full GC of TaskExecutor process on 
1.9.0-rc2 is much more than that on 1.8.

A further analysis found that the difference is due to in 
_TaskExecutor#setupResultPartitionBookkeeping_, the anonymous function in 
_taskTermimationWithResourceCleanFuture_ has referenced the 
_TaskDeploymentDescriptor_, since this function will be kept till the task is 
terminated,  _TaskDeploymentDescriptor_ will also be kept referenced in the 
closure and cannot be recycled by GC. In this job, _TaskDeploymentDescriptor_ 
of some tasks are as large as 10M, and the total heap is about 113M, thus the 
kept _TaskDeploymentDescriptors_ will cause relatively large impact on GC and 
performance.

  was:
When comparing the 1.8 and 1.9.0-rc2 on a test streaming job, we found that the 
performance on 1.9.0-rc2 is much lower than that of 1.8. By comparing the two 
versions, we found that the count of Full GC on 1.9.0-rc2 is much more than 
that on 1.8.

A further analysis found that the difference is due to in 
_TaskExecutor#setupResultPartitionBookkeeping_, the anonymous function in 
_taskTermimationWithResourceCleanFuture_ has referenced the 
_TaskDeploymentDescriptor_, since this function will be kept till the task is 
terminated,  _TaskDeploymentDescriptor_ will also be kept referenced in the 
closure and cannot be recycled by GC. In this job, _TaskDeploymentDescriptor_ 
of some tasks are as large as 10M, and the total heap is about 113M, thus the 
kept _TaskDeploymentDescriptors_ will cause relatively large impact on GC and 
performance.


> TaskDeploymentDescriptor cannot be recycled by GC due to referenced by an 
> anonymous function
> 
>
> Key: FLINK-13752
> URL: https://issues.apache.org/jira/browse/FLINK-13752
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.9.0
>Reporter: Yun Gao
>Priority: Major
>
> When comparing the 1.8 and 1.9.0-rc2 on a test streaming job, we found that 
> the performance on 1.9.0-rc2 is much lower than that of 1.8. By comparing the 
> two versions, we found that the count of Full GC of TaskExecutor process on 
> 1.9.0-rc2 is much more than that on 1.8.
> A further analysis found that the difference is due to in 
> _TaskExecutor#setupResultPartitionBookkeeping_, the anonymous function in 
> _taskTermimationWithResourceCleanFuture_ has referenced the 
> _TaskDeploymentDescriptor_, since this function will be kept till the task is 
> terminated,  _TaskDeploymentDescriptor_ will also be kept referenced in the 
> closure and cannot be recycled by GC. In this job, _TaskDeploymentDescriptor_ 
> of some tasks are as large as 10M, and the total heap is about 113M, thus the 
> kept _TaskDeploymentDescriptors_ will cause relatively large impact on GC and 
> performance.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13752) TaskDeploymentDescriptor cannot be recycled by GC due to referenced by an anonymous function

2019-08-16 Thread Yun Gao (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Gao updated FLINK-13752:

Description: 
When comparing the 1.8 and 1.9.0-rc2 on a test streaming job, we found that the 
performance on 1.9.0-rc2 is much lower than that of 1.8. By comparing the two 
versions, we found that the count of Full GC on 1.9.0-rc2 is much more than 
that on 1.8.

A further analysis found that the difference is due to in 
_TaskExecutor#setupResultPartitionBookkeeping_, the anonymous function in 
_taskTermimationWithResourceCleanFuture_ has referenced the 
_TaskDeploymentDescriptor_, since this function will be kept till the task is 
terminated,  _TaskDeploymentDescriptor_ will also be kept referenced in the 
closure and cannot be recycled by GC. In this job, _TaskDeploymentDescriptor_ 
of some tasks are as large as 10M, and the total heap is about 113M, thus the 
kept _TaskDeploymentDescriptors_ will cause relatively large impact on GC and 
performance.

  was:
When comparing the 1.8 and 1.9.0-rc2 on a test streaming job, we found that the 
performance on 1.9.0-rc2 is much lower than that of 1.8. By comparing the two 
versions, we found that the count of Full GC on 1.9.0-rc2 is much more than 
that on 1.8.

A further analysis found that the difference is due to in 
TaskExecutor#setupResultPartitionBookkeeping, the anonymous function in 
taskTermimationWithResourceCleanFuture has referenced the 
TaskDeploymentDescriptor, since this function will be kept till the task is 
terminated,  TaskDeploymentDescriptor will also be kept referenced in the 
closure and cannot be recycled by GC. In this job, TaskDeploymentDescriptor of 
some tasks are as large as 10M, and the total heap is about 113M, thus the kept 
TaskDeploymentDescriptors will cause relatively large impact on GC and 
performance.


> TaskDeploymentDescriptor cannot be recycled by GC due to referenced by an 
> anonymous function
> 
>
> Key: FLINK-13752
> URL: https://issues.apache.org/jira/browse/FLINK-13752
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.9.0
>Reporter: Yun Gao
>Priority: Major
>
> When comparing the 1.8 and 1.9.0-rc2 on a test streaming job, we found that 
> the performance on 1.9.0-rc2 is much lower than that of 1.8. By comparing the 
> two versions, we found that the count of Full GC on 1.9.0-rc2 is much more 
> than that on 1.8.
> A further analysis found that the difference is due to in 
> _TaskExecutor#setupResultPartitionBookkeeping_, the anonymous function in 
> _taskTermimationWithResourceCleanFuture_ has referenced the 
> _TaskDeploymentDescriptor_, since this function will be kept till the task is 
> terminated,  _TaskDeploymentDescriptor_ will also be kept referenced in the 
> closure and cannot be recycled by GC. In this job, _TaskDeploymentDescriptor_ 
> of some tasks are as large as 10M, and the total heap is about 113M, thus the 
> kept _TaskDeploymentDescriptors_ will cause relatively large impact on GC and 
> performance.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13752) TaskDeploymentDescriptor cannot be recycled by GC due to referenced by an anonymous function

2019-08-16 Thread Yun Gao (JIRA)
Yun Gao created FLINK-13752:
---

 Summary: TaskDeploymentDescriptor cannot be recycled by GC due to 
referenced by an anonymous function
 Key: FLINK-13752
 URL: https://issues.apache.org/jira/browse/FLINK-13752
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.9.0
Reporter: Yun Gao


When comparing the 1.8 and 1.9.0-rc2 on a test streaming job, we found that the 
performance on 1.9.0-rc2 is much lower than that of 1.8. By comparing the two 
versions, we found that the count of Full GC on 1.9.0-rc2 is much more than 
that on 1.8.

A further analysis found that the difference is due to in 
TaskExecutor#setupResultPartitionBookkeeping, the anonymous function in 
taskTermimationWithResourceCleanFuture has referenced the 
TaskDeploymentDescriptor, since this function will be kept till the task is 
terminated,  TaskDeploymentDescriptor will also be kept referenced in the 
closure and cannot be recycled by GC. In this job, TaskDeploymentDescriptor of 
some tasks are as large as 10M, and the total heap is about 113M, thus the kept 
TaskDeploymentDescriptors will cause relatively large impact on GC and 
performance.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-4256) Fine-grained recovery

2019-08-16 Thread Thomas Weise (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909157#comment-16909157
 ] 

Thomas Weise commented on FLINK-4256:
-

I'm surprised to see this closed when there is still no support for streaming? 
Please see 
[https://lists.apache.org/thread.html/cdb315f4b71a915c4c598b580f71cad11bc3f5bd146b916378765f2a@%3Cdev.flink.apache.org%3E]

 

> Fine-grained recovery
> -
>
> Key: FLINK-4256
> URL: https://issues.apache.org/jira/browse/FLINK-4256
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.1.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
>Priority: Major
> Fix For: 1.9.0
>
>
> When a task fails during execution, Flink currently resets the entire 
> execution graph and triggers complete re-execution from the last completed 
> checkpoint. This is more expensive than just re-executing the failed tasks.
> In many cases, more fine-grained recovery is possible.
> The full description and design is in the corresponding FLIP.
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-1+%3A+Fine+Grained+Recovery+from+Task+Failures
> The detail desgin for version1 is 
> https://docs.google.com/document/d/1_PqPLA1TJgjlqz8fqnVE3YSisYBDdFsrRX_URgRSj74/edit#



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] flinkbot edited a comment on issue #9466: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies

2019-08-16 Thread GitBox
flinkbot edited a comment on issue #9466: [FLINK-13737][flink-dist] Added 
examples-table to flink-dist dependencies
URL: https://github.com/apache/flink/pull/9466#issuecomment-522028628
 
 
   ## CI report:
   
   * aec1d92adaaf5fd75eb673d23c51c570c2425587 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/123518971)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (FLINK-13737) flink-dist should add provided dependency on flink-examples-table

2019-08-16 Thread Dawid Wysakowicz (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz closed FLINK-13737.

Resolution: Fixed

Fixed:
master: 95e3f88f08b652d01583410f9a8bb73b0d891602
1.9: e598d76418d438bcca9e7b8c50e6adc82347e561

> flink-dist should add provided dependency on flink-examples-table
> -
>
> Key: FLINK-13737
> URL: https://issues.apache.org/jira/browse/FLINK-13737
> Project: Flink
>  Issue Type: Bug
>  Components: Examples
>Affects Versions: 1.9.0
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In FLINK-13558 we changed the `flink-dist/bin.xml` to also include 
> flink-examples-table in the binary distribution. The flink-dist module though 
> does not depend on the flink-examples-table.
> If only the flink-dist module is built with its dependencies (this happens in 
> the release scripts). The table examples are not built and thus not included 
> in the distribution



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13737) flink-dist should add provided dependency on flink-examples-table

2019-08-16 Thread Dawid Wysakowicz (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz updated FLINK-13737:
-
Issue Type: Bug  (was: Improvement)

> flink-dist should add provided dependency on flink-examples-table
> -
>
> Key: FLINK-13737
> URL: https://issues.apache.org/jira/browse/FLINK-13737
> Project: Flink
>  Issue Type: Bug
>  Components: Examples
>Affects Versions: 1.9.0
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In FLINK-13558 we changed the `flink-dist/bin.xml` to also include 
> flink-examples-table in the binary distribution. The flink-dist module though 
> does not depend on the flink-examples-table.
> If only the flink-dist module is built with its dependencies (this happens in 
> the release scripts). The table examples are not built and thus not included 
> in the distribution



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] dawidwys merged pull request #9466: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies

2019-08-16 Thread GitBox
dawidwys merged pull request #9466: [FLINK-13737][flink-dist] Added 
examples-table to flink-dist dependencies
URL: https://github.com/apache/flink/pull/9466
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-13737) flink-dist should add provided dependency on flink-examples-table

2019-08-16 Thread Dawid Wysakowicz (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz updated FLINK-13737:
-
Fix Version/s: (was: 1.9.1)
   (was: 1.10.0)
   1.9.0

> flink-dist should add provided dependency on flink-examples-table
> -
>
> Key: FLINK-13737
> URL: https://issues.apache.org/jira/browse/FLINK-13737
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 1.9.0
>Reporter: Dawid Wysakowicz
>Assignee: Dawid Wysakowicz
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In FLINK-13558 we changed the `flink-dist/bin.xml` to also include 
> flink-examples-table in the binary distribution. The flink-dist module though 
> does not depend on the flink-examples-table.
> If only the flink-dist module is built with its dependencies (this happens in 
> the release scripts). The table examples are not built and thus not included 
> in the distribution



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] dawidwys merged pull request #9465: [FLINK-13737][flink-dist][bp-1.9] Added examples-table to flink-dist dependencies

2019-08-16 Thread GitBox
dawidwys merged pull request #9465: [FLINK-13737][flink-dist][bp-1.9] Added 
examples-table to flink-dist dependencies
URL: https://github.com/apache/flink/pull/9465
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9465: [FLINK-13737][flink-dist][bp-1.9] Added examples-table to flink-dist dependencies

2019-08-16 Thread GitBox
flinkbot edited a comment on issue #9465: [FLINK-13737][flink-dist][bp-1.9] 
Added examples-table to flink-dist dependencies
URL: https://github.com/apache/flink/pull/9465#issuecomment-522028577
 
 
   ## CI report:
   
   * 095ba0a98f00367052feb9472c83a292a72fa98a : SUCCESS 
[Build](https://travis-ci.com/flink-ci/flink/builds/123519029)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] Lemonjing commented on issue #9467: [FLINK-9941][ScalaAPI] Flush in ScalaCsvOutputFormat before close

2019-08-16 Thread GitBox
Lemonjing commented on issue #9467: [FLINK-9941][ScalaAPI] Flush in 
ScalaCsvOutputFormat before close
URL: https://github.com/apache/flink/pull/9467#issuecomment-522054960
 
 
   @flinkbot attention @yanghua @buptljy 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot commented on issue #9467: [FLINK-9941][ScalaAPI] Flush in ScalaCsvOutputFormat before close

2019-08-16 Thread GitBox
flinkbot commented on issue #9467: [FLINK-9941][ScalaAPI] Flush in 
ScalaCsvOutputFormat before close
URL: https://github.com/apache/flink/pull/9467#issuecomment-522053886
 
 
   ## CI report:
   
   * 009da497d7c551ba854dc7ed8fa658f2acd6d6ce : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123529451)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] Lemonjing closed pull request #6411: [FLINK-9941][ScalaAPI] Flush in ScalaCsvOutputFormat before close

2019-08-16 Thread GitBox
Lemonjing closed pull request #6411: [FLINK-9941][ScalaAPI] Flush in 
ScalaCsvOutputFormat before close
URL: https://github.com/apache/flink/pull/6411
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot commented on issue #9467: [FLINK-9941][ScalaAPI] Flush in ScalaCsvOutputFormat before close

2019-08-16 Thread GitBox
flinkbot commented on issue #9467: [FLINK-9941][ScalaAPI] Flush in 
ScalaCsvOutputFormat before close
URL: https://github.com/apache/flink/pull/9467#issuecomment-522052035
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 009da497d7c551ba854dc7ed8fa658f2acd6d6ce (Fri Aug 16 
15:33:57 UTC 2019)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] Lemonjing commented on issue #6411: [FLINK-9941][ScalaAPI] Flush in ScalaCsvOutputFormat before close

2019-08-16 Thread GitBox
Lemonjing commented on issue #6411: [FLINK-9941][ScalaAPI] Flush in 
ScalaCsvOutputFormat before close
URL: https://github.com/apache/flink/pull/6411#issuecomment-522052052
 
 
   because this pr code is too late from master (fork a year go, flink 1.5) , 
it can't pass ci.  So I pull the latest flink project and create a new pr.  
https://github.com/apache/flink/pull/9467


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] Lemonjing opened a new pull request #9467: [FLINK-9941][ScalaAPI] Flush in ScalaCsvOutputFormat before close

2019-08-16 Thread GitBox
Lemonjing opened a new pull request #9467: [FLINK-9941][ScalaAPI] Flush in 
ScalaCsvOutputFormat before close
URL: https://github.com/apache/flink/pull/9467
 
 
   
   
   ## What is the purpose of the change
   This pull request update scala api `ScalaCsvOutputFormat` to add flush 
method before close to avoid inconsistent result with java `CsvOutputFormat` 
when auto flush not call in some io streams.
   
   
   ## Brief change log
   Add flush method before close method in `ScalaCsvOutputFormat` for scala API.
   
   ## Verifying this change
   
   Add `ScalaCsvOutputFormatTest` and test passed.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (**yes** / no)
 - The serializers: (yes / **no**/ don't know)
 - The runtime per-record code paths (performance sensitive): (yes /  
**no** / don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes /  **no** / don't know)
 - The S3 file system connector: (yes /  **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes /  **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #9072: [FLINK-11630] Wait for the termination of all running Tasks when shutting down TaskExecutor

2019-08-16 Thread GitBox
flinkbot edited a comment on issue #9072: [FLINK-11630] Wait for the 
termination of all running Tasks when shutting down TaskExecutor
URL: https://github.com/apache/flink/pull/9072#issuecomment-512425985
 
 
   ## CI report:
   
   * cd5ad8d23046c1025f7f9865e60fc3d048fd1f85 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/119513431)
   * 9b6f8121f909e75b1230bb0e220e8b5ac534a5ff : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/123389936)
   * ad8ea8a540ba4385548c1eee37fab5d3970d3daa : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123393714)
   * 56241d01bcc82692a8ceb3add3117fe1f8cbb58e : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123410346)
   * 581be904a2775cf6f42013afe0ef6fbf35658b6b : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123478228)
   * 181a7e4a5016f171ced04468996956503564c64f : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123510415)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Comment Edited] (FLINK-13056) Optimize region failover performance on calculating vertices to restart

2019-08-16 Thread Zhu Zhu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909096#comment-16909096
 ] 

Zhu Zhu edited comment on FLINK-13056 at 8/16/19 2:45 PM:
--

The diff can be found at 
[https://github.com/zhuzhurk/flink/commit/4f7da57b218e9ccd86f468f9ece62ee1e378ceda].

Need to mention that this diff is based on the initial version of 
flip1.RestartPipelinedRegionStrategy. So it cannot be applied to latest 
flip1.RestartPipelinedRegionStrategy directly, as the region building was 
refactored out from it later(for partition releasing).

The perf test case(RegionFailoverPerfTest#complexPerfTest) used can be found in 
the same branch.

Agree that it's better to make this optimization configurable, as it has side 
effects.


was (Author: zhuzh):
The diff can be found at 
[https://github.com/zhuzhurk/flink/commit/4f7da57b218e9ccd86f468f9ece62ee1e378ceda].

Need to mention that this diff is based on the initial version of 
flip1.RestartPipelinedRegionStrategy. So it cannot be applied to latest 
flip1.RestartPipelinedRegionStrategy directly, as the region building was 
refactored out from it later(for partition releasing).

The perf test case(RegionFailoverPerfTest#complexPerfTest) used can be found in 
the same branch.

> Optimize region failover performance on calculating vertices to restart
> ---
>
> Key: FLINK-13056
> URL: https://issues.apache.org/jira/browse/FLINK-13056
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.9.0
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>
> Currently some region boundary structures are calculated each time of a 
> region failover. This calculation can be heavy as its complexity goes up with 
> execution edge count.
> We tested it in a sample case with 8000 vertices and 16,000,000 edges. It 
> takes ~2.0s to calculate vertices to restart.
> (more details in 
> [https://docs.google.com/document/d/197Ou-01h2obvxq8viKqg4FnOnsykOEKxk3r5WrVBPuA/edit?usp=sharing)]
> That's why we'd propose to cache the region boundary structures to improve 
> the region failover performance.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13056) Optimize region failover performance on calculating vertices to restart

2019-08-16 Thread Zhu Zhu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909096#comment-16909096
 ] 

Zhu Zhu commented on FLINK-13056:
-

The diff can be found at 
[https://github.com/zhuzhurk/flink/commit/4f7da57b218e9ccd86f468f9ece62ee1e378ceda].

Need to mention that this diff is based on the initial version of 
flip1.RestartPipelinedRegionStrategy. So it cannot be applied to latest 
flip1.RestartPipelinedRegionStrategy directly, as the region building was 
refactored out from it later(for partition releasing).

The perf test case(RegionFailoverPerfTest#complexPerfTest) used can be found in 
the same branch.

> Optimize region failover performance on calculating vertices to restart
> ---
>
> Key: FLINK-13056
> URL: https://issues.apache.org/jira/browse/FLINK-13056
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.9.0
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
>
> Currently some region boundary structures are calculated each time of a 
> region failover. This calculation can be heavy as its complexity goes up with 
> execution edge count.
> We tested it in a sample case with 8000 vertices and 16,000,000 edges. It 
> takes ~2.0s to calculate vertices to restart.
> (more details in 
> [https://docs.google.com/document/d/197Ou-01h2obvxq8viKqg4FnOnsykOEKxk3r5WrVBPuA/edit?usp=sharing)]
> That's why we'd propose to cache the region boundary structures to improve 
> the region failover performance.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] tzulitai commented on issue #9466: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies

2019-08-16 Thread GitBox
tzulitai commented on issue #9466: [FLINK-13737][flink-dist] Added 
examples-table to flink-dist dependencies
URL: https://github.com/apache/flink/pull/9466#issuecomment-522031403
 
 
   Have verified that this works  
   +1 to merge (as well as the backport)
   
   Thanks @dawidwys!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Issue Comment Deleted] (FLINK-13143) Refactor CheckpointExceptionHandler relevant classes

2019-08-16 Thread vinoyang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated FLINK-13143:
-
Comment: was deleted

(was: [~pnowojski] you are right. The {{CheckpointExceptionHandler}} and 
{{CheckpointExceptionHandlerFactory}} are legacy classes. They are not 
necessary since FLINK-11662, we can remove them, let the related code more 
clean. cc [~till.rohrmann])

> Refactor CheckpointExceptionHandler relevant classes
> 
>
> Key: FLINK-13143
> URL: https://issues.apache.org/jira/browse/FLINK-13143
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.9.0
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> Since FLINK-11662 has been merged, we can clear 
> {{CheckpointExceptionHandler}} relevant classes.
> {{CheckpointExceptionHandler}} used to implement 
> {{setFailOnCheckpointingErrors}}. Now, it has only one implementation which 
> is {{DecliningCheckpointExceptionHandler}}.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-12514) Refactor the failure checkpoint counting mechanism with ordered checkpoint id

2019-08-16 Thread vinoyang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-12514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909089#comment-16909089
 ] 

vinoyang commented on FLINK-12514:
--

[~pnowojski] I will give a more detailed description. I said this is a 
refactor, because I have implemented a simple counting mechanism based on 
{{AtomicInteger}}. The context of this idea comes from the PR of FLINK-12364, 
Stefan proposed it.

Whatever, I totally agree with your comment. And rework the title and 
description.

> Refactor the failure checkpoint counting mechanism with ordered checkpoint id
> -
>
> Key: FLINK-12514
> URL: https://issues.apache.org/jira/browse/FLINK-12514
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.9.0
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, the checkpoint failure manager uses a simple counting mechanism 
> which does not tract checkpoint id sequence.
> However, a more graceful counting mechanism is based on ordered checkpoint id 
> sequence.
> It should be refactored after the FLINK-12364 would been merged.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] flinkbot commented on issue #9466: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies

2019-08-16 Thread GitBox
flinkbot commented on issue #9466: [FLINK-13737][flink-dist] Added 
examples-table to flink-dist dependencies
URL: https://github.com/apache/flink/pull/9466#issuecomment-522028628
 
 
   ## CI report:
   
   * aec1d92adaaf5fd75eb673d23c51c570c2425587 : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123518971)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot commented on issue #9465: [FLINK-13737][flink-dist][bp-1.9] Added examples-table to flink-dist dependencies

2019-08-16 Thread GitBox
flinkbot commented on issue #9465: [FLINK-13737][flink-dist][bp-1.9] Added 
examples-table to flink-dist dependencies
URL: https://github.com/apache/flink/pull/9465#issuecomment-522028577
 
 
   ## CI report:
   
   * 095ba0a98f00367052feb9472c83a292a72fa98a : PENDING 
[Build](https://travis-ci.com/flink-ci/flink/builds/123519029)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] 1u0 commented on a change in pull request #9383: [FLINK-13248] [runtime] Adding processing of downstream messages in AsyncWaitOperator's wait loops

2019-08-16 Thread GitBox
1u0 commented on a change in pull request #9383: [FLINK-13248] [runtime] Adding 
processing of downstream messages in AsyncWaitOperator's wait loops
URL: https://github.com/apache/flink/pull/9383#discussion_r314741672
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/mailbox/TaskMailboxImpl.java
 ##
 @@ -251,4 +278,74 @@ public void quiesce() {
public State getState() {
return state;
}
+
+   @Override
+   public Mailbox getDownstreamMailbox(int priority) {
+   return new DownstreamMailbox(priority);
+   }
+
+   class DownstreamMailbox implements Mailbox {
+   private final int priority;
+
+   DownstreamMailbox(int priority) {
+   this.priority = priority;
+   }
+
+   @Override
+   public boolean hasMail() {
+   lock.lock();
+   try {
+   for (Mail mail : queue) {
+   if (mail.getOperatorIndex() >= 
priority) {
+   return true;
+   }
+   }
+   return false;
+   } finally {
+   lock.unlock();
+   }
+   }
+
+   @Override
+   public Optional tryTakeMail() throws 
MailboxStateException {
+   return tryTakeDownstreamMail(priority);
+   }
+
+   @Nonnull
+   @Override
+   public Runnable takeMail() throws InterruptedException, 
MailboxStateException {
+   return takeDownstreamMail(priority);
+   }
+
+   @Override
+   public void putMail(@Nonnull Runnable letter) throws 
MailboxStateException {
+   TaskMailboxImpl.this.putMail(letter, priority);
+   }
+
+   @Override
+   public void putFirst(@Nonnull Runnable priorityLetter) throws 
MailboxStateException {
+   TaskMailboxImpl.this.putFirst(priorityLetter, priority);
+   }
+   }
+
+   /**
+* An executable bound to a specific operator in the chain, such that 
it can be picked for downstream mailbox.
+*/
+   static class Mail {
+   private final Runnable runnable;
+   private final int operatorIndex;
 
 Review comment:
   Rename the field (and the corresponding method) to `priority`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] 1u0 commented on a change in pull request #9383: [FLINK-13248] [runtime] Adding processing of downstream messages in AsyncWaitOperator's wait loops

2019-08-16 Thread GitBox
1u0 commented on a change in pull request #9383: [FLINK-13248] [runtime] Adding 
processing of downstream messages in AsyncWaitOperator's wait loops
URL: https://github.com/apache/flink/pull/9383#discussion_r314741224
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/mailbox/TaskMailboxImpl.java
 ##
 @@ -251,4 +278,74 @@ public void quiesce() {
public State getState() {
return state;
}
+
+   @Override
+   public Mailbox getDownstreamMailbox(int priority) {
+   return new DownstreamMailbox(priority);
 
 Review comment:
   Suggestion:
   * make the `-1` constant (in `private DownstreamMailbox downstreamMailbox = 
new DownstreamMailbox(-1);`) a named `static final` field (for example 
`MIN_PRIORITY`);
   * and add validation check here that `checkArgument(priority > 
MIN_PRIORITY)`.
   
   Similarly, you can also add a `MAX_PRIORITY` constant (for example, it can 
be used for task cancellation letter) and additionally check here, that 
operator's priority is `< MAX_PRIORITY`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] 1u0 commented on a change in pull request #9383: [FLINK-13248] [runtime] Adding processing of downstream messages in AsyncWaitOperator's wait loops

2019-08-16 Thread GitBox
1u0 commented on a change in pull request #9383: [FLINK-13248] [runtime] Adding 
processing of downstream messages in AsyncWaitOperator's wait loops
URL: https://github.com/apache/flink/pull/9383#discussion_r314742243
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java
 ##
 @@ -923,6 +919,17 @@ public String toString() {
return getName();
}
 
+   /**
+* Returns a new view on the MailboxExecutor for the given priority.
+* A h
 
 Review comment:
   Unfinished sentence...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-13143) Refactor CheckpointExceptionHandler relevant classes

2019-08-16 Thread vinoyang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909086#comment-16909086
 ] 

vinoyang commented on FLINK-13143:
--

[~pnowojski] you are right. The {{CheckpointExceptionHandler}} and 
{{CheckpointExceptionHandlerFactory}} are legacy classes. They are not 
necessary since FLINK-11662, we can remove them, let the related code cleaner. 
cc [~till.rohrmann]

> Refactor CheckpointExceptionHandler relevant classes
> 
>
> Key: FLINK-13143
> URL: https://issues.apache.org/jira/browse/FLINK-13143
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.9.0
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> Since FLINK-11662 has been merged, we can clear 
> {{CheckpointExceptionHandler}} relevant classes.
> {{CheckpointExceptionHandler}} used to implement 
> {{setFailOnCheckpointingErrors}}. Now, it has only one implementation which 
> is {{DecliningCheckpointExceptionHandler}}.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] flinkbot commented on issue #9466: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies

2019-08-16 Thread GitBox
flinkbot commented on issue #9466: [FLINK-13737][flink-dist] Added 
examples-table to flink-dist dependencies
URL: https://github.com/apache/flink/pull/9466#issuecomment-522026797
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit aec1d92adaaf5fd75eb673d23c51c570c2425587 (Fri Aug 16 
14:22:29 UTC 2019)
   
   **Warnings:**
* **1 pom.xml files were touched**: Check for build and licensing issues.
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-13143) Refactor CheckpointExceptionHandler relevant classes

2019-08-16 Thread vinoyang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909084#comment-16909084
 ] 

vinoyang commented on FLINK-13143:
--

[~pnowojski] you are right. The {{CheckpointExceptionHandler}} and 
{{CheckpointExceptionHandlerFactory}} are legacy classes. They are not 
necessary since FLINK-11662, we can remove them, let the related code more 
clean. cc [~till.rohrmann]

> Refactor CheckpointExceptionHandler relevant classes
> 
>
> Key: FLINK-13143
> URL: https://issues.apache.org/jira/browse/FLINK-13143
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.9.0
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> Since FLINK-11662 has been merged, we can clear 
> {{CheckpointExceptionHandler}} relevant classes.
> {{CheckpointExceptionHandler}} used to implement 
> {{setFailOnCheckpointingErrors}}. Now, it has only one implementation which 
> is {{DecliningCheckpointExceptionHandler}}.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[GitHub] [flink] dawidwys commented on issue #9466: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies

2019-08-16 Thread GitBox
dawidwys commented on issue #9466: [FLINK-13737][flink-dist] Added 
examples-table to flink-dist dependencies
URL: https://github.com/apache/flink/pull/9466#issuecomment-522026008
 
 
   @tzulitai Could you have a look?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot commented on issue #9465: [FLINK-13737][flink-dist][bp-1.9] Added examples-table to flink-dist dependencies

2019-08-16 Thread GitBox
flinkbot commented on issue #9465: [FLINK-13737][flink-dist][bp-1.9] Added 
examples-table to flink-dist dependencies
URL: https://github.com/apache/flink/pull/9465#issuecomment-522026202
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 095ba0a98f00367052feb9472c83a292a72fa98a (Fri Aug 16 
14:20:56 UTC 2019)
   
   **Warnings:**
* **1 pom.xml files were touched**: Check for build and licensing issues.
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot edited a comment on issue #8631: [FLINK-12745][ml] add sparse and dense vector class, and dense matrix class with basic operations.

2019-08-16 Thread GitBox
flinkbot edited a comment on issue #8631: [FLINK-12745][ml] add sparse and 
dense vector class, and dense matrix class with basic operations.
URL: https://github.com/apache/flink/pull/8631#issuecomment-517015005
 
 
   ## CI report:
   
   * 98f0cec3deff65ebe316b8d3c13b51470d079b65 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121481703)
   * 159994a4bc63a67609ede71a6465c8c85db4d3a3 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/121716307)
   * 464c2960c6de4bca741d937cadacea0a44541fe6 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/122419396)
   * 4ee4e5fc9e29c75274dec332f2df2ff35bd7c208 : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/122442614)
   * c23d8ac2646f0b1f153d0dfb2950c53830838696 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/122445831)
   * 08cb1e6d6832e3bce5273831494542e39c9d56fd : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/122751123)
   * b3430bab7f70c17284f1db1245e75a0aa27184ed : CANCELED 
[Build](https://travis-ci.com/flink-ci/flink/builds/123504908)
   * 9f68413743a4f2fb6db4ebb6dea7255b7cc26dd0 : FAILURE 
[Build](https://travis-ci.com/flink-ci/flink/builds/123506746)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] dawidwys opened a new pull request #9466: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies

2019-08-16 Thread GitBox
dawidwys opened a new pull request #9466: [FLINK-13737][flink-dist] Added 
examples-table to flink-dist dependencies
URL: https://github.com/apache/flink/pull/9466
 
 
   ## What is the purpose of the change
   
   This PR adds examples-table to flink-dist dependencies. In 
https://issues.apache.org/jira/browse/FLINK-13558 we added table examples to 
the distribution package, but forgot to add it to the build dependencies of 
flink-dist.
   
   
   ## Verifying this change
   
   Run
   
   ```
   mvn clean install -pl flink-dist -am
   ```
   
   and see that the distribution contain table examples.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (**yes** / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] dawidwys commented on issue #9465: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies

2019-08-16 Thread GitBox
dawidwys commented on issue #9465: [FLINK-13737][flink-dist] Added 
examples-table to flink-dist dependencies
URL: https://github.com/apache/flink/pull/9465#issuecomment-522025623
 
 
   @tzulitai Could you have a look?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   3   4   >