[GitHub] [flink] rmetzger commented on issue #9462: [FLINK-13738][blink-table-planner] Fix NegativeArraySizeException in LongHybridHashTable
rmetzger commented on issue #9462: [FLINK-13738][blink-table-planner] Fix NegativeArraySizeException in LongHybridHashTable URL: https://github.com/apache/flink/pull/9462#issuecomment-521969040 I will test this change with my sample job. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9462: [FLINK-13738][blink-table-planner] Fix NegativeArraySizeException in LongHybridHashTable
flinkbot edited a comment on issue #9462: [FLINK-13738][blink-table-planner] Fix NegativeArraySizeException in LongHybridHashTable URL: https://github.com/apache/flink/pull/9462#issuecomment-521948849 ## CI report: * 01c5d282d80106ec89eb594ab7ba92f28ad857b0 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/123483709) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9460: [FLINK-13747][hive] Remove some TODOs in Hive connector
flinkbot edited a comment on issue #9460: [FLINK-13747][hive] Remove some TODOs in Hive connector URL: https://github.com/apache/flink/pull/9460#issuecomment-521923251 ## CI report: * 0e8bb81aa0a13dd2836ed2ed9b3effe5552fd5e2 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/123474060) * 3ef6b6029110e8e353b290d28854770102f95617 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/123498860) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #7751: [FLINK-11608] [docs] Translate the "Local Setup Tutorial" page into Chinese
flinkbot edited a comment on issue #7751: [FLINK-11608] [docs] Translate the "Local Setup Tutorial" page into Chinese URL: https://github.com/apache/flink/pull/7751#issuecomment-517135258 ## CI report: * 82afc481b4eb377bbcec2b9bac53d013a1ba1666 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/121522086) * 27b87b30704e4121dfc41adceda8720f465fd9f2 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/123167209) * 627ce02f4b30278e0d67a48e775666bdd4698f6b : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/123483738) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] Lemonjing commented on issue #6411: [FLINK-9941][ScalaAPI] Flush in ScalaCsvOutputFormat before close
Lemonjing commented on issue #6411: [FLINK-9941][ScalaAPI] Flush in ScalaCsvOutputFormat before close URL: https://github.com/apache/flink/pull/6411#issuecomment-521981874 @rmetzger I noticed that the problem is still unresolved, can u help to review it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-4463) FLIP-3: Restructure Documentation
[ https://issues.apache.org/jira/browse/FLINK-4463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909017#comment-16909017 ] Robert Metzger commented on FLINK-4463: --- I would propose to close all subtasks as Invalid with a comment that FLIP-3 has been completed. > FLIP-3: Restructure Documentation > - > > Key: FLINK-4463 > URL: https://issues.apache.org/jira/browse/FLINK-4463 > Project: Flink > Issue Type: Improvement > Components: Documentation >Reporter: Ufuk Celebi >Priority: Major > > Super issue to track progress for > [FLIP-3|https://cwiki.apache.org/confluence/display/FLINK/FLIP-3+-+Organization+of+Documentation]. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] tillrohrmann closed pull request #9461: [FLINK-9900][tests] Harden ZooKeeperHighAvailabilityITCase
tillrohrmann closed pull request #9461: [FLINK-9900][tests] Harden ZooKeeperHighAvailabilityITCase URL: https://github.com/apache/flink/pull/9461 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] xuyang1706 opened a new pull request #9464: [FLINK-13751][ml] Add Built-in vector types
xuyang1706 opened a new pull request #9464: [FLINK-13751][ml] Add Built-in vector types URL: https://github.com/apache/flink/pull/9464 ## What is the purpose of the change Built-in vector types is the TypeInformation of Vector, DenseVector and SparseVector. The class contains the mapping of the TypeInformation and its String representation. ## Brief change log - *Add the class of the Built-in vector types* - *Add the test cases* ## Verifying this change This change added tests and can be verified as follows: - run test case pass ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (yes) - If yes, how is the feature documented? (JavaDocs) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-13751) Add Built-in vector types
[ https://issues.apache.org/jira/browse/FLINK-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-13751: --- Labels: pull-request-available (was: ) > Add Built-in vector types > - > > Key: FLINK-13751 > URL: https://issues.apache.org/jira/browse/FLINK-13751 > Project: Flink > Issue Type: Sub-task > Components: Library / Machine Learning >Reporter: Xu Yang >Priority: Major > Labels: pull-request-available > > Built-in vector types is the TypeInformation of Vector, DenseVector and > SparseVector. The class contains the mapping of the TypeInformation and its > String representation. > * Add the class of the Built-in vector types > * Add the test cases -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] flinkbot edited a comment on issue #9366: [FLINK-13359][docs] Add documentation for DDL introduction
flinkbot edited a comment on issue #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#issuecomment-518524777 ## CI report: * f99e66ffb4356f8132b48d352b27686a6ad958f5 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/122058269) * 6e8dbc06ec17458f96c429e1c01a06afdf916c94 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/122379544) * 0f7f9a1e9388b136ce42c0b6ea407808b7a91b5d : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/123462820) * bcacb670cbb21eeb6250a6d2bffbac91692e7552 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/123488923) * 7b07fd74f3498b33c1ded1b2f05f0324d14ceec8 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/123492965) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #9465: [FLINK-13737][flink-dist][bp-1.9] Added examples-table to flink-dist dependencies
flinkbot commented on issue #9465: [FLINK-13737][flink-dist][bp-1.9] Added examples-table to flink-dist dependencies URL: https://github.com/apache/flink/pull/9465#issuecomment-522028577 ## CI report: * 095ba0a98f00367052feb9472c83a292a72fa98a : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/123519029) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #9466: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies
flinkbot commented on issue #9466: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies URL: https://github.com/apache/flink/pull/9466#issuecomment-522028628 ## CI report: * aec1d92adaaf5fd75eb673d23c51c570c2425587 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/123518971) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] tzulitai commented on issue #9466: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies
tzulitai commented on issue #9466: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies URL: https://github.com/apache/flink/pull/9466#issuecomment-522031403 Have verified that this works +1 to merge (as well as the backport) Thanks @dawidwys! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-13056) Optimize region failover performance on calculating vertices to restart
[ https://issues.apache.org/jira/browse/FLINK-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-13056: - Issue Type: Improvement (was: Sub-task) Parent: (was: FLINK-4256) > Optimize region failover performance on calculating vertices to restart > --- > > Key: FLINK-13056 > URL: https://issues.apache.org/jira/browse/FLINK-13056 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.9.0 >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > > Currently some region boundary structures are calculated each time of a > region failover. This calculation can be heavy as its complexity goes up with > execution edge count. > We tested it in a sample case with 8000 vertices and 16,000,000 edges. It > takes ~2.0s to calculate vertices to restart. > (more details in > [https://docs.google.com/document/d/197Ou-01h2obvxq8viKqg4FnOnsykOEKxk3r5WrVBPuA/edit?usp=sharing)] > That's why we'd propose to cache the region boundary structures to improve > the region failover performance. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13056) Optimize region failover performance on calculating vertices to restart
[ https://issues.apache.org/jira/browse/FLINK-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908955#comment-16908955 ] Chesnay Schepler commented on FLINK-13056: -- I'm removing this as a subtask of FLINK-4256 so we can mark that one as finished (since it's weird if a FLIP is marked as release but the JIRA still in progress). > Optimize region failover performance on calculating vertices to restart > --- > > Key: FLINK-13056 > URL: https://issues.apache.org/jira/browse/FLINK-13056 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Affects Versions: 1.9.0 >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > > Currently some region boundary structures are calculated each time of a > region failover. This calculation can be heavy as its complexity goes up with > execution edge count. > We tested it in a sample case with 8000 vertices and 16,000,000 edges. It > takes ~2.0s to calculate vertices to restart. > (more details in > [https://docs.google.com/document/d/197Ou-01h2obvxq8viKqg4FnOnsykOEKxk3r5WrVBPuA/edit?usp=sharing)] > That's why we'd propose to cache the region boundary structures to improve > the region failover performance. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Closed] (FLINK-4256) Fine-grained recovery
[ https://issues.apache.org/jira/browse/FLINK-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-4256. --- Resolution: Fixed Fix Version/s: 1.9.0 > Fine-grained recovery > - > > Key: FLINK-4256 > URL: https://issues.apache.org/jira/browse/FLINK-4256 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.1.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen >Priority: Major > Fix For: 1.9.0 > > > When a task fails during execution, Flink currently resets the entire > execution graph and triggers complete re-execution from the last completed > checkpoint. This is more expensive than just re-executing the failed tasks. > In many cases, more fine-grained recovery is possible. > The full description and design is in the corresponding FLIP. > https://cwiki.apache.org/confluence/display/FLINK/FLIP-1+%3A+Fine+Grained+Recovery+from+Task+Failures > The detail desgin for version1 is > https://docs.google.com/document/d/1_PqPLA1TJgjlqz8fqnVE3YSisYBDdFsrRX_URgRSj74/edit# -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] zjuwangg commented on issue #9447: [FLINK-13643][docs]Document the workaround for users with a different minor Hive version
zjuwangg commented on issue #9447: [FLINK-13643][docs]Document the workaround for users with a different minor Hive version URL: https://github.com/apache/flink/pull/9447#issuecomment-521968239 Update according to comments. cc @bowenli86 @lirui-apache This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #6411: [FLINK-9941][ScalaAPI] Flush in ScalaCsvOutputFormat before close
flinkbot commented on issue #6411: [FLINK-9941][ScalaAPI] Flush in ScalaCsvOutputFormat before close URL: https://github.com/apache/flink/pull/6411#issuecomment-521980643 ## CI report: * 95a9b60b1ece7d248755d92868e682c4ee0fd334 : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/123498093) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zentol commented on a change in pull request #9386: [FLINK-13601][tests] Harden RegionFailoverITCase by recording info when checkpoint just completed
zentol commented on a change in pull request #9386: [FLINK-13601][tests] Harden RegionFailoverITCase by recording info when checkpoint just completed URL: https://github.com/apache/flink/pull/9386#discussion_r314685717 ## File path: flink-tests/src/test/java/org/apache/flink/test/checkpointing/RegionFailoverITCase.java ## @@ -102,8 +111,14 @@ @Before public void setup() throws Exception { + HighAvailabilityServicesUtilsTest.TestHAFactory.haServices = new TestingHaServices( Review comment: please don't re-use the TestHAFactory class; with it's static mutable field it should be avoided if possible. You can just define your own `HighAvailabilityServicesFactory` with a lambda `HighAvailabilityServicesFactory haServiceFactory = (config, executor) -> new TestingHaServices...` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] Lemonjing commented on issue #6411: [FLINK-9941][ScalaAPI] Flush in ScalaCsvOutputFormat before close
Lemonjing commented on issue #6411: [FLINK-9941][ScalaAPI] Flush in ScalaCsvOutputFormat before close URL: https://github.com/apache/flink/pull/6411#issuecomment-521980729 @aljoscha u are original author of this class. mind help to review it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-13056) Optimize region failover performance on calculating vertices to restart
[ https://issues.apache.org/jira/browse/FLINK-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909001#comment-16909001 ] Zhu Zhu commented on FLINK-13056: - [~till.rohrmann] we tried to optimize it, at the cost of more data cached and slowing down the region building time. Taking the sample case with 8000 vertices and 16,000,000 edges as an example. The failover time reduced from 1961ms to 110ms. The region building time increases from 523ms to 5681ms as a side effect. [https://docs.google.com/document/d/1-QLxe4FXqXBuxlYsNmNU-R21euoTkzk1JAS6Lvrd-F4/edit] > Optimize region failover performance on calculating vertices to restart > --- > > Key: FLINK-13056 > URL: https://issues.apache.org/jira/browse/FLINK-13056 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.9.0 >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > > Currently some region boundary structures are calculated each time of a > region failover. This calculation can be heavy as its complexity goes up with > execution edge count. > We tested it in a sample case with 8000 vertices and 16,000,000 edges. It > takes ~2.0s to calculate vertices to restart. > (more details in > [https://docs.google.com/document/d/197Ou-01h2obvxq8viKqg4FnOnsykOEKxk3r5WrVBPuA/edit?usp=sharing)] > That's why we'd propose to cache the region boundary structures to improve > the region failover performance. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13406) MetricConfig.getInteger() always returns null
[ https://issues.apache.org/jira/browse/FLINK-13406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909002#comment-16909002 ] Chesnay Schepler commented on FLINK-13406: -- Not necessarily. So let's go through all scenarios: # An actual cluster, with a job submitted through the CLI, with only the flink-conf.yaml as an input for config parameters. In this case, the Configuration only contains strings, the MetricConfig only contains strings, and everything works as expected. This is the case that the current code was written for. # A local cluster, where a custom Configuration was passed to the ExecutionEnvironment. Here, the Configuration may contain values that are not strings, in turn the MetricConfig may contain values that are not strings, and reporters may run into issues when trying to retrieve a non-string value. This is the originally reported issue. # A MetricConfig is manually created by putting in arbitrary objects. The MetricConfig does not do any type conversions, and reporters may run into the same issues as in 2). This case should mostly apply to unit tests. So, with that said: Changing the Configuration, or the transformation of Configuration -> MetricConfig can only cover case 2). I would regard this option as undesirable for that reason, although I would generally prefer if the Configuration were entirely based on strings. (But IIRC the Streaming API does some funky things with the Configuration, so I don't know how feasible this is). Thus we effectively have to touch the MetricConfig instead, and there are 2 approaches we can opt for: # Sanitize inputs; override all methods that add entries and convert objects to strings if necessary. # Beef up accessors, to account for cases where properties may not be strings. (Basically, if (property is String) -> parse; elif (property is targetType) -> return; else -> fail) Or, we just don't do anything and attribute it to user error. > MetricConfig.getInteger() always returns null > - > > Key: FLINK-13406 > URL: https://issues.apache.org/jira/browse/FLINK-13406 > Project: Flink > Issue Type: Bug > Components: Runtime / Metrics >Affects Versions: 1.6.3 >Reporter: Ori Popowski >Priority: Minor > > {{MetricConfig}}'s {{getInteger}} will always return the default value. > The reason is, since it delegates to Java's {{Properties.getProperty}} which > returns null if the type of the value is not {{String}}. > h3. Reproduce > # Create a class {{MyReporter}} implementing {{MetricReporter}} > # Implment the {{open()}} method so that you do {{config.getInteger("foo", > null)}} > # Start an {{ExecutionEnvironment}} with and give it the following > Configuration object: > {code:java} > configuration.setString("metrics.reporters", "my"); > configuration.setClass("metrics.reporter.my.class", MyReporter.class) > configuration.setInteger("metrics.reporter.my.foo", 42);{code} > # In {{open()}} the value of {{foo}} is null. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-9941) Flush in ScalaCsvOutputFormat before close method
[ https://issues.apache.org/jira/browse/FLINK-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909010#comment-16909010 ] Ryan Tao commented on FLINK-9941: - maybe mistake, i reopen it. [~rmetzger] > Flush in ScalaCsvOutputFormat before close method > - > > Key: FLINK-9941 > URL: https://issues.apache.org/jira/browse/FLINK-9941 > Project: Flink > Issue Type: Improvement > Components: API / Scala >Affects Versions: 1.5.1 >Reporter: Ryan Tao >Assignee: Jiayi Liao >Priority: Major > Labels: pull-request-available > > Because not every stream's close method will flush, in order to ensure the > stability of continuous integration, we need to manually call flush() before > close(). > I noticed that CsvOutputFormat (Java API) has done this. As follows. > {code:java} > //CsvOutputFormat > public void close() throws IOException { > if (wrt != null) { > this.wrt.flush(); > this.wrt.close(); > } > super.close(); > } > {code} > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Reopened] (FLINK-9941) Flush in ScalaCsvOutputFormat before close method
[ https://issues.apache.org/jira/browse/FLINK-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Tao reopened FLINK-9941: - reopen it. > Flush in ScalaCsvOutputFormat before close method > - > > Key: FLINK-9941 > URL: https://issues.apache.org/jira/browse/FLINK-9941 > Project: Flink > Issue Type: Improvement > Components: API / Scala >Affects Versions: 1.5.1 >Reporter: Ryan Tao >Assignee: Jiayi Liao >Priority: Major > Labels: pull-request-available > > Because not every stream's close method will flush, in order to ensure the > stability of continuous integration, we need to manually call flush() before > close(). > I noticed that CsvOutputFormat (Java API) has done this. As follows. > {code:java} > //CsvOutputFormat > public void close() throws IOException { > if (wrt != null) { > this.wrt.flush(); > this.wrt.close(); > } > super.close(); > } > {code} > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (FLINK-9941) Flush in ScalaCsvOutputFormat before close method
[ https://issues.apache.org/jira/browse/FLINK-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Tao updated FLINK-9941: Affects Version/s: 1.8.1 > Flush in ScalaCsvOutputFormat before close method > - > > Key: FLINK-9941 > URL: https://issues.apache.org/jira/browse/FLINK-9941 > Project: Flink > Issue Type: Improvement > Components: API / Scala >Affects Versions: 1.5.1, 1.8.1 >Reporter: Ryan Tao >Assignee: Jiayi Liao >Priority: Major > Labels: pull-request-available > > Because not every stream's close method will flush, in order to ensure the > stability of continuous integration, we need to manually call flush() before > close(). > I noticed that CsvOutputFormat (Java API) has done this. As follows. > {code:java} > //CsvOutputFormat > public void close() throws IOException { > if (wrt != null) { > this.wrt.flush(); > this.wrt.close(); > } > super.close(); > } > {code} > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13741) "SHOW FUNCTIONS" should include Flink built-in functions' names
[ https://issues.apache.org/jira/browse/FLINK-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909015#comment-16909015 ] Bowen Li commented on FLINK-13741: -- Thanks [~Terry1897] for the summary of existing systems! Hi [~twalthr], w.r.t. "Once FunctionCatalog has been integrated into the catalog API and built-in functions are stored there as well, this should be easily doable.", currently {{FunctionCatalog}} wraps {{CatalogManager}} and builtin functions can be easily obtained thru {{BuiltInFunctionDefinitions.getDefinitions()}}, thus my understanding is it's already doable, no? seems to me what need to be done is 1) add {{listFuntions()}} API to TableEnvironment and FunctionCatalog 2) add new {{Executor.listFunctions()}} API (I'll just keep {{Executor.listUserDefinedFunctions()}} API for now in case we want to support showing UDFs only in the future) 3) make {{SHOW FUNCTIONS;}} call {{executor.listFunctions()}} W.r.t. "We should also introduce a syntax SHOW EXTENDED FUNCTION", shall they be {{DESCRIBE FUNCTION ;}} and {{DESCRIBE FUNCTION EXTENDED ;}}? > "SHOW FUNCTIONS" should include Flink built-in functions' names > --- > > Key: FLINK-13741 > URL: https://issues.apache.org/jira/browse/FLINK-13741 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.9.0 >Reporter: Bowen Li >Assignee: Bowen Li >Priority: Critical > Labels: pull-request-available > Fix For: 1.10.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Currently "SHOW FUNCTIONS;" only returns catalog functions and > FunctionDefinitions registered in memory, but does not include Flink built-in > functions' names. > AFAIK, it's standard for "SHOW FUNCTIONS;" to show all available functions > for use in queries in SQL systems like Hive, Presto, Teradata, etc, thus it > includes built-in functions naturally. Besides, > {{FunctionCatalog.lookupFunction(name)}} resolves calls to built-in > functions, it's not feeling right to not displaying functions but can > successfully resolve to them. > It seems to me that the root cause is the call stack for "SHOW FUNCTIONS;" > has been a bit messy - it calls {{tEnv.listUserDefinedFunctions()}} which > further calls {{FunctionCatalog.getUserDefinedFunctions()}}, and I'm not sure > what's the intention of those two APIs. Are they dedicated to getting all > functions, or just user defined functions excluding built-in ones? > In the end, I believe "SHOW FUNCTIONS;" should display built-in functions. To > achieve that, we either need to modify and/or rename existing APIs mentioned > above, or add new APIs to return all functions from FunctionCatalog. > cc [~xuefuz] [~lirui] [~twalthr] -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] pnowojski commented on a change in pull request #8471: [FLINK-12529][runtime] Release record-deserializer buffers timely to improve the efficiency of heap usage on taskmanager
pnowojski commented on a change in pull request #8471: [FLINK-12529][runtime] Release record-deserializer buffers timely to improve the efficiency of heap usage on taskmanager URL: https://github.com/apache/flink/pull/8471#discussion_r314710622 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/io/StreamTwoInputProcessor.java ## @@ -307,21 +307,35 @@ private void processBufferOrEvent(BufferOrEvent bufferOrEvent) throws IOExceptio if (event.getClass() != EndOfPartitionEvent.class) { throw new IOException("Unexpected event: " + event); } + + // release the record deserializer immediately, + // which is very valuable in case of bounded stream + releaseDeserializer(bufferOrEvent.getChannelIndex()); } } public void cleanup() throws IOException { - // clear the buffers first. this part should not ever fail - for (RecordDeserializer deserializer : recordDeserializers) { + // release the deserializers first. this part should not ever fail + for (int channelIndex = 0; channelIndex < recordDeserializers.length; channelIndex++) { + releaseDeserializer(channelIndex); + } + + // cleanup the barrier handler resources + barrierHandler.cleanup(); + } + + private void releaseDeserializer(int channelIndex) { + // recycle buffers and clear the deserializer. + RecordDeserializer deserializer = recordDeserializers[channelIndex]; + if (deserializer != null) { Review comment: Ok, let's drop this, since this seems to be supported only for eclipse... based on https://stackoverflow.com/questions/32327134/where-does-a-nullable-annotation-refer-to-in-case-of-a-varargs-parameter > When you are using a real Java 8 type annotation, i.e. an annotation with @Target(ElementType.TYPE_USE), the outcome in best explained by using the ordinary array declaration: And based on: https://stackoverflow.com/questions/4963300/which-notnull-java-annotation-should-i-use only Eclipse's `@Nulalble` support ` @Target(ElementType.TYPE_USE) `. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] pnowojski commented on issue #8471: [FLINK-12529][runtime] Release record-deserializer buffers timely to improve the efficiency of heap usage on taskmanager
pnowojski commented on issue #8471: [FLINK-12529][runtime] Release record-deserializer buffers timely to improve the efficiency of heap usage on taskmanager URL: https://github.com/apache/flink/pull/8471#issuecomment-522006514 One more note, please also try to include better descriptions in the commit messages, for example by copying the same thing that you include in the PR description (it makes easier understanding why a change was implemented when looking at the git history). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] wuchong commented on issue #9366: [FLINK-13359][docs] Add documentation for DDL introduction
wuchong commented on issue #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#issuecomment-522009012 Please also add DDL tabs for the formats. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-13359) Add documentation for DDL introduction
[ https://issues.apache.org/jira/browse/FLINK-13359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-13359: --- Labels: pull-request-available (was: ) > Add documentation for DDL introduction > -- > > Key: FLINK-13359 > URL: https://issues.apache.org/jira/browse/FLINK-13359 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table SQL / API >Reporter: Jark Wu >Assignee: Danny Chan >Priority: Critical > Labels: pull-request-available > Fix For: 1.9.0 > > > Add documentation for DDL introduction > - “Concepts & Common API”: Add a section to describe how to execute DDL on > TableEnvironment. > - “SQL Client”: Add a section and example in SQL CLI page too? -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] danny0405 closed pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 closed pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] rmetzger commented on issue #9462: [FLINK-13738][blink-table-planner] Fix NegativeArraySizeException in LongHybridHashTable
rmetzger commented on issue #9462: [FLINK-13738][blink-table-planner] Fix NegativeArraySizeException in LongHybridHashTable URL: https://github.com/apache/flink/pull/9462#issuecomment-521978638 my testing job doesn't fail anymore with this PR applied. I can't say whether this fix is good or not. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] lirui-apache commented on issue #9460: [FLINK-13747][hive] Remove some TODOs in Hive connector
lirui-apache commented on issue #9460: [FLINK-13747][hive] Remove some TODOs in Hive connector URL: https://github.com/apache/flink/pull/9460#issuecomment-521979323 The test failures are related. Just updated to fix them. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Closed] (FLINK-13131) add a custom directory for storing configuration files and jar files for user jobs
[ https://issues.apache.org/jira/browse/FLINK-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song closed FLINK-13131. Resolution: Won't Fix > add a custom directory for storing configuration files and jar files for user > jobs > -- > > Key: FLINK-13131 > URL: https://issues.apache.org/jira/browse/FLINK-13131 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.9.0 >Reporter: zhangbinzaifendou >Priority: Minor > Labels: patch, pull-request-available > Attachments: addYarnStagingDir.patch, addYarnStagingDir2.patch, > addYarnStagingDir3.patch > > Time Spent: 20m > Remaining Estimate: 0h > > The user-related jars and configuration files can only be placed under > fileSystem.getHomeDirectory(/home/xxx), which is inflexible and increases the > configuration so that users can customize the directory. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13131) add a custom directory for storing configuration files and jar files for user jobs
[ https://issues.apache.org/jira/browse/FLINK-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908993#comment-16908993 ] Xintong Song commented on FLINK-13131: -- Hi [~zhangbinzaifendou], thanks for reporting this issue. I see there is a discussion taken place on the PR, which seems to me reaches a consensus on not merging this change, since you also personally closed the PR. Therefore, I'm closing this jira issue as well. Please reopen it if you still have concerns around this issue. > add a custom directory for storing configuration files and jar files for user > jobs > -- > > Key: FLINK-13131 > URL: https://issues.apache.org/jira/browse/FLINK-13131 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN >Affects Versions: 1.9.0 >Reporter: zhangbinzaifendou >Priority: Minor > Labels: patch, pull-request-available > Attachments: addYarnStagingDir.patch, addYarnStagingDir2.patch, > addYarnStagingDir3.patch > > Time Spent: 20m > Remaining Estimate: 0h > > The user-related jars and configuration files can only be placed under > fileSystem.getHomeDirectory(/home/xxx), which is inflexible and increases the > configuration so that users can customize the directory. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Comment Edited] (FLINK-13406) MetricConfig.getInteger() always returns null
[ https://issues.apache.org/jira/browse/FLINK-13406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909019#comment-16909019 ] Till Rohrmann edited comment on FLINK-13406 at 8/16/19 12:33 PM: - I think there would be a 3. option: Only store {{Strings}} in {{Properties}}. This effectively means to replace {{put}} calls with {{setProperty}}. That's also what the JavaDocs of {{Properties}} say. It is a bit unfortunate that {{Properties}} extends {{Hashtable}} because it obviously is not a {{Hashtable}}. Consequently, one could argue that we are misusing {{Properties}} and the {{MetricConfig}} by calling https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/configuration/Configuration.java#L661. was (Author: till.rohrmann): I think there would be a 3. option: Only store {{Strings}} in {{Properties}}. This effectively means to replace {{put}} calls with {{setProperty}}. That's also what the JavaDocs of {{Properties}} say. It is a bit unfortunate that {{Properties extends {{Hashtable}} because it obviously is not a {{Hashtable}}. Consequently, one could argue that we are misusing {{Properties}} and the {{MetricConfig}} by calling https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/configuration/Configuration.java#L661. > MetricConfig.getInteger() always returns null > - > > Key: FLINK-13406 > URL: https://issues.apache.org/jira/browse/FLINK-13406 > Project: Flink > Issue Type: Bug > Components: Runtime / Metrics >Affects Versions: 1.6.3 >Reporter: Ori Popowski >Priority: Minor > > {{MetricConfig}}'s {{getInteger}} will always return the default value. > The reason is, since it delegates to Java's {{Properties.getProperty}} which > returns null if the type of the value is not {{String}}. > h3. Reproduce > # Create a class {{MyReporter}} implementing {{MetricReporter}} > # Implment the {{open()}} method so that you do {{config.getInteger("foo", > null)}} > # Start an {{ExecutionEnvironment}} with and give it the following > Configuration object: > {code:java} > configuration.setString("metrics.reporters", "my"); > configuration.setClass("metrics.reporter.my.class", MyReporter.class) > configuration.setInteger("metrics.reporter.my.foo", 42);{code} > # In {{open()}} the value of {{foo}} is null. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13406) MetricConfig.getInteger() always returns null
[ https://issues.apache.org/jira/browse/FLINK-13406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909019#comment-16909019 ] Till Rohrmann commented on FLINK-13406: --- I think there would be a 3. option: Only store {{Strings}} in {{Properties}}. This effectively means to replace {{put}} calls with {{setProperty}}. That's also what the JavaDocs of {{Properties}} say. It is a bit unfortunate that {{Properties extends {{Hashtable}} because it obviously is not a {{Hashtable}}. Consequently, one could argue that we are misusing {{Properties}} and the {{MetricConfig}} by calling https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/configuration/Configuration.java#L661. > MetricConfig.getInteger() always returns null > - > > Key: FLINK-13406 > URL: https://issues.apache.org/jira/browse/FLINK-13406 > Project: Flink > Issue Type: Bug > Components: Runtime / Metrics >Affects Versions: 1.6.3 >Reporter: Ori Popowski >Priority: Minor > > {{MetricConfig}}'s {{getInteger}} will always return the default value. > The reason is, since it delegates to Java's {{Properties.getProperty}} which > returns null if the type of the value is not {{String}}. > h3. Reproduce > # Create a class {{MyReporter}} implementing {{MetricReporter}} > # Implment the {{open()}} method so that you do {{config.getInteger("foo", > null)}} > # Start an {{ExecutionEnvironment}} with and give it the following > Configuration object: > {code:java} > configuration.setString("metrics.reporters", "my"); > configuration.setClass("metrics.reporter.my.class", MyReporter.class) > configuration.setInteger("metrics.reporter.my.foo", 42);{code} > # In {{open()}} the value of {{foo}} is null. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13707) Make max parallelism configurable
[ https://issues.apache.org/jira/browse/FLINK-13707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909020#comment-16909020 ] xuekang commented on FLINK-13707: - ok, thanks for the comments. I will upload a patch soon. > Make max parallelism configurable > -- > > Key: FLINK-13707 > URL: https://issues.apache.org/jira/browse/FLINK-13707 > Project: Flink > Issue Type: New Feature > Components: API / DataStream >Reporter: xuekang >Priority: Minor > > For now, if a user set parallelism larger than 128, and does not set max > parallelism explicitly, the system will compute a max parallelism, which is > 1.5 * parallelism. When the job changes the parallelism and recover from a > savepoint, there may be some problem when restoring states from the > savepoint, as the number of key groups changed. > To avoid this problem, and trying not to modify the code of existing jobs, > we want to configure the default max parallelism in flink-conf.yaml, but it > is not configurable now. > Should we make it configurable? Any comments would be appreciated. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] xuyang1706 closed pull request #9463: Add Built-in vector types
xuyang1706 closed pull request #9463: Add Built-in vector types URL: https://github.com/apache/flink/pull/9463 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (FLINK-9900) Failed to testRestoreBehaviourWithFaultyStateHandles (org.apache.flink.test.checkpointing.ZooKeeperHighAvailabilityITCase)
[ https://issues.apache.org/jira/browse/FLINK-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann resolved FLINK-9900. -- Resolution: Fixed Fixed via 1.10.0: 3e0e030b41ba7dfcc4d85b7c776ae9627a51f838 1.9.1: fc7aeffd83d390d9a9452e8fcc55aec0c9d23cac > Failed to testRestoreBehaviourWithFaultyStateHandles > (org.apache.flink.test.checkpointing.ZooKeeperHighAvailabilityITCase) > --- > > Key: FLINK-9900 > URL: https://issues.apache.org/jira/browse/FLINK-9900 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination, Tests >Affects Versions: 1.5.1, 1.6.0, 1.9.0 >Reporter: zhangminglei >Assignee: Biao Liu >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.10.0, 1.9.1 > > Time Spent: 0.5h > Remaining Estimate: 0h > > https://api.travis-ci.org/v3/job/405843617/log.txt > Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 124.598 sec > <<< FAILURE! - in > org.apache.flink.test.checkpointing.ZooKeeperHighAvailabilityITCase > > testRestoreBehaviourWithFaultyStateHandles(org.apache.flink.test.checkpointing.ZooKeeperHighAvailabilityITCase) > Time elapsed: 120.036 sec <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 12 > milliseconds > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693) > at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) > at > java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) > at > org.apache.flink.test.checkpointing.ZooKeeperHighAvailabilityITCase.testRestoreBehaviourWithFaultyStateHandles(ZooKeeperHighAvailabilityITCase.java:244) > Results : > Tests in error: > > ZooKeeperHighAvailabilityITCase.testRestoreBehaviourWithFaultyStateHandles:244 > » TestTimedOut > Tests run: 1453, Failures: 0, Errors: 1, Skipped: 29 -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] xuyang1706 opened a new pull request #9463: Add Built-in vector types
xuyang1706 opened a new pull request #9463: Add Built-in vector types URL: https://github.com/apache/flink/pull/9463 ## What is the purpose of the change Built-in vector types is the TypeInformation of Vector, DenseVector and SparseVector. The class contains the mapping of the TypeInformation and its String representation. ## Brief change log - *Add the class of the Built-in vector types* - *Add the test cases* ## Verifying this change This change added tests and can be verified as follows: - run test case pass ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (yes) - If yes, how is the feature documented? (JavaDocs) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Closed] (FLINK-12529) Release record-deserializer buffers timely to improve the efficiency of heap usage on taskmanager
[ https://issues.apache.org/jira/browse/FLINK-12529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-12529. -- Resolution: Fixed Fix Version/s: 1.10.0 Merged as 7ed0df0 into apache:master > Release record-deserializer buffers timely to improve the efficiency of heap > usage on taskmanager > - > > Key: FLINK-12529 > URL: https://issues.apache.org/jira/browse/FLINK-12529 > Project: Flink > Issue Type: Improvement > Components: Runtime / Task >Affects Versions: 1.8.0, 1.9.0 >Reporter: Haibo Sun >Assignee: Haibo Sun >Priority: Major > Labels: pull-request-available > Fix For: 1.10.0 > > Time Spent: 20m > Remaining Estimate: 0h > > In input processors (`StreamInputProcessor` and `StreamTwoInputProcessor`), > each input channel has a corresponding record deserializer. Currently, these > record deserializers are cleaned up at the end of the task (look at > `StreamInputProcessor#cleanup()` and `StreamTwoInputProcessor#cleanup()`). > This is not a problem for unbounded streams, but it may reduce the efficiency > of heap memory usage on taskmanger when input is bounded stream. > For example, in case that all inputs are bounded streams, some of them end > very early because of the small amount of data, and the other end very late > because of the large amount of data, then the buffers of the record > deserializers corresponding to the input channels finished early is idle for > a long time and no longer used. > In another case, when both unbounded and bounded streams exist in the inputs, > the buffers of the record deserializers corresponding to the bounded stream > are idle for ever (no longer used) after the bounded streams are finished. > Especially when the record and the parallelism of upstream are large, the > total size of `SpanningWrapper#buffer` are very large. The size of > `SpanningWrapper#buffer` is allowed to reach up to 5 MB, and if the > parallelism of upstream is 100, the maximum total size will reach 500 MB (in > our production, there are jobs with the record size up to hundreds of KB and > the parallelism of upstream up to 1000). > Overall, after receiving `EndOfPartitionEvent` from the input channel, the > corresponding record deserializer should be cleared immediately to improve > the efficiency of heap memory usage on taskmanager. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (FLINK-12529) Release record-deserializer buffers timely to improve the efficiency of heap usage on taskmanager
[ https://issues.apache.org/jira/browse/FLINK-12529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-12529: --- Affects Version/s: 1.9.0 > Release record-deserializer buffers timely to improve the efficiency of heap > usage on taskmanager > - > > Key: FLINK-12529 > URL: https://issues.apache.org/jira/browse/FLINK-12529 > Project: Flink > Issue Type: Improvement > Components: Runtime / Task >Affects Versions: 1.8.0, 1.9.0 >Reporter: Haibo Sun >Assignee: Haibo Sun >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > In input processors (`StreamInputProcessor` and `StreamTwoInputProcessor`), > each input channel has a corresponding record deserializer. Currently, these > record deserializers are cleaned up at the end of the task (look at > `StreamInputProcessor#cleanup()` and `StreamTwoInputProcessor#cleanup()`). > This is not a problem for unbounded streams, but it may reduce the efficiency > of heap memory usage on taskmanger when input is bounded stream. > For example, in case that all inputs are bounded streams, some of them end > very early because of the small amount of data, and the other end very late > because of the large amount of data, then the buffers of the record > deserializers corresponding to the input channels finished early is idle for > a long time and no longer used. > In another case, when both unbounded and bounded streams exist in the inputs, > the buffers of the record deserializers corresponding to the bounded stream > are idle for ever (no longer used) after the bounded streams are finished. > Especially when the record and the parallelism of upstream are large, the > total size of `SpanningWrapper#buffer` are very large. The size of > `SpanningWrapper#buffer` is allowed to reach up to 5 MB, and if the > parallelism of upstream is 100, the maximum total size will reach 500 MB (in > our production, there are jobs with the record size up to hundreds of KB and > the parallelism of upstream up to 1000). > Overall, after receiving `EndOfPartitionEvent` from the input channel, the > corresponding record deserializer should be cleared immediately to improve > the efficiency of heap memory usage on taskmanager. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Comment Edited] (FLINK-13143) Refactor CheckpointExceptionHandler relevant classes
[ https://issues.apache.org/jira/browse/FLINK-13143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909081#comment-16909081 ] Piotr Nowojski edited comment on FLINK-13143 at 8/16/19 2:15 PM: - I think [~yanghua] might be right. Defining a special interface, implementation class and test class in order to have this code (current): {code:java} owner.checkpointExceptionHandler.tryHandleCheckpointException(checkpointMetaData, checkpointException); {code} vs a bit simpler, but referring explicitly to the environment: {code:java} owner.getEnvironment().declineCheckpoint(checkpointMetaData.getCheckpointId(), checkpointException); {code} Seems "a bit" excessive. This should be quite easy change I guess as well. was (Author: pnowojski): I think [~yanghua] might be right. Defining a special interface, implementation class and test class in order to have this code (current): {code:java} owner.checkpointExceptionHandler.tryHandleCheckpointException(checkpointMetaData, checkpointException); {code} vs a bit simpler, but referring explicitly to the environment: {code:java} owner.getEnvironment().declineCheckpoint(checkpointMetaData.getCheckpointId(), checkpointException); {code} Seems a bit excessive. > Refactor CheckpointExceptionHandler relevant classes > > > Key: FLINK-13143 > URL: https://issues.apache.org/jira/browse/FLINK-13143 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.9.0 >Reporter: vinoyang >Assignee: vinoyang >Priority: Major > > Since FLINK-11662 has been merged, we can clear > {{CheckpointExceptionHandler}} relevant classes. > {{CheckpointExceptionHandler}} used to implement > {{setFailOnCheckpointingErrors}}. Now, it has only one implementation which > is {{DecliningCheckpointExceptionHandler}}. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13056) Optimize region failover performance on calculating vertices to restart
[ https://issues.apache.org/jira/browse/FLINK-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909096#comment-16909096 ] Zhu Zhu commented on FLINK-13056: - The diff can be found at [https://github.com/zhuzhurk/flink/commit/4f7da57b218e9ccd86f468f9ece62ee1e378ceda]. Need to mention that this diff is based on the initial version of flip1.RestartPipelinedRegionStrategy. So it cannot be applied to latest flip1.RestartPipelinedRegionStrategy directly, as the region building was refactored out from it later(for partition releasing). The perf test case(RegionFailoverPerfTest#complexPerfTest) used can be found in the same branch. > Optimize region failover performance on calculating vertices to restart > --- > > Key: FLINK-13056 > URL: https://issues.apache.org/jira/browse/FLINK-13056 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Affects Versions: 1.9.0 >Reporter: Zhu Zhu >Assignee: Zhu Zhu >Priority: Major > > Currently some region boundary structures are calculated each time of a > region failover. This calculation can be heavy as its complexity goes up with > execution edge count. > We tested it in a sample case with 8000 vertices and 16,000,000 edges. It > takes ~2.0s to calculate vertices to restart. > (more details in > [https://docs.google.com/document/d/197Ou-01h2obvxq8viKqg4FnOnsykOEKxk3r5WrVBPuA/edit?usp=sharing)] > That's why we'd propose to cache the region boundary structures to improve > the region failover performance. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#discussion_r314663434 ## File path: docs/dev/table/connect.md ## @@ -652,6 +686,34 @@ connector: path: "file:///path/to/whatever"# required: path to a file or directory {% endhighlight %} + + +{% highlight sql %} +-- CREATE a partitioned CSV table using the CREATE TABLE syntax. +create table csv_table ( + user bigint, + message string, + ts string Review comment: I don't think there is anything confusing, I put one to let users know how to quote the double quotes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#discussion_r314664862 ## File path: docs/dev/table/connect.md ## @@ -900,6 +1026,78 @@ connector: connection-path-prefix: "/v1" # optional: prefix string to be added to every REST communication {% endhighlight %} + + +{% highlight sql %} +-- CREATE a version 6 Elasticsearch table. +create table MyUserTable ( + user bigint, + message string, + ts string +) with ( + 'connector.type' = 'elasticsearch', -- required: specify this table type is elasticsearch + + 'connector.version' = '6', -- required: valid connector versions are "6" + + 'format.type' = 'json', -- required: specify which format to deserialize(as table source) + + 'connector.hosts.0.hostname' = 'host_name', -- required: one or more Elasticsearch hosts to connect to + 'connector.hosts.0.port' = '9092', + 'connector.hosts.0.protocol' = 'http', + + 'connector.index' = 'MyUsers', -- required: Elasticsearch index + + 'connector.document-type' = 'user', -- required: Elasticsearch document type + + 'update-mode' = 'append',-- optional: update mode when used as table sink, + -- only support append mode now. + + 'format.derive-schema' = 'true', -- optional: derive the serialize/deserialize format + -- schema from the table schema. + + 'format.json-schema' = '...',-- optional: specify the serialize/deserialize format schema, Review comment: Why ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#discussion_r314663166 ## File path: docs/dev/table/connect.md ## @@ -276,6 +276,40 @@ tables: type: VARCHAR {% endhighlight %} + + +{% highlight sql %} +-- CREATE a 010 version Kafka table start from the earliest offset(as table source) and append mode(as table sink). +create table MyUserTable ( + user bigint, + message string, + ts string +) with ( + -- declare the external system to connect to + 'connector.type' = 'kafka', + 'connector.version' = '0.10', + 'update-mode' = 'append', Review comment: Moved This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#discussion_r314664613 ## File path: docs/dev/table/connect.md ## @@ -753,6 +815,70 @@ connector: sink-partitioner-class: org.mycompany.MyPartitioner # optional: used in case of sink partitioner custom {% endhighlight %} + + +{% highlight sql %} +-- CREATE a 011 version Kafka table start from the earliest offset(as table source) +-- and append mode(as table sink). +create table MyUserTable ( + user bigint, + message string, + ts string +) with ( + 'connector.type' = 'kafka', + + 'connector.version' = '0.11', -- required: valid connector versions are +-- "0.8", "0.9", "0.10", "0.11", and "universal" + + 'connector.topic' = 'topic_name', -- required: topic name from which the table is read + + 'update-mode' = 'append', -- required: update mode when used as table sink, +-- only support append mode now. + + 'format.type' = 'avro', -- required: specify which format to deserialize(as table source) +-- and serialize(as table sink). +-- Valid format types are : "csv", "json", "avro". + + 'connector.properties.0.key' = 'zookeeper.connect', -- optional: connector specific properties + 'connector.properties.0.value' = 'localhost:2181', + 'connector.properties.0.key' = 'bootstrap.servers', + 'connector.properties.0.value' = 'localhost:9092', + 'connector.properties.0.key' = 'group.id', + 'connector.properties.0.value' = 'testGroup', + 'connector.startup-mode' = 'earliest-offset', -- optional: valid modes are "earliest-offset", "latest-offset", + -- "group-offsets", or "specific-offsets" + + 'connector.specific-offsets.partition' = '0', -- optional: used in case of startup mode with specific offsets + 'connector.specific-offsets.offset' = '42', + 'connector.specific-offsets.partition' = '1', + 'connector.specific-offsets.offset' = '300', + + 'connector.sink-partitioner' = '...', -- optional: output partitioning from Flink's partitions + -- into Kafka's partitions valid are "fixed" + -- (each Flink partition ends up in at most one Kafka partition), + -- "round-robin" (a Flink partition is distributed to + -- Kafka partitions round-robin) + -- "custom" (use a custom FlinkKafkaPartitioner subclass) + + 'format.derive-schema' = 'true', -- optional: derive the serialize/deserialize format Review comment: Why ??? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#discussion_r314664413 ## File path: docs/dev/table/connect.md ## @@ -652,6 +686,34 @@ connector: path: "file:///path/to/whatever"# required: path to a file or directory {% endhighlight %} + + +{% highlight sql %} +-- CREATE a partitioned CSV table using the CREATE TABLE syntax. +create table csv_table ( + user bigint, + message string, + ts string +) +COMMENT 'This is a csv table.' +PARTITIONED BY(user) +WITH ( + 'connector.type' = 'filesystem', -- required: specify to connector type Review comment: I don't think so, at lease we should give out the required properties like `format.type`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#discussion_r314664520 ## File path: docs/dev/table/connect.md ## @@ -753,6 +815,70 @@ connector: sink-partitioner-class: org.mycompany.MyPartitioner # optional: used in case of sink partitioner custom {% endhighlight %} + + +{% highlight sql %} +-- CREATE a 011 version Kafka table start from the earliest offset(as table source) +-- and append mode(as table sink). +create table MyUserTable ( + user bigint, + message string, + ts string +) with ( + 'connector.type' = 'kafka', + + 'connector.version' = '0.11', -- required: valid connector versions are +-- "0.8", "0.9", "0.10", "0.11", and "universal" + + 'connector.topic' = 'topic_name', -- required: topic name from which the table is read + + 'update-mode' = 'append', -- required: update mode when used as table sink, +-- only support append mode now. + + 'format.type' = 'avro', -- required: specify which format to deserialize(as table source) Review comment: Why ? it is a required option. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#discussion_r314666452 ## File path: docs/dev/table/ddl.md ## @@ -0,0 +1,119 @@ +--- +title: "DDL" +nav-parent_id: tableapi +nav-pos: 0 +--- + + +The Table API and SQL are integrated in a joint API. The central concept of this API is a `Table` which serves as input and output of queries. This document shows all the DDL grammar Flink support, how to register a `Table`(or view) through DDL, how to drop a `Table`(or view) through DDL. + +* This will be replaced by the TOC +{:toc} + +Create Table +--- +{% highlight sql %} +CREATE [OR REPLACE] TABLE [catalog_name.][db_name.]table_name + [(col_name1 col_type1 [COMMENT col_comment1], ...)] + [COMMENT table_comment] + [PARTITIONED BY (col_name1, col_name2, ...)] + [WITH (key1=val1, key2=val2, ...)] Review comment: I kind of like the grammar form in the Calcite reference page, we don't need any quotes for the literal itself, i have added the notes for the format. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#discussion_r314664808 ## File path: docs/dev/table/connect.md ## @@ -900,6 +1026,78 @@ connector: connection-path-prefix: "/v1" # optional: prefix string to be added to every REST communication {% endhighlight %} + + +{% highlight sql %} +-- CREATE a version 6 Elasticsearch table. +create table MyUserTable ( + user bigint, + message string, + ts string +) with ( + 'connector.type' = 'elasticsearch', -- required: specify this table type is elasticsearch + + 'connector.version' = '6', -- required: valid connector versions are "6" + + 'format.type' = 'json', -- required: specify which format to deserialize(as table source) Review comment: Why ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#discussion_r314664985 ## File path: docs/dev/table/ddl.md ## @@ -0,0 +1,119 @@ +--- +title: "DDL" +nav-parent_id: tableapi +nav-pos: 0 +--- + + +The Table API and SQL are integrated in a joint API. The central concept of this API is a `Table` which serves as input and output of queries. This document shows all the DDL grammar Flink support, how to register a `Table`(or view) through DDL, how to drop a `Table`(or view) through DDL. + +* This will be replaced by the TOC +{:toc} + +Create Table +--- +{% highlight sql %} +CREATE [OR REPLACE] TABLE [catalog_name.][db_name.]table_name Review comment: Removed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#discussion_r314662314 ## File path: docs/dev/table/connect.md ## @@ -276,6 +276,40 @@ tables: type: VARCHAR {% endhighlight %} + + +{% highlight sql %} +-- CREATE a 010 version Kafka table start from the earliest offset(as table source) and append mode(as table sink). +create table MyUserTable ( + user bigint, + message string, Review comment: Changed to varchar This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9450: [FLINK-13711][sql-client] Hive array values not properly displayed in…
flinkbot edited a comment on issue #9450: [FLINK-13711][sql-client] Hive array values not properly displayed in… URL: https://github.com/apache/flink/pull/9450#issuecomment-521552936 ## CI report: * c9d99f2866f281298f4217e9ce7543732bece2f8 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/123334919) * 671aa2687e3758d16646c6fbf58e4cc486328a38 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/123456040) * 5c25642609614012a78142672e4e11f0b028e2a8 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/123488890) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] rmetzger commented on issue #9100: [hotfix] Update `findAndCreateTableSource` method's annotation in TableFactoryUtil class
rmetzger commented on issue #9100: [hotfix] Update `findAndCreateTableSource` method's annotation in TableFactoryUtil class URL: https://github.com/apache/flink/pull/9100#issuecomment-521988465 Please carefully read the contribution guide I linked earlier. There is still something wrong with PR's commit message. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #9464: [FLINK-13751][ml] Add Built-in vector types
flinkbot commented on issue #9464: [FLINK-13751][ml] Add Built-in vector types URL: https://github.com/apache/flink/pull/9464#issuecomment-521998582 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 2d53d67c87d99ebff93d37a732dd50fc100a166e (Fri Aug 16 12:52:46 UTC 2019) **Warnings:** * **1 pom.xml files were touched**: Check for build and licensing issues. * No documentation files were touched! Remember to keep the Flink docs up to date! * **This pull request references an unassigned [Jira ticket](https://issues.apache.org/jira/browse/FLINK-13751).** According to the [code contribution guide](https://flink.apache.org/contributing/contribute-code.html), tickets need to be assigned before starting with the implementation work. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] tillrohrmann commented on a change in pull request #9456: FLINK-13588 flink-streaming-java don't throw away exception info in logging
tillrohrmann commented on a change in pull request #9456: FLINK-13588 flink-streaming-java don't throw away exception info in logging URL: https://github.com/apache/flink/pull/9456#discussion_r314706590 ## File path: flink-streaming-java/src/test/java/org/apache/flink/streaming/runtime/tasks/StreamTaskTest.java ## @@ -180,6 +180,25 @@ @Rule public final Timeout timeoutPerTest = Timeout.seconds(30); + /** +* This test checks the async exceptions handling wraps the message and cause as an AsynchronousException +* and propagates this to the environment. +*/ + @Test + public void exceptionReporting() { + Environment e = mock(Environment.class); + RuntimeException expectedException = new RuntimeException("RUNTIME EXCEPTION"); + + SourceStreamTask sut = new SourceStreamTask(e); + sut.handleAsyncException("EXPECTED_ERROR", expectedException); + + ArgumentCaptor c = ArgumentCaptor.forClass(AsynchronousException.class); + verify(e).failExternally(c.capture()); + assertEquals(c.getValue().getMessage(), "EXPECTED_ERROR"); + assertEquals(c.getValue().getCause(), expectedException); + assertEquals(expectedException, "AsynchronousException{EXPECTED_ERROR, caused by RUNTIME EXCEPTION}"); Review comment: Instead of mocking here via Mockito I would suggest to use `MockEnvironment` and call `#setExpectedExternalFailureCause` and `#getActualExternalFailureCause`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] tillrohrmann commented on a change in pull request #9456: FLINK-13588 flink-streaming-java don't throw away exception info in logging
tillrohrmann commented on a change in pull request #9456: FLINK-13588 flink-streaming-java don't throw away exception info in logging URL: https://github.com/apache/flink/pull/9456#discussion_r314705802 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/AsynchronousException.java ## @@ -38,6 +38,10 @@ public AsynchronousException(String message, Throwable cause) { @Override public String toString() { - return "AsynchronousException{" + getCause() + "}"; + if (getMessage() != null) { + return "AsynchronousException{" + getMessage() + ", caused by " + getCause() + "}"; + } else { + return "AsynchronousException{" + getCause() + "}"; + } Review comment: I'd be in favour of simply removing the `toString` method. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#discussion_r314722591 ## File path: docs/dev/table/ddl.md ## @@ -0,0 +1,76 @@ +--- +title: "DDL" +nav-parent_id: tableapi +nav-pos: 0 +--- + + +The Table API and SQL are integrated in a joint API. The central concept of this API is a `Table` which serves as input and output of queries. This document shows all the DDL grammar Flink support, how to register a `Table`(or view) through DDL, how to drop a `Table`(or view) through DDL. + +* This will be replaced by the TOC +{:toc} + +Create Table +--- +{% highlight sql %} +CREATE [OR REPLACE] TABLE [catalog_name.][db_name.]table_name + [(col_name1 col_type1 [COMMENT col_comment1], ...)] + [COMMENT table_comment] + [PARTITIONED BY (col_name1, col_name2, ...)] + [WITH (key1=val1, key2=val2, ...)] +{% endhighlight %} + +Create a table with the given table properties. If a table with the same name already exists in the database, an exception is thrown. + +**PARTITIONED BY** + +Partition the created table by the specified columns. A directory is created for each partition if this table is used as a filesystem sink. + +**WITH OPTIONS** + +Table properties used to create a table source/sink. The properties are usually used to find and create the underlying connector. **Notes:** the key and value of expression `key1=val1` should both be string literal. + +See details in [Connect to External Systems](connect.html) for all the supported table properties of different connectors. + +**Notes:** The table name can be of two formats: 1. `catalog_name.db_name.table_name` 2. `table_name`. For `catalog_name.db_name.table_name`, the table would be registered into metastore with catalog named "catalog_name" and database named "db_name"; for `table_name`, the table would be registered into the current catalog and database of the execution table environment. + +{% top %} + +Drop Table +--- +{% highlight sql %} +DROP TABLE [IF EXISTS] [catalog_name.][db_name.]table_name +{% endhighlight %} + +Drop a table with the given table name. If the table to drop does not exist, an exception is thrown. + +**IF EXISTS** + +If the table does not exist, nothing happens. + +{% top %} + +DDL Data Types +--- +For DDLs, we support full data types defined in page [Data Types]({{ site.baseurl }}/dev/table/types.html). + +**Notes:** Some of the data types are not supported in the sql query(the cast expression or literals). E.G. `STRING`, `BYTES`, `TIME(p) WITHOUT TIME ZONE`, `TIME(p) WITH LOCAL TIME ZONE`, `TIMESTAMP(p) WITHOUT TIME ZONE`, `TIMESTAMP(p) WITH LOCAL TIME ZONE`, `ARRAY`, `MULTISET`, `ROW`. Review comment: Okey, i would move it there. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#discussion_r314722894 ## File path: docs/dev/table/ddl.md ## @@ -0,0 +1,119 @@ +--- +title: "DDL" +nav-parent_id: tableapi +nav-pos: 0 +--- + + +The Table API and SQL are integrated in a joint API. The central concept of this API is a `Table` which serves as input and output of queries. This document shows all the DDL grammar Flink support, how to register a `Table`(or view) through DDL, how to drop a `Table`(or view) through DDL. + +* This will be replaced by the TOC +{:toc} + +Create Table +--- +{% highlight sql %} +CREATE [OR REPLACE] TABLE [catalog_name.][db_name.]table_name + [(col_name1 col_type1 [COMMENT col_comment1], ...)] + [COMMENT table_comment] + [PARTITIONED BY (col_name1, col_name2, ...)] + [WITH (key1=val1, key2=val2, ...)] +{% endhighlight %} + +Create a table with the given table properties. If a table with the same name already exists in the database, an exception is thrown except that *IF NOT EXIST* is declared. Review comment: Removed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] JingsongLi commented on issue #9462: [FLINK-13738][blink-table-planner] Fix NegativeArraySizeException in LongHybridHashTable
JingsongLi commented on issue #9462: [FLINK-13738][blink-table-planner] Fix NegativeArraySizeException in LongHybridHashTable URL: https://github.com/apache/flink/pull/9462#issuecomment-522022698 Thanks @rmetzger for verification, ping @wuchong to review This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8631: [FLINK-12745][ml] add sparse and dense vector class, and dense matrix class with basic operations.
flinkbot edited a comment on issue #8631: [FLINK-12745][ml] add sparse and dense vector class, and dense matrix class with basic operations. URL: https://github.com/apache/flink/pull/8631#issuecomment-517015005 ## CI report: * 98f0cec3deff65ebe316b8d3c13b51470d079b65 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/121481703) * 159994a4bc63a67609ede71a6465c8c85db4d3a3 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/121716307) * 464c2960c6de4bca741d937cadacea0a44541fe6 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/122419396) * 4ee4e5fc9e29c75274dec332f2df2ff35bd7c208 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/122442614) * c23d8ac2646f0b1f153d0dfb2950c53830838696 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/122445831) * 08cb1e6d6832e3bce5273831494542e39c9d56fd : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/122751123) * b3430bab7f70c17284f1db1245e75a0aa27184ed : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/123504908) * 9f68413743a4f2fb6db4ebb6dea7255b7cc26dd0 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/123506746) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] dawidwys commented on issue #9466: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies
dawidwys commented on issue #9466: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies URL: https://github.com/apache/flink/pull/9466#issuecomment-522026008 @tzulitai Could you have a look? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] dawidwys opened a new pull request #9466: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies
dawidwys opened a new pull request #9466: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies URL: https://github.com/apache/flink/pull/9466 ## What is the purpose of the change This PR adds examples-table to flink-dist dependencies. In https://issues.apache.org/jira/browse/FLINK-13558 we added table examples to the distribution package, but forgot to add it to the build dependencies of flink-dist. ## Verifying this change Run ``` mvn clean install -pl flink-dist -am ``` and see that the distribution contain table examples. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (**yes** / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot commented on issue #9465: [FLINK-13737][flink-dist][bp-1.9] Added examples-table to flink-dist dependencies
flinkbot commented on issue #9465: [FLINK-13737][flink-dist][bp-1.9] Added examples-table to flink-dist dependencies URL: https://github.com/apache/flink/pull/9465#issuecomment-522026202 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 095ba0a98f00367052feb9472c83a292a72fa98a (Fri Aug 16 14:20:56 UTC 2019) **Warnings:** * **1 pom.xml files were touched**: Check for build and licensing issues. * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] rmetzger commented on issue #9444: [FLINK-13726][docs] build docs with jekyll 4.0.0.pre.beta1
rmetzger commented on issue #9444: [FLINK-13726][docs] build docs with jekyll 4.0.0.pre.beta1 URL: https://github.com/apache/flink/pull/9444#issuecomment-521968497 I've checked out your PR to test that the Docker setup still works. Are these build times faster than before? ``` Regenerating: 1 file(s) changed at 2019-08-16 09:53:39 release-notes/flink-1.8.md ...done in 108.7836321 seconds. Regenerating: 1 file(s) changed at 2019-08-16 09:55:50 release-notes/flink-1.8.md ...done in 85.8082178 seconds. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Closed] (FLINK-13585) Test TaskAsyncCallTest#testSetsUserCodeClassLoader() deadlocks
[ https://issues.apache.org/jira/browse/FLINK-13585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Biao Liu closed FLINK-13585. > Test TaskAsyncCallTest#testSetsUserCodeClassLoader() deadlocks > -- > > Key: FLINK-13585 > URL: https://issues.apache.org/jira/browse/FLINK-13585 > Project: Flink > Issue Type: Bug > Components: Runtime / Task, Tests >Affects Versions: 1.10.0 >Reporter: Gary Yao >Assignee: Biao Liu >Priority: Critical > Labels: test-stability > Fix For: 1.10.0, 1.9.1 > > Attachments: log.txt > > > {{TaskAsyncCallTest#testSetsUserCodeClassLoader() deadlocks}} sporadically. > Commit 1ad16bc252f1d3502a29ddb2081fdfdf3436cc55. > *Stacktrace:* > {noformat} > "main" #1 prio=5 os_prio=0 tid=0x7f5e6000b800 nid=0x49bd in Object.wait() > [0x7f5e67e0b000] >java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0x8f450af8> (a java.lang.Object) > at java.lang.Object.wait(Object.java:502) > at > org.apache.flink.core.testutils.OneShotLatch.await(OneShotLatch.java:63) > - locked <0x8f450af8> (a java.lang.Object) > at > org.apache.flink.runtime.taskmanager.TaskAsyncCallTest.testSetsUserCodeClassLoader(TaskAsyncCallTest.java:201) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] flinkbot edited a comment on issue #9433: [FLINK-13708] [table-planner-blink] transformations should be cleared after execution in blink planner
flinkbot edited a comment on issue #9433: [FLINK-13708] [table-planner-blink] transformations should be cleared after execution in blink planner URL: https://github.com/apache/flink/pull/9433#issuecomment-521131546 ## CI report: * 22d047614613c293a7aca416268449b3cabcad6a : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/123164756) * 255e8d57f2eabf7fbfeefe73f10287493e8a5c2d : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/123375768) * aacac7867ac81946a8e4427334e91c65c0d3e08f : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/123451412) * e68d7394eaba76a806020b12bf4d3ea61cb4f8f3 : SUCCESS [Build](https://travis-ci.com/flink-ci/flink/builds/123482934) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-9941) Flush in ScalaCsvOutputFormat before close method
[ https://issues.apache.org/jira/browse/FLINK-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909006#comment-16909006 ] Robert Metzger commented on FLINK-9941: --- [~wind_ljy] why did you close the Jira ticket? > Flush in ScalaCsvOutputFormat before close method > - > > Key: FLINK-9941 > URL: https://issues.apache.org/jira/browse/FLINK-9941 > Project: Flink > Issue Type: Improvement > Components: API / Scala >Affects Versions: 1.5.1 >Reporter: Ryan Tao >Assignee: Jiayi Liao >Priority: Major > Labels: pull-request-available > > Because not every stream's close method will flush, in order to ensure the > stability of continuous integration, we need to manually call flush() before > close(). > I noticed that CsvOutputFormat (Java API) has done this. As follows. > {code:java} > //CsvOutputFormat > public void close() throws IOException { > if (wrt != null) { > this.wrt.flush(); > this.wrt.close(); > } > super.close(); > } > {code} > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (FLINK-9941) Flush in ScalaCsvOutputFormat before close method
[ https://issues.apache.org/jira/browse/FLINK-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Tao updated FLINK-9941: Affects Version/s: 1.6.4 1.7.2 > Flush in ScalaCsvOutputFormat before close method > - > > Key: FLINK-9941 > URL: https://issues.apache.org/jira/browse/FLINK-9941 > Project: Flink > Issue Type: Improvement > Components: API / Scala >Affects Versions: 1.5.1, 1.6.4, 1.7.2, 1.8.1 >Reporter: Ryan Tao >Assignee: Jiayi Liao >Priority: Major > Labels: pull-request-available > > Because not every stream's close method will flush, in order to ensure the > stability of continuous integration, we need to manually call flush() before > close(). > I noticed that CsvOutputFormat (Java API) has done this. As follows. > {code:java} > //CsvOutputFormat > public void close() throws IOException { > if (wrt != null) { > this.wrt.flush(); > this.wrt.close(); > } > super.close(); > } > {code} > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-9900) Failed to testRestoreBehaviourWithFaultyStateHandles (org.apache.flink.test.checkpointing.ZooKeeperHighAvailabilityITCase)
[ https://issues.apache.org/jira/browse/FLINK-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909013#comment-16909013 ] Till Rohrmann commented on FLINK-9900: -- Great to hear [~SleePy]. I'll take a look at your PR. > Failed to testRestoreBehaviourWithFaultyStateHandles > (org.apache.flink.test.checkpointing.ZooKeeperHighAvailabilityITCase) > --- > > Key: FLINK-9900 > URL: https://issues.apache.org/jira/browse/FLINK-9900 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination, Tests >Affects Versions: 1.5.1, 1.6.0, 1.9.0 >Reporter: zhangminglei >Assignee: Biao Liu >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.10.0, 1.9.1 > > Time Spent: 0.5h > Remaining Estimate: 0h > > https://api.travis-ci.org/v3/job/405843617/log.txt > Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 124.598 sec > <<< FAILURE! - in > org.apache.flink.test.checkpointing.ZooKeeperHighAvailabilityITCase > > testRestoreBehaviourWithFaultyStateHandles(org.apache.flink.test.checkpointing.ZooKeeperHighAvailabilityITCase) > Time elapsed: 120.036 sec <<< ERROR! > org.junit.runners.model.TestTimedOutException: test timed out after 12 > milliseconds > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693) > at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) > at > java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) > at > org.apache.flink.test.checkpointing.ZooKeeperHighAvailabilityITCase.testRestoreBehaviourWithFaultyStateHandles(ZooKeeperHighAvailabilityITCase.java:244) > Results : > Tests in error: > > ZooKeeperHighAvailabilityITCase.testRestoreBehaviourWithFaultyStateHandles:244 > » TestTimedOut > Tests run: 1453, Failures: 0, Errors: 1, Skipped: 29 -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13406) MetricConfig.getInteger() always returns null
[ https://issues.apache.org/jira/browse/FLINK-13406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909024#comment-16909024 ] Chesnay Schepler commented on FLINK-13406: -- This is not really a new option, I did cover changes to the Configuration that affect how the MetricConfig is assembled in my last comment. > MetricConfig.getInteger() always returns null > - > > Key: FLINK-13406 > URL: https://issues.apache.org/jira/browse/FLINK-13406 > Project: Flink > Issue Type: Bug > Components: Runtime / Metrics >Affects Versions: 1.6.3 >Reporter: Ori Popowski >Priority: Minor > > {{MetricConfig}}'s {{getInteger}} will always return the default value. > The reason is, since it delegates to Java's {{Properties.getProperty}} which > returns null if the type of the value is not {{String}}. > h3. Reproduce > # Create a class {{MyReporter}} implementing {{MetricReporter}} > # Implment the {{open()}} method so that you do {{config.getInteger("foo", > null)}} > # Start an {{ExecutionEnvironment}} with and give it the following > Configuration object: > {code:java} > configuration.setString("metrics.reporters", "my"); > configuration.setClass("metrics.reporter.my.class", MyReporter.class) > configuration.setInteger("metrics.reporter.my.foo", 42);{code} > # In {{open()}} the value of {{foo}} is null. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13472) taskmanager.jvm-exit-on-oom doesn't work reliably with YARN
[ https://issues.apache.org/jira/browse/FLINK-13472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909025#comment-16909025 ] Till Rohrmann commented on FLINK-13472: --- This could be a good improvement. However there might also be some corner cases where it does not properly work: https://bugs.openjdk.java.net/browse/JDK-8155004 > taskmanager.jvm-exit-on-oom doesn't work reliably with YARN > --- > > Key: FLINK-13472 > URL: https://issues.apache.org/jira/browse/FLINK-13472 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.6.3 >Reporter: Pawel Bartoszek >Priority: Major > > I have added *taskmanager.jvm-exit-on-oom* flag to the task manager starting > arguments. During my testing (simulating oom) I noticed that sometimes YARN > containers were still in RUNNING state even though they should haven been > killed on OutOfMemory errors with the flag on. > I could find RUNNING containers with the last log lines like this. > {code:java} > 2019-07-26 13:32:51,396 ERROR org.apache.flink.runtime.taskmanager.Task > - Encountered fatal error java.lang.OutOfMemoryError - > terminating the JVM > java.lang.OutOfMemoryError: Metaspace > at java.lang.ClassLoader.defineClass1(Native Method) > at java.lang.ClassLoader.defineClass(ClassLoader.java:763) > at > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) > at java.net.URLClassLoader.defineClass(URLClassLoader.java:468) > at java.net.URLClassLoader.access$100(URLClassLoader.java:74) > at java.net.URLClassLoader$1.run(URLClassLoader.java:369){code} > > Does YARN make it tricky to forcefully kill JVM after OutOfMemory error? > > *Workaround* > > When using -XX:+ExitOnOutOfMemoryError JVM flag containers get always > terminated! -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] flinkbot edited a comment on issue #8631: [FLINK-12745][ml] add sparse and dense vector class, and dense matrix class with basic operations.
flinkbot edited a comment on issue #8631: [FLINK-12745][ml] add sparse and dense vector class, and dense matrix class with basic operations. URL: https://github.com/apache/flink/pull/8631#issuecomment-517015005 ## CI report: * 98f0cec3deff65ebe316b8d3c13b51470d079b65 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/121481703) * 159994a4bc63a67609ede71a6465c8c85db4d3a3 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/121716307) * 464c2960c6de4bca741d937cadacea0a44541fe6 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/122419396) * 4ee4e5fc9e29c75274dec332f2df2ff35bd7c208 : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/122442614) * c23d8ac2646f0b1f153d0dfb2950c53830838696 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/122445831) * 08cb1e6d6832e3bce5273831494542e39c9d56fd : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/122751123) * b3430bab7f70c17284f1db1245e75a0aa27184ed : PENDING [Build](https://travis-ci.com/flink-ci/flink/builds/123504908) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-13418) Avoid InfluxdbReporter to report unnecessary tags
[ https://issues.apache.org/jira/browse/FLINK-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909034#comment-16909034 ] Yun Tang commented on FLINK-13418: -- [~Zentol], the comparison of these two different concepts is listed below and summarized from [influxdb official doc|https://docs.influxdata.com/influxdb/v1.7/concepts/key_concepts/#field-set]: ||concept||whether required||whether indexed||whether count for series|| |field|required|not indexed|no| |tag|optional|indexed|yes| Take our {{numBytesInRemotePerSecond}} of input channel metrics as example, it is a task scope metrics with default naming {{.taskmanager}}. If we want to know the change of bytes input throughput of specific tasks, we need query with specific {{task_name}} and {{subtask_index}}. Just like [doc|https://docs.influxdata.com/influxdb/v1.7/concepts/key_concepts/#tag-set] said, if {{task_name}} and {{subtask_index}} were filed instead of tag, InfuxDB have to scan every value of {{task_name}} and {{subtask_index}}. On the other side, these could be set as tags to optimize the performance as tag is indexed in influxDB. The value of {{numBytesInRemotePerSecond}} is appropriate to act as field as we would not query on condign of values in most cases. However, in InfluxDB, a {{series}} is the collection of data that share a retention policy, measurement, and tag set. In other words, more tags, more series could be counted. Since series number would impact the overall index performance especially the default in-memory index version for InfluxDB, and the default limit for series per data-base is only one million. If we include the {{task_attempt_id}} and other unnecessary tags in influxDB, the total series number would increase dramatically especially {{task_attempt_id}} set would be a high dimension set. That's why [~TheoD] come across the huge memory usage of InfluxDB. > Avoid InfluxdbReporter to report unnecessary tags > - > > Key: FLINK-13418 > URL: https://issues.apache.org/jira/browse/FLINK-13418 > Project: Flink > Issue Type: Improvement > Components: Runtime / Metrics >Reporter: Yun Tang >Priority: Major > Fix For: 1.10.0 > > > Currently, when building measurement info within {{InfluxdbReporter}}, it > would involve all variables as tags (please see code > [here|https://github.com/apache/flink/blob/d57741cef9d4773cc487418baa961254d0d47524/flink-metrics/flink-metrics-influxdb/src/main/java/org/apache/flink/metrics/influxdb/MeasurementInfoProvider.java#L54]). > However, user could adjust their own scope format to abort unnecessary > scope, while {{InfluxdbReporter}} could report all the scopes as tags to > InfluxDB. > This is due to current {{MetricGroup}} lacks of any method to get necessary > scopes but only {{#getScopeComponents()}} or {{#getAllVariables()}}. In other > words, InfluxDB need tag-key and tag-value to compose as its tags while we > could only get all variables (without any filter acording to scope format) or > only scopeComponents (could be treated as tag-value). I think that's why > previous implementation have to report all tags. > From our experience on InfluxDB, as the size of tags contribute to the > overall series in InfluxDB, it would never be a good idea to contain too many > tags, not to mention the [default value of series per > database|https://docs.influxdata.com/influxdb/v1.7/troubleshooting/errors/#error-max-series-per-database-exceeded] > is only one million. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] xuyang1706 commented on a change in pull request #8631: [FLINK-12745][ml] add sparse and dense vector class, and dense matrix class with basic operations.
xuyang1706 commented on a change in pull request #8631: [FLINK-12745][ml] add sparse and dense vector class, and dense matrix class with basic operations. URL: https://github.com/apache/flink/pull/8631#discussion_r314709236 ## File path: flink-ml-parent/flink-ml-lib/src/main/java/org/apache/flink/ml/common/linalg/Vector.java ## @@ -0,0 +1,159 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.flink.ml.common.linalg; + +import org.apache.commons.lang3.StringUtils; + +import java.io.Serializable; + +/** + * The Vector class defines some common methods for both DenseVector and + * SparseVector. + */ +public abstract class Vector implements Serializable { + /** +* Get the size of the vector. +*/ + public abstract int size(); + + /** +* Get the i-th element of the vector. +*/ + public abstract double get(int i); + + /** +* Set the i-th element of the vector to value "val". +*/ + public abstract void set(int i, double val); + + /** +* Add the i-th element of the vector by value "val". +*/ + public abstract void add(int i, double val); + + /** +* Return the L1 norm of the vector. +*/ + public abstract double normL1(); + + /** +* Return the Inf norm of the vector. +*/ + public abstract double normInf(); + + /** +* Return the L2 norm of the vector. +*/ + public abstract double normL2(); + + /** +* Return the square of L2 norm of the vector. +*/ + public abstract double normL2Square(); + + /** +* Scale the vector by value "v" and create a new vector to store the result. +*/ + public abstract Vector scale(double v); + + /** +* Scale the vector by value "v". +*/ + public abstract void scaleEqual(double v); + + /** +* Normalize the vector. +*/ + public abstract void normalizeEqual(double p); + + /** +* Standardize the vector. +*/ + public abstract void standardizeEqual(double mean, double stdvar); + + /** +* Create a new vector by adding an element to the head of the vector. +*/ + public abstract Vector prefix(double v); + + /** +* Create a new vector by adding an element to the end of the vector. +*/ + public abstract Vector append(double v); + + /** +* Create a new vector by plussing another vector. +*/ + public abstract Vector plus(Vector vec); + + /** +* Create a new vector by subtracting another vector. +*/ + public abstract Vector minus(Vector vec); + + /** +* Compute the dot product with another vector. +*/ + public abstract double dot(Vector vec); + + /** +* Get the iterator of the vector. +*/ + public abstract VectorIterator iterator(); + + /** +* Serialize the vector to a string. +*/ + public abstract String serialize(); + + /** +* Slice the vector. +*/ + public abstract Vector slice(int[] indexes); + + /** +* Compute the outer product with itself. +* +* @return The outer product matrix. +*/ + public abstract DenseMatrix outer(); + + /** +* Parse either a {@link SparseVector} or a {@link DenseVector} from a formatted string. +* +* The format of a dense vector is comma separated values such as "1 2 3 4". +* The format of a sparse vector is comma separated index-value pairs, such as "0:1 2:3 3:4". +* If the sparse vector has determined vector size, the size is prepended to the head. For example, +* the string "$4$0:1 2:3 3:4" represents a sparse vector with size 4. +* +* @param str A formatted string representing a vector. +* @return The parsed vector. +*/ + public static Vector parse(String str) { + boolean isSparse = org.apache.flink.util.StringUtils.isNullOrWhitespaceOnly(str) +
[GitHub] [flink] xuyang1706 commented on a change in pull request #8631: [FLINK-12745][ml] add sparse and dense vector class, and dense matrix class with basic operations.
xuyang1706 commented on a change in pull request #8631: [FLINK-12745][ml] add sparse and dense vector class, and dense matrix class with basic operations. URL: https://github.com/apache/flink/pull/8631#discussion_r314708905 ## File path: flink-ml-parent/flink-ml-lib/src/main/java/org/apache/flink/ml/common/linalg/DenseVector.java ## @@ -0,0 +1,463 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.flink.ml.common.linalg; + +import org.apache.commons.lang3.StringUtils; + +import java.util.Arrays; +import java.util.Random; + +/** + * A dense vector represented by a values array. + */ +public class DenseVector extends Vector { + /** +* The array holding the vector data. +* +* Package private to allow access from {@link MatVecOp} and {@link BLAS}. +*/ + double[] data; + + /** +* Create a zero size vector. +*/ + public DenseVector() { + this(0); + } + + /** +* Create a size n vector with all elements zero. +* +* @param n Size of the vector. +*/ + public DenseVector(int n) { + this.data = new double[n]; + } + + /** +* Create a dense vector with the user provided data. +* +* @param data The vector data. +*/ + public DenseVector(double[] data) { + this.data = data; + } + + /** +* Get the data array. +*/ + public double[] getData() { + return this.data; + } + + /** +* Set the data array. +*/ + public void setData(double[] data) { + this.data = data; + } + + /** +* Create a dense vector with all elements one. +* +* @param n Size of the vector. +* @return The newly created dense vector. +*/ + public static DenseVector ones(int n) { + DenseVector r = new DenseVector(n); + Arrays.fill(r.data, 1.0); + return r; + } + + /** +* Create a dense vector with all elements zero. +* +* @param n Size of the vector. +* @return The newly created dense vector. +*/ + public static DenseVector zeros(int n) { + DenseVector r = new DenseVector(n); + Arrays.fill(r.data, 0.0); + return r; + } + + /** +* Create a dense vector with random values uniformly distributed in the range of [0.0, 1.0]. +* +* @param n Size of the vector. +* @return The newly created dense vector. +*/ + public static DenseVector rand(int n) { + Random random = new Random(); + DenseVector v = new DenseVector(n); + for (int i = 0; i < n; i++) { + v.data[i] = random.nextDouble(); + } + return v; + } + + /** +* Delimiter between elements. +*/ + private static final char ELEMENT_DELIMITER = ' '; + + @Override + public String serialize() { + StringBuilder sbd = new StringBuilder(); + + for (int i = 0; i < data.length; i++) { + sbd.append(data[i]); + if (i < data.length - 1) { + sbd.append(ELEMENT_DELIMITER); + } + } + return sbd.toString(); + } + + /** +* Parse the dense vector from a formatted string. +* +* The format of a dense vector is comma separated values such as "1 2 3 4". +* +* @param str A string of space separated values. +* @return The parsed vector. +*/ + public static DenseVector deserialize(String str) { + if (org.apache.flink.util.StringUtils.isNullOrWhitespaceOnly(str)) { + return new DenseVector(); + } + + int len = str.length(); + + int inDataBuffPos = 0; + boolean isInBuff = false; + + for (int i = 0; i < len; ++i) { +
[GitHub] [flink] xuyang1706 commented on a change in pull request #8631: [FLINK-12745][ml] add sparse and dense vector class, and dense matrix class with basic operations.
xuyang1706 commented on a change in pull request #8631: [FLINK-12745][ml] add sparse and dense vector class, and dense matrix class with basic operations. URL: https://github.com/apache/flink/pull/8631#discussion_r314709076 ## File path: flink-ml-parent/flink-ml-lib/src/main/java/org/apache/flink/ml/common/linalg/Vector.java ## @@ -0,0 +1,159 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.flink.ml.common.linalg; + +import org.apache.commons.lang3.StringUtils; + +import java.io.Serializable; + +/** + * The Vector class defines some common methods for both DenseVector and + * SparseVector. + */ +public abstract class Vector implements Serializable { + /** +* Get the size of the vector. +*/ + public abstract int size(); + + /** +* Get the i-th element of the vector. +*/ + public abstract double get(int i); + + /** +* Set the i-th element of the vector to value "val". +*/ + public abstract void set(int i, double val); + + /** +* Add the i-th element of the vector by value "val". +*/ + public abstract void add(int i, double val); + + /** +* Return the L1 norm of the vector. +*/ + public abstract double normL1(); + + /** +* Return the Inf norm of the vector. +*/ + public abstract double normInf(); + + /** +* Return the L2 norm of the vector. +*/ + public abstract double normL2(); + + /** +* Return the square of L2 norm of the vector. +*/ + public abstract double normL2Square(); + + /** +* Scale the vector by value "v" and create a new vector to store the result. +*/ + public abstract Vector scale(double v); + + /** +* Scale the vector by value "v". +*/ + public abstract void scaleEqual(double v); + + /** +* Normalize the vector. +*/ + public abstract void normalizeEqual(double p); + + /** +* Standardize the vector. +*/ + public abstract void standardizeEqual(double mean, double stdvar); + + /** +* Create a new vector by adding an element to the head of the vector. +*/ + public abstract Vector prefix(double v); + + /** +* Create a new vector by adding an element to the end of the vector. +*/ + public abstract Vector append(double v); + + /** +* Create a new vector by plussing another vector. +*/ + public abstract Vector plus(Vector vec); + + /** +* Create a new vector by subtracting another vector. +*/ + public abstract Vector minus(Vector vec); + + /** +* Compute the dot product with another vector. +*/ + public abstract double dot(Vector vec); + + /** +* Get the iterator of the vector. +*/ + public abstract VectorIterator iterator(); + + /** +* Serialize the vector to a string. +*/ + public abstract String serialize(); + + /** +* Slice the vector. +*/ + public abstract Vector slice(int[] indexes); + + /** +* Compute the outer product with itself. +* +* @return The outer product matrix. +*/ + public abstract DenseMatrix outer(); + + /** +* Parse either a {@link SparseVector} or a {@link DenseVector} from a formatted string. +* +* The format of a dense vector is comma separated values such as "1 2 3 4". Review comment: Thanks. It has been fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] pnowojski merged pull request #8471: [FLINK-12529][runtime] Release record-deserializer buffers timely to improve the efficiency of heap usage on taskmanager
pnowojski merged pull request #8471: [FLINK-12529][runtime] Release record-deserializer buffers timely to improve the efficiency of heap usage on taskmanager URL: https://github.com/apache/flink/pull/8471 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #6411: [FLINK-9941][ScalaAPI] Flush in ScalaCsvOutputFormat before close
flinkbot edited a comment on issue #6411: [FLINK-9941][ScalaAPI] Flush in ScalaCsvOutputFormat before close URL: https://github.com/apache/flink/pull/6411#issuecomment-521980643 ## CI report: * 95a9b60b1ece7d248755d92868e682c4ee0fd334 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/123498093) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 opened a new pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 opened a new pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366 ## What is the purpose of the change Add doc for all the supported DDLs ## Brief change log - Add ddl.md in table dir. - Add links to this page This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on issue #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on issue #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#issuecomment-522010412 OOps, i mis-clicked, sorry for that ... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #9464: [FLINK-13751][ml] Add Built-in vector types
flinkbot edited a comment on issue #9464: [FLINK-13751][ml] Add Built-in vector types URL: https://github.com/apache/flink/pull/9464#issuecomment-52160 ## CI report: * 2d53d67c87d99ebff93d37a732dd50fc100a166e : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/123506707) * b251fc3fa90e7ee381440b2263d514fd29bba008 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/123507726) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-13500) RestClusterClient requires S3 access when HA is configured
[ https://issues.apache.org/jira/browse/FLINK-13500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909071#comment-16909071 ] David Judd commented on FLINK-13500: Ok, that's good to know. Thanks for the followup! This isn't a problem for me in the short term - I just gave the client box the needed permissions. I think it would be useful to describe in the docs though. (Also FWIW I think you linked the wrong ticket.) > RestClusterClient requires S3 access when HA is configured > -- > > Key: FLINK-13500 > URL: https://issues.apache.org/jira/browse/FLINK-13500 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination, Runtime / REST >Affects Versions: 1.8.1 >Reporter: David Judd >Priority: Major > > RestClusterClient initialization calls ClusterClient initialization, which > calls > org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices > In turn, createHighAvailabilityServices calls > BlobUtils.createBlobStoreFromConfig, which in our case tries to talk to S3. > It seems very surprising to me that (a) RestClusterClient needs any form of > access other than to the REST API, and (b) that client initialization would > attempt a write as a side effect. I do not see either of these surprising > facts described in the documentation–are they intentional? -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (FLINK-12514) Refactor the failure checkpoint counting mechanism with ordered checkpoint id
[ https://issues.apache.org/jira/browse/FLINK-12514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-12514: --- Affects Version/s: 1.9.0 > Refactor the failure checkpoint counting mechanism with ordered checkpoint id > - > > Key: FLINK-12514 > URL: https://issues.apache.org/jira/browse/FLINK-12514 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.9.0 >Reporter: vinoyang >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently, the checkpoint failure manager uses a simple counting mechanism > which does not tract checkpoint id sequence. > However, a more graceful counting mechanism is based on ordered checkpoint id > sequence. > It should be refactored after the FLINK-12364 would been merged. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-12514) Refactor the failure checkpoint counting mechanism with ordered checkpoint id
[ https://issues.apache.org/jira/browse/FLINK-12514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909072#comment-16909072 ] Piotr Nowojski commented on FLINK-12514: I've briefly checked your PR and it's adding even more locking. I think whatever we do about this feature, it has to be done after refactoring FLINK-13698 and fixing bugs like FLINK-13497 caused by FLINK-12364. Can you also add better description in this ticket, what this change is about (what is it trying to fix/improve)? Also I think the title is misleading, as this is not a refactor, but a new feature. > Refactor the failure checkpoint counting mechanism with ordered checkpoint id > - > > Key: FLINK-12514 > URL: https://issues.apache.org/jira/browse/FLINK-12514 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Reporter: vinoyang >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently, the checkpoint failure manager uses a simple counting mechanism > which does not tract checkpoint id sequence. > However, a more graceful counting mechanism is based on ordered checkpoint id > sequence. > It should be refactored after the FLINK-12364 would been merged. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] pnowojski commented on issue #8871: [FLINK-12976] Bump Kafka client version to 2.3.0 for universal Kafka connector
pnowojski commented on issue #8871: [FLINK-12976] Bump Kafka client version to 2.3.0 for universal Kafka connector URL: https://github.com/apache/flink/pull/8871#issuecomment-522019662 I would suggest to ask @becketqin about this as I'm unfortunately busy with runtime features :( This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-13143) Refactor CheckpointExceptionHandler relevant classes
[ https://issues.apache.org/jira/browse/FLINK-13143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909086#comment-16909086 ] vinoyang commented on FLINK-13143: -- [~pnowojski] you are right. The {{CheckpointExceptionHandler}} and {{CheckpointExceptionHandlerFactory}} are legacy classes. They are not necessary since FLINK-11662, we can remove them, let the related code cleaner. cc [~till.rohrmann] > Refactor CheckpointExceptionHandler relevant classes > > > Key: FLINK-13143 > URL: https://issues.apache.org/jira/browse/FLINK-13143 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.9.0 >Reporter: vinoyang >Assignee: vinoyang >Priority: Major > > Since FLINK-11662 has been merged, we can clear > {{CheckpointExceptionHandler}} relevant classes. > {{CheckpointExceptionHandler}} used to implement > {{setFailOnCheckpointingErrors}}. Now, it has only one implementation which > is {{DecliningCheckpointExceptionHandler}}. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-13143) Refactor CheckpointExceptionHandler relevant classes
[ https://issues.apache.org/jira/browse/FLINK-13143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909084#comment-16909084 ] vinoyang commented on FLINK-13143: -- [~pnowojski] you are right. The {{CheckpointExceptionHandler}} and {{CheckpointExceptionHandlerFactory}} are legacy classes. They are not necessary since FLINK-11662, we can remove them, let the related code more clean. cc [~till.rohrmann] > Refactor CheckpointExceptionHandler relevant classes > > > Key: FLINK-13143 > URL: https://issues.apache.org/jira/browse/FLINK-13143 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.9.0 >Reporter: vinoyang >Assignee: vinoyang >Priority: Major > > Since FLINK-11662 has been merged, we can clear > {{CheckpointExceptionHandler}} relevant classes. > {{CheckpointExceptionHandler}} used to implement > {{setFailOnCheckpointingErrors}}. Now, it has only one implementation which > is {{DecliningCheckpointExceptionHandler}}. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] flinkbot commented on issue #9466: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies
flinkbot commented on issue #9466: [FLINK-13737][flink-dist] Added examples-table to flink-dist dependencies URL: https://github.com/apache/flink/pull/9466#issuecomment-522026797 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit aec1d92adaaf5fd75eb673d23c51c570c2425587 (Fri Aug 16 14:22:29 UTC 2019) **Warnings:** * **1 pom.xml files were touched**: Check for build and licensing issues. * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Issue Comment Deleted] (FLINK-13143) Refactor CheckpointExceptionHandler relevant classes
[ https://issues.apache.org/jira/browse/FLINK-13143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang updated FLINK-13143: - Comment: was deleted (was: [~pnowojski] you are right. The {{CheckpointExceptionHandler}} and {{CheckpointExceptionHandlerFactory}} are legacy classes. They are not necessary since FLINK-11662, we can remove them, let the related code more clean. cc [~till.rohrmann]) > Refactor CheckpointExceptionHandler relevant classes > > > Key: FLINK-13143 > URL: https://issues.apache.org/jira/browse/FLINK-13143 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.9.0 >Reporter: vinoyang >Assignee: vinoyang >Priority: Major > > Since FLINK-11662 has been merged, we can clear > {{CheckpointExceptionHandler}} relevant classes. > {{CheckpointExceptionHandler}} used to implement > {{setFailOnCheckpointingErrors}}. Now, it has only one implementation which > is {{DecliningCheckpointExceptionHandler}}. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (FLINK-12514) Refactor the failure checkpoint counting mechanism with ordered checkpoint id
[ https://issues.apache.org/jira/browse/FLINK-12514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909089#comment-16909089 ] vinoyang commented on FLINK-12514: -- [~pnowojski] I will give a more detailed description. I said this is a refactor, because I have implemented a simple counting mechanism based on {{AtomicInteger}}. The context of this idea comes from the PR of FLINK-12364, Stefan proposed it. Whatever, I totally agree with your comment. And rework the title and description. > Refactor the failure checkpoint counting mechanism with ordered checkpoint id > - > > Key: FLINK-12514 > URL: https://issues.apache.org/jira/browse/FLINK-12514 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.9.0 >Reporter: vinoyang >Assignee: vinoyang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently, the checkpoint failure manager uses a simple counting mechanism > which does not tract checkpoint id sequence. > However, a more graceful counting mechanism is based on ordered checkpoint id > sequence. > It should be refactored after the FLINK-12364 would been merged. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[GitHub] [flink] flinkbot edited a comment on issue #9072: [FLINK-11630] Wait for the termination of all running Tasks when shutting down TaskExecutor
flinkbot edited a comment on issue #9072: [FLINK-11630] Wait for the termination of all running Tasks when shutting down TaskExecutor URL: https://github.com/apache/flink/pull/9072#issuecomment-512425985 ## CI report: * cd5ad8d23046c1025f7f9865e60fc3d048fd1f85 : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/119513431) * 9b6f8121f909e75b1230bb0e220e8b5ac534a5ff : CANCELED [Build](https://travis-ci.com/flink-ci/flink/builds/123389936) * ad8ea8a540ba4385548c1eee37fab5d3970d3daa : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/123393714) * 56241d01bcc82692a8ceb3add3117fe1f8cbb58e : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/123410346) * 581be904a2775cf6f42013afe0ef6fbf35658b6b : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/123478228) * 181a7e4a5016f171ced04468996956503564c64f : FAILURE [Build](https://travis-ci.com/flink-ci/flink/builds/123510415) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#discussion_r314662058 ## File path: docs/dev/table/connect.md ## @@ -276,6 +276,40 @@ tables: type: VARCHAR {% endhighlight %} + + +{% highlight sql %} +-- CREATE a 010 version Kafka table start from the earliest offset(as table source) and append mode(as table sink). +create table MyUserTable ( + user bigint, Review comment: Yep, thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#discussion_r314664840 ## File path: docs/dev/table/connect.md ## @@ -900,6 +1026,78 @@ connector: connection-path-prefix: "/v1" # optional: prefix string to be added to every REST communication {% endhighlight %} + + +{% highlight sql %} +-- CREATE a version 6 Elasticsearch table. +create table MyUserTable ( + user bigint, + message string, + ts string +) with ( + 'connector.type' = 'elasticsearch', -- required: specify this table type is elasticsearch + + 'connector.version' = '6', -- required: valid connector versions are "6" + + 'format.type' = 'json', -- required: specify which format to deserialize(as table source) + + 'connector.hosts.0.hostname' = 'host_name', -- required: one or more Elasticsearch hosts to connect to + 'connector.hosts.0.port' = '9092', + 'connector.hosts.0.protocol' = 'http', + + 'connector.index' = 'MyUsers', -- required: Elasticsearch index + + 'connector.document-type' = 'user', -- required: Elasticsearch document type + + 'update-mode' = 'append',-- optional: update mode when used as table sink, + -- only support append mode now. + + 'format.derive-schema' = 'true', -- optional: derive the serialize/deserialize format Review comment: Why ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#discussion_r314664109 ## File path: docs/dev/table/connect.md ## @@ -652,6 +686,34 @@ connector: path: "file:///path/to/whatever"# required: path to a file or directory {% endhighlight %} + + +{% highlight sql %} +-- CREATE a partitioned CSV table using the CREATE TABLE syntax. +create table csv_table ( + user bigint, + message string, + ts string +) +COMMENT 'This is a csv table.' Review comment: Okey This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#discussion_r314662011 ## File path: docs/dev/table/connect.md ## @@ -276,6 +276,40 @@ tables: type: VARCHAR {% endhighlight %} + + +{% highlight sql %} +-- CREATE a 010 version Kafka table start from the earliest offset(as table source) and append mode(as table sink). +create table MyUserTable ( Review comment: what's the difference ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction
danny0405 commented on a change in pull request #9366: [FLINK-13359][docs] Add documentation for DDL introduction URL: https://github.com/apache/flink/pull/9366#discussion_r314669687 ## File path: docs/dev/table/ddl.md ## @@ -0,0 +1,119 @@ +--- +title: "DDL" +nav-parent_id: tableapi +nav-pos: 0 +--- + + +The Table API and SQL are integrated in a joint API. The central concept of this API is a `Table` which serves as input and output of queries. This document shows all the DDL grammar Flink support, how to register a `Table`(or view) through DDL, how to drop a `Table`(or view) through DDL. + +* This will be replaced by the TOC +{:toc} + +Create Table +--- +{% highlight sql %} +CREATE [OR REPLACE] TABLE [catalog_name.][db_name.]table_name + [(col_name1 col_type1 [COMMENT col_comment1], ...)] + [COMMENT table_comment] + [PARTITIONED BY (col_name1, col_name2, ...)] + [WITH (key1=val1, key2=val2, ...)] +{% endhighlight %} + +Create a table with the given table properties. If a table with the same name already exists in the database, an exception is thrown except that *IF NOT EXIST* is declared. + +**OR REPLACE** + +If a table with the same name already exists in the database, replace it if this is declared. **Notes:** The OR REPLACE option is always false now. + +**PARTITIONED BY** + +Partition the created table by the specified columns. A directory is created for each partition if this table is used as a filesystem sink. + +**WITH OPTIONS** + +Table properties used to create a table source/sink. The properties are usually used to find and create the underlying connector. **Notes:** the key and value of expression `key1=val1` should both be string literal. + +See details in [Connect to External Systems](connect.html) for all the supported table properties of different connectors. + +**Notes:** The table name can be of two formats: 1. `catalog_name.db_name.table_name` 2. `table_name`. For `catalog_name.db_name.table_name`, the table would be registered into metastore with catalog named "catalog_name" and database named "db_name"; for `table_name`, the table would be registered into the current catalog and database of the execution table environment. + +{% top %} + +Drop Table +--- +{% highlight sql %} +DROP TABLE [IF EXISTS] [catalog_name.][db_name.]table_name +{% endhighlight %} + +Drop a table with the given table name. If the table to drop does not exist, an exception is thrown. + +**IF EXISTS** + +If the table does not exist, nothing happens. + +{% top %} + +Create View Review comment: Removed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services