[jira] [Commented] (FLINK-14175) Upgrade KPL version in flink-connector-kinesis to fix application OOM

2019-10-15 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16951717#comment-16951717
 ] 

Robert Metzger commented on FLINK-14175:


I would target {{master}} for the upcoming 1.10 release (and maybe to the 
{{release-1.9}} branch for the next 1.9.x release)

For backporting FLINK-12847, to older versions: For bugfix releases, we usually 
only upgrade dependency versions for severe bugs or security issues.

Since connectors are only distributed via maven repositories, you could 
consider backporting the version change yourself, and then releasing the 
connector under a different name to maven central? 

> Upgrade KPL version in flink-connector-kinesis to fix application OOM
> -
>
> Key: FLINK-14175
> URL: https://issues.apache.org/jira/browse/FLINK-14175
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.6.3, 1.6.4, 1.6.5, 1.7.2, 1.7.3, 1.8.0, 1.8.1, 1.8.2, 
> 1.9.0
> Environment: [link title|http://example.com][link 
> title|http://example.com]
>Reporter: Abhilasha Seth
>Priority: Major
>
> The [KPL 
> version|https://github.com/apache/flink/blob/release-1.9/flink-connectors/flink-connector-kinesis/pom.xml#L38]
>  (0.12.9) used by flink-connector-kinesis in the affected Flink versions has 
> a thread leak bug that causes applications to run out of memory after 
> frequent restarts:
> KPL Issue - [https://github.com/awslabs/amazon-kinesis-producer/issues/224]
> Fix - [https://github.com/awslabs/amazon-kinesis-producer/pull/225/files]
> Upgrading KPL to 0.12.10 or higher is necessary to avoid this issue. The 
> recommended version to upgrade would be the latest (0.13.1)
> Note that KPL version in Flink 1.10.0 has been updated to the latest version 
> (0.13.1): https://issues.apache.org/jira/browse/FLINK-12847
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13880) The behavior of JobExecutionResult.getAccumulatorResult does not match its java doc

2019-10-02 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13880:
---
Component/s: Runtime / Coordination

> The behavior of JobExecutionResult.getAccumulatorResult does not match its 
> java doc
> ---
>
> Key: FLINK-13880
> URL: https://issues.apache.org/jira/browse/FLINK-13880
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Reporter: Caizhi Weng
>Priority: Minor
>
> The java doc of `JobExecutionResult.getAccumulatorResult` states that 
> "Returns \{@code null}, if no accumulator with that name was produced", but 
> actually an NPE will be triggered if no accumulator with that name is 
> produced.
> I'm going to rewrite the `getAccumulatorResult` method to the following:
> {code:java}
> public  T getAccumulatorResult(String accumulatorName) {
>OptionalFailure result = 
> this.accumulatorResults.get(accumulatorName);
>if (result != null) {
>   return (T) result.getUnchecked();
>} else {
>   return null;
>}
> }
> {code}
> Please assign this issue to me if this solution is acceptable.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14169) Cleanup expired jobs from history server

2019-10-02 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14169:
---
Component/s: Runtime / Web Frontend

> Cleanup expired jobs from history server
> 
>
> Key: FLINK-14169
> URL: https://issues.apache.org/jira/browse/FLINK-14169
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Reporter: David Moravek
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Cleanup jobs, that are no longer in history refresh locations during 
> JobArchiveFetcher::run.
> https://github.com/apache/flink/blob/master/flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/history/HistoryServerArchiveFetcher.java#L138



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14198) Add type options to all flink python API doc

2019-10-02 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14198:
---
Component/s: API / Python

> Add type options to all flink python API doc
> 
>
> Key: FLINK-14198
> URL: https://issues.apache.org/jira/browse/FLINK-14198
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Reporter: Wei Zhong
>Assignee: Wei Zhong
>Priority: Minor
>
> Currently ":type:" and ":rtype:" options in python docstrings have been fully 
> supported in 
> sphinx([https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html])
>  and 
> PyCharm([https://www.jetbrains.com/help/pycharm/using-docstrings-to-specify-types.html]).
>  Sphinx will generate python API documents with type annotations if these 
> options exist in function docstrings. PyCharm also collects the type 
> information in docstrings to detect potential coding mistakes and provide 
> autocomplete support during python program development. There are already few 
> interfaces in python API that have these options. We should add these options 
> to the rest. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14166) Reuse cache from previous history server run

2019-10-02 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14166:
---
Component/s: Runtime / Web Frontend

> Reuse cache from previous history server run
> 
>
> Key: FLINK-14166
> URL: https://issues.apache.org/jira/browse/FLINK-14166
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Reporter: David Moravek
>Priority: Minor
>
> Currently history server is not able to reuse cache from previous run, even 
> when `historyserver.web.tmpdir` is set. It could simply "warm up" cached job 
> ids set, from previously parsed jobs.
> https://github.com/apache/flink/blob/master/flink-runtime-web/src/main/java/org/apache/flink/runtime/webmonitor/history/HistoryServerArchiveFetcher.java#L129
> This should be configurable, so it does not break backward compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13703) AvroTypeInfo requires objects to be strict POJOs (mutable, with setters)

2019-10-02 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13703:
---
Component/s: Formats (JSON, Avro, Parquet, ORC, SequenceFile)

> AvroTypeInfo requires objects to be strict POJOs (mutable, with setters)
> 
>
> Key: FLINK-13703
> URL: https://issues.apache.org/jira/browse/FLINK-13703
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Alexander Fedulov
>Priority: Minor
>
> There exists an option to generate Avro sources which would represent 
> immutable objects (`createSetters` option set to false) 
> [\[1\]|https://github.com/commercehub-oss/gradle-avro-plugin] , 
> [\[2\]|https://avro.apache.org/docs/current/api/java/org/apache/avro/mojo/AbstractAvroMojo.html].
>  Those objects still have full arguments constructors and are being correctly 
> dealt with by Avro. 
>  `AvroTypeInfo` in Flink performs a check to verify if a Class complies to 
> the strict POJO requirements (including setters) and throws an 
> IllegalStateException("Expecting type to be a PojoTypeInfo") otherwise. Can 
> this check be relaxed to provide better immutability support?
> +Steps to reproduce:+
> 1) Generate Avro sources from schema using `createSetters` option.
> 2) Use generated class in 
> `ConfluentRegistryAvroDeserializationSchema.forSpecific(GeneratedClass.class, 
> schemaRegistryUrl)`



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13956) Add sequence file format with repeated sync blocks

2019-10-02 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13956:
---
Component/s: Formats (JSON, Avro, Parquet, ORC, SequenceFile)

> Add sequence file format with repeated sync blocks
> --
>
> Key: FLINK-13956
> URL: https://issues.apache.org/jira/browse/FLINK-13956
> Project: Flink
>  Issue Type: Improvement
>  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>Reporter: Arvid Heise
>Priority: Minor
>
> The current {{SerializedOutputFormat}} produces files that are tightly bound 
> to the block size of the filesystem. While this was a somewhat plausible 
> assumption in the old HDFS days, it can lead to [hard to debug issues in 
> other file 
> systems|https://lists.apache.org/thread.html/bdd87cbb5eb7b19ab4be6501940ec5659e8f6ce6c27ccefa2680732c@%3Cdev.flink.apache.org%3E].
> We could implement a file format similar to the current version of Hadoop's 
> SequenceFileFormat: add a sync block in-between records whenever X bytes were 
> written. Hadoop uses 2k, but I'd propose to use 1M.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13710) JarListHandler always extract the jar package

2019-10-02 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13710:
---
Component/s: Runtime / Web Frontend

> JarListHandler always extract the jar package
> -
>
> Key: FLINK-13710
> URL: https://issues.apache.org/jira/browse/FLINK-13710
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Web Frontend
>Reporter: ChengWei Ye
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
>  
> {code:java}
> // JarListHandler class
> // handleRequest method
> for (String clazz : classes) {
>clazz = clazz.trim();
>PackagedProgram program = null;
>try {
>   // here
>   program = new PackagedProgram(f, clazz, new String[0]);
>} catch (Exception ignored) {
>   // ignore jar files which throw an error upon creating a PackagedProgram
>}
>if (program != null) {
>   JarListInfo.JarEntryInfo jarEntryInfo = new 
> JarListInfo.JarEntryInfo(clazz, program.getDescription());
>   jarEntryList.add(jarEntryInfo);
>}
> }
> {code}
> When I open the submit page of the jm web 
> ([http://localhost:7081/#/submit|http://localhost:8081/#/submit]), the 
> background always decompresses the lib directory in the job jar package until 
> the temp directory is full.
> If the jobmanager just gets the jar information, the submit page should not 
> extract the jar package.
> And I think the same jar only needs to be decompressed once, and should not 
> be decompressed every time it is submitted.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14103) StreamTask refactoring: refine and improve exceptions passed to failExternally call

2019-10-02 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14103:
---
Component/s: Runtime / Task

> StreamTask refactoring: refine and improve exceptions passed to 
> failExternally call
> ---
>
> Key: FLINK-14103
> URL: https://issues.apache.org/jira/browse/FLINK-14103
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: Alex
>Priority: Minor
>
> The documentation string for the {{AsynchronousException}} says:
> {quote}An exception for wrapping exceptions that are thrown by an operator in 
> threads other than the
> main compute thread of that operator.
> {quote}
> But the usage of the exception has diverged from the initial intent. In 
> particular, some actions that are run in the main task's thread (for example 
> via mailbox) may throw (or pass) instances of this exception.
> Also, some exceptions that are already instances of this class may be wrapped 
> in this exception again.
> It maybe needed to just adjust the documentation string of the exception, or 
> refine and improve the exceptions passed to the exception handling methods.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14184) Provide a stage listener API to be invoked per task manager

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14184:
---
Component/s: Runtime / Coordination

> Provide a stage listener API to be invoked per task manager
> ---
>
> Key: FLINK-14184
> URL: https://issues.apache.org/jira/browse/FLINK-14184
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: Stephen Connolly
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Often times a topology has nodes that need to use 3rd party APIs. Not every 
> 3rd party API is written in a good style for usage from within Flink.
> At present, implementing a `Rich___` will provide each stage with the 
> `open(...)` and `close()` callbacks, as the stage is accepted for execution 
> on each task manager.
> There is, however, a need for being able to listen for the first stage being 
> opened on any given task manager as well as the last stage being closed. 
> Critically the last stage being closed is the opportunity to release any 
> resources that are shared across multiple stages in the topology, e.g. 
> Database connection pools, Async HTTP Client thread pools, etc.
> Without such a clean-up hook, the connections and threads can act as GC roots 
> that prevent the topology's classloader from being unloaded and result in a 
> memory and resource leak in the task manager... nevermind that if it is a 
> Database connection pool, it may also be consuming resources from the 
> database.
> There are three workarounds available at present:
>  # Each stage just allocates its own resources and cleans up afterwards. This 
> is, in many ways, the ideal... however this can result in higher than 
> intended database connections, e.f. as each stage that accesses the database 
> stage needs to have a separate database connection rather than letting the 
> whole topology share the use of one or two connections through a connection 
> pool. Similarly, if the 3rd party library uses a static singleton for the 
> whole classloader there is no way for the independent stages to know when it 
> is safe to shut down the singleton
>  # Implement a reference counting proxy for the 3rd party API. This is a lot 
> of work, you need to ensure that deserialization of the proxy returns a 
> classloader singleton (so you can maintain the reference counts) and if the 
> count goes wrong you have leaked the resource
>  # Use a ReferenceQueue backed proxy. This is even more complex than 
> implementing reference counting, but has the advantage of not requiring the 
> count be maintained correctly. On the other hand, it does not provide for 
> eager release of the resources.
> If Flink provided a listener contract that could be registered with the 
> execution environment then this would allow the resources to be cleared out. 
> My proposed interface would look something like
> {code:java}
> public interface EnvironmentLocalTopologyListener extends Serializable {
>   /** 
>* Called immediately prior to the first {@link 
> RichFunction#open(Configuration)}
>* being invoked for the topology on the current task manager JVM for this
>* classloader. Will not be called again unless {#close()} has been invoked 
> first.
>* Use this method to eagerly initialize any ClassLoader scoped resources 
> that
>* are pooled across the stages of the topology.
>*
>* @param parameters // I am unsure if this makes sense
>*/ 
>   default void open(Configuration parameters) throws Exception {}
>   /**
>* Called after the last {@link RichFunction#close()} has completed and the
>* topology is effectively being stopped (for the current ClassLoader).
>* This method will only be invoked if a call to {@link 
> #open(Configuration)}
>* was attempted, and will be invoked irrespective of whether the call to
>* {@link #open(Configuration)} terminated normally or exceptionally.
>* Use this method to release any ClassLoader scoped resources that have 
> been
>* pooled across the stages of the topology.
>*/
>   default void close() throws Exception {}
>   /**
>* Decorate the threads that are used to invoke the stages of the topology.
>* Use this method, for example, to seed the {@link org.slf4j.MDC} with
>* topology specific details, e.g.
>* 
>* Runnable decorate(Runnable task) {
>*   return () -> {
>* try (MDC.MDCClosable ctx = MDC.putCloseable("foo", "bar")){
>*   task.run();
>* };
>* }
>* 
>*
>* @param task // might not be the most appropriate type, I haven't 
>* // checked how Flink implements dispatch. May or may not
>* // want a parameters 

[jira] [Updated] (FLINK-14254) Introduce BatchFileSystemSink

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14254:
---
Component/s: Table SQL / Planner

> Introduce BatchFileSystemSink
> -
>
> Key: FLINK-14254
> URL: https://issues.apache.org/jira/browse/FLINK-14254
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>
> Introduce BatchFileSystemSink to support all table file system connector with 
> partition support.
> BatchFileSystemSink use PartitionWriter to write:
>  # DynamicPartitionWriter
>  # GroupedPartitionWriter
>  # NonPartitionWriter
> BatchFileSystemSink use FileCommitter to commit temporary files.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13984) Separate on-heap and off-heap managed memory pools

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13984:
---
Component/s: Runtime / Coordination

> Separate on-heap and off-heap managed memory pools
> --
>
> Key: FLINK-13984
> URL: https://issues.apache.org/jira/browse/FLINK-13984
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Xintong Song
>Assignee: Andrey Zagrebin
>Priority: Major
>
> * Update {{MemoryManager}} to have two separated pools.
>  * Extend {{MemoryManager}} interfaces to specify which pool to allocate 
> memory from.
> Implement this step in common code paths for the legacy / new mode. For the 
> legacy mode, depending to the configured memory type, we can set one of the 
> two pools to the managed memory size and always allocate from this pool, 
> leaving the other pool empty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13925) ClassLoader in BlobLibraryCacheManager is not using context class loader

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13925:
---
Component/s: Runtime / Coordination

> ClassLoader in BlobLibraryCacheManager is not using context class loader
> 
>
> Key: FLINK-13925
> URL: https://issues.apache.org/jira/browse/FLINK-13925
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.8.1, 1.9.0
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.3, 1.9.2
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Use thread's current context classloader as parent class loader of flink user 
> code class loaders.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14060) Set slot sharing groups according to pipelined regions

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14060:
---
Component/s: Runtime / Coordination

> Set slot sharing groups according to pipelined regions
> --
>
> Key: FLINK-14060
> URL: https://issues.apache.org/jira/browse/FLINK-14060
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Xintong Song
>Priority: Major
>
> {{StreamingJobGraphGenerator}} set slot sharing group for operators at 
> compiling time.
>  * Identify pipelined regions, with respect to 
> {{allSourcesInSamePipelinedRegion}}
>  * Set slot sharing groups according to pipelined regions 
>  ** By default, each pipelined region should go into a separate slot sharing 
> group
>  ** If the user sets operators in multiple pipelined regions into same slot 
> sharing group, it should be respected
> This step should not introduce any behavior changes, given that later 
> scheduled pipelined regions can reuse slots from previous scheduled pipelined 
> regions. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14156) Execute/run processing timer triggers taking into account operator level mailbox loops

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14156:
---
Component/s: Runtime / Task

> Execute/run processing timer triggers taking into account operator level 
> mailbox loops
> --
>
> Key: FLINK-14156
> URL: https://issues.apache.org/jira/browse/FLINK-14156
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Task
>Reporter: Alex
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> With FLINK-12481, the timer triggers are executed by the mailbox thread and 
> passed to the mailbox with the maximum priority.
> In case of operators that use {{mailbox.yield()}} (introduced in 
> FLINK-13248), current approach may execute timer triggers that belong to an 
> upstream operator. Such timer trigger, may potentially call 
> {{processElement|Watermark()}} which eventually would come back to the 
> current operator. This situation may be similar to FLINK-13063.
> To avoid this, the proposal is to set mailbox letters priorities of timer 
> triggers with the priority of the operator that the trigger belongs to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14239) Emitting the max watermark in StreamSource#run may cause it to arrive the downstream early

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14239:
---
Component/s: API / DataStream

> Emitting the max watermark in StreamSource#run may cause it to arrive the 
> downstream early
> --
>
> Key: FLINK-14239
> URL: https://issues.apache.org/jira/browse/FLINK-14239
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: Haibo Sun
>Assignee: Haibo Sun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.10.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For {{Source}}, the max watermark is emitted in {{StreamSource#run}} 
> currently. If some records are also output in {{close}} of 
> {{RichSourceFunction}}, then the max watermark will reach the downstream 
> operator before these records.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13982) Implement memory calculation logics

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13982:
---
Component/s: (was: Runtime / Task)
 Runtime / Coordination

> Implement memory calculation logics
> ---
>
> Key: FLINK-13982
> URL: https://issues.apache.org/jira/browse/FLINK-13982
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> * Introduce new configuration options
>  * Introduce data structures and utilities.
>  ** Data structure to store memory / pool sizes of task executor
>  ** Utility for calculating memory / pool sizes from configuration
>  ** Utility for generating dynamic configurations
>  ** Utility for generating JVM parameters
> This step should not introduce any behavior changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13982) Implement memory calculation logics

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13982:
---
Component/s: Runtime / Task

> Implement memory calculation logics
> ---
>
> Key: FLINK-13982
> URL: https://issues.apache.org/jira/browse/FLINK-13982
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> * Introduce new configuration options
>  * Introduce data structures and utilities.
>  ** Data structure to store memory / pool sizes of task executor
>  ** Utility for calculating memory / pool sizes from configuration
>  ** Utility for generating dynamic configurations
>  ** Utility for generating JVM parameters
> This step should not introduce any behavior changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13803) Introduce SpillableHeapKeyedStateBackend and all necessities

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13803:
---
Component/s: Runtime / State Backends

> Introduce SpillableHeapKeyedStateBackend and all necessities
> 
>
> Key: FLINK-13803
> URL: https://issues.apache.org/jira/browse/FLINK-13803
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Reporter: Yu Li
>Priority: Major
>
> This JIRA aims at introducing a new {{SpillableHeapKeyedStateBackend}} which 
> will reuse most code of the {{HeapKeyedStateBackend}} (probably the only 
> difference is the spill-able one will register a {{HybridStateTable}}), and 
> allow using it in {{FsStateBackend}} and {{MemoryStateBackend}} (only as an 
> option, by default still {{HeapKeyedStateBackend}}) through configuration.
> The related necessities include but are not limited to:
> * A relative backend builder class
> * Relative restore operation classes
> * Necessary configurations for using spill-able backend
> This should be the last JIRA after which the spill-able heap backend feature 
> will become runnable regardless of the stability and performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13983) Launch task executor with new memory calculation logics

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13983:
---
Component/s: Runtime / Task

> Launch task executor with new memory calculation logics
> ---
>
> Key: FLINK-13983
> URL: https://issues.apache.org/jira/browse/FLINK-13983
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> * Invoke data structures and utilities introduced in Step 2 to generate JVM 
> parameters and dynamic configurations for launching new task executors.
>  ** In startup scripts
>  ** In resource managers
>  * Task executor uses data structures and utilities introduced in Step 2 to 
> set memory pool sizes and slot resource profiles.
>  ** {{MemoryManager}}
>  ** {{ShuffleEnvironment}}
>  ** {{TaskSlotTable}}
> Implement this step as separate code paths only for the new mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13813) metrics is different between overview ui and metric response

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13813:
---
Component/s: Runtime / Web Frontend

> metrics is different between overview ui and metric response
> 
>
> Key: FLINK-13813
> URL: https://issues.apache.org/jira/browse/FLINK-13813
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Web Frontend
> Environment: Flink 1.8.1 with hdfs
>Reporter: lidesheng
>Priority: Major
> Attachments: metrics.png, overview.png
>
>
> After a flink task is over, I get metrics by http request. The record count 
> in response is different with job overview.After a flink task is over, I get 
> metrics by http request. The record count in response is different with job 
> overview.
> get http://x.x.x.x:8081/jobs/57969f3978edf3115354fab1a72fd0c8 returns:
> { "jid": "57969f3978edf3115354fab1a72fd0c8", "name": 
> "3349_cjrw_1566177018152", "isStoppable": false, "state": "FINISHED", ... 
> "vertices": [ \{ "id": "d1cdde18b91ef6ce7c6a1cfdfa9e968d", "name": "CHAIN 
> DataSource (3349_cjrw_1566177018152/SOURCE/0) -> Map 
> (3349_cjrw_1566177018152/AUDIT/0)", "parallelism": 1, "status": "FINISHED", 
> ... "metrics": { "read-bytes": 0, "read-bytes-complete": true, "write-bytes": 
> 555941888, "write-bytes-complete": true, "read-records": 0, 
> "read-records-complete": true, "write-records": 500, 
> "write-records-complete": true } ... } ... ] }}
> But, the metrics by 
> http://x.x.x.x:8081/jobs/57969f3978edf3115354fab1a72fd0c8/vertices/d1cdde18b91ef6ce7c6a1cfdfa9e968d/metrics?get=0.numRecordsOut
>  returns
> [\{"id":"0.numRecordsOut","value":"4084803"}]
> The overview record count is different with task metrics, please view the 
> apppendix.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13846) Implement benchmark case on MapState#isEmpty

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13846:
---
Component/s: Benchmarks

> Implement benchmark case on MapState#isEmpty
> 
>
> Key: FLINK-13846
> URL: https://issues.apache.org/jira/browse/FLINK-13846
> Project: Flink
>  Issue Type: Improvement
>  Components: Benchmarks
>Reporter: Yun Tang
>Priority: Major
>
> If FLINK-13034 merged, we need to implement benchmark case on 
> {{MapState#isEmpty} in https://github.com/dataArtisans/flink-benchmarks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13860) Flink Apache Kudu Connector

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13860:
---
Component/s: Connectors / Common

> Flink Apache Kudu Connector
> ---
>
> Key: FLINK-13860
> URL: https://issues.apache.org/jira/browse/FLINK-13860
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: João Boto
>Priority: Major
>
> Hi..
> I'm the contributor and maintainer of this connector on Bahir-Flink project
> [https://github.com/apache/bahir-flink/tree/master/flink-connector-kudu]
>  
> but seems that flink-connectors on that project are less maintained an its 
> difficult to maintain the code up to date, as PR take a while to be merged 
> and never released any version, which makes it difficult to use easily
>  
> I would like to contribute that code to flink allowing other to contribute 
> and use that connector
>  
> [~fhueske] what do you think?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14099) SQL supports timestamp in Long

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14099:
---
Component/s: Table SQL / Planner

> SQL supports timestamp in Long
> --
>
> Key: FLINK-14099
> URL: https://issues.apache.org/jira/browse/FLINK-14099
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Planner
>Reporter: Zijie Lu
>Priority: Major
>
> The rowtime only supports sql timestamp but not long. Can the rowtime field 
> supports long?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14057) Add Remove Other Timers to TimerService

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14057:
---
Component/s: API / DataStream

> Add Remove Other Timers to TimerService
> ---
>
> Key: FLINK-14057
> URL: https://issues.apache.org/jira/browse/FLINK-14057
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Reporter: Jesse Anderson
>Priority: Major
>
> The TimerService service has the ability to add timers with 
> registerProcessingTimeTimer. This method can be called many times and have 
> different timer times.
> If you want to add a new timer and delete other timers, you have to keep 
> track of all previous timer times and call deleteProcessingTimeTimer for each 
> time. This method forces you to keep track of all previous (unexpired) timers 
> for a key.
> Instead, I suggest overloading registerProcessingTimeTimer with a second 
> boolean argument that will remove all previous timers and set the new timer.
> Note: although I'm using registerProcessingTimeTimer, this applies to 
> registerEventTimeTimer as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14172) Implement KubeClient with official Java client library for kubernetes

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14172:
---
Component/s: Deployment / Kubernetes

> Implement KubeClient with official Java client library for kubernetes
> -
>
> Key: FLINK-14172
> URL: https://issues.apache.org/jira/browse/FLINK-14172
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / Kubernetes
>Reporter: Yang Wang
>Priority: Major
>
> Official Java client library for kubernetes is become more and more active. 
> The new features(such as leader election) and some client 
> implementations(informer, lister, cache) are better. So we should use the 
> official java client for kubernetes in flink.
> https://github.com/kubernetes-client/java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14063) Operators use fractions to decide how many managed memory to allocate

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14063:
---
Component/s: Runtime / Task

> Operators use fractions to decide how many managed memory to allocate
> -
>
> Key: FLINK-14063
> URL: https://issues.apache.org/jira/browse/FLINK-14063
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Task
>Reporter: Xintong Song
>Priority: Major
>
> * Operators allocate memory segments with the amount returned by 
> {{MemoryManager#computeNumberOfPages}}.
>  * Operators reserve memory with the amount returned by 
> {{MemoryManager#computeMemorySize}}. 
> This step activates the new fraction based managed memory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14191) Extend SlotManager to support dynamic slot allocation on pending task executors

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14191:
---
Component/s: Runtime / Coordination

> Extend SlotManager to support dynamic slot allocation on pending task 
> executors
> ---
>
> Key: FLINK-14191
> URL: https://issues.apache.org/jira/browse/FLINK-14191
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Xintong Song
>Priority: Major
>
> * Introduce PendingTaskManagerResources
>  * Create PendingTaskManagerSlot on allocation, from 
> PendingTaskManagerResource
>  * Map registered task executors to matching PendingTaskManagerResources, and 
> allocate slots for corresponding PendingTaskManagerSlots
> Convert registered task executor free slots into equivalent available 
> resources according to default slot resource profiles.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14134) Introduce LimitableTableSource to optimize limit

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14134:
---
Component/s: Connectors / Hive

> Introduce LimitableTableSource to optimize limit
> 
>
> Key: FLINK-14134
> URL: https://issues.apache.org/jira/browse/FLINK-14134
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Hive
>Reporter: Jingsong Lee
>Priority: Major
>
> SQL: select *from t1 limit 1
> Now source will scan full table, if we can introduce LimitableTableSource, 
> let source know the limit line, source can just read one row is OK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14062) Set managed memory fractions according to slot sharing groups

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14062:
---
Component/s: Runtime / Task

> Set managed memory fractions according to slot sharing groups
> -
>
> Key: FLINK-14062
> URL: https://issues.apache.org/jira/browse/FLINK-14062
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Task
>Reporter: Xintong Song
>Priority: Major
>
> * For operators with specified {{ResourceSpecs}}, calculate fractions 
> according to operators {{ResourceSpecs}}
>  * For operators with unknown {{ResourceSpecs}}, calculate fractions 
> according to number of operators using managed memory
> This step should not introduce any behavior changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14192) Enable the dynamic slot allocation feature.

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14192:
---
Component/s: Runtime / Coordination

> Enable the dynamic slot allocation feature.
> ---
>
> Key: FLINK-14192
> URL: https://issues.apache.org/jira/browse/FLINK-14192
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Xintong Song
>Priority: Major
>
> * ResourceManager uses TaskExecutor registered default slot resource 
> profiles, instead of that calculated on RM side.
>  * ResourceManager uses actual requested resource profiles for slot requests, 
> instead assuming default profile for all requests.
>  * TaskExecutor bookkeep with requested resource profiles instead, instead of 
> assuming default profile for all requests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14061) Introduce managed memory fractions to StreamConfig

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14061:
---
Component/s: Runtime / Task

> Introduce managed memory fractions to StreamConfig
> --
>
> Key: FLINK-14061
> URL: https://issues.apache.org/jira/browse/FLINK-14061
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Task
>Reporter: Xintong Song
>Priority: Major
>
> Introduce {{fracManagedMemOnHeap}} and {{fracManagedMemOffHeap}} in 
> {{StreamConfig}}, so they can be set by {{StreamingJobGraphGenerator}} and 
> used by operators in runtime. 
> This step should not introduce any behavior changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14190) Extend SlotManager to support dynamic slot allocation on registered TaskExecutors.

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14190:
---
Component/s: Runtime / Coordination

> Extend SlotManager to support dynamic slot allocation on registered 
> TaskExecutors.
> --
>
> Key: FLINK-14190
> URL: https://issues.apache.org/jira/browse/FLINK-14190
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Xintong Song
>Priority: Major
>
> * Bookkeep task manager available resources
>  * Match between slot requests and task executor resources
>  ** Find task executors with matching available resources for slot requests
>  ** Find matching pending slot requests for task executors with new available 
> resources
>  * Create TaskManagerSlot on allocation and remove on free.
>  * Request slot from TaskExecutor with resource profiles.
> Use RM calculated default resource profiles for all slot requests. Convert 
> free slots in SlotReports into equivalent available resources according to 
> default slot resource profiles.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14194) Clean-up of legacy mode.

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14194:
---
Component/s: Runtime / Coordination

> Clean-up of legacy mode.
> 
>
> Key: FLINK-14194
> URL: https://issues.apache.org/jira/browse/FLINK-14194
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14199) Only use dedicated/named classes for mailbox letters.

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14199:
---
Component/s: Runtime / Task

> Only use dedicated/named classes for mailbox letters.
> -
>
> Key: FLINK-14199
> URL: https://issues.apache.org/jira/browse/FLINK-14199
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Task
>Reporter: Arvid Heise
>Priority: Major
> Attachments: Screenshot 2019-08-22 at 12.59.59.png
>
>
> Unnamed lambdas make it difficult to debug the code
>  !Screenshot 2019-08-22 at 12.59.59.png! 
> Note: there are cases of the mailbox usage that need a Future to track when a 
> letter is executed and get success/exception result. Current implementation 
> (in MailboxExecutor) piggybacks on java.util.concurrent.FutureTask and the 
> fact that it implements Java Executor api.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14193) Update RestAPI / Web UI

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14193:
---
Component/s: Runtime / Web Frontend
 Runtime / Coordination

> Update RestAPI / Web UI
> ---
>
> Key: FLINK-14193
> URL: https://issues.apache.org/jira/browse/FLINK-14193
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination, Runtime / Web Frontend
>Reporter: Xintong Song
>Priority: Major
>
> * Update RestAPI / WebUI to properly display information of available 
> resources and allocated slots of task executors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14189) Extend TaskExecutor to support dynamic slot allocation

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14189:
---
Component/s: Runtime / Coordination

> Extend TaskExecutor to support dynamic slot allocation
> --
>
> Key: FLINK-14189
> URL: https://issues.apache.org/jira/browse/FLINK-14189
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Xintong Song
>Priority: Major
>
> * TaskSlotTable
>  ** Bookkeep task manager available resources
>  ** Add and implement interface for dynamic allocating slot (with resource 
> profile instead of slot index)
>  ** Create slot report with dynamic allocated slots and remaining available 
> resources
>  * TaskExecutor
>  ** Support request slot with resource profile rather than slot id.
> The slot report still contain status of legacy free slots. When 
> ResourceManager requests slots with slot id, convert it to default slot 
> resource profiles for bookkeeping.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14188) TaskExecutor derive and register with default slot resource profile

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14188:
---
Component/s: Runtime / Coordination

> TaskExecutor derive and register with default slot resource profile
> ---
>
> Key: FLINK-14188
> URL: https://issues.apache.org/jira/browse/FLINK-14188
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Xintong Song
>Priority: Major
>
> * Introduce config option for defaultSlotFraction
>  * Derive default slot resource profile from the new config option, or the 
> legacy config option "taskmanager.numberOfTaskSlots".
>  * Register task executor with the default slot resource profile.
> This step should not introduce any behavior changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13854) Support Aggregating in Join and CoGroup

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13854:
---
Component/s: API / DataStream

> Support Aggregating in Join and CoGroup
> ---
>
> Key: FLINK-13854
> URL: https://issues.apache.org/jira/browse/FLINK-13854
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Affects Versions: 1.9.0
>Reporter: Jiayi Liao
>Priority: Major
>
> In WindowStream we can use  windowStream.aggregate(AggregateFunction, 
> WindowFunction) to aggregate input records in real-time.   
> I think we should support similar api in JoinedStreams and CoGroupStreams, 
> because it's a very huge cost by storing the records log in state backend, 
> especially when we don't need the specific records.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13986) Clean-up of legacy mode

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13986:
---
Component/s: Runtime / Task

> Clean-up of legacy mode
> ---
>
> Key: FLINK-13986
> URL: https://issues.apache.org/jira/browse/FLINK-13986
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Task
>Reporter: Xintong Song
>Priority: Major
>
> * Fix / update / remove test cases for legacy mode
>  * Deprecate / remove legacy config options.
>  * Remove legacy code paths
>  * Remove the switch for legacy / new mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14146) Introduce Pulsar connector

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14146:
---
Component/s: Connectors / Common

> Introduce Pulsar connector
> --
>
> Key: FLINK-14146
> URL: https://issues.apache.org/jira/browse/FLINK-14146
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Common
>Reporter: Yijie Shen
>Priority: Major
>
> Please see FLIP-72 for detailed information:
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13985) Use unsafe memory for managed memory.

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13985:
---
Component/s: Runtime / Task

> Use unsafe memory for managed memory.
> -
>
> Key: FLINK-13985
> URL: https://issues.apache.org/jira/browse/FLINK-13985
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Task
>Reporter: Xintong Song
>Assignee: Andrey Zagrebin
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> * Allocate memory with {{Unsafe.allocateMemory}}
>  ** {{MemoryManager}}
> Implement this issue in common code paths for the legacy / new mode. This 
> should only affect the GC behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14100) Introduce OracleDialect

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14100:
---
Component/s: Connectors / JDBC

> Introduce OracleDialect
> ---
>
> Key: FLINK-14100
> URL: https://issues.apache.org/jira/browse/FLINK-14100
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / JDBC
>Reporter: Canbin Zheng
>Assignee: Canbin Zheng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14102) Introduce DB2Dialect

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14102:
---
Component/s: (was: Table SQL / Planner)
 Connectors / JDBC

> Introduce DB2Dialect
> 
>
> Key: FLINK-14102
> URL: https://issues.apache.org/jira/browse/FLINK-14102
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / JDBC
>Reporter: Canbin Zheng
>Assignee: Canbin Zheng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14101) Introduce SqlServerDialect

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14101:
---
Component/s: Connectors / JDBC

> Introduce SqlServerDialect
> --
>
> Key: FLINK-14101
> URL: https://issues.apache.org/jira/browse/FLINK-14101
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / JDBC
>Reporter: Canbin Zheng
>Assignee: Canbin Zheng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-14102) Introduce DB2Dialect

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-14102:
---
Component/s: Table SQL / Planner

> Introduce DB2Dialect
> 
>
> Key: FLINK-14102
> URL: https://issues.apache.org/jira/browse/FLINK-14102
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Canbin Zheng
>Assignee: Canbin Zheng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13878) Flink CLI fails (TimeoutException) for unknown reason

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13878:
---
Component/s: Command Line Client

> Flink CLI fails (TimeoutException) for unknown reason
> -
>
> Key: FLINK-13878
> URL: https://issues.apache.org/jira/browse/FLINK-13878
> Project: Flink
>  Issue Type: Bug
>  Components: Command Line Client
>Reporter: Mitch Wasson
>Priority: Major
>
> We are running Flink 1.8.1 in HA mode with Zookeeper 3.4.11.
> {{When we list/submit/cancel from the CLI on a job manager, the operation 
> fails with a TimeoutException. Here is an example:}}
> {{ $ ./bin/flink list}}
> {{ Waiting for 
> response...}}{{}}
> {{ The program finished with the following 
> exception:}}{{org.apache.flink.util.FlinkException: Failed to retrieve job 
> list.}}
> {{ at org.apache.flink.client.cli.CliFrontend.listJobs(CliFrontend.java:448)}}
> {{ at 
> org.apache.flink.client.cli.CliFrontend.lambda$list$0(CliFrontend.java:430)}}
> {{ at 
> org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:985)}}
> {{ at org.apache.flink.client.cli.CliFrontend.list(CliFrontend.java:427)}}
> {{ at 
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1053)}}
> {{ at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)}}
> {{ at 
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)}}
> {{ at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)}}
> {{ Caused by: java.util.concurrent.TimeoutException}}
> {{ at 
> org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:943)}}
> {{ at 
> org.apache.flink.runtime.concurrent.DirectExecutorService.execute(DirectExecutorService.java:211)}}
> {{ at 
> org.apache.flink.runtime.concurrent.FutureUtils.lambda$orTimeout$11(FutureUtils.java:361)}}
> {{ at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)}}
> {{ at java.util.concurrent.FutureTask.run(FutureTask.java:266)}}
> {{ at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)}}
> {{ at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)}}
> {{ at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)}}
> {{ at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}}
> {{ at java.lang.Thread.run(Thread.java:748)}}
>  
> {{When the same command is run with strace, we see that the CLI only manages 
> to connect out to zookeeper and hangs on that connection:}}
> {{ $ strace -f -e connect ./bin/flink list}}
> {{ ...}}
> {{ [pid 29445] connect(27, \{sa_family=AF_INET, sin_port=htons(53), 
> sin_addr=inet_addr("gateway ip")}, 16) = 0}}
> {{ ...}}
> {{ [pid 29445] connect(27,\{sa_family=AF_INET, sin_port=htons(2181), 
> sin_addr=inet_addr("zookeeper ip")}, 16) = -1 EINPROGRESS (Operation now in 
> progress)}}
> {{ strace: Process 29448 attached}}
> {{ Waiting for response...}}
> {{ ...}}
> {{ Exception}}
>  
> {{The CLI is able to successfully connect to zookeeper and even appears to 
> submit commands. This is verified by logs from zookeeper:}}
> {{INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@215] - 
> Accepted socket connection from /JOB_MANAGER_IP:52074}}
> {{DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@880] - 
> Session establishment request from client /JOB_MANAGER_IP:52074 client's 
> lastZxid is 0x0}}
> {{INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@938] - 
> Client attempting to establish new session at /JOB_MANAGER_IP:52074}}
> {{DEBUG [FollowerRequestProcessor:3:CommitProcessor@174] - Processing 
> request:: sessionid:0x30037a90dce0001 type:createSession cxid:0x0 
> zxid:0xfffe txntype:unknown reqpath:n/a}}
> {{DEBUG [QuorumPeer[myid=3]/0.0.0.0:2181:CommitProcessor@164] - Committing 
> request:: sessionid:0x30037a90dce0001 type:createSession cxid:0x0 
> zxid:0x1020109 txntype:-10 reqpath:n/a}}
> {{DEBUG [CommitProcessor:3:FinalRequestProcessor@89] - Processing request:: 
> sessionid:0x30037a90dce0001 type:createSession cxid:0x0 zxid:0x1020109 
> txntype:-10 reqpath:n/a}}
> {{DEBUG [CommitProcessor:3:FinalRequestProcessor@161] - 
> sessionid:0x30037a90dce0001 type:createSession cxid:0x0 zxid:0x1020109 
> txntype:-10 reqpath:n/a}}
> {{INFO [CommitProcessor:3:ZooKeeperServer@683] - Established session 
> 0x30037a90dce0001 with negotiated timeout 4 for client 
> /JOB_MANAGER_IP:52074}}
> {{DEBUG [FollowerRequestProcessor:3:CommitProcessor@174] - Processing 
> request:: 

[jira] [Updated] (FLINK-13878) Flink CLI fails (TimeoutException) for unknown reason

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13878:
---
Component/s: Runtime / Coordination

> Flink CLI fails (TimeoutException) for unknown reason
> -
>
> Key: FLINK-13878
> URL: https://issues.apache.org/jira/browse/FLINK-13878
> Project: Flink
>  Issue Type: Bug
>  Components: Command Line Client, Runtime / Coordination
>Reporter: Mitch Wasson
>Priority: Major
>
> We are running Flink 1.8.1 in HA mode with Zookeeper 3.4.11.
> {{When we list/submit/cancel from the CLI on a job manager, the operation 
> fails with a TimeoutException. Here is an example:}}
> {{ $ ./bin/flink list}}
> {{ Waiting for 
> response...}}{{}}
> {{ The program finished with the following 
> exception:}}{{org.apache.flink.util.FlinkException: Failed to retrieve job 
> list.}}
> {{ at org.apache.flink.client.cli.CliFrontend.listJobs(CliFrontend.java:448)}}
> {{ at 
> org.apache.flink.client.cli.CliFrontend.lambda$list$0(CliFrontend.java:430)}}
> {{ at 
> org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:985)}}
> {{ at org.apache.flink.client.cli.CliFrontend.list(CliFrontend.java:427)}}
> {{ at 
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1053)}}
> {{ at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)}}
> {{ at 
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)}}
> {{ at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)}}
> {{ Caused by: java.util.concurrent.TimeoutException}}
> {{ at 
> org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:943)}}
> {{ at 
> org.apache.flink.runtime.concurrent.DirectExecutorService.execute(DirectExecutorService.java:211)}}
> {{ at 
> org.apache.flink.runtime.concurrent.FutureUtils.lambda$orTimeout$11(FutureUtils.java:361)}}
> {{ at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)}}
> {{ at java.util.concurrent.FutureTask.run(FutureTask.java:266)}}
> {{ at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)}}
> {{ at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)}}
> {{ at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)}}
> {{ at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}}
> {{ at java.lang.Thread.run(Thread.java:748)}}
>  
> {{When the same command is run with strace, we see that the CLI only manages 
> to connect out to zookeeper and hangs on that connection:}}
> {{ $ strace -f -e connect ./bin/flink list}}
> {{ ...}}
> {{ [pid 29445] connect(27, \{sa_family=AF_INET, sin_port=htons(53), 
> sin_addr=inet_addr("gateway ip")}, 16) = 0}}
> {{ ...}}
> {{ [pid 29445] connect(27,\{sa_family=AF_INET, sin_port=htons(2181), 
> sin_addr=inet_addr("zookeeper ip")}, 16) = -1 EINPROGRESS (Operation now in 
> progress)}}
> {{ strace: Process 29448 attached}}
> {{ Waiting for response...}}
> {{ ...}}
> {{ Exception}}
>  
> {{The CLI is able to successfully connect to zookeeper and even appears to 
> submit commands. This is verified by logs from zookeeper:}}
> {{INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@215] - 
> Accepted socket connection from /JOB_MANAGER_IP:52074}}
> {{DEBUG [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@880] - 
> Session establishment request from client /JOB_MANAGER_IP:52074 client's 
> lastZxid is 0x0}}
> {{INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@938] - 
> Client attempting to establish new session at /JOB_MANAGER_IP:52074}}
> {{DEBUG [FollowerRequestProcessor:3:CommitProcessor@174] - Processing 
> request:: sessionid:0x30037a90dce0001 type:createSession cxid:0x0 
> zxid:0xfffe txntype:unknown reqpath:n/a}}
> {{DEBUG [QuorumPeer[myid=3]/0.0.0.0:2181:CommitProcessor@164] - Committing 
> request:: sessionid:0x30037a90dce0001 type:createSession cxid:0x0 
> zxid:0x1020109 txntype:-10 reqpath:n/a}}
> {{DEBUG [CommitProcessor:3:FinalRequestProcessor@89] - Processing request:: 
> sessionid:0x30037a90dce0001 type:createSession cxid:0x0 zxid:0x1020109 
> txntype:-10 reqpath:n/a}}
> {{DEBUG [CommitProcessor:3:FinalRequestProcessor@161] - 
> sessionid:0x30037a90dce0001 type:createSession cxid:0x0 zxid:0x1020109 
> txntype:-10 reqpath:n/a}}
> {{INFO [CommitProcessor:3:ZooKeeperServer@683] - Established session 
> 0x30037a90dce0001 with negotiated timeout 4 for client 
> /JOB_MANAGER_IP:52074}}
> {{DEBUG [FollowerRequestProcessor:3:CommitProcessor@174] - Processing 
> 

[jira] [Updated] (FLINK-13938) Use yarn public distributed cache to speed up containers launch

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13938:
---
Component/s: Deployment / YARN

> Use yarn public distributed cache to speed up containers launch
> ---
>
> Key: FLINK-13938
> URL: https://issues.apache.org/jira/browse/FLINK-13938
> Project: Flink
>  Issue Type: New Feature
>  Components: Deployment / YARN
>Reporter: Yang Wang
>Assignee: Yang Wang
>Priority: Major
>
> By default, the LocalResourceVisibility is APPLICATION, so they will be 
> downloaded only once and shared for all taskmanager containers of a same 
> application in the same node. However, different applications will have to 
> download all jars every time, including the flink-dist.jar. I think we could 
> use the yarn public cache to eliminate the unnecessary jars downloading and 
> make launching container faster.
>  
> How to use the shared lib feature?
>  # Upload a copy of flink release binary to hdfs.
>  # Use the -ysl argument to specify the shared lib
> {code:java}
> ./bin/flink run -d -m yarn-cluster -p 20 -ysl 
> hdfs:///flink/release/flink-1.9.0/lib examples/streaming/WindowJoin.jar{code}
>  
> -ysl, --yarnsharedLib           Upload a copy of flink lib beforehand
>                                                           and specify the 
> path to use public
>                                                           visibility feature 
> of YARN NodeManager
>                                                           localizing 
> resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13777) Introduce sql function wrappers and conversion to ExpressionConverter

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13777:
---
Component/s: Table SQL / Planner

> Introduce sql function wrappers and conversion to ExpressionConverter
> -
>
> Key: FLINK-13777
> URL: https://issues.apache.org/jira/browse/FLINK-13777
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>
> For remove the extended calcite sql functions, we introduce wrappers to wrap 
> flink FunctionDefinition:
> 1.Add SqlReturnTypeInferenceWrapper to wrap TypeStrategy
> 2.Add SqlOperandTypeInferenceWrapper to wrap InputTypeStrategy
> 3.Add SqlOperandTypeCheckerWrapper to wrap InputTypeValidator
> 4.Add SqlFunctionWrapper to wrap SqlFunction
> 5.Add SqlFunctionWrapper converter and Standard sql converter



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13788) Document state migration constraints on keys

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13788:
---
Component/s: API / Type Serialization System

> Document state migration constraints on keys
> 
>
> Key: FLINK-13788
> URL: https://issues.apache.org/jira/browse/FLINK-13788
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Type Serialization System, Documentation
>Reporter: Seth Wiesman
>Assignee: Seth Wiesman
>Priority: Major
>
> [https://lists.apache.org/thread.html/79d74334baecbcbd765cead2f90df470d3ffe55b839f208e9695a6e2@%3Cuser.flink.apache.org%3E]
>  
> Key migrations are not allowed to prevent: 
>  
> 1) Key clashes
> 2) Change in key group assignment
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-13788) Document state migration constraints on keys

2019-10-01 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13788:
---
Component/s: Documentation

> Document state migration constraints on keys
> 
>
> Key: FLINK-13788
> URL: https://issues.apache.org/jira/browse/FLINK-13788
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Seth Wiesman
>Assignee: Seth Wiesman
>Priority: Major
>
> [https://lists.apache.org/thread.html/79d74334baecbcbd765cead2f90df470d3ffe55b839f208e9695a6e2@%3Cuser.flink.apache.org%3E]
>  
> Key migrations are not allowed to prevent: 
>  
> 1) Key clashes
> 2) Change in key group assignment
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-14175) Upgrade KPL version in flink-connector-kinesis to fix application OOM

2019-09-30 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-14175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940729#comment-16940729
 ] 

Robert Metzger commented on FLINK-14175:


[~phoenixjiangnan]: you have recently worked on Kinesis as part of 
https://issues.apache.org/jira/browse/FLINK-12847, could you take a look here?

> Upgrade KPL version in flink-connector-kinesis to fix application OOM
> -
>
> Key: FLINK-14175
> URL: https://issues.apache.org/jira/browse/FLINK-14175
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.6.3, 1.6.4, 1.6.5, 1.7.2, 1.7.3, 1.8.0, 1.8.1, 1.8.2, 
> 1.9.0
> Environment: [link title|http://example.com][link 
> title|http://example.com]
>Reporter: Abhilasha Seth
>Priority: Major
>
> The [KPL 
> version|https://github.com/apache/flink/blob/release-1.9/flink-connectors/flink-connector-kinesis/pom.xml#L38]
>  (0.12.9) used by flink-connector-kinesis in the affected Flink versions has 
> a thread leak bug that causes applications to run out of memory after 
> frequent restarts:
> KPL Issue - [https://github.com/awslabs/amazon-kinesis-producer/issues/224]
> Fix - [https://github.com/awslabs/amazon-kinesis-producer/pull/225/files]
> Upgrading KPL to 0.12.10 or higher is necessary to avoid this issue. The 
> recommended version to upgrade would be the latest (0.13.1)
> Note that KPL version in Flink 1.10.0 has been updated to the latest version 
> (0.13.1): https://issues.apache.org/jira/browse/FLINK-12847
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-13821) Website must link to License etc

2019-09-08 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16925166#comment-16925166
 ] 

Robert Metzger commented on FLINK-13821:


Very nice, thank you!

> Website must link to License etc
> 
>
> Key: FLINK-13821
> URL: https://issues.apache.org/jira/browse/FLINK-13821
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Sebb
>Assignee: Robert Metzger
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> ASF project websites must have certain links:
> Apachecon
> License
> Thanks
> Security
> Sponsor/Donat
> Please see:
> https://whimsy.apache.org/site/project/flink



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (FLINK-13821) Website must link to License etc

2019-09-05 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-13821.

Resolution: Fixed

This has been resolved in 
[https://github.com/apache/flink-web/commit/ce569aecbe33f6cf5697d9e8e6214e2011e04875]
 by [~fhueske]

> Website must link to License etc
> 
>
> Key: FLINK-13821
> URL: https://issues.apache.org/jira/browse/FLINK-13821
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Sebb
>Assignee: Robert Metzger
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> ASF project websites must have certain links:
> Apachecon
> License
> Thanks
> Security
> Sponsor/Donat
> Please see:
> https://whimsy.apache.org/site/project/flink



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13978) Evaluate Azure Pipelines as a CI tool for Flink

2019-09-05 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-13978:
--

 Summary: Evaluate Azure Pipelines as a CI tool for Flink
 Key: FLINK-13978
 URL: https://issues.apache.org/jira/browse/FLINK-13978
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Reporter: Robert Metzger
Assignee: Robert Metzger


See ML discussion: 
[https://lists.apache.org/thread.html/b90aa518fcabce94f8e1de4132f46120fae613db6e95a2705f1bd1ea@%3Cdev.flink.apache.org%3E]
 

We want to try out Azure Pipelines for the following reasons:
 * more mature system (compared to travis)
 * 10 parallel, 6 hrs builds for open source
 * ability to add custom machines

 

(See also INFRA-17030) 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (FLINK-13821) Website must link to License etc

2019-08-28 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-13821:
--

Assignee: Robert Metzger

> Website must link to License etc
> 
>
> Key: FLINK-13821
> URL: https://issues.apache.org/jira/browse/FLINK-13821
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Sebb
>Assignee: Robert Metzger
>Priority: Major
>
> ASF project websites must have certain links:
> Apachecon
> License
> Thanks
> Security
> Sponsor/Donat
> Please see:
> https://whimsy.apache.org/site/project/flink



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (FLINK-13821) Website must link to License etc

2019-08-28 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1691#comment-1691
 ] 

Robert Metzger commented on FLINK-13821:


Thanks for opening this ticket Sebb. I was wondering the same, but never came 
across doing it.

I'll fix that soon.

> Website must link to License etc
> 
>
> Key: FLINK-13821
> URL: https://issues.apache.org/jira/browse/FLINK-13821
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Sebb
>Priority: Major
>
> ASF project websites must have certain links:
> Apachecon
> License
> Thanks
> Security
> Sponsor/Donat
> Please see:
> https://whimsy.apache.org/site/project/flink



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (FLINK-13821) Website must link to License etc

2019-08-28 Thread Robert Metzger (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13821:
---
Component/s: Project Website

> Website must link to License etc
> 
>
> Key: FLINK-13821
> URL: https://issues.apache.org/jira/browse/FLINK-13821
> Project: Flink
>  Issue Type: Bug
>  Components: Project Website
>Reporter: Sebb
>Priority: Major
>
> ASF project websites must have certain links:
> Apachecon
> License
> Thanks
> Security
> Sponsor/Donat
> Please see:
> https://whimsy.apache.org/site/project/flink



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (FLINK-4463) FLIP-3: Restructure Documentation

2019-08-16 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909017#comment-16909017
 ] 

Robert Metzger commented on FLINK-4463:
---

I would propose to close all subtasks as Invalid with a comment that FLIP-3 has 
been completed.

> FLIP-3: Restructure Documentation
> -
>
> Key: FLINK-4463
> URL: https://issues.apache.org/jira/browse/FLINK-4463
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Ufuk Celebi
>Priority: Major
>
> Super issue to track progress for 
> [FLIP-3|https://cwiki.apache.org/confluence/display/FLINK/FLIP-3+-+Organization+of+Documentation].



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-9941) Flush in ScalaCsvOutputFormat before close method

2019-08-16 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909006#comment-16909006
 ] 

Robert Metzger commented on FLINK-9941:
---

[~wind_ljy] why did you close the Jira ticket?

> Flush in ScalaCsvOutputFormat before close method
> -
>
> Key: FLINK-9941
> URL: https://issues.apache.org/jira/browse/FLINK-9941
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Scala
>Affects Versions: 1.5.1
>Reporter: Ryan Tao
>Assignee: Jiayi Liao
>Priority: Major
>  Labels: pull-request-available
>
> Because not every stream's close method will flush, in order to ensure the 
> stability of continuous integration, we need to manually call flush() before 
> close().
> I noticed that CsvOutputFormat (Java API) has done this. As follows. 
> {code:java}
> //CsvOutputFormat
> public void close() throws IOException {
> if (wrt != null) {
> this.wrt.flush();
> this.wrt.close();
> }
> super.close();
> }
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13739) BinaryRowTest.testWriteString() fails in some environments

2019-08-15 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13739:
---
Issue Type: Bug  (was: Task)

> BinaryRowTest.testWriteString() fails in some environments
> --
>
> Key: FLINK-13739
> URL: https://issues.apache.org/jira/browse/FLINK-13739
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.9.0
> Environment:  
>  
>Reporter: Robert Metzger
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available, test-stability
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
>  
> {code:java}
> Test set: org.apache.flink.table.dataformat.BinaryRowTest
> ---
> Tests run: 26, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.328 s <<< 
> FAILURE! - in org.apache.flink.table.dataformat.BinaryRowTest
> testWriteString(org.apache.flink.table.dataformat.BinaryRowTest)  Time 
> elapsed: 0.05 s  <<< FAILURE!
> org.junit.ComparisonFailure: 
> expected:<[<95><95><95><95><95><88><91><98>
> <90><9A><84><89><88><8C>]> 
> but was:<[?]>
>         at 
> org.apache.flink.table.dataformat.BinaryRowTest.testWriteString(BinaryRowTest.java:189)
> {code}
>  
> This error happens on a Google Cloud n2-standard-16 (16 vCPUs, 64 GB memory) 
> machine.
> {code}$ lsb_release -a
> No LSB modules are available.
> Distributor ID:Debian
> Description:Debian GNU/Linux 9.9 (stretch)
> Release:9.9
> Codename:stretch{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13738) NegativeArraySizeException in LongHybridHashTable

2019-08-15 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13738:
---
Issue Type: Bug  (was: Task)

> NegativeArraySizeException in LongHybridHashTable
> -
>
> Key: FLINK-13738
> URL: https://issues.apache.org/jira/browse/FLINK-13738
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.9.0
>Reporter: Robert Metzger
>Priority: Major
>
> Executing this (meaningless) query:
> {code:java}
> INSERT INTO sinkTable ( SELECT CONCAT( CAST( id AS VARCHAR), CAST( COUNT(*) 
> AS VARCHAR)) as something, 'const' FROM CsvTable, table1  WHERE sometxt LIKE 
> 'a%' AND id = key GROUP BY id ) {code}
> leads to the following exception:
> {code:java}
> Caused by: java.lang.NegativeArraySizeException
>  at 
> org.apache.flink.table.runtime.hashtable.LongHybridHashTable.tryDenseMode(LongHybridHashTable.java:216)
>  at 
> org.apache.flink.table.runtime.hashtable.LongHybridHashTable.endBuild(LongHybridHashTable.java:105)
>  at LongHashJoinOperator$36.endInput1$(Unknown Source)
>  at LongHashJoinOperator$36.endInput(Unknown Source)
>  at 
> org.apache.flink.streaming.runtime.tasks.OperatorChain.endInput(OperatorChain.java:256)
>  at 
> org.apache.flink.streaming.runtime.io.StreamTwoInputSelectableProcessor.checkFinished(StreamTwoInputSelectableProcessor.java:359)
>  at 
> org.apache.flink.streaming.runtime.io.StreamTwoInputSelectableProcessor.processInput(StreamTwoInputSelectableProcessor.java:193)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.performDefaultAction(StreamTask.java:276)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.run(StreamTask.java:298)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:403)
>  at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:687)
>  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:517)
>  at java.lang.Thread.run(Thread.java:748){code}
> This is the plan:
>  
> {code:java}
> == Abstract Syntax Tree ==
> LogicalSink(name=[sinkTable], fields=[f0, f1])
> +- LogicalProject(something=[CONCAT(CAST($0):VARCHAR(2147483647) CHARACTER 
> SET "UTF-16LE", CAST($1):VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT 
> NULL)], EXPR$1=[_UTF-16LE'const'])
>+- LogicalAggregate(group=[{0}], agg#0=[COUNT()])
>   +- LogicalProject(id=[$1])
>  +- LogicalFilter(condition=[AND(LIKE($0, _UTF-16LE'a%'), =($1, 
> CAST($2):BIGINT))])
> +- LogicalJoin(condition=[true], joinType=[inner])
>:- LogicalTableScan(table=[[default_catalog, default_database, 
> CsvTable, source: [CsvTableSource(read fields: sometxt, id)]]])
>+- LogicalTableScan(table=[[default_catalog, default_database, 
> table1, source: [GeneratorTableSource(key, rowtime, payload)]]])
> == Optimized Logical Plan ==
> Sink(name=[sinkTable], fields=[f0, f1]): rowcount = 1498810.6659336376, 
> cumulative cost = {4.459964319978008E8 rows, 1.879799762133187E10 cpu, 4.8E9 
> io, 8.4E8 network, 1.799524266373455E8 memory}
> +- Calc(select=[CONCAT(CAST(id), CAST($f1)) AS something, _UTF-16LE'const' AS 
> EXPR$1]): rowcount = 1498810.6659336376, cumulative cost = 
> {4.444976213318672E8 rows, 1.8796498810665936E10 cpu, 4.8E9 io, 8.4E8 
> network, 1.799524266373455E8 memory}
>+- HashAggregate(isMerge=[false], groupBy=[id], select=[id, COUNT(*) AS 
> $f1]): rowcount = 1498810.6659336376, cumulative cost = {4.429988106659336E8 
> rows, 1.8795E10 cpu, 4.8E9 io, 8.4E8 network, 1.799524266373455E8 memory}
>   +- Calc(select=[id]): rowcount = 1.575E7, cumulative cost = {4.415E8 
> rows, 1.848E10 cpu, 4.8E9 io, 8.4E8 network, 1.2E8 memory}
>  +- HashJoin(joinType=[InnerJoin], where=[=(id, key0)], select=[id, 
> key0], build=[left]): rowcount = 1.575E7, cumulative cost = {4.2575E8 rows, 
> 1.848E10 cpu, 4.8E9 io, 8.4E8 network, 1.2E8 memory}
> :- Exchange(distribution=[hash[id]]): rowcount = 500.0, 
> cumulative cost = {1.1E8 rows, 8.4E8 cpu, 2.0E9 io, 4.0E7 network, 0.0 memory}
> :  +- Calc(select=[id], where=[LIKE(sometxt, _UTF-16LE'a%')]): 
> rowcount = 500.0, cumulative cost = {1.05E8 rows, 0.0 cpu, 2.0E9 io, 0.0 
> network, 0.0 memory}
> : +- TableSourceScan(table=[[default_catalog, 
> default_database, CsvTable, source: [CsvTableSource(read fields: sometxt, 
> id)]]], fields=[sometxt, id]): rowcount = 1.0E8, cumulative cost = {1.0E8 
> rows, 0.0 cpu, 2.0E9 io, 0.0 network, 0.0 memory}
> +- Exchange(distribution=[hash[key0]]): rowcount = 1.0E8, 
> cumulative cost = {3.0E8 rows, 1.68E10 cpu, 2.8E9 io, 8.0E8 network, 0.0 
> memory}
>+- Calc(select=[CAST(key) AS key0]): rowcount = 1.0E8, 
> cumulative cost = {2.0E8 rows, 0.0 cpu, 2.8E9 io, 0.0 

[jira] [Commented] (FLINK-13739) BinaryRowTest.testWriteString() fails in some environments

2019-08-15 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908205#comment-16908205
 ] 

Robert Metzger commented on FLINK-13739:


Indeed, the machine is using {{US-ASCII}} as the default charset.

> BinaryRowTest.testWriteString() fails in some environments
> --
>
> Key: FLINK-13739
> URL: https://issues.apache.org/jira/browse/FLINK-13739
> Project: Flink
>  Issue Type: Task
>  Components: Table SQL / Runtime
>Affects Versions: 1.9.0
> Environment:  
>  
>Reporter: Robert Metzger
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: test-stability
>
>  
>  
> {code:java}
> Test set: org.apache.flink.table.dataformat.BinaryRowTest
> ---
> Tests run: 26, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.328 s <<< 
> FAILURE! - in org.apache.flink.table.dataformat.BinaryRowTest
> testWriteString(org.apache.flink.table.dataformat.BinaryRowTest)  Time 
> elapsed: 0.05 s  <<< FAILURE!
> org.junit.ComparisonFailure: 
> expected:<[<95><95><95><95><95><88><91><98>
> <90><9A><84><89><88><8C>]> 
> but was:<[?]>
>         at 
> org.apache.flink.table.dataformat.BinaryRowTest.testWriteString(BinaryRowTest.java:189)
> {code}
>  
> This error happens on a Google Cloud n2-standard-16 (16 vCPUs, 64 GB memory) 
> machine.
> {code}$ lsb_release -a
> No LSB modules are available.
> Distributor ID:Debian
> Description:Debian GNU/Linux 9.9 (stretch)
> Release:9.9
> Codename:stretch{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (FLINK-13739) BinaryRowTest.testWriteString() fails in some environments

2019-08-15 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-13739:
--

Assignee: Jingsong Lee  (was: Robert Metzger)

> BinaryRowTest.testWriteString() fails in some environments
> --
>
> Key: FLINK-13739
> URL: https://issues.apache.org/jira/browse/FLINK-13739
> Project: Flink
>  Issue Type: Task
>  Components: Table SQL / Runtime
>Affects Versions: 1.9.0
> Environment:  
>  
>Reporter: Robert Metzger
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: test-stability
>
>  
>  
> {code:java}
> Test set: org.apache.flink.table.dataformat.BinaryRowTest
> ---
> Tests run: 26, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.328 s <<< 
> FAILURE! - in org.apache.flink.table.dataformat.BinaryRowTest
> testWriteString(org.apache.flink.table.dataformat.BinaryRowTest)  Time 
> elapsed: 0.05 s  <<< FAILURE!
> org.junit.ComparisonFailure: 
> expected:<[<95><95><95><95><95><88><91><98>
> <90><9A><84><89><88><8C>]> 
> but was:<[?]>
>         at 
> org.apache.flink.table.dataformat.BinaryRowTest.testWriteString(BinaryRowTest.java:189)
> {code}
>  
> This error happens on a Google Cloud n2-standard-16 (16 vCPUs, 64 GB memory) 
> machine.
> {code}$ lsb_release -a
> No LSB modules are available.
> Distributor ID:Debian
> Description:Debian GNU/Linux 9.9 (stretch)
> Release:9.9
> Codename:stretch{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (FLINK-13739) BinaryRowTest.testWriteString() fails in some environments

2019-08-15 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-13739:
--

Assignee: Robert Metzger

> BinaryRowTest.testWriteString() fails in some environments
> --
>
> Key: FLINK-13739
> URL: https://issues.apache.org/jira/browse/FLINK-13739
> Project: Flink
>  Issue Type: Task
>  Components: Table SQL / Runtime
>Affects Versions: 1.9.0
> Environment:  
>  
>Reporter: Robert Metzger
>Assignee: Robert Metzger
>Priority: Major
>  Labels: test-stability
>
>  
>  
> {code:java}
> Test set: org.apache.flink.table.dataformat.BinaryRowTest
> ---
> Tests run: 26, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.328 s <<< 
> FAILURE! - in org.apache.flink.table.dataformat.BinaryRowTest
> testWriteString(org.apache.flink.table.dataformat.BinaryRowTest)  Time 
> elapsed: 0.05 s  <<< FAILURE!
> org.junit.ComparisonFailure: 
> expected:<[<95><95><95><95><95><88><91><98>
> <90><9A><84><89><88><8C>]> 
> but was:<[?]>
>         at 
> org.apache.flink.table.dataformat.BinaryRowTest.testWriteString(BinaryRowTest.java:189)
> {code}
>  
> This error happens on a Google Cloud n2-standard-16 (16 vCPUs, 64 GB memory) 
> machine.
> {code}$ lsb_release -a
> No LSB modules are available.
> Distributor ID:Debian
> Description:Debian GNU/Linux 9.9 (stretch)
> Release:9.9
> Codename:stretch{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13739) BinaryRowTest.testWriteString() fails in some environments

2019-08-15 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-13739:
--

 Summary: BinaryRowTest.testWriteString() fails in some environments
 Key: FLINK-13739
 URL: https://issues.apache.org/jira/browse/FLINK-13739
 Project: Flink
  Issue Type: Task
  Components: Table SQL / Runtime
Affects Versions: 1.9.0
 Environment:  

 
Reporter: Robert Metzger


 

 
{code:java}
Test set: org.apache.flink.table.dataformat.BinaryRowTest
---
Tests run: 26, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.328 s <<< 
FAILURE! - in org.apache.flink.table.dataformat.BinaryRowTest
testWriteString(org.apache.flink.table.dataformat.BinaryRowTest)  Time elapsed: 
0.05 s  <<< FAILURE!
org.junit.ComparisonFailure: 
expected:<[<95><95><95><95><95><88><91><98>
<90><9A><84><89><88><8C>]> but 
was:<[?]>
        at 
org.apache.flink.table.dataformat.BinaryRowTest.testWriteString(BinaryRowTest.java:189)
{code}
 

This error happens on a Google Cloud n2-standard-16 (16 vCPUs, 64 GB memory) 
machine.

{code}$ lsb_release -a
No LSB modules are available.
Distributor ID:Debian
Description:Debian GNU/Linux 9.9 (stretch)
Release:9.9
Codename:stretch{code}




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13738) NegativeArraySizeException in LongHybridHashTable

2019-08-15 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13738:
---
Description: 
Executing this (meaningless) query:
{code:java}
INSERT INTO sinkTable ( SELECT CONCAT( CAST( id AS VARCHAR), CAST( COUNT(*) AS 
VARCHAR)) as something, 'const' FROM CsvTable, table1  WHERE sometxt LIKE 'a%' 
AND id = key GROUP BY id ) {code}
leads to the following exception:
{code:java}
Caused by: java.lang.NegativeArraySizeException
 at 
org.apache.flink.table.runtime.hashtable.LongHybridHashTable.tryDenseMode(LongHybridHashTable.java:216)
 at 
org.apache.flink.table.runtime.hashtable.LongHybridHashTable.endBuild(LongHybridHashTable.java:105)
 at LongHashJoinOperator$36.endInput1$(Unknown Source)
 at LongHashJoinOperator$36.endInput(Unknown Source)
 at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.endInput(OperatorChain.java:256)
 at 
org.apache.flink.streaming.runtime.io.StreamTwoInputSelectableProcessor.checkFinished(StreamTwoInputSelectableProcessor.java:359)
 at 
org.apache.flink.streaming.runtime.io.StreamTwoInputSelectableProcessor.processInput(StreamTwoInputSelectableProcessor.java:193)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.performDefaultAction(StreamTask.java:276)
 at org.apache.flink.streaming.runtime.tasks.StreamTask.run(StreamTask.java:298)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:403)
 at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:687)
 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:517)
 at java.lang.Thread.run(Thread.java:748){code}
This is the plan:

 
{code:java}
== Abstract Syntax Tree ==
LogicalSink(name=[sinkTable], fields=[f0, f1])
+- LogicalProject(something=[CONCAT(CAST($0):VARCHAR(2147483647) CHARACTER SET 
"UTF-16LE", CAST($1):VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT NULL)], 
EXPR$1=[_UTF-16LE'const'])
   +- LogicalAggregate(group=[{0}], agg#0=[COUNT()])
  +- LogicalProject(id=[$1])
 +- LogicalFilter(condition=[AND(LIKE($0, _UTF-16LE'a%'), =($1, 
CAST($2):BIGINT))])
+- LogicalJoin(condition=[true], joinType=[inner])
   :- LogicalTableScan(table=[[default_catalog, default_database, 
CsvTable, source: [CsvTableSource(read fields: sometxt, id)]]])
   +- LogicalTableScan(table=[[default_catalog, default_database, 
table1, source: [GeneratorTableSource(key, rowtime, payload)]]])

== Optimized Logical Plan ==
Sink(name=[sinkTable], fields=[f0, f1]): rowcount = 1498810.6659336376, 
cumulative cost = {4.459964319978008E8 rows, 1.879799762133187E10 cpu, 4.8E9 
io, 8.4E8 network, 1.799524266373455E8 memory}
+- Calc(select=[CONCAT(CAST(id), CAST($f1)) AS something, _UTF-16LE'const' AS 
EXPR$1]): rowcount = 1498810.6659336376, cumulative cost = {4.444976213318672E8 
rows, 1.8796498810665936E10 cpu, 4.8E9 io, 8.4E8 network, 1.799524266373455E8 
memory}
   +- HashAggregate(isMerge=[false], groupBy=[id], select=[id, COUNT(*) AS 
$f1]): rowcount = 1498810.6659336376, cumulative cost = {4.429988106659336E8 
rows, 1.8795E10 cpu, 4.8E9 io, 8.4E8 network, 1.799524266373455E8 memory}
  +- Calc(select=[id]): rowcount = 1.575E7, cumulative cost = {4.415E8 
rows, 1.848E10 cpu, 4.8E9 io, 8.4E8 network, 1.2E8 memory}
 +- HashJoin(joinType=[InnerJoin], where=[=(id, key0)], select=[id, 
key0], build=[left]): rowcount = 1.575E7, cumulative cost = {4.2575E8 rows, 
1.848E10 cpu, 4.8E9 io, 8.4E8 network, 1.2E8 memory}
:- Exchange(distribution=[hash[id]]): rowcount = 500.0, 
cumulative cost = {1.1E8 rows, 8.4E8 cpu, 2.0E9 io, 4.0E7 network, 0.0 memory}
:  +- Calc(select=[id], where=[LIKE(sometxt, _UTF-16LE'a%')]): 
rowcount = 500.0, cumulative cost = {1.05E8 rows, 0.0 cpu, 2.0E9 io, 0.0 
network, 0.0 memory}
: +- TableSourceScan(table=[[default_catalog, default_database, 
CsvTable, source: [CsvTableSource(read fields: sometxt, id)]]], 
fields=[sometxt, id]): rowcount = 1.0E8, cumulative cost = {1.0E8 rows, 0.0 
cpu, 2.0E9 io, 0.0 network, 0.0 memory}
+- Exchange(distribution=[hash[key0]]): rowcount = 1.0E8, 
cumulative cost = {3.0E8 rows, 1.68E10 cpu, 2.8E9 io, 8.0E8 network, 0.0 memory}
   +- Calc(select=[CAST(key) AS key0]): rowcount = 1.0E8, 
cumulative cost = {2.0E8 rows, 0.0 cpu, 2.8E9 io, 0.0 network, 0.0 memory}
  +- TableSourceScan(table=[[default_catalog, default_database, 
table1, source: [GeneratorTableSource(key, rowtime, payload)]]], fields=[key, 
rowtime, payload]): rowcount = 1.0E8, cumulative cost = {1.0E8 rows, 0.0 cpu, 
2.8E9 io, 0.0 network, 0.0 memory}

== Physical Execution Plan ==
Stage 1 : Data Source
content : collect elements with CollectionInputFormat

Stage 2 : Operator
content : CsvTableSource(read fields: sometxt, id)
ship_strategy : REBALANCE

Stage 3 : 

[jira] [Created] (FLINK-13738) NegativeArraySizeException in LongHybridHashTable

2019-08-15 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-13738:
--

 Summary: NegativeArraySizeException in LongHybridHashTable
 Key: FLINK-13738
 URL: https://issues.apache.org/jira/browse/FLINK-13738
 Project: Flink
  Issue Type: Task
  Components: Table SQL / Runtime
Affects Versions: 1.9.0
Reporter: Robert Metzger


Executing this (meaningless) query:
{code:java}
INSERT INTO sinkTable ( SELECT CONCAT( CAST( id AS VARCHAR), CAST( COUNT(*) AS 
VARCHAR)) as something, 'const' FROM CsvTable, table1  WHERE sometxt LIKE 'a%' 
AND id = key GROUP BY id ) {code}
leads to the following exception:
{code:java}
Caused by: java.lang.NegativeArraySizeException
 at 
org.apache.flink.table.runtime.hashtable.LongHybridHashTable.tryDenseMode(LongHybridHashTable.java:216)
 at 
org.apache.flink.table.runtime.hashtable.LongHybridHashTable.endBuild(LongHybridHashTable.java:105)
 at LongHashJoinOperator$36.endInput1$(Unknown Source)
 at LongHashJoinOperator$36.endInput(Unknown Source)
 at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.endInput(OperatorChain.java:256)
 at 
org.apache.flink.streaming.runtime.io.StreamTwoInputSelectableProcessor.checkFinished(StreamTwoInputSelectableProcessor.java:359)
 at 
org.apache.flink.streaming.runtime.io.StreamTwoInputSelectableProcessor.processInput(StreamTwoInputSelectableProcessor.java:193)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.performDefaultAction(StreamTask.java:276)
 at org.apache.flink.streaming.runtime.tasks.StreamTask.run(StreamTask.java:298)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:403)
 at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:687)
 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:517)
 at java.lang.Thread.run(Thread.java:748){code}
This is the plan:

 
{code:java}
== Abstract Syntax Tree ==
 LogicalSink(name=[sinkTable], fields=[f0, f1])
 +- LogicalProject(something=[CONCAT(CAST($0):VARCHAR(2147483647) CHARACTER SET 
"UTF-16LE", CAST($1):VARCHAR(2147483647) CHARACTER SET "UTF-16LE" NOT NULL)], 
EXPR$1=[_UTF-16LE'const'])
 +- LogicalAggregate(group=[
{0}
], agg#0=[COUNT()])
 +- LogicalProject(id=[$1])
 +- LogicalFilter(condition=[AND(LIKE($0, _UTF-16LE'a%'), =($1, 
CAST($2):BIGINT))])
 +- LogicalJoin(condition=[true], joinType=[inner])
 :- LogicalTableScan(table=[[default_catalog, default_database, CsvTable, 
source: [CsvTableSource(read fields: sometxt, id)]]])
 +- LogicalTableScan(table=[[default_catalog, default_database, table1, source: 
[GeneratorTableSource(key, rowtime, payload)]]])
== Optimized Logical Plan ==
 Sink(name=[sinkTable], fields=[f0, f1]): rowcount = 1498810.6659336376, 
cumulative cost =
{4.459964319978008E8 rows, 1.879799762133187E10 cpu, 4.8E9 io, 8.4E8 network, 
1.799524266373455E8 memory}
+- Calc(select=[CONCAT(CAST(id), CAST($f1)) AS something, _UTF-16LE'const' AS 
EXPR$1]): rowcount = 1498810.6659336376, cumulative cost =
{4.444976213318672E8 rows, 1.8796498810665936E10 cpu, 4.8E9 io, 8.4E8 network, 
1.799524266373455E8 memory}
+- HashAggregate(isMerge=[false], groupBy=[id], select=[id, COUNT(*) AS $f1]): 
rowcount = 1498810.6659336376, cumulative cost =
{4.429988106659336E8 rows, 1.8795E10 cpu, 4.8E9 io, 8.4E8 network, 
1.799524266373455E8 memory}
+- Calc(select=[id]): rowcount = 1.575E7, cumulative cost =
{4.415E8 rows, 1.848E10 cpu, 4.8E9 io, 8.4E8 network, 1.2E8 memory}
+- HashJoin(joinType=[InnerJoin], where=[=(id, key0)], select=[id, key0], 
build=[left]): rowcount = 1.575E7, cumulative cost =
{4.2575E8 rows, 1.848E10 cpu, 4.8E9 io, 8.4E8 network, 1.2E8 memory}
:- Exchange(distribution=[hash[id]]): rowcount = 500.0, cumulative cost =
{1.1E8 rows, 8.4E8 cpu, 2.0E9 io, 4.0E7 network, 0.0 memory}
: +- Calc(select=[id], where=[LIKE(sometxt, _UTF-16LE'a%')]): rowcount = 
500.0, cumulative cost =
{1.05E8 rows, 0.0 cpu, 2.0E9 io, 0.0 network, 0.0 memory}
: +- TableSourceScan(table=[[default_catalog, default_database, CsvTable, 
source: [CsvTableSource(read fields: sometxt, id)]]], fields=[sometxt, id]): 
rowcount = 1.0E8, cumulative cost =
{1.0E8 rows, 0.0 cpu, 2.0E9 io, 0.0 network, 0.0 memory}
+- Exchange(distribution=[hash[key0]]): rowcount = 1.0E8, cumulative cost =
{3.0E8 rows, 1.68E10 cpu, 2.8E9 io, 8.0E8 network, 0.0 memory}
+- Calc(select=[CAST(key) AS key0]): rowcount = 1.0E8, cumulative cost =
{2.0E8 rows, 0.0 cpu, 2.8E9 io, 0.0 network, 0.0 memory}
+- TableSourceScan(table=[[default_catalog, default_database, table1, source: 
[GeneratorTableSource(key, rowtime, payload)]]], fields=[key, rowtime, 
payload]): rowcount = 1.0E8, cumulative cost =
{1.0E8 rows, 0.0 cpu, 2.8E9 io, 0.0 network, 0.0 memory}
== Physical Execution Plan ==
 Stage 1 : Data Source
 content : collect elements with CollectionInputFormat
Stage 2 : Operator
 content : CsvTableSource(read fields: sometxt, id)
 ship_strategy : REBALANCE
Stage 3 : Operator
 content : 

[jira] [Commented] (FLINK-13700) PubSub connector example not included in flink-dist

2019-08-14 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907039#comment-16907039
 ] 

Robert Metzger commented on FLINK-13700:


[~Xeli] do you want to kick off a short discussion on the dev@ list on what to 
do with the PubSub example? 

> PubSub connector example not included in flink-dist
> ---
>
> Key: FLINK-13700
> URL: https://issues.apache.org/jira/browse/FLINK-13700
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Google Cloud PubSub
>Affects Versions: 1.9.0
>Reporter: Richard Deurwaarder
>Assignee: Richard Deurwaarder
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13700) PubSub connector example not included in flink-dist

2019-08-13 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905948#comment-16905948
 ] 

Robert Metzger commented on FLINK-13700:


According to this discussion: https://issues.apache.org/jira/browse/FLINK-13306 
we need to add a proper NOTICE file now.

> PubSub connector example not included in flink-dist
> ---
>
> Key: FLINK-13700
> URL: https://issues.apache.org/jira/browse/FLINK-13700
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Google Cloud PubSub
>Affects Versions: 1.9.0
>Reporter: Richard Deurwaarder
>Assignee: Richard Deurwaarder
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13306) flink-examples-streaming-gcp-pubsub is missing NOTICE

2019-08-13 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905947#comment-16905947
 ] 

Robert Metzger commented on FLINK-13306:


[~Zentol]: Why was this closed as invalid? There's now a ticket to include the 
example into flink-dist: FLINK-13700.

If we do FLINK-13700, we need to add the missing NOTICE file.

> flink-examples-streaming-gcp-pubsub is missing NOTICE
> -
>
> Key: FLINK-13306
> URL: https://issues.apache.org/jira/browse/FLINK-13306
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Google Cloud PubSub, Examples
>Affects Versions: 1.9.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Blocker
>
> The pubsub example is bundling various dependencies but is missing a NOTICE 
> file.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (FLINK-13700) PubSub connector example not included in flink-dist

2019-08-13 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-13700:
--

Assignee: Richard Deurwaarder

> PubSub connector example not included in flink-dist
> ---
>
> Key: FLINK-13700
> URL: https://issues.apache.org/jira/browse/FLINK-13700
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Google Cloud PubSub
>Affects Versions: 1.9.0
>Reporter: Richard Deurwaarder
>Assignee: Richard Deurwaarder
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Closed] (FLINK-13701) KafkaTableSourceSink doesn't support topic pattern

2019-08-13 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger closed FLINK-13701.
--
Resolution: Duplicate

This is a duplicate of FLINK-13340. Closing.

> KafkaTableSourceSink doesn't support topic pattern
> --
>
> Key: FLINK-13701
> URL: https://issues.apache.org/jira/browse/FLINK-13701
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka, Table SQL / API
>Affects Versions: 1.8.1, 1.9.0
>Reporter: Shengnan YU
>Priority: Major
>
> The current  KafkaTableSourceSink can only support single topic rather than 
> multiple or topic patterns, which is limited in many cases.
> {code:java}
> .connect(
>   new Kafka()
> .version("0.11")   
> .topic("...") // Here we can only set single topic
> )
> {code}
> I lookup the source code of 1.9-release branch and there is no change. The 
> new DDL feature is also related to the implementation of connectors. I think 
> it is necessary to add topicPattern support.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13701) KafkaTableSourceSink doesn't support topic pattern

2019-08-13 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13701:
---
Component/s: (was: Connectors / Hadoop Compatibility)
 Table SQL / API
 Connectors / Kafka

> KafkaTableSourceSink doesn't support topic pattern
> --
>
> Key: FLINK-13701
> URL: https://issues.apache.org/jira/browse/FLINK-13701
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka, Table SQL / API
>Affects Versions: 1.8.1, 1.9.0
>Reporter: Shengnan YU
>Priority: Major
>
> The current  KafkaTableSourceSink can only support single topic rather than 
> multiple or topic patterns, which is limited in many cases.
> {code:java}
> .connect(
>   new Kafka()
> .version("0.11")   
> .topic("...") // Here we can only set single topic
> )
> {code}
> I lookup the source code of 1.9-release branch and there is no change. The 
> new DDL feature is also related to the implementation of connectors. I think 
> it is necessary to add topicPattern support.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13588) StreamTask.handleAsyncException throws away the exception cause

2019-08-12 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13588:
---
Component/s: Runtime / Task

> StreamTask.handleAsyncException throws away the exception cause
> ---
>
> Key: FLINK-13588
> URL: https://issues.apache.org/jira/browse/FLINK-13588
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.8.1
>Reporter: John Lonergan
>Priority: Major
>
> Code below throws the reason 'message' away making it hard to diagnose why a 
> split has failed for instance.
>  
> {code:java}
> https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L909
> @Override
>   public void handleAsyncException(String message, Throwable exception) {
>   if (isRunning) {
>   // only fail if the task is still running
>   getEnvironment().failExternally(exception);
>   }
> }{code}
>  
> Need to pass the message through so that we see it in logs please.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (FLINK-13646) Add ARM CI job definition scripts

2019-08-12 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-13646:
--

Assignee: wangxiyuan

> Add ARM CI job definition scripts
> -
>
> Key: FLINK-13646
> URL: https://issues.apache.org/jira/browse/FLINK-13646
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System
>Affects Versions: 2.0.0
>Reporter: wangxiyuan
>Assignee: wangxiyuan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> OpenLab CI github-app now is added to Flink repo. It's a good time to add the 
> related ARM job definition scripts now.
>  OpenLab uses zuul[1] as the CI infrastructure which uses ansible[2] for job 
> execution and definition.
> The ansible scripts for job definition is always contained in a file named 
> *.zuul.yaml*. Here is an example[3].
> So for Flink ARM support work, I suggest to devide it into some steps(we will 
> add *flink-core* and *flink-test* related modules as the beginner. Other 
> modules can be added later if people want them):
>  # Add the basic *build* script to ensure the CI system and build job works 
> as expect. The job should be marked as non-voting first, it means the CI test 
> failure won't block Flink PR to be merged.
>  # Add the *test* script to run unit/intergration test. At this step the 
> *--fn* parameter will be added to *mvn test*. It will run the full test cases 
> in Flink, so that we can find what test is failed on ARM.
>  # Fix the test failure one by one.
>  # Once all the tests are passed, remove the *--fn* parameter and keep watch 
> the CI's status for some days. If some bugs raise then, fix them as what we 
> usually do for travis-ci.
>  # Once the CI is stable enought, remove the non-voting tag, so that the ARM 
> CI will be the same as travis-ci, to be one of the gate for Flink PR.
>  # Finally, Flink community can announce and release Flink ARM version.
> OpenLab will keep helping and maintaining the ARM work. If you have any 
> question or requirement, welcome to job IRC channel: #askopenlab
> Any thought?
> Thanks.
> [1]: [https://zuul-ci.org/docs/zuul/]
>  [2]: [https://www.ansible.com/]
>  [3]: [https://github.com/theopenlab/flink/pull/1/files]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13450) Adjust tests to tolerate arithmetic differences between x86 and ARM

2019-07-31 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897178#comment-16897178
 ] 

Robert Metzger commented on FLINK-13450:


Spark had issues with {{Math}}: 
[https://twitter.com/steveloughran/status/1156555133989396481]

> Adjust tests to tolerate arithmetic differences between x86 and ARM
> ---
>
> Key: FLINK-13450
> URL: https://issues.apache.org/jira/browse/FLINK-13450
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.9.0
>Reporter: Stephan Ewen
>Priority: Major
>
> Certain arithmetic operations have different precision/rounding on ARM versus 
> x86.
> Tests using floating point numbers should be changed to tolerate a certain 
> minimal deviation.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13199) ARM support for Flink

2019-07-31 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897175#comment-16897175
 ] 

Robert Metzger commented on FLINK-13199:


Maybe we should rename this ticket to related to build support and make it a 
sub-task of FLINK-13448 ?

> ARM support for Flink
> -
>
> Key: FLINK-13199
> URL: https://issues.apache.org/jira/browse/FLINK-13199
> Project: Flink
>  Issue Type: New Feature
>  Components: Build System
>Reporter: wangxiyuan
>Priority: Critical
>
> There is not official ARM release for Flink. But basing on my local test, 
> Flink which is made by Java and Scala is built and tested well. So is it 
> possible to support ARM release officially? And I think it's may not be a 
> huge work.
>  
> AFAIK, Flink now uses travis-ci which supports only x86 for CI gate. Is it 
> possible to add an ARM one? I'm from openlab community[1]. Similar with 
> travis-ci, it's is an opensource and free community which provide CI 
> resources and system for opensource projects, contains both ARM and X86 
> machines. And now it helps some community building there CI already. Such as 
> OpenStack and CNCF.
>  
> If Flink community agree to support ARM. I can spend my full time to help. 
> Such as job define, CI maintaining, test fix and so on. If Flink don't want 
> to rely on OpenLab, we can donate ARM resources directly as well.
>  
> I have sent out a discuess mail-list already[2]. Feel free to reply there or 
> here.
>  
> Thanks.
>  
> [1]:[https://openlabtesting.org/]
> [2]:[http://mail-archives.apache.org/mod_mbox/flink-dev/201907.mbox/browser]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13291) Cannot provide AWS Direct Connect ENDPOINT

2019-07-31 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13291:
---
Summary: Cannot provide AWS Direct Connect ENDPOINT  (was: Cannot Privide 
AWS Direct Connect ENDPOINT)

> Cannot provide AWS Direct Connect ENDPOINT
> --
>
> Key: FLINK-13291
> URL: https://issues.apache.org/jira/browse/FLINK-13291
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.8.0, 1.8.1
>Reporter: Shakir P Mukkath
>Priority: Minor
>
> The issue is at two places,
> First, *KinesisConfigUtil.validateConsumerConfiguration((Properties config))* 
> limit providing both REGION and ENDPOINT in properties. 
> Second, *AWSUtil.createKinesisClient(Properties configProps, 
> ClientConfiguration awsClientConfig)* is passing REGION as null when ENDPOINT 
> is provided. 
> The above cases will not work when an AWS DIRECT CONNECT ENDPOINT is used. A 
> sample direct connect endpoint for east region is 
> _kinesis-ae1.hdw.r53.feap.pv_  So this does not follow the convention of 
> kinesis.us-east-1.amazonaws.com where the first word after _"-" is_ region_._
> And using the above endpoint will results error  __  
> _org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.model.AmazonKinesisException:
>  Credential should be scoped to a valid region, not 'ae1'. (Service: 
> AmazonKinesis; Status Code: 400; Error Code: InvalidSignatureException; 
> Request ID: ee678308-0c88-ca77-bbc0-54322258672d)_
>  _at 
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1632)_
>  _at 
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304)_
>  _at 
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1058)_
>  _at 
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)_
>  _at 
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)_
>  _at 
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)_
>  _at 
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)_
>  _at 
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)_
>  _at 
> org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)_
>  _at 
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.doInvoke(AmazonKinesisClient.java:2388)_
>  _at 
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.invoke(AmazonKinesisClient.java:2364)_
>  _at 
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.executeListShards(AmazonKinesisClient.java:1337)_
>  _at 
> org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.listShards(AmazonKinesisClient.java:1312)_
>  _at 
> org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.listShards(KinesisProxy.java:442)_
>  _at 
> org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.getShardsOfStream(KinesisProxy.java:392)_
>  _at 
> org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.getShardList(KinesisProxy.java:282)_
>  _at 
> org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher.discoverNewShardsToSubscribe(KinesisDataFetcher.java:550)_
>  _at 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.run(FlinkKinesisConsumer.java:270)_
>  _at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:93)_
>  _at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:57)_
>  _at 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:97)_
>  _at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)_
>  _at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)_
>  _at java.lang.Thread.run(Thread.java:748)_
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13340) Add more Kafka topic option of flink-connector-kafka

2019-07-31 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897155#comment-16897155
 ] 

Robert Metzger commented on FLINK-13340:


Thanks a lot for following the process! I hope a committer working on that part 
of Flink will soon take a look!

> Add more Kafka topic option of flink-connector-kafka
> 
>
> Key: FLINK-13340
> URL: https://issues.apache.org/jira/browse/FLINK-13340
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka, Table SQL / API
>Affects Versions: 1.8.1
>Reporter: DuBin
>Priority: Major
>  Labels: features
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Currently, only 'topic' option implemented in the Kafka Connector Descriptor, 
> we can only use it like :
> {code:java}
> val env = StreamExecutionEnvironment.getExecutionEnvironment
> val tableEnv = StreamTableEnvironment.create(env)
> tableEnv
>   .connect(
> new Kafka()
>   .version("0.11")
>   .topic("test-flink-1")
>   .startFromEarliest()
>   .property("zookeeper.connect", "localhost:2181")
>   .property("bootstrap.servers", "localhost:9092"))
>   .withFormat(
> new Json()
>   .deriveSchema()
>   )
>   .withSchema(
> new Schema()
>   .field("name", Types.STRING)
>   .field("age", Types.STRING)
>   ){code}
> but we cannot consume multiple topics or a topic regex pattern. 
> Here is my thoughts:
> {code:java}
>   .topic("test-flink-1") 
>   //.topics("test-flink-1,test-flink-2") or topics(List 
> topics)
>   //.subscriptionPattern("test-flink-.*") or 
> subscriptionPattern(Pattern pattern)
> {code}
> I already implement the code on my local env with help of the 
> FlinkKafkaConsumer, and it works.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13340) Add more Kafka topic option of flink-connector-kafka

2019-07-31 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13340:
---
Component/s: Table SQL / API

> Add more Kafka topic option of flink-connector-kafka
> 
>
> Key: FLINK-13340
> URL: https://issues.apache.org/jira/browse/FLINK-13340
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka, Table SQL / API
>Affects Versions: 1.8.1
>Reporter: DuBin
>Priority: Major
>  Labels: features
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Currently, only 'topic' option implemented in the Kafka Connector Descriptor, 
> we can only use it like :
> {code:java}
> val env = StreamExecutionEnvironment.getExecutionEnvironment
> val tableEnv = StreamTableEnvironment.create(env)
> tableEnv
>   .connect(
> new Kafka()
>   .version("0.11")
>   .topic("test-flink-1")
>   .startFromEarliest()
>   .property("zookeeper.connect", "localhost:2181")
>   .property("bootstrap.servers", "localhost:9092"))
>   .withFormat(
> new Json()
>   .deriveSchema()
>   )
>   .withSchema(
> new Schema()
>   .field("name", Types.STRING)
>   .field("age", Types.STRING)
>   ){code}
> but we cannot consume multiple topics or a topic regex pattern. 
> Here is my thoughts:
> {code:java}
>   .topic("test-flink-1") 
>   //.topics("test-flink-1,test-flink-2") or topics(List 
> topics)
>   //.subscriptionPattern("test-flink-.*") or 
> subscriptionPattern(Pattern pattern)
> {code}
> I already implement the code on my local env with help of the 
> FlinkKafkaConsumer, and it works.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13406) MetricConfig.getInteger() always returns null

2019-07-31 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897138#comment-16897138
 ] 

Robert Metzger commented on FLINK-13406:


Does this mean we can close this ticket?

> MetricConfig.getInteger() always returns null
> -
>
> Key: FLINK-13406
> URL: https://issues.apache.org/jira/browse/FLINK-13406
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics
>Affects Versions: 1.6.3
>Reporter: Ori Popowski
>Priority: Major
>
> {{MetricConfig}}'s {{getInteger}} will always return the default value.
> The reason is, since it delegates to Java's {{Properties.getProperty}} which 
> returns null if the type of the value is not {{String}}.
> h3. Reproduce
>  # Create a class {{MyReporter}} implementing {{MetricReporter}}
>  # Implment the {{open()}} method so that you do {{config.getInteger("foo", 
> null)}}
>  # Start an {{ExecutionEnvironment}} with and give it the following 
> Configuration object:
> {code:java}
> configuration.setString("metrics.reporters", "my");
> configuration.setClass("metrics.reporter.my.class", MyReporter.class)
> configuration.setInteger("metrics.reporter.my.foo", 42);{code}
>  # In {{open()}} the value of {{foo}} is null.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13459) Violation of the order queue messages from RabbitMQ.

2019-07-31 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897129#comment-16897129
 ] 

Robert Metzger commented on FLINK-13459:


Are you writing out the data immediately after the source operator, or is there 
anything in between?

This is how writing the data immediately would look like:
{code:java}
DataStream stream = env.addSource(new 
RMQSource(connectionConfig, queueName, true, new SimpleStringSchema()))
.setParallelism(1);
stream.print();{code}
 

> Violation of the order queue messages from RabbitMQ.
> 
>
> Key: FLINK-13459
> URL: https://issues.apache.org/jira/browse/FLINK-13459
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream, Connectors/ RabbitMQ
>Affects Versions: 1.8.1
>Reporter: Dmitry Kharlashko
>Priority: Critical
>
> When receiving an accumulated message queue from RabbitMQ their order is 
> disturbed. Messages come from Rabbit in the correct order but in the stream 
> they are mixed. Stream created as written in the documentation.
> DataStream stream = env.addSource(new 
> RMQSource(connectionConfig, queueName, true, new 
> SimpleStringSchema()))
>  .setParallelism(1);
> Example:
> In the RabbitMQ message queue is :\{message1,message2,message 3,message4...}.
> In the flink stream queue messages is :
> {message1,message3,message4,message2...}.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13469) Resource used by StateMapSnapshot can not be released if snapshot fails

2019-07-31 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897126#comment-16897126
 ] 

Robert Metzger commented on FLINK-13469:


After talking to [~carp84] offline, I assigned the ticket to you.

> Resource used by StateMapSnapshot can not be released if snapshot fails
> ---
>
> Key: FLINK-13469
> URL: https://issues.apache.org/jira/browse/FLINK-13469
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.9.0
>Reporter: PengFei Li
>Assignee: PengFei Li
>Priority: Major
>
> Currently, resource used by {{StateMapSnapshot}} is released in 
> {{AbstractStateTableSnapshot#writeStateInKeyGroup}} as follows  
> {code:java}
> public void writeStateInKeyGroup(@Nonnull DataOutputView dov, int keyGroupId) 
> throws IOException {
> StateMapSnapshot> stateMapSnapshot = 
> getStateMapSnapshotForKeyGroup(keyGroupId);
> stateMapSnapshot.writeState(localKeySerializer, localNamespaceSerializer, 
> localStateSerializer, dov, stateSnapshotTransformer);
> stateMapSnapshot.release();
> }
> {code}
> If exception happens in {{StateMapSnapshot#writeState}}, resources used by 
> this and left {{StateMapSnapshot}} s can not be released.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (FLINK-13469) Resource used by StateMapSnapshot can not be released if snapshot fails

2019-07-31 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reassigned FLINK-13469:
--

Assignee: PengFei Li

> Resource used by StateMapSnapshot can not be released if snapshot fails
> ---
>
> Key: FLINK-13469
> URL: https://issues.apache.org/jira/browse/FLINK-13469
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.9.0
>Reporter: PengFei Li
>Assignee: PengFei Li
>Priority: Major
>
> Currently, resource used by {{StateMapSnapshot}} is released in 
> {{AbstractStateTableSnapshot#writeStateInKeyGroup}} as follows  
> {code:java}
> public void writeStateInKeyGroup(@Nonnull DataOutputView dov, int keyGroupId) 
> throws IOException {
> StateMapSnapshot> stateMapSnapshot = 
> getStateMapSnapshotForKeyGroup(keyGroupId);
> stateMapSnapshot.writeState(localKeySerializer, localNamespaceSerializer, 
> localStateSerializer, dov, stateSnapshotTransformer);
> stateMapSnapshot.release();
> }
> {code}
> If exception happens in {{StateMapSnapshot#writeState}}, resources used by 
> this and left {{StateMapSnapshot}} s can not be released.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13459) Violation of the order queue messages from RabbitMQ.

2019-07-31 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13459:
---
Component/s: Connectors/ RabbitMQ

> Violation of the order queue messages from RabbitMQ.
> 
>
> Key: FLINK-13459
> URL: https://issues.apache.org/jira/browse/FLINK-13459
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream, Connectors/ RabbitMQ
>Affects Versions: 1.8.1
>Reporter: Dmitry Kharlashko
>Priority: Critical
>
> When receiving an accumulated message queue from RabbitMQ their order is 
> disturbed. Messages come from Rabbit in the correct order but in the stream 
> they are mixed. Stream created as written in the documentation.
> DataStream stream = env.addSource(new 
> RMQSource(connectionConfig, queueName, true, new 
> SimpleStringSchema()))
>  .setParallelism(1);
> Example:
> In the RabbitMQ message queue is :\{message1,message2,message 3,message4...}.
> In the flink stream queue messages is :
> {message1,message3,message4,message2...}.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13472) taskmanager.jvm-exit-on-oom doesn't work reliably with YARN

2019-07-31 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13472:
---
Component/s: (was: core)
 Deployment / YARN

> taskmanager.jvm-exit-on-oom doesn't work reliably with YARN
> ---
>
> Key: FLINK-13472
> URL: https://issues.apache.org/jira/browse/FLINK-13472
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.6.3
>Reporter: Pawel Bartoszek
>Priority: Major
>
> I have added *taskmanager.jvm-exit-on-oom* flag to the task manager starting 
> arguments. During my testing (simulating oom) I noticed that sometimes YARN 
> containers were still in RUNNING state even though they should haven been 
> killed on OutOfMemory errors with the flag on.
> I could find RUNNING containers with the last log lines like this. 
> {code:java}
> 2019-07-26 13:32:51,396 ERROR org.apache.flink.runtime.taskmanager.Task   
>   - Encountered fatal error java.lang.OutOfMemoryError - 
> terminating the JVM
> java.lang.OutOfMemoryError: Metaspace
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:369){code}
>  
> Does YARN make it tricky to forcefully kill JVM after OutOfMemory error? 
>  
> *Workaround*
>  
> When using -XX:+ExitOnOutOfMemoryError JVM flag containers get always 
> terminated!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13477) Containerized TaskManager killed because of lack of memory overhead

2019-07-31 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13477:
---
Component/s: Deployment / YARN

> Containerized TaskManager killed because of lack of memory overhead
> ---
>
> Key: FLINK-13477
> URL: https://issues.apache.org/jira/browse/FLINK-13477
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Mesos, Deployment / YARN
>Affects Versions: 1.9.0
>Reporter: Benoit Hanotte
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, the `-XX:MaxDirectMemorySize` parameter is set as:
> `MaxDirectMemorySize = containerMemoryMB - heapSizeMB`
> (see 
> [https://github.com/apache/flink/blob/7fec4392b21b07c69ba15ea554731886f181609e/flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/ContaineredTaskManagerParameters.java#L162])
> However as explained at
>  https://docs.oracle.com/javase/8/docs/technotes/tools/unix/java.html,
> `MaxDirectMemorySize` only sets the maximum amount of memory that can be
> used for direct buffers, thus the amount of off-heap memory used can be
> greater than that value, leading to the container being killed by Mesos
> or Yarn as it exceeds the allocated memory.
> In addition, users might want to allocate off-heap memory through native
> code, in which case they will want to keep some of the container memory
> free and unallocated by Flink.
> To solve this issue, we currently set the following parameter:
> {code:java}
> -Dcontainerized.taskmanager.env.FLINK_ENV_JAVA_OPTS='-XX:MaxDirectMemorySize=600m'
> {code}
> which overrides the value that Flink picks (744M in this case) with a lower 
> one to keep some overhead memory in the TaskManager containers. However this 
> is an "ugly" hack as it goes around the clever memory allocation that Flink 
> performs and allows to bypass the sanity checks done in 
> `ContaineredTaskManagerParameters`.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13482) How can I cleanly shutdown streaming jobs in local mode?

2019-07-31 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897093#comment-16897093
 ] 

Robert Metzger commented on FLINK-13482:


I'm not aware of a mechanism for that yet.
There are some plans for providing an API for controlling a running Flink job. 
This API would probably work independent of the deployment mode.

If you are using a custom source, you could implement a mechanism that shuts 
the sources down. Once the {{SourceFunction.run()}} method completes, Flink 
will shut down the topology.

> How can I cleanly shutdown streaming jobs in local mode?
> 
>
> Key: FLINK-13482
> URL: https://issues.apache.org/jira/browse/FLINK-13482
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: Donghui Xu
>Priority: Minor
>
> Currently, streaming jobs can be stopped using "cancel" and "stop" command 
> only in cluster mode, not in local mode.
> When users need to explicitly terminate jobs, it is necessary to provide a 
> termination mechanism for local mode flow jobs.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13482) How can I cleanly shutdown streaming jobs in local mode?

2019-07-31 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13482:
---
Component/s: Runtime / Coordination

> How can I cleanly shutdown streaming jobs in local mode?
> 
>
> Key: FLINK-13482
> URL: https://issues.apache.org/jira/browse/FLINK-13482
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Reporter: Donghui Xu
>Priority: Minor
>
> Currently, streaming jobs can be stopped using "cancel" and "stop" command 
> only in cluster mode, not in local mode.
> When users need to explicitly terminate jobs, it is necessary to provide a 
> termination mechanism for local mode flow jobs.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13500) RestClusterClient requires S3 access when HA is configured

2019-07-31 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13500:
---
Component/s: Runtime / Coordination

> RestClusterClient requires S3 access when HA is configured
> --
>
> Key: FLINK-13500
> URL: https://issues.apache.org/jira/browse/FLINK-13500
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination, Runtime / REST
>Affects Versions: 1.8.1
>Reporter: David Judd
>Priority: Major
>
> RestClusterClient initialization calls ClusterClient initialization, which 
> calls
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices
> In turn, createHighAvailabilityServices calls 
> BlobUtils.createBlobStoreFromConfig, which in our case tries to talk to S3.
> It seems very surprising to me that (a) RestClusterClient needs any form of 
> access other than to the REST API, and (b) that client initialization would 
> attempt a write as a side effect. I do not see either of these surprising 
> facts described in the documentation–are they intentional?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13506) flink1.6.1 can't consume the specified topic list,only a few topic was consumed

2019-07-31 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897003#comment-16897003
 ] 

Robert Metzger commented on FLINK-13506:


Hi, I would like to ask you to refer to the user@ mailing list for such support 
questions: https://flink.apache.org/community.html#mailing-lists

But since you are already here: Are you sure that the splitting of the string 
by "," works with the spaces.
Does it work when you set your config file to:
{code:java}
kafka.consumer.topic = user,order,sales
{code}


> flink1.6.1 can't consume the specified topic list,only a few topic was 
> consumed 
> 
>
> Key: FLINK-13506
> URL: https://issues.apache.org/jira/browse/FLINK-13506
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Reporter: jiangxiaozhi
>Priority: Major
>
> i  specified topic list in my config file, and the flink program read message 
> from this.
> here is my config file:
>  
> {code:java}
> kafka.consumer.topic = user,order,sales
> {code}
>  
> and flink program :
> {code:java}
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.enableCheckpointing(5000);
> FlinkKafkaConsumer010 kafkaConsumer = new 
> FlinkKafkaConsumer010<>(Arrays.asList(kafka_consumer_topic.split(",")), new 
> SimpleStringSchema(), getKafkaProperties());
> DataStream dataStream = env.addSource(kafkaConsumer);
> {code}
> when i run the flink program,it is only can consume a few topics,the others 
> can't consume,can anyone help me?flink version is 1.6.1。



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13506) flink1.6.1 can't consume the specified topic list,only a few topic was consumed

2019-07-31 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13506:
---
Component/s: Connectors / Kafka

> flink1.6.1 can't consume the specified topic list,only a few topic was 
> consumed 
> 
>
> Key: FLINK-13506
> URL: https://issues.apache.org/jira/browse/FLINK-13506
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Reporter: jiangxiaozhi
>Priority: Major
>
> i  specified topic list in my config file, and the flink program read message 
> from this.
> here is my config file:
>  
> {code:java}
> kafka.consumer.topic = user,order,sales
> {code}
>  
> and flink program :
> {code:java}
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> env.enableCheckpointing(5000);
> FlinkKafkaConsumer010 kafkaConsumer = new 
> FlinkKafkaConsumer010<>(Arrays.asList(kafka_consumer_topic.split(",")), new 
> SimpleStringSchema(), getKafkaProperties());
> DataStream dataStream = env.addSource(kafkaConsumer);
> {code}
> when i run the flink program,it is only can consume a few topics,the others 
> can't consume,can anyone help me?flink version is 1.6.1。



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Closed] (FLINK-13507) kafka consumer not able to distinguish topic name after add source

2019-07-31 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger closed FLINK-13507.
--
Resolution: Duplicate

> kafka consumer not able to distinguish topic name after add source
> --
>
> Key: FLINK-13507
> URL: https://issues.apache.org/jira/browse/FLINK-13507
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Connectors / Kafka
>Reporter: quanduoling
>Priority: Major
>
> right now , i cannot find a way to distinguish topic name after add kafka 
> source when spark streaming is able to, can flink add this feature to make 
> kafka consuming more powerful and flexible



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-13507) kafka consumer not able to distinguish topic name after add source

2019-07-31 Thread Robert Metzger (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896999#comment-16896999
 ] 

Robert Metzger commented on FLINK-13507:


Thanks a lot for opening this ticket. 
This has been solved as part of FLINK-8354. Through the {{ConsumerRecord}} you 
can access timestamp, topic, offset etc. of each record in the deserialization 
schema.

I'm closing this ticket.

> kafka consumer not able to distinguish topic name after add source
> --
>
> Key: FLINK-13507
> URL: https://issues.apache.org/jira/browse/FLINK-13507
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, Connectors / Kafka
>Reporter: quanduoling
>Priority: Major
>
> right now , i cannot find a way to distinguish topic name after add kafka 
> source when spark streaming is able to, can flink add this feature to make 
> kafka consuming more powerful and flexible



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13428) StreamingFileSink allow part file name to be configurable

2019-07-30 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13428:
---
Component/s: Connectors / FileSystem

> StreamingFileSink allow part file name to be configurable
> -
>
> Key: FLINK-13428
> URL: https://issues.apache.org/jira/browse/FLINK-13428
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Reporter: Joao Boto
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Allow that part file name could be configurable:
>  * partPrefix and partSuffix can be passed
>  
> the part prefix allow to set a better name to file
> the part suffix (if used as extension) allow system like Athena or Presto to 
> automatic detect the type of file and the compression if applied



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13450) Adjust tests to tolerate arithmetic differences between x86 and ARM

2019-07-30 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13450:
---
Component/s: Tests

> Adjust tests to tolerate arithmetic differences between x86 and ARM
> ---
>
> Key: FLINK-13450
> URL: https://issues.apache.org/jira/browse/FLINK-13450
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 1.9.0
>Reporter: Stephan Ewen
>Priority: Major
>
> Certain arithmetic operations have different precision/rounding on ARM versus 
> x86.
> Tests using floating point numbers should be changed to tolerate a certain 
> minimal deviation.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13333) Potentially NPE of preview plan functionality

2019-07-30 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1:
---
Component/s: Command Line Client

> Potentially NPE of preview plan functionality
> -
>
> Key: FLINK-1
> URL: https://issues.apache.org/jira/browse/FLINK-1
> Project: Flink
>  Issue Type: Bug
>  Components: Command Line Client
>Affects Versions: 1.10.0
>Reporter: TisonKun
>Priority: Major
>
> {{PackagedProgram#getPreviewPlan}} contains code as below
> {code:java}
> if (isUsingProgramEntryPoint()) {
> previewPlan = Optimizer.createPreOptimizedPlan(getPlan());
> } else if (isUsingInteractiveMode()) {
> // ...
> getPlan().getJobId();
> // 
> }
> {code}
>  
> when the latter {{#getPlan}} executed, it will finally execute 
> {{program.getPlan(options)}} where {{program}} equals null.
> To solve this problem, we can replace {{getPlan}} with {{env.getPlan}}. Where 
> {{env}} is an instance of {{PreviewPlanEnvironment}}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (FLINK-13377) Streaming SQL e2e test failed on travis

2019-07-30 Thread Robert Metzger (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-13377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-13377:
---
Component/s: Table SQL / Runtime

> Streaming SQL e2e test failed on travis
> ---
>
> Key: FLINK-13377
> URL: https://issues.apache.org/jira/browse/FLINK-13377
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.9.0
>Reporter: Zhenghua Gao
>Assignee: Zhenghua Gao
>Priority: Blocker
> Fix For: 1.9.0
>
> Attachments: 198.jpg, 495.jpg
>
>
> This is an instance: [https://api.travis-ci.org/v3/job/562011491/log.txt]
> ==
>  Running 'Streaming SQL end-to-end test' 
> ==
>  TEST_DATA_DIR: 
> /home/travis/build/apache/flink/flink-end-to-end-tests/test-scripts/temp-test-directory-47211990314
>  Flink dist directory: 
> /home/travis/build/apache/flink/flink-dist/target/flink-1.9-SNAPSHOT-bin/flink-1.9-SNAPSHOT
>  Starting cluster. Starting standalonesession daemon on host 
> travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. Starting taskexecutor daemon 
> on host travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. Waiting for 
> dispatcher REST endpoint to come up... Waiting for dispatcher REST endpoint 
> to come up... Waiting for dispatcher REST endpoint to come up... Waiting for 
> dispatcher REST endpoint to come up... Waiting for dispatcher REST endpoint 
> to come up... Waiting for dispatcher REST endpoint to come up... Dispatcher 
> REST endpoint is up. [INFO] 1 instance(s) of taskexecutor are already running 
> on travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. Starting taskexecutor 
> daemon on host travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. [INFO] 2 
> instance(s) of taskexecutor are already running on 
> travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. Starting taskexecutor daemon 
> on host travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. [INFO] 3 instance(s) 
> of taskexecutor are already running on 
> travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. Starting taskexecutor daemon 
> on host travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. Starting execution 
> of program Program execution finished Job with JobID 
> 7c7b66dd4e8dc17e229700b1c746aba6 has finished. Job Runtime: 77371 ms cat: 
> '/home/travis/build/apache/flink/flink-end-to-end-tests/test-scripts/temp-test-directory-47211990314/out/result/20/.part-*':
>  No such file or directory cat: 
> '/home/travis/build/apache/flink/flink-end-to-end-tests/test-scripts/temp-test-directory-47211990314/out/result/20/part-*':
>  No such file or directory FAIL StreamSQL: Output hash mismatch. Got 
> d41d8cd98f00b204e9800998ecf8427e, expected b29f14ed221a936211202ff65b51ee26. 
> head hexdump of actual: Stopping taskexecutor daemon (pid: 9983) on host 
> travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. Stopping standalonesession 
> daemon (pid: 8088) on host travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. 
> Skipping taskexecutor daemon (pid: 21571), because it is not running anymore 
> on travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. Skipping taskexecutor 
> daemon (pid: 22154), because it is not running anymore on 
> travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. Skipping taskexecutor daemon 
> (pid: 22595), because it is not running anymore on 
> travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. Skipping taskexecutor daemon 
> (pid: 30622), because it is not running anymore on 
> travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. Skipping taskexecutor daemon 
> (pid: 3850), because it is not running anymore on 
> travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. Skipping taskexecutor daemon 
> (pid: 4405), because it is not running anymore on 
> travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. Skipping taskexecutor daemon 
> (pid: 4839), because it is not running anymore on 
> travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. Stopping taskexecutor daemon 
> (pid: 8531) on host travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. Stopping 
> taskexecutor daemon (pid: 9077) on host 
> travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. Stopping taskexecutor daemon 
> (pid: 9518) on host travis-job-3ac15c01-1a7d-48b2-b4a9-86575f5d4641. [FAIL] 
> Test script contains errors. Checking of logs skipped. [FAIL] 'Streaming SQL 
> end-to-end test' failed after 1 minutes and 51 seconds! Test exited with exit 
> code 1



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


  1   2   3   4   5   6   7   8   9   10   >