[jira] [Created] (FLINK-20340) Use StreamFormat instead of DelimitedInputFormat in DeserializationSchemaAdapter
Jingsong Lee created FLINK-20340: Summary: Use StreamFormat instead of DelimitedInputFormat in DeserializationSchemaAdapter Key: FLINK-20340 URL: https://issues.apache.org/jira/browse/FLINK-20340 Project: Flink Issue Type: Improvement Components: Connectors / FileSystem Affects Versions: 1.12.0 Reporter: Jingsong Lee -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20339) `FileWriter` support to load StreamingFileSink's state.
Guowei Ma created FLINK-20339: - Summary: `FileWriter` support to load StreamingFileSink's state. Key: FLINK-20339 URL: https://issues.apache.org/jira/browse/FLINK-20339 Project: Flink Issue Type: Sub-task Components: API / DataStream Affects Versions: 1.12.0 Reporter: Guowei Ma -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20338) Make the SinkWriter load previous sink's state.
Guowei Ma created FLINK-20338: - Summary: Make the SinkWriter load previous sink's state. Key: FLINK-20338 URL: https://issues.apache.org/jira/browse/FLINK-20338 Project: Flink Issue Type: Sub-task Components: API / DataStream Affects Versions: 1.12.0 Reporter: Guowei Ma -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20337) Make migrate `StreamingFileSink` to `FileSink` possible
Guowei Ma created FLINK-20337: - Summary: Make migrate `StreamingFileSink` to `FileSink` possible Key: FLINK-20337 URL: https://issues.apache.org/jira/browse/FLINK-20337 Project: Flink Issue Type: Improvement Components: API / DataStream Affects Versions: 1.12.0 Reporter: Guowei Ma Flink-1.12 introduces the `FileSink` in FLINK-19510, which can guarantee the exactly once semantics both in the streaming and batch execution mode. We need to figure out how to migrate from `StreamingFileSink` to `FileSink` for the user who uses the `StreamingFileSink` currently. The pr wants provide a way that make it possible. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20336) RequestReplyFunction should not silently ignore UNRECOGNIZED state value mutations types
Tzu-Li (Gordon) Tai created FLINK-20336: --- Summary: RequestReplyFunction should not silently ignore UNRECOGNIZED state value mutations types Key: FLINK-20336 URL: https://issues.apache.org/jira/browse/FLINK-20336 Project: Flink Issue Type: Bug Components: Stateful Functions Affects Versions: statefun-2.2.1, statefun-2.1.0 Reporter: Tzu-Li (Gordon) Tai Assignee: Tzu-Li (Gordon) Tai Fix For: statefun-2.3.0, statefun-2.2.2 If a function's response has a {{PersistedValueMutation}} type that is {{UNRECOGNIZED}}, we currently just silently ignore that mutation: https://github.com/apache/flink-statefun/blob/master/statefun-flink/statefun-flink-core/src/main/java/org/apache/flink/statefun/flink/core/reqreply/PersistedRemoteFunctionValues.java#L84 This is incorrect. The {{UNRECOGNIZED}} enum constant is a pre-defined constant used by the Protobuf Java SDK, to represent a constant that was unable to be deserialized (because the the serialized constant does not match any enums defined in the protobuf message). Therefore, it should be handled by throwing an exception, preferably indicating that there is some sort of version mismatch between the function's Protobuf message definitions, and StateFun's Protobuf message definitions (i.e. most likely a mismatch in the invocation protocol versions). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20335) Remove support for eager state specifications in module YAML definitions
Tzu-Li (Gordon) Tai created FLINK-20335: --- Summary: Remove support for eager state specifications in module YAML definitions Key: FLINK-20335 URL: https://issues.apache.org/jira/browse/FLINK-20335 Project: Flink Issue Type: Sub-task Components: Stateful Functions Reporter: Tzu-Li (Gordon) Tai Fix For: statefun-2.3.0 With FLINK-20265, we now support declaring state in StateFun functions, and that can change dynamically without any system downtime. It can be confusing for users if we continued to support the legacy way of statically declaring state specifications in the module YAML definitions. Therefore, we propose to completely remove that by: * No longer support module YAML format versions <= 2.0. * Remove the {{PersistedRemoteFunctionValues}} constructor that accepts a list of eager state specifications This would be a breaking change: * Users upgrading to version 2.3.0 have to rewrite their module YAMLs to conform to format version 3.0 * They also have to correspondingly update their functions to use SDKs of version 2.3.0. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20334) Introduce function endpoint path templating in module YAML specifications
Tzu-Li (Gordon) Tai created FLINK-20334: --- Summary: Introduce function endpoint path templating in module YAML specifications Key: FLINK-20334 URL: https://issues.apache.org/jira/browse/FLINK-20334 Project: Flink Issue Type: Sub-task Reporter: Tzu-Li (Gordon) Tai Fix For: statefun-2.3.0 In the current module specifications, function endpoints are defined like so: {code} functions: - function: meta: kind: http type: com.foo/world spec: endpoint: http://localhost:5959/statefun {code} A list of functions and their corresponding service endpoints are listed statically in the module specification file, which is loaded once on system startup. The system may only route messages to functions that have been defined. This prevents users from adding new functions to the application, without having to restart the system and reload new module specifications. We propose that instead of specifying functions, users should specify a "family" of function endpoints, like so: {code} functionEndpoints: - functionEndpoint: meta: kind: http spec: target: typename: namespace: com.foo.bar function: * # (can be wildcard * or a specific name) urlPathTemplate: "https://bar.foo.com:8000/{typename.function}; connectTimeout: 1min # ... (other connection related configs that is shared for this endpoint family) {code} Note how users no longer define eager state per individual function. This is made possible by FLINK-20265, where state is now defined in the functions instead of in the module specifications. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20333) Flink standalone cluster throws metaspace OOM after submitting multiple PyFlink UDF jobs.
Wei Zhong created FLINK-20333: - Summary: Flink standalone cluster throws metaspace OOM after submitting multiple PyFlink UDF jobs. Key: FLINK-20333 URL: https://issues.apache.org/jira/browse/FLINK-20333 Project: Flink Issue Type: Bug Components: API / Python Reporter: Wei Zhong Currently the Flink standalone cluster will throw metaspace OOM after submitting multiple PyFlink UDF jobs. The root cause is that there are many soft references and Finalizers in memory, which prevent the garbage collection of the finished PyFlink job classloader. Due to their existence, it needs multiple full gc to reclaim the classloader of the completed job. If only one full gc is performed before the metaspace space is insufficient, then OOM will occur. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20332) Add workers recovered from previous attempt to pending resources
Xintong Song created FLINK-20332: Summary: Add workers recovered from previous attempt to pending resources Key: FLINK-20332 URL: https://issues.apache.org/jira/browse/FLINK-20332 Project: Flink Issue Type: Improvement Components: Runtime / Coordination Reporter: Xintong Song Assignee: Xintong Song For active deployments (Native K8s/Yarn/Mesos), after a JM failover, workers from previous attempt should register to the new JM. Depending on the order that slot requests and TM registrations arrive at the RM, it could happen that RM allocates unnecessary new resources while there are recovered resources that can be reused. A potential improvement is to add recovered workers to pending resources, so that RM knows what resources are expected to be available soon and decide whether to allocate new resources accordingly. See also the discussion in FLINK-20249. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20331) UnalignedCheckpointITCase.execute failed with "Sequence number for checkpoint 20 is not known (it was likely been overwritten by a newer checkpoint 21)"
Dian Fu created FLINK-20331: --- Summary: UnalignedCheckpointITCase.execute failed with "Sequence number for checkpoint 20 is not known (it was likely been overwritten by a newer checkpoint 21)" Key: FLINK-20331 URL: https://issues.apache.org/jira/browse/FLINK-20331 Project: Flink Issue Type: Bug Components: Runtime / Checkpointing Affects Versions: 1.12.0 Reporter: Dian Fu Fix For: 1.12.0 https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10059=logs=119bbba7-f5e3-5e08-e72d-09f1529665de=7dc1f5a9-54e1-502e-8b02-c7df69073cfc {code} 2020-11-24T22:42:17.6704402Z [ERROR] execute[parallel pipeline with mixed channels, p = 20](org.apache.flink.test.checkpointing.UnalignedCheckpointITCase) Time elapsed: 7.901 s <<< ERROR! 2020-11-24T22:42:17.6706095Z org.apache.flink.runtime.client.JobExecutionException: Job execution failed. 2020-11-24T22:42:17.6707450Zat org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147) 2020-11-24T22:42:17.6708569Zat org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$2(MiniClusterJobClient.java:119) 2020-11-24T22:42:17.6709626Zat java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) 2020-11-24T22:42:17.6710452Zat java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) 2020-11-24T22:42:17.6711271Zat java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 2020-11-24T22:42:17.6713170Zat java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) 2020-11-24T22:42:17.6713974Zat org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:229) 2020-11-24T22:42:17.6714517Zat java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) 2020-11-24T22:42:17.6715372Zat java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) 2020-11-24T22:42:17.6715871Zat java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 2020-11-24T22:42:17.6716514Zat java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) 2020-11-24T22:42:17.6718475Zat org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:996) 2020-11-24T22:42:17.6719322Zat akka.dispatch.OnComplete.internal(Future.scala:264) 2020-11-24T22:42:17.6719887Zat akka.dispatch.OnComplete.internal(Future.scala:261) 2020-11-24T22:42:17.6720271Zat akka.dispatch.japi$CallbackBridge.apply(Future.scala:191) 2020-11-24T22:42:17.6720645Zat akka.dispatch.japi$CallbackBridge.apply(Future.scala:188) 2020-11-24T22:42:17.6721114Zat scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) 2020-11-24T22:42:17.6721585Zat org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:74) 2020-11-24T22:42:17.6722078Zat scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44) 2020-11-24T22:42:17.6722738Zat scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252) 2020-11-24T22:42:17.6723183Zat akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572) 2020-11-24T22:42:17.6723862Zat akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22) 2020-11-24T22:42:17.6724435Zat akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21) 2020-11-24T22:42:17.6724914Zat scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436) 2020-11-24T22:42:17.6725323Zat scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435) 2020-11-24T22:42:17.6725866Zat scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) 2020-11-24T22:42:17.6726313Zat akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) 2020-11-24T22:42:17.6726829Zat akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91) 2020-11-24T22:42:17.6727376Zat akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) 2020-11-24T22:42:17.6727891Zat akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) 2020-11-24T22:42:17.6728400Zat scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72) 2020-11-24T22:42:17.6728855Zat akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90) 2020-11-24T22:42:17.6729390Zat akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) 2020-11-24T22:42:17.6729853Zat akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44) 2020-11-24T22:42:17.6730345Zat
[jira] [Created] (FLINK-20330) Flink connector has error in support hive external tables (hbase or es)
liu created FLINK-20330: --- Summary: Flink connector has error in support hive external tables (hbase or es) Key: FLINK-20330 URL: https://issues.apache.org/jira/browse/FLINK-20330 Project: Flink Issue Type: Bug Components: Connectors / Hive Affects Versions: 1.11.2, 1.11.1, 1.11.0 Environment: TEST CODE LIKE THIS: CREATE EXTERNAL TABLE hive_to_es ( key string, value string ) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES( 'es.resource' = 'hive_to_es/_doc', 'es.index.auto.create' = 'TRUE', 'es.nodes'='192.168.1.111:9200,192.168.1.112:9200,192.168.1.113:9200' ); insert into hive_to_es (key, value) values ('name','tom'); insert into hive_to_es (key, value) values ('yes','aaa'); select * from hive_to_es; !image-2020-11-25-09-51-00-100.png|width=807,height=134! Reporter: liu Attachments: image-2020-11-25-09-42-13-102.png, image-2020-11-25-09-51-00-100.png [ERROR] Could not execute SQL statement. Reason: org.apache.flink.connectors.hive.FlinkHiveException: Unable to instantiate the hadoop input format !image-2020-11-25-09-42-13-102.png|width=384,height=288! we add a patch like this: flink-connector-hive_2.12-1.11.2.jar org/apache/flink/connectors/hive/HiveTableSink.java +134 ADD PATCH: {code:java} // code placeholder if (sd.getOutputFormat() == null && "org.apache.hadoop.hive.hbase.HBaseSerDe".equals(sd.getSerdeInfo().getSerializationLib())) { sd.setOutputFormat("org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat"); } if (sd.getOutputFormat() == null && "org.elasticsearch.hadoop.hive.EsSerDe".equals(sd.getSerdeInfo().getSerializationLib())) { sd.setOutputFormat("org.elasticsearch.hadoop.hive.EsHiveOutputFormat"); } {code} org/apache/flink/connectors/hive/read/HiveTableInputFormat.java + 305 ADD PATCH: {code:java} // code placeholder if (sd.getInputFormat() == null && "org.apache.hadoop.hive.hbase.HBaseSerDe".equals(sd.getSerdeInfo().getSerializationLib())) { sd.setInputFormat("org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat"); jobConf.set("hbase.table.name", partition.getTableProps().getProperty("hbase.table.name")); jobConf.set("hbase.columns.mapping", partition.getTableProps().getProperty("hbase.columns.mapping")); } if (sd.getInputFormat() == null && "org.elasticsearch.hadoop.hive.EsSerDe".equals(sd.getSerdeInfo().getSerializationLib())) { sd.setInputFormat("org.elasticsearch.hadoop.hive.EsHiveInputFormat"); jobConf.set("location", sd.getLocation()); for (Enumeration en = partition.getTableProps().keys(); en.hasMoreElements();) { String key = en.nextElement().toString(); if(key.startsWith("es.")){ jobConf.set(key, partition.getTableProps().getProperty(key)); } } } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20329) Elasticsearch7DynamicSinkITCase hangs
Dian Fu created FLINK-20329: --- Summary: Elasticsearch7DynamicSinkITCase hangs Key: FLINK-20329 URL: https://issues.apache.org/jira/browse/FLINK-20329 Project: Flink Issue Type: Bug Components: Connectors / ElasticSearch Affects Versions: 1.12.0 Reporter: Dian Fu Fix For: 1.12.0 https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10052=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=03dca39c-73e8-5aaf-601d-328ae5c35f20 {code} 2020-11-24T16:04:05.9260517Z [INFO] Running org.apache.flink.streaming.connectors.elasticsearch.table.Elasticsearch7DynamicSinkITCase 2020-11-24T16:19:25.5481231Z == 2020-11-24T16:19:25.5483549Z Process produced no output for 900 seconds. 2020-11-24T16:19:25.5484064Z == 2020-11-24T16:19:25.5484498Z == 2020-11-24T16:19:25.5484882Z The following Java processes are running (JPS) 2020-11-24T16:19:25.5485475Z == 2020-11-24T16:19:25.5694497Z Picked up JAVA_TOOL_OPTIONS: -XX:+HeapDumpOnOutOfMemoryError 2020-11-24T16:19:25.7263048Z 16192 surefirebooter5057948964630155904.jar 2020-11-24T16:19:25.7263515Z 18566 Jps 2020-11-24T16:19:25.7263709Z 959 Launcher 2020-11-24T16:19:25.7411148Z == 2020-11-24T16:19:25.7427013Z Printing stack trace of Java process 16192 2020-11-24T16:19:25.7427369Z == 2020-11-24T16:19:25.7484365Z Picked up JAVA_TOOL_OPTIONS: -XX:+HeapDumpOnOutOfMemoryError 2020-11-24T16:19:26.0848776Z 2020-11-24 16:19:26 2020-11-24T16:19:26.0849578Z Full thread dump OpenJDK 64-Bit Server VM (25.275-b01 mixed mode): 2020-11-24T16:19:26.0849831Z 2020-11-24T16:19:26.0850185Z "Attach Listener" #32 daemon prio=9 os_prio=0 tid=0x7fc148001000 nid=0x48e7 waiting on condition [0x] 2020-11-24T16:19:26.0850595Zjava.lang.Thread.State: RUNNABLE 2020-11-24T16:19:26.0850814Z 2020-11-24T16:19:26.0851375Z "testcontainers-ryuk" #31 daemon prio=5 os_prio=0 tid=0x7fc251232000 nid=0x3fb0 in Object.wait() [0x7fc1012c4000] 2020-11-24T16:19:26.0854688Zjava.lang.Thread.State: TIMED_WAITING (on object monitor) 2020-11-24T16:19:26.0855379Zat java.lang.Object.wait(Native Method) 2020-11-24T16:19:26.0855844Zat org.testcontainers.utility.ResourceReaper.lambda$null$1(ResourceReaper.java:142) 2020-11-24T16:19:26.0857272Z- locked <0x8e2bd2d0> (a java.util.ArrayList) 2020-11-24T16:19:26.0857977Zat org.testcontainers.utility.ResourceReaper$$Lambda$93/1981729428.run(Unknown Source) 2020-11-24T16:19:26.0858471Zat org.rnorth.ducttape.ratelimits.RateLimiter.doWhenReady(RateLimiter.java:27) 2020-11-24T16:19:26.0858961Zat org.testcontainers.utility.ResourceReaper.lambda$start$2(ResourceReaper.java:133) 2020-11-24T16:19:26.0859422Zat org.testcontainers.utility.ResourceReaper$$Lambda$92/40191541.run(Unknown Source) 2020-11-24T16:19:26.0859788Zat java.lang.Thread.run(Thread.java:748) 2020-11-24T16:19:26.0860030Z 2020-11-24T16:19:26.0860371Z "process reaper" #24 daemon prio=10 os_prio=0 tid=0x7fc0f803b800 nid=0x3f92 waiting on condition [0x7fc10296e000] 2020-11-24T16:19:26.0860913Zjava.lang.Thread.State: TIMED_WAITING (parking) 2020-11-24T16:19:26.0861387Zat sun.misc.Unsafe.park(Native Method) 2020-11-24T16:19:26.0862495Z- parking to wait for <0x8814bf30> (a java.util.concurrent.SynchronousQueue$TransferStack) 2020-11-24T16:19:26.0863253Zat java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) 2020-11-24T16:19:26.0863760Zat java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) 2020-11-24T16:19:26.0864274Zat java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362) 2020-11-24T16:19:26.0864762Zat java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941) 2020-11-24T16:19:26.0865299Zat java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073) 2020-11-24T16:19:26.0866000Zat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) 2020-11-24T16:19:26.0866727Zat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 2020-11-24T16:19:26.0867321Zat java.lang.Thread.run(Thread.java:748) 2020-11-24T16:19:26.0867701Z 2020-11-24T16:19:26.0868666Z "surefire-forkedjvm-ping-30s" #23 daemon prio=5 os_prio=0 tid=0x7fc25040c000 nid=0x3f8f waiting on condition [0x7fc1037c6000] 2020-11-24T16:19:26.0869307Z
[jira] [Created] (FLINK-20328) UnalignedCheckpointITCase.execute failed with "Insufficient number of network buffers: required 1, but only 0 available"
Dian Fu created FLINK-20328: --- Summary: UnalignedCheckpointITCase.execute failed with "Insufficient number of network buffers: required 1, but only 0 available" Key: FLINK-20328 URL: https://issues.apache.org/jira/browse/FLINK-20328 Project: Flink Issue Type: Bug Components: Runtime / Checkpointing Affects Versions: 1.12.0 Reporter: Dian Fu Fix For: 1.12.0 [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10008=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=f508e270-48d6-5f1e-3138-42a17e0714f0] {code} 2020-11-24T08:18:15.0273526Z [ERROR] execute[non-parallel pipeline with local channels](org.apache.flink.test.checkpointing.UnalignedCheckpointITCase) Time elapsed: 4.338 s <<< ERROR! 2020-11-24T08:18:15.0287436Z org.apache.flink.runtime.client.JobExecutionException: Job execution failed. 2020-11-24T08:18:15.0288449Zat org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147) 2020-11-24T08:18:15.0292752Zat org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$2(MiniClusterJobClient.java:119) 2020-11-24T08:18:15.0293717Zat java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) 2020-11-24T08:18:15.0294586Zat java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) 2020-11-24T08:18:15.0295320Zat java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 2020-11-24T08:18:15.0296301Zat java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) 2020-11-24T08:18:15.0297083Zat org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:229) 2020-11-24T08:18:15.0299675Zat java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) 2020-11-24T08:18:15.0300561Zat java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) 2020-11-24T08:18:15.0301312Zat java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 2020-11-24T08:18:15.0302041Zat java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) 2020-11-24T08:18:15.0302911Zat org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:996) 2020-11-24T08:18:15.0303568Zat akka.dispatch.OnComplete.internal(Future.scala:264) 2020-11-24T08:18:15.0304112Zat akka.dispatch.OnComplete.internal(Future.scala:261) 2020-11-24T08:18:15.0304823Zat akka.dispatch.japi$CallbackBridge.apply(Future.scala:191) 2020-11-24T08:18:15.0305426Zat akka.dispatch.japi$CallbackBridge.apply(Future.scala:188) 2020-11-24T08:18:15.0349665Zat scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) 2020-11-24T08:18:15.0350525Zat org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:74) 2020-11-24T08:18:15.0351135Zat scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44) 2020-11-24T08:18:15.0351615Zat scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252) 2020-11-24T08:18:15.0352100Zat akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572) 2020-11-24T08:18:15.0352715Zat akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22) 2020-11-24T08:18:15.0353389Zat akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21) 2020-11-24T08:18:15.0353992Zat scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436) 2020-11-24T08:18:15.0354631Zat scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435) 2020-11-24T08:18:15.0355099Zat scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) 2020-11-24T08:18:15.033Zat akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) 2020-11-24T08:18:15.0356110Zat akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91) 2020-11-24T08:18:15.0356651Zat akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) 2020-11-24T08:18:15.0357188Zat akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) 2020-11-24T08:18:15.0357682Zat scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72) 2020-11-24T08:18:15.0358133Zat akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90) 2020-11-24T08:18:15.0358661Zat akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) 2020-11-24T08:18:15.0359167Zat akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44) 2020-11-24T08:18:15.0359662Zat akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
[jira] [Created] (FLINK-20327) The Hive's read/write page should redirect to SQL Fileystem connector
Dawid Wysakowicz created FLINK-20327: Summary: The Hive's read/write page should redirect to SQL Fileystem connector Key: FLINK-20327 URL: https://issues.apache.org/jira/browse/FLINK-20327 Project: Flink Issue Type: Improvement Components: Connectors / Hive, Documentation Reporter: Dawid Wysakowicz Fix For: 1.12.0 Right now the Hive's read/write page redirects to SQL filesystem connector with a note: ??Please see the StreamingFileSink for a full list of available configurations.?? but this page actually has no configuration options. We should link to https://ci.apache.org/projects/flink/flink-docs-master/dev/table/connectors/filesystem.html instead which cover the SQL related configuration. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20326) Broken links to "How to define LookupableTableSource"
Dawid Wysakowicz created FLINK-20326: Summary: Broken links to "How to define LookupableTableSource" Key: FLINK-20326 URL: https://issues.apache.org/jira/browse/FLINK-20326 Project: Flink Issue Type: Bug Components: Documentation, Table SQL / Ecosystem Reporter: Dawid Wysakowicz Fix For: 1.12.0 I found at least to pages that have a link to a non-existing anchor: * https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/joins.html#join-with-a-temporal-table * https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/temporal_tables.html#defining-temporal-table -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [REMINDER] Build docs before merging
Yes. I have opened a pull request: https://github.com/apache/flink/pull/14204 Best, Jark On Tue, 24 Nov 2020 at 17:38, Robert Metzger wrote: > Yes, that should be very easy to do. Just move the docs_404_check to the > "ci" stage in "build-apache-repo.yml". Maybe it makes sense to extend the > build_properties.sh check to test if the change includes any docs changes. > If not, we could skip the check to save some resources. > > Do you want to open a PR for this? > > On Tue, Nov 24, 2020 at 10:03 AM Jark Wu wrote: > > > Is it possible to move it to PR CI? > > > > I have seen many documentation link issues these days. > > > > Best, > > Jark > > > > On Tue, 24 Nov 2020 at 15:56, Robert Metzger > wrote: > > > > > Actually, the check_links.sh script is called in the nightly CI runs > > > > > > > > > > > > https://github.com/apache/flink/blob/master/tools/azure-pipelines/build-apache-repo.yml#L145 > > > --> https://github.com/apache/flink/blob/master/tools/ci/docs.sh#L30+ > > > > > > It was just broken for a while, but fixed by Dian 2 weeks ago. > > > > > > > > > On Wed, Nov 18, 2020 at 6:09 PM Till Rohrmann > > > wrote: > > > > > > > I think automating the check is a good idea since everything which is > > not > > > > automated will be forgotten at some point. > > > > > > > > Cheers, > > > > Till > > > > > > > > On Wed, Nov 18, 2020 at 12:56 PM Jark Wu wrote: > > > > > > > > > +1 for this. Would be better to run the `check_links.sh` for broken > > > > links. > > > > > Btw, could we add the docs build and check into PR CI. > > > > > I think it would be better to guarantee this in the process. > > > > > > > > > > Best, > > > > > Jark > > > > > > > > > > On Wed, 18 Nov 2020 at 18:08, Till Rohrmann > > > > wrote: > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > I noticed in the last couple of days that our docs were broken. > The > > > > main > > > > > > reason was invalid links using the {% link %} tag. The best way > to > > > > avoid > > > > > > this situation is to build the docs before merging them [1]. That > > > way, > > > > we > > > > > > can ensure that they are not broken. > > > > > > > > > > > > [1] https://github.com/apache/flink/tree/master/docs#build > > > > > > > > > > > > Cheers, > > > > > > Till > > > > > > > > > > > > > > > > > > > > >
[jira] [Created] (FLINK-20325) Move docs_404_check to CI stage
Jark Wu created FLINK-20325: --- Summary: Move docs_404_check to CI stage Key: FLINK-20325 URL: https://issues.apache.org/jira/browse/FLINK-20325 Project: Flink Issue Type: Test Components: Build System / Azure Pipelines, Build System / CI Reporter: Jark Wu Assignee: Jark Wu Fix For: 1.12.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20324) Support customizing of containers for native kubernetes setup
Boris Lublinsky created FLINK-20324: --- Summary: Support customizing of containers for native kubernetes setup Key: FLINK-20324 URL: https://issues.apache.org/jira/browse/FLINK-20324 Project: Flink Issue Type: Improvement Components: Deployment / Kubernetes Affects Versions: 1.11.2 Environment: Kubernetes Reporter: Boris Lublinsky Fix For: 1.12.0, 1.11.2 A common requirement for Flink applications is usage of custom resources (Environment variables, PVCs, Secrets, configMaps, etc). For example, usage of NFS-based checkpointing, require mounting NFS volumes, access to databases might require environment variables and secrets, etc. An implementation of such support is provided in this pull request https://github.com/apache/flink/pull/14005 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20323) CorrelateSortToRankRule cannot deal with multiple groupings
Timo Walther created FLINK-20323: Summary: CorrelateSortToRankRule cannot deal with multiple groupings Key: FLINK-20323 URL: https://issues.apache.org/jira/browse/FLINK-20323 Project: Flink Issue Type: Improvement Components: Table SQL / Planner Reporter: Timo Walther Fix the following test case in {{CorrelateSortToRankRuleTest}}: {code} @Test // TODO: this is a valid case to support def testMultipleGroupingsNotSupported(): Unit = { val query = s""" |SELECT f0, f2 |FROM | (SELECT DISTINCT f0, f1 FROM t1) t2, | LATERAL ( |SELECT f2 |FROM t1 |WHERE f0 = t2.f0 AND f1 = t2.f1 |ORDER BY f2 |DESC LIMIT 3 | ) """.stripMargin util.verifyPlan(query) } {code} Currently, we only support one equal condition not {{f0 = t2.f0 AND f1 = t2.f1}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20322) Rename table.optimizer.union-all-as-breakpoint-disabled to table.optimizer.union-all-as-breakpoint.enabled
godfrey he created FLINK-20322: -- Summary: Rename table.optimizer.union-all-as-breakpoint-disabled to table.optimizer.union-all-as-breakpoint.enabled Key: FLINK-20322 URL: https://issues.apache.org/jira/browse/FLINK-20322 Project: Flink Issue Type: Improvement Components: Table SQL / API Affects Versions: 1.12.0 Reporter: godfrey he Fix For: 1.12.0 {{table.optimizer.union-all-as-breakpoint-disabled}} is defined in {{RelNodeBlockPlanBuilder}} and is an internal experimental config. While {{disabled}} and {{false}} as default value is very obscure. I suggest to change {{table.optimizer.union-all-as-breakpoint-disabled}} to {{table.optimizer.union-all-as-breakpoint-enabled}} and use {{true}} as default value, which is easier to understand. As this config is an internal experimental config, we don't mark it as deprecated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [DISCUSS] Move license check utility to a new repository to share it with flink-statefun
Thanks a lot for your response, and sorry for my late response. how do you intend downstream CIs in flink / flink-statefun to be using > this? > Would there be released artifacts from `flink-project-utils` to expose > each tool (e.g. the `LicenseChecker`)? My idea was that flink and flink-statefun would just clone the flink-project-utils repository, build the tool locally and run it, without releasing the artifacts anywhere. The build takes just a few seconds, and the maintenance overhead is a little lower with this approach. I also considered building a maven plugin, but I guess forking the maven-shade-plugin is the only feasible option. Adding a custom maven-shade-transformer is not feasible, since we won't have access to the required data. All this sounded too complicated, so I discarded the idea. Thanks a lot for the list of copied utilities. I guess if we are adding more sister projects to Flink, or splitting the main repo in the future, we need to look into coming up with a properly written set of shared release utilities. I have the feeling that currently the effort for creating a separate repository is not justified for just checking one module in flink-statefun. Let's revisit this in the future. On Fri, Nov 6, 2020 at 4:28 AM Tzu-Li (Gordon) Tai wrote: > Hi Robert, > > I think this could be useful in flink-statefun. > > StateFun currently has two modules that bundles dependencies, most > importantly the `flink-statefun-distribution` module which currently > bundles some Flink dependencies as well as Flink connectors (Kafka, > Kinesis). > Upgrading the Flink version in StateFun typically involves a bulk update on > the NOTICE of that module, so some automatic validation in CI could help > with that. > The other module that bundles dependencies is the StateFun examples, which > we've been thinking about stopping to release Maven artifacts for. > > On Thu, Nov 5, 2020 at 9:54 PM Robert Metzger wrote: > > > 1. Is this relevant for flink-statefun? > > > > So, really there is only one module that would benefit from this tool > (which could possibly be enough already for sharing to make sense). > To justify the efforts for sharing this nice utility, I'd like to have a > better idea of: how do you intend downstream CIs in flink / flink-statefun > to be using this? Would there be released artifacts from > `flink-project-utils` to expose each tool (e.g. the `LicenseChecker`)? > It almost looks as if it would be easiest to reuse this tool if it was > provided as a Maven plugin, though I'm not sure how possible that is and > probably out-of-scope for this discussion. > > > > > > 2. For the repository name, what do you think about > "flink-project-utils" ? > > I'd like to use a generic name so that we can potentially share other > > internal utilities. > > > > I like the idea of sharing internal utilities in general across the two > repos. > > To gauge how useful this repo would be in the end, here's some info on what > StateFun has copied already to flink-statefun: > >- About to copy checking for dead links in docs [1] >- Several release-related scripts for creating source bundles, deploying >staging jars, updating branch version, etc. [2]. However, these have >somewhat evolved in StateFun to tailor it for flink-statefun, so I > guess it >would not make sense to share release related tooling. >- Utility around building the documentation (currently flink and >flink-statefun share the same Jekyll setup). > > Best, > Gordon > > [1] https://github.com/apache/flink-statefun/pull/171 > [2] https://github.com/apache/flink-statefun/tree/master/tools/releasing >
[jira] [Created] (FLINK-20321) Get NPE when using AvroDeserializationSchema to deserialize null input
Shengkai Fang created FLINK-20321: - Summary: Get NPE when using AvroDeserializationSchema to deserialize null input Key: FLINK-20321 URL: https://issues.apache.org/jira/browse/FLINK-20321 Project: Flink Issue Type: Bug Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) Affects Versions: 1.12.0 Reporter: Shengkai Fang You can reproduce the bug by adding the code into the {{AvroDeserializationSchemaTest}}. The code follows {code:java} @Test public void testSpecificRecord2() throws Exception { DeserializationSchema deserializer = AvroDeserializationSchema.forSpecific(Address.class); Address deserializedAddress = deserializer.deserialize(null); assertEquals(null, deserializedAddress); } {code} Exception stack: {code:java} java.lang.NullPointerException at org.apache.flink.formats.avro.utils.MutableByteArrayInputStream.setBuffer(MutableByteArrayInputStream.java:43) at org.apache.flink.formats.avro.AvroDeserializationSchema.deserialize(AvroDeserializationSchema.java:131) at org.apache.flink.formats.avro.AvroDeserializationSchemaTest.testSpecificRecord2(AvroDeserializationSchemaTest.java:69) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20320) Support start SQL Client with an initialization SQL file
Jark Wu created FLINK-20320: --- Summary: Support start SQL Client with an initialization SQL file Key: FLINK-20320 URL: https://issues.apache.org/jira/browse/FLINK-20320 Project: Flink Issue Type: New Feature Components: Table SQL / Client Reporter: Jark Wu As discussed in FLINK-20260, it would be helpful to predefine some meta information when starting SQL CLI. This can be a replacement of YAML file which defines meta informations. Hive supports start with an initialization SQL file "hive -i init.sql": https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli I think we can also support that. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20319) Improving the visitor pattern for operations
Ingo Bürk created FLINK-20319: - Summary: Improving the visitor pattern for operations Key: FLINK-20319 URL: https://issues.apache.org/jira/browse/FLINK-20319 Project: Flink Issue Type: Improvement Affects Versions: 1.11.2 Reporter: Ingo Bürk The *OperationVisitor interfaces (which are not public API) don't always implement the visitor pattern correctly, and some things which would be useful are missing. Some things I discovered: # CatalogSinkModifyOperation doesn't accept() its child. It's likely that others have this problem as well, but I haven't checked further. # The base Operation interface doesn't have an accept() method at all. Potentially intentional since this interface actually is public API? # There's a catch-all QueryOperationVisitor#visit(QueryOperation other) that would be nice to split up into its subtypes (PlannerQueryOperation, …) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-20318) Fix cast question for properies() method in kafka ConnectorDescriptor
hehuiyuan created FLINK-20318: - Summary: Fix cast question for properies() method in kafka ConnectorDescriptor Key: FLINK-20318 URL: https://issues.apache.org/jira/browse/FLINK-20318 Project: Flink Issue Type: Bug Components: Connectors / Kafka Reporter: hehuiyuan This Jira fixes Kafka connector. There is a cast problem when use properties method. {code:java} Properties props = new Properties(); props.put( "enable.auto.commit", "false"); props.put( "fetch.max.wait.ms", "3000"); props.put("flink.poll-timeout", 5000); props.put( "flink.partition-discovery.interval-millis", false); kafka = new Kafka() .version("0.11") .topic(topic) .properties(props); {code} {code:java} Exception in thread "main" java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.String Exception in thread "main" java.lang.ClassCastException: java.lang.Boolean cannot be cast to java.lang.String {code} change : - *change (String) v > String.valueOf() in Kafka.java* -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [REMINDER] Build docs before merging
Yes, that should be very easy to do. Just move the docs_404_check to the "ci" stage in "build-apache-repo.yml". Maybe it makes sense to extend the build_properties.sh check to test if the change includes any docs changes. If not, we could skip the check to save some resources. Do you want to open a PR for this? On Tue, Nov 24, 2020 at 10:03 AM Jark Wu wrote: > Is it possible to move it to PR CI? > > I have seen many documentation link issues these days. > > Best, > Jark > > On Tue, 24 Nov 2020 at 15:56, Robert Metzger wrote: > > > Actually, the check_links.sh script is called in the nightly CI runs > > > > > > > https://github.com/apache/flink/blob/master/tools/azure-pipelines/build-apache-repo.yml#L145 > > --> https://github.com/apache/flink/blob/master/tools/ci/docs.sh#L30+ > > > > It was just broken for a while, but fixed by Dian 2 weeks ago. > > > > > > On Wed, Nov 18, 2020 at 6:09 PM Till Rohrmann > > wrote: > > > > > I think automating the check is a good idea since everything which is > not > > > automated will be forgotten at some point. > > > > > > Cheers, > > > Till > > > > > > On Wed, Nov 18, 2020 at 12:56 PM Jark Wu wrote: > > > > > > > +1 for this. Would be better to run the `check_links.sh` for broken > > > links. > > > > Btw, could we add the docs build and check into PR CI. > > > > I think it would be better to guarantee this in the process. > > > > > > > > Best, > > > > Jark > > > > > > > > On Wed, 18 Nov 2020 at 18:08, Till Rohrmann > > > wrote: > > > > > > > > > Hi everyone, > > > > > > > > > > I noticed in the last couple of days that our docs were broken. The > > > main > > > > > reason was invalid links using the {% link %} tag. The best way to > > > avoid > > > > > this situation is to build the docs before merging them [1]. That > > way, > > > we > > > > > can ensure that they are not broken. > > > > > > > > > > [1] https://github.com/apache/flink/tree/master/docs#build > > > > > > > > > > Cheers, > > > > > Till > > > > > > > > > > > > > > >
[jira] [Created] (FLINK-20317) Update Format Overview page to mention the supported connector for upsert-kafka
Jark Wu created FLINK-20317: --- Summary: Update Format Overview page to mention the supported connector for upsert-kafka Key: FLINK-20317 URL: https://issues.apache.org/jira/browse/FLINK-20317 Project: Flink Issue Type: Task Components: Connectors / Kafka, Documentation, Table SQL / Ecosystem Reporter: Jark Wu Assignee: Shengkai Fang Fix For: 1.12.0 Currently, the Format Overview page only mentions and links "Apache Kafka". We should update the table for "upsert-kafka" connector. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: [REMINDER] Build docs before merging
Is it possible to move it to PR CI? I have seen many documentation link issues these days. Best, Jark On Tue, 24 Nov 2020 at 15:56, Robert Metzger wrote: > Actually, the check_links.sh script is called in the nightly CI runs > > > https://github.com/apache/flink/blob/master/tools/azure-pipelines/build-apache-repo.yml#L145 > --> https://github.com/apache/flink/blob/master/tools/ci/docs.sh#L30+ > > It was just broken for a while, but fixed by Dian 2 weeks ago. > > > On Wed, Nov 18, 2020 at 6:09 PM Till Rohrmann > wrote: > > > I think automating the check is a good idea since everything which is not > > automated will be forgotten at some point. > > > > Cheers, > > Till > > > > On Wed, Nov 18, 2020 at 12:56 PM Jark Wu wrote: > > > > > +1 for this. Would be better to run the `check_links.sh` for broken > > links. > > > Btw, could we add the docs build and check into PR CI. > > > I think it would be better to guarantee this in the process. > > > > > > Best, > > > Jark > > > > > > On Wed, 18 Nov 2020 at 18:08, Till Rohrmann > > wrote: > > > > > > > Hi everyone, > > > > > > > > I noticed in the last couple of days that our docs were broken. The > > main > > > > reason was invalid links using the {% link %} tag. The best way to > > avoid > > > > this situation is to build the docs before merging them [1]. That > way, > > we > > > > can ensure that they are not broken. > > > > > > > > [1] https://github.com/apache/flink/tree/master/docs#build > > > > > > > > Cheers, > > > > Till > > > > > > > > > >