[jira] [Created] (FLINK-30002) Change the alignmentTimeout to alignedCheckpointTimeout
Rui Fan created FLINK-30002: --- Summary: Change the alignmentTimeout to alignedCheckpointTimeout Key: FLINK-30002 URL: https://issues.apache.org/jira/browse/FLINK-30002 Project: Flink Issue Type: Improvement Components: Runtime / Checkpointing Affects Versions: 1.16.0 Reporter: Rui Fan Fix For: 1.17.0 The alignmentTimeout has been changed to alignedCheckpointTimeout in FLINK-23041 . But some fields or methods still use alignmentTimeout. They should be renamed to alignedCheckpointTimeout. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30001) sql-client.sh start failed
xiaohang.li created FLINK-30001: --- Summary: sql-client.sh start failed Key: FLINK-30001 URL: https://issues.apache.org/jira/browse/FLINK-30001 Project: Flink Issue Type: Bug Components: Command Line Client Affects Versions: 1.15.2, 1.16.0 Reporter: xiaohang.li [hadoop@master flink-1.15.0]$ ./bin/sql-client.sh Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR or HADOOP_CLASSPATH was set. Setting HBASE_CONF_DIR=/etc/hbase/conf because no HBASE_CONF_DIR was set. Exception in thread "main" org.apache.flink.table.client.SqlClientException: Unexpected exception. This is a bug. Please consider filing an issue. at org.apache.flink.table.client.SqlClient.startClient(SqlClient.java:201) at org.apache.flink.table.client.SqlClient.main(SqlClient.java:161) Caused by: org.apache.flink.table.api.TableException: Could not instantiate the executor. Make sure a planner module is on the classpath at org.apache.flink.table.client.gateway.context.ExecutionContext.lookupExecutor(ExecutionContext.java:163) at org.apache.flink.table.client.gateway.context.ExecutionContext.createTableEnvironment(ExecutionContext.java:111) at org.apache.flink.table.client.gateway.context.ExecutionContext.(ExecutionContext.java:66) at org.apache.flink.table.client.gateway.context.SessionContext.create(SessionContext.java:247) at org.apache.flink.table.client.gateway.local.LocalContextUtils.buildSessionContext(LocalContextUtils.java:87) at org.apache.flink.table.client.gateway.local.LocalExecutor.openSession(LocalExecutor.java:87) at org.apache.flink.table.client.SqlClient.start(SqlClient.java:88) at org.apache.flink.table.client.SqlClient.startClient(SqlClient.java:187) ... 1 more Caused by: org.apache.flink.table.api.TableException: Unexpected error when trying to load service provider for factories. at org.apache.flink.table.factories.FactoryUtil.lambda$discoverFactories$19(FactoryUtil.java:813) at java.util.ArrayList.forEach(ArrayList.java:1259) at org.apache.flink.table.factories.FactoryUtil.discoverFactories(FactoryUtil.java:799) at org.apache.flink.table.factories.FactoryUtil.discoverFactory(FactoryUtil.java:517) at org.apache.flink.table.client.gateway.context.ExecutionContext.lookupExecutor(ExecutionContext.java:154) ... 8 more Caused by: java.util.ServiceConfigurationError: org.apache.flink.table.factories.Factory: Provider org.apache.flink.table.planner.loader.DelegateExecutorFactory could not be instantiated at java.util.ServiceLoader.fail(ServiceLoader.java:232) at java.util.ServiceLoader.access$100(ServiceLoader.java:185) at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384) at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) at java.util.ServiceLoader$1.next(ServiceLoader.java:480) at org.apache.flink.table.factories.ServiceLoaderUtil.load(ServiceLoaderUtil.java:42) at org.apache.flink.table.factories.FactoryUtil.discoverFactories(FactoryUtil.java:798) ... 10 more Caused by: java.lang.ExceptionInInitializerError at org.apache.flink.table.planner.loader.PlannerModule.getInstance(PlannerModule.java:135) at org.apache.flink.table.planner.loader.DelegateExecutorFactory.(DelegateExecutorFactory.java:34) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380) ... 14 more Caused by: org.apache.flink.table.api.TableException: Could not initialize the table planner components loader. at org.apache.flink.table.planner.loader.PlannerModule.(PlannerModule.java:123) at org.apache.flink.table.planner.loader.PlannerModule.(PlannerModule.java:52) at org.apache.flink.table.planner.loader.PlannerModule$PlannerComponentsHolder.(PlannerModule.java:131) ... 22 more Caused by: java.nio.file.FileAlreadyExistsException: /tmp at sun.nio.fs.UnixException.translateToIOException(UnixException.java:88) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384) at java.nio.file.Files.createDirectory(Files.java:674) at
Re: [DISCUSS] Issue tracking workflow
Hi everyone, Unfortunately ASF Infra has already implemented the change and new Jira users can't sign up. I think there is consensus that we shouldn't move from Jira now. My proposal would be to setup a separate mailing list to which users can send their request for an account, which gets sent to the PMC so they can create accounts for them. I don't see any other short term solution. If agreed, let's open up a vote thread on this. Thanks, Martijn Op do 3 nov. 2022 om 04:51 schreef Xintong Song > Thanks all for the valuable feedback, opinions and suggestions. > > # Option 1. > I know this is the first choice for pretty much everyone. Many people from > the Flink community (including myself) have shared their opinion with > Infra. However, based on the feedback so far, TBH I don't think things > would turn out the way we want. I don't see what else we can do. Does > anyone have more suggestions on this option? Or we probably have to > scratch it out of the list. > > # Option 4. > Seems there are also quite some concerns on using solely GH issues: limited > features (thus the significant changes to the current issue/release > management processes), migration cost, source of truth, etc. I think I'm > also convinced that this may not be a good choice. > > # Option 2 & 3. > Between the two options, I'm leaning towards option 2. > - IMO, making it as easy as possible for users to report issues should be a > top priority. Having to wait for a human response for creating an account > does not meet that requirement. That makes a strong objection to option 3 > from my side. > - Using GH issues for consumer-facing issues and reflecting the valid ones > back to Jira (either manually by committers or by bot) sounds good to me. > The status (open/closed) and labels should make tracking the issues easier > compared to in mailing lists / slack, in terms of whether an issue has been > taken care of / reflected to Jira / closed as invalid. That does not mean > we should not reflect things from mailing lists / slack to Jira. Ideally, > we leverage every possible channel for collecting user issues / feedback, > while guiding / suggesting users to choose GH issues over the others. > - For new contributors, they still need to request an account from a PMC > member. They can even make that request on GH issues, if they do not mind > posting the email address publicly. > - I would not be worried very much about the privacy issue, if the Jira > account creation is restricted to contributors. Contributors are exposing > their email addresses publicly anyway, in dev@ mailing list and commit > history. I'm also not strongly against creating a dedicated mailing list > though. > > Best, > > Xintong > > > > On Wed, Nov 2, 2022 at 9:16 PM Chesnay Schepler > wrote: > > > Calcite just requested a separate mailing list for users to request a > > JIRA account. > > > > > > I think I'd try going a similar route. While I prefer the openness of > > github issues, they are really limited, and while some things can be > > replicated with labels (like fix versions / components), things like > > release notes can't. > > We'd also lose a central place for collecting issues, since we'd have to > > (?) scope issues per repo. > > > > I wouldn't want to import everything into GH issues (it's just a flawed > > approach in the long-term imo), but on the other hand I don't know if > > the auto linker even works if it has to link to either jira or a GH > issue. > > > > Given that we need to change workflows in any case, I think I'd prefer > > sticking to JIRA. > > For reported bugs I'd wager that in most cases we can file the tickets > > ourselves and communicate with users on slack/MLs to gather all the > > information. I'd argue that if we'd had been more pro-active with filing > > tickets for user issues (instead of relying on them to do it) we > > would've addressed several issues way sooner. > > > > Additionally, since either option would be a sort of experiment, then > > JIRA is a safer option. We have to change less and there aren't any > > long-term ramifications (like having to re-import GH tickets into JIRA). > > > > On 28/10/2022 16:49, Piotr Nowojski wrote: > > > Hi, > > > > > > I'm afraid of the migration cost to only github issues and lack of many > > > features that we are currently using. That would be very disruptive and > > > annoying. For me github issues are far worse compared to using the > Jira. > > > > > > I would strongly prefer Option 1. over the others. Option 4 I would > like > > > the least. I would be fine with Option 3, and Option 2 but assuming > that > > > Jira would stay the source of truth. > > > For option 2, maybe we could have a bot that would backport/copy user > > > created issues in github to Jira (and link them together)? Discussions > > > could still happen in the github, but we could track all of the issues > as > > > we are doing right now. Bot could also sync it the other way around > (like > > > marking
Re: [DISCUSS] Allow sharing (RocksDB) memory between slots
Hi John, Yun, Thank you for your feedback @John > It seems like operators would either choose isolation for the cluster’s jobs > or they would want to share the memory between jobs. > I’m not sure I see the motivation to reserve only part of the memory for sharing > and allowing jobs to choose whether they will share or be isolated. I see two related questions here: 1) Whether to allow mixed workloads within the same cluster. I agree that most likely all the jobs will have the same "sharing" requirement. So we can drop "state.backend.memory.share-scope" from the proposal. 2) Whether to allow different memory consumers to use shared or exclusive memory. Currently, only RocksDB is proposed to use shared memory. For python, it's non-trivial because it is job-specific. So we have to partition managed memory into shared/exclusive and therefore can NOT replace "taskmanager.memory.managed.shared-fraction" with some boolean flag. I think your question was about (1), just wanted to clarify why the shared-fraction is needed. @Yun > I am just curious whether this could really bring benefits to our users with such complex configuration logic. I agree, and configuration complexity seems a common concern. I hope that removing "state.backend.memory.share-scope" (as proposed above) reduces the complexity. Please share any ideas of how to simplify it further. > Could you share some real experimental results? I did an experiment to verify that the approach is feasible, i.e. multilple jobs can share the same memory/block cache. But I guess that's not what you mean here? Do you have any experiments in mind? > BTW, as talked before, I am not sure whether different lifecycles of RocksDB state-backends > would affect the memory usage of block cache & write buffer manager in RocksDB. > Currently, all instances would start and destroy nearly simultaneously, > this would change after we introduce this feature with jobs running at different scheduler times. IIUC, the concern is that closing a RocksDB instance might close the BlockCache. I checked that manually and it seems to work as expected. And I think that would contradict the sharing concept, as described in the documentation [1]. [1] https://github.com/facebook/rocksdb/wiki/Block-Cache Regards, Roman On Wed, Nov 9, 2022 at 3:50 AM Yanfei Lei wrote: > Hi Roman, > Thanks for the proposal, this allows State Backend to make better use of > memory. > > After reading the ticket, I'm curious about some points: > > 1. Is shared-memory only for the state backend? If both > "taskmanager.memory.managed.shared-fraction: >0" and > "state.backend.rocksdb.memory.managed: false" are set at the same time, > will the shared-memory be wasted? > 2. It's said that "Jobs 4 and 5 will use the same 750Mb of unmanaged memory > and will compete with each other" in the example, how is the memory size of > unmanaged part calculated? > 3. For fine-grained-resource-management, the control > of cpuCores, taskHeapMemory can still work, right? And I am a little > worried that too many memory-about configuration options are complicated > for users to understand. > > Regards, > Yanfei > > Roman Khachatryan 于2022年11月8日周二 23:22写道: > > > Hi everyone, > > > > I'd like to discuss sharing RocksDB memory across slots as proposed in > > FLINK-29928 [1]. > > > > Since 1.10 / FLINK-7289 [2], it is possible to: > > - share these objects among RocksDB instances of the same slot > > - bound the total memory usage by all RocksDB instances of a TM > > > > However, the memory is divided between the slots equally (unless using > > fine-grained resource control). This is sub-optimal if some slots contain > > more memory intensive tasks than the others. > > Using fine-grained resource control is also often not an option because > the > > workload might not be known in advance. > > > > The proposal is to widen the scope of sharing memory to TM, so that it > can > > be shared across all RocksDB instances of that TM. That would reduce the > > overall memory consumption in exchange for resource isolation. > > > > Please see FLINK-29928 [1] for more details. > > > > Looking forward to feedback on that proposal. > > > > [1] > > https://issues.apache.org/jira/browse/FLINK-29928 > > [2] > > https://issues.apache.org/jira/browse/FLINK-7289 > > > > Regards, > > Roman > > >
Re: SQL Gateway and SQL Client
Hi Shengkai, I think there is an additional case where a proxy is between the client and gateway. In that case, being able to pass headers would allow for additional options / features. I see several PRs from Yu Zelin. Is there a first one to review? Cheers, Jim On Thu, Nov 10, 2022 at 9:42 PM Shengkai Fang wrote: > Hi, Jim. > > > how to pass additional headers when sending REST requests > > Could you share what headers do you want to send when using SQL Client? I > think there are two cases we need to consider. Please correct me if I am > wrong. > > # Case 1 > > If users wants to connect to the SQL Gateway with its password, I think the > users should type > ``` > ./sql-client.sh --user xxx --password xxx > ``` > in the terminal and the OpenSessionRequest should be enough. > > # Case 2 > > If users wants to modify the execution config, users should type > ``` > Flink SQL> SET `` = ``; > ``` > in the terminal. The Client can send ExecuteStatementRequest to the > Gateway. > > > As you have FLIPs or PRs, feel free to let me, Jamie, and Alexey know. > > It would be nice you can join us to finish the feature. I think the > modification about the SQL Gateway side is ready to review. > > Best, > Shengkai > > > Jim Hughes 于2022年11月11日周五 05:19写道: > > > Hi Yu Zelin, > > > > I have read through your draft and it looks good. I am new to Flink, so > I > > haven't learned about everything which needs to be done yet. > > > > One of the considerations that I'm interested in understanding is how to > > pass additional headers when sending REST requests. From looking at the > > code, it looks like a custom `OutboundChannelHandlerFactory` could be > > created to read additional configuration and set headers. Does that make > > sense? > > > > Thank you very much for sharing the proof of concept code and the > > document. As you have FLIPs or PRs, feel free to let me, Jamie, and > Alexey > > know. We'll be happy to review them. > > > > Cheers, > > > > Jim > > > > On Wed, Nov 9, 2022 at 11:43 PM yu zelin wrote: > > > > > Hi, all > > > Sorry for late response. As Shengkai mentioned, Currently I’m working > > with > > > him on SQL Client, dedicating to implement the Remote Mode of SQL > > Client. I > > > have written a draft of implementation plan and will write a FLIP about > > it > > > ASAP. If you are interested in, please take a look at the draft and > it’s > > > nice if you give me some feedback. > > > The doc is at: > > > > > > https://docs.google.com/document/d/14cS4VBSamMUnlM_PZuK6QKLfriUuQU51iqET5oiYy_c/edit?usp=sharing > > > > > > > 2022年11月7日 11:19,Shengkai Fang 写道: > > > > > > > > Hi, all. Sorry for the late reply. > > > > > > > > > Is the gateway mode planned to be supported for SQL Client in 1.17? > > > > > Do you have anything you can already share so we can start with > your > > > work or just play around with it. > > > > > > > > Yes. @yzl is working on it and he will list the implementation plan > > > later and share the progress. I think the change is not very large and > I > > > think it's not a big problem to finish this in the release-1.17. I will > > > join to develop this in the mid of November. > > > > > > > > Best, > > > > Shengkai > > > > > > > > > > > > > > > > > > > > Jamie Grier mailto:jgr...@apache.org>> > > > 于2022年11月5日周六 00:48写道: > > > >> Hi Shengkai, > > > >> > > > >> We're doing more and more Flink development at Confluent these days > > and > > > we're currently trying to bootstrap a prototype that relies on the SQL > > > Client and Gateway. We will be using the the components in some of our > > > projects and would like to co-develop them with you and the rest of the > > > Flink community. > > > >> > > > >> As of right now it's a pretty big blocker for our upcoming milestone > > > that the SQL Client has not yet been modified to talk to the SQL > Gateway > > > and we want to help with this effort ASAP! We would be even willing to > > > take over the work if it's not yet started but I suspect it already is. > > > >> > > > >> Anyway, rather than start working immediately on the SQL Client and > > > adding a the new Gateway mode ourselves we wanted to start a > conversation > > > with you and see where you're at with things and offer to help. > > > >> > > > >> Do you have anything you can already share so we can start with your > > > work or just play around with it. Like I said, we just want to get > > started > > > and are very able to help in this area. We see both the SQL Client and > > > Gateway being very important for us and have a good team to help > develop > > it. > > > >> > > > >> Let me know if there is a branch you can share, etc. It would be > much > > > appreciated! > > > >> > > > >> -Jamie Grier > > > >> > > > >> > > > >> On 2022/10/28 06:06:49 Shengkai Fang wrote: > > > >> > Hi. > > > >> > > > > >> > > Is there a possibility for us to get engaged and at least > > introduce > > > >> > initial changes to support authentication/authorization? > > > >> > >
[jira] [Created] (FLINK-30000) Introduce FileSystemFactory to create FileSystem from custom configuration
Jingsong Lee created FLINK-3: Summary: Introduce FileSystemFactory to create FileSystem from custom configuration Key: FLINK-3 URL: https://issues.apache.org/jira/browse/FLINK-3 Project: Flink Issue Type: Improvement Components: Table Store Reporter: Jingsong Lee Fix For: table-store-0.3.0 Currently, table store uses static Flink FileSystem. This can not support: 1. Use another FileSystem different from checkpoint FileSystem. 2. Use FileSystem in Hive and Spark from custom configuration instead of using FileSystem.initialize. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [DISCUSS] Remove FlinkKafkaConsumer and FlinkKafkaProducer in the master for 1.17 release
Hi all, Thank you all for the informative feedback. I figure there is a requirement to improve the documentation wrt the migration from FlinkKafkaConsumer to KafkaSource. I've fired a ticket[1] and connected it with [2]. This shouldn't be the blocker for removing FlinkKafkaConsumer. Given there will be some ongoing SinkV2 upgrades, I will start a vote only limited to FlinkKafkaConsumer elimination and related APIs graduation. As a follow-up task, I will sync with Yun Gao before the coding freeze of 1.17 release to check if we can start the second vote to remove FlinkKafkaProducer with 1.17. Best regards, Jing [1] https://issues.apache.org/jira/browse/FLINK-2 [2] https://issues.apache.org/jira/browse/FLINK-28302 On Wed, Nov 2, 2022 at 11:39 AM Martijn Visser wrote: > Hi David, > > I believe that for the DataStream this is indeed documented [1] but it > might be missed given that there is a lot of documentation and you need to > know that your problem is related to idleness. For the Table API I think > this is never mentioned, so it should definitely be at least documented > there. > > Thanks, > > Martijn > > [1] > > https://nightlies.apache.org/flink/flink-docs-stable/docs/connectors/datastream/kafka/#idleness > > On Wed, Nov 2, 2022 at 11:28 AM David Anderson > wrote: > > > > > > > For the partition > > > idleness problem could you elaborate more about it? I assume both > > > FlinkKafkaConsumer and KafkaSource need a WatermarkStrategy to decide > > > whether to mark the partition as idle. > > > > > > As a matter of fact, no, that's not the case -- which is why I mentioned > > it. > > > > The FlinkKafkaConsumer automatically treats all initially empty (or > > non-existent) partitions as idle, while the KafkaSource only does this if > > the WatermarkStrategy specifies that idleness handling is desired by > > configuring withIdleness. This can be a source of confusion for folks > > upgrading to the new connector. It most often shows up in situations > where > > the number of Kafka partitions is less than the parallelism of the > > connector, which is a rather common occurrence in development and testing > > environments. > > > > I believe this change in behavior was made deliberately, so as to create > a > > more consistent experience across all FLIP-27 connectors. This isn't > > something that needs to be fixed, but does need to be communicated more > > clearly. Unfortunately, the whole idleness mechanism remained > significantly > > broken until 1.16 (considering the impact of [1] and [2]), further > > complicating the situation. Because of FLINK-28975 [2], users with > > partitions that are initially empty may have problems with versions > before > > 1.15.3 (still unreleased) and 1.16.0. See [3] for an example of this > > confusion. > > > > [1] https://issues.apache.org/jira/browse/FLINK-18934 (idleness didn't > > work > > with connected streams) > > [2] https://issues.apache.org/jira/browse/FLINK-28975 (idle streams > could > > never become active again) > > [3] > > > > > https://stackoverflow.com/questions/70096166/parallelism-in-flink-kafka-source-causes-nothing-to-execute/70101290#70101290 > > > > Best, > > David > > > > On Wed, Nov 2, 2022 at 5:26 AM Qingsheng Ren wrote: > > > > > Thanks Jing for starting the discussion. > > > > > > +1 for removing FlinkKafkaConsumer, as KafkaSource has evolved for many > > > release cycles and should be stable enough. I have some concerns about > > the > > > new Kafka sink based on sink v2, as sink v2 still has some ongoing work > > in > > > 1.17 (maybe Yun Gao could provide some inputs). Also we found some > issues > > > of KafkaSink related to the internal mechanism of sink v2, like > > > FLINK-29492. > > > > > > @David > > > About the ability of DeserializationSchema#isEndOfStream, FLIP-208 is > > > trying to complete this piece of the puzzle, and Hang Ruan ( > > > ruanhang1...@gmail.com) plans to work on it in 1.17. For the partition > > > idleness problem could you elaborate more about it? I assume both > > > FlinkKafkaConsumer and KafkaSource need a WatermarkStrategy to decide > > > whether to mark the partition as idle. > > > > > > Best, > > > Qingsheng > > > Ververica (Alibaba) > > > > > > On Thu, Oct 27, 2022 at 8:06 PM Jing Ge wrote: > > > > > > > Hi Dev, > > > > > > > > I'd like to start a discussion about removing FlinkKafkaConsumer and > > > > FlinkKafkaProducer in 1.17. > > > > > > > > Back in the past, it was originally announced to remove it with Flink > > > 1.15 > > > > after Flink 1.14 had been released[1]. And then postponed to the next > > > 1.15 > > > > release which meant to remove it with Flink 1.16 but forgot to change > > the > > > > doc[2]. I have created a PRs to fix it. Since the 1.16 release branch > > has > > > > code freeze, it makes sense to, first of all, update the doc to say > > that > > > > FlinkKafkaConsumer will be removed with Flink 1.17 [3][4] and second > > > start > > > > the discussion about removing them
[jira] [Created] (FLINK-29999) Improve documentation on how a user should migrate from FlinkKafkaConsumer to KafkaSource
Jing Ge created FLINK-2: --- Summary: Improve documentation on how a user should migrate from FlinkKafkaConsumer to KafkaSource Key: FLINK-2 URL: https://issues.apache.org/jira/browse/FLINK-2 Project: Flink Issue Type: Improvement Components: Documentation Affects Versions: 1.16.0 Reporter: Jing Ge We have described how to migrate job from FlinkKafkaConsumer to KafkaSource at [https://nightlies.apache.org/flink/flink-docs-master/release-notes/flink-1.14/#flink-24055httpsissuesapacheorgjirabrowseflink-24055.] But there are more things to take care of beyond it, one example is the idleness handling. Related documentation should improved. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [DISCUSS] FLIP-268: Rack Awareness for Kafka Sources
Yanfei, You caught a mistake on my part. I referenced the wrong KIP. That one is for Kafka Streams, KIP-36 is the one for the regular client. I'll update the referenced FLIP and its JIRA. This feature is already available in our client, my team at work implemented it against the old Kafka Consumer API earlier this year. We'd like to donate the implementation on the new API. KIP-36 https://cwiki.apache.org/confluence/plugins/servlet/mobile?contentId=61321028#content/view/61321028 On Fri, Nov 11, 2022, 2:55 AM Martijn Visser wrote: > Hi Yanfei, > > The version for Kafka Clients has been upgraded to 3.2.3 since 1.16 via > FLINK-29513 [1] > > Best regards, > > Martijn > > [1] https://issues.apache.org/jira/browse/FLINK-29513 > > On Fri, Nov 11, 2022 at 4:11 AM Yanfei Lei wrote: > > > Hi Jeremy, > > > > Thanks for the proposal. > > I'm curious if this feature can adapt to the current kafka-client version > > in the connector. The rack awareness feature requires a version>=3.2.0 to > > use, while the kafka-client version of Kafka connector is 2.8.1. Will we > > upgrade the Kafka-client version first? > > > > Best, > > Yanfei > > > > Jeremy DeGroot 于2022年11月11日周五 05:35写道: > > > > > Kafak has a rack awareness feature that allows brokers and consumers to > > > communicate about the rack (or AWS Availability Zone) they're located > in. > > > Reading from a local broker can save money in bandwidth and improve > > latency > > > for your consumers. > > > > > > This improvement proposes that a Kafka Consumer could be configured > with > > a > > > callback that could be run when it's being configured on the task > > manager, > > > that will set the appropriate value at runtime if a value is provided. > > > > > > More detail about this proposal can be found at > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-268%3A+Kafka+Rack+Awareness > > > > > > > > > More information about the Kafka rack awareness feature is at > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-708%3A+Rack+awareness+for+Kafka+Streams > > > > > > > > > Best, > > > > > > Jeremy > > > > > >
[jira] [Created] (FLINK-29998) Make the backpressure tab could be sort by the busy percent
Yun Tang created FLINK-29998: Summary: Make the backpressure tab could be sort by the busy percent Key: FLINK-29998 URL: https://issues.apache.org/jira/browse/FLINK-29998 Project: Flink Issue Type: Sub-task Components: Runtime / Web Frontend Reporter: Yun Tang Fix For: 1.17.0 Currently, we cannot sort the backpressure tab to see which task is busiest. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29997) Link to the taskmanager page in the expired sub-task of checkpoint tab
Yun Tang created FLINK-29997: Summary: Link to the taskmanager page in the expired sub-task of checkpoint tab Key: FLINK-29997 URL: https://issues.apache.org/jira/browse/FLINK-29997 Project: Flink Issue Type: Sub-task Components: Runtime / Web Frontend Reporter: Yun Tang Fix For: 1.17.0 Currently, when we debug why some of the sub-tasks cannot complete the checkpoints in time, we have a complex steps to find which task manager containing such logs. This could be simplified via a direct link. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29996) Link to the task manager's thread dump page in the backpressure tab
Yun Tang created FLINK-29996: Summary: Link to the task manager's thread dump page in the backpressure tab Key: FLINK-29996 URL: https://issues.apache.org/jira/browse/FLINK-29996 Project: Flink Issue Type: Sub-task Components: Runtime / Web Frontend Reporter: Yun Tang Fix For: 1.17.0 Currently, we have a complex steps to find the thread dump of backpressured tasks, however, this could be simplified with a link in the backpressure tab of web UI. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29995) Improve the usability of job analysis for backpressure, expired checkpoints and exceptions
Yun Tang created FLINK-29995: Summary: Improve the usability of job analysis for backpressure, expired checkpoints and exceptions Key: FLINK-29995 URL: https://issues.apache.org/jira/browse/FLINK-29995 Project: Flink Issue Type: Improvement Components: Runtime / Web Frontend Reporter: Yun Tang Assignee: Yun Tang Fix For: 1.17.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29994) Update official document about Lookup Join.
Hang HOU created FLINK-29994: Summary: Update official document about Lookup Join. Key: FLINK-29994 URL: https://issues.apache.org/jira/browse/FLINK-29994 Project: Flink Issue Type: Improvement Components: Table Store Affects Versions: 1.16.0 Reporter: Hang HOU Missed a period in Description of "lookup.cache-rows". [About rocksdb'configuration|https://nightlies.apache.org/flink/flink-table-store-docs-release-0.2/docs/development/lookup-join/#rocksdboptions] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29993) Provide more convenient way to programmatically configure reporters
Chesnay Schepler created FLINK-29993: Summary: Provide more convenient way to programmatically configure reporters Key: FLINK-29993 URL: https://issues.apache.org/jira/browse/FLINK-29993 Project: Flink Issue Type: Improvement Components: Runtime / Metrics, Tests Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.17.0 Configuring reporters programmatically is an uncommon task in tests, but is currently not convenient and error prone since you need to manually assemble the config key as it must contain the reporter name. Add some factory methods to the `MetricOptions` to generate a config options at runtime given a reporter name. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29992) Join execution plan parsing error
HunterXHunter created FLINK-29992: - Summary: Join execution plan parsing error Key: FLINK-29992 URL: https://issues.apache.org/jira/browse/FLINK-29992 Project: Flink Issue Type: Bug Components: Table SQL / Planner Affects Versions: 1.16.0, 1.17.0 Reporter: HunterXHunter {code:java} // tableEnv.executeSql(" CREATE CATALOG hive WITH (\n" + " 'type' = 'hive',\n" + " 'default-database' = 'flinkdebug',\n" + " 'hive-conf-dir' = '/programe/hadoop/hive-3.1.2/conf'\n" + " )"); tableEnv.executeSql("create table datagen_tbl (\n" + "id STRING\n" + ",name STRING\n" + ",age bigint\n" + ",ts bigint\n" + ",`par` STRING\n" + ",pro_time as PROCTIME()\n" + ") with (\n" + " 'connector'='datagen'\n" + ",'rows-per-second'='10'\n" + " \n" + ")"); String dml1 = "select * " + " from datagen_tbl as p " + " join hive.flinkdebug.default_hive_src_tbl " + " FOR SYSTEM_TIME AS OF p.pro_time AS c" + " ON p.id = c.id"; // Execution succeeded System.out.println(tableEnv.explainSql(dml1)); String dml2 = "select p.id " + " from datagen_tbl as p " + " join hive.flinkdebug.default_hive_src_tbl " + " FOR SYSTEM_TIME AS OF p.pro_time AS c" + " ON p.id = c.id"; // Throw an exception System.out.println(tableEnv.explainSql(dml2)); {code} {code:java} org.apache.flink.table.api.TableException: Cannot generate a valid execution plan for the given query: FlinkLogicalCalc(select=[id]) +- FlinkLogicalJoin(condition=[=($0, $1)], joinType=[inner]) :- FlinkLogicalCalc(select=[id]) : +- FlinkLogicalTableSourceScan(table=[[default_catalog, default_database, datagen_tbl]], fields=[id, name, age, ts, par]) +- FlinkLogicalSnapshot(period=[$cor1.pro_time]) +- FlinkLogicalTableSourceScan(table=[[hive, flinkdebug, default_hive_src_tbl, project=[id]]], fields=[id])This exception indicates that the query uses an unsupported SQL feature. Please check the documentation for the set of currently supported SQL features. at org.apache.flink.table.planner.plan.optimize.program.FlinkVolcanoProgram.optimize(FlinkVolcanoProgram.scala:70) at org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.$anonfun$optimize$1(FlinkChainedProgram.scala:59) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29991) KinesisFirehoseSinkTest#firehoseSinkFailsWhenUnableToConnectToRemoteService failed
Martijn Visser created FLINK-29991: -- Summary: KinesisFirehoseSinkTest#firehoseSinkFailsWhenUnableToConnectToRemoteService failed Key: FLINK-29991 URL: https://issues.apache.org/jira/browse/FLINK-29991 Project: Flink Issue Type: Bug Components: Connectors / Kinesis Affects Versions: 1.15.2 Reporter: Martijn Visser {code:java} Nov 10 10:22:53 [ERROR] org.apache.flink.connector.firehose.sink.KinesisFirehoseSinkTest.firehoseSinkFailsWhenUnableToConnectToRemoteService Time elapsed: 7.394 s <<< FAILURE! Nov 10 10:22:53 java.lang.AssertionError: Nov 10 10:22:53 Nov 10 10:22:53 Expecting throwable message: Nov 10 10:22:53 "An OperatorEvent from an OperatorCoordinator to a task was lost. Triggering task failover to ensure consistency. Event: '[NoMoreSplitEvent]', targetTask: Source: Sequence Source -> Map -> Map -> Sink: Writer (15/32) - execution #0" Nov 10 10:22:53 to contain: Nov 10 10:22:53 "Received an UnknownHostException when attempting to interact with a service." Nov 10 10:22:53 but did not. {code} https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=43017=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=ed165f3f-d0f6-524b-5279-86f8ee7d0e2d=44513 -- This message was sent by Atlassian Jira (v8.20.10#820010)