[jira] [Created] (FLINK-30002) Change the alignmentTimeout to alignedCheckpointTimeout

2022-11-11 Thread Rui Fan (Jira)
Rui Fan created FLINK-30002:
---

 Summary: Change the alignmentTimeout to alignedCheckpointTimeout
 Key: FLINK-30002
 URL: https://issues.apache.org/jira/browse/FLINK-30002
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Checkpointing
Affects Versions: 1.16.0
Reporter: Rui Fan
 Fix For: 1.17.0


The alignmentTimeout has been changed to alignedCheckpointTimeout in 
FLINK-23041 .

But some fields or methods still use alignmentTimeout. They should be renamed 
to alignedCheckpointTimeout.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30001) sql-client.sh start failed

2022-11-11 Thread xiaohang.li (Jira)
xiaohang.li created FLINK-30001:
---

 Summary: sql-client.sh start failed
 Key: FLINK-30001
 URL: https://issues.apache.org/jira/browse/FLINK-30001
 Project: Flink
  Issue Type: Bug
  Components: Command Line Client
Affects Versions: 1.15.2, 1.16.0
Reporter: xiaohang.li


[hadoop@master flink-1.15.0]$ ./bin/sql-client.sh 
Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR or 
HADOOP_CLASSPATH was set.
Setting HBASE_CONF_DIR=/etc/hbase/conf because no HBASE_CONF_DIR was set.


Exception in thread "main" org.apache.flink.table.client.SqlClientException: 
Unexpected exception. This is a bug. Please consider filing an issue.
        at 
org.apache.flink.table.client.SqlClient.startClient(SqlClient.java:201)
        at org.apache.flink.table.client.SqlClient.main(SqlClient.java:161)
Caused by: org.apache.flink.table.api.TableException: Could not instantiate the 
executor. Make sure a planner module is on the classpath
        at 
org.apache.flink.table.client.gateway.context.ExecutionContext.lookupExecutor(ExecutionContext.java:163)
        at 
org.apache.flink.table.client.gateway.context.ExecutionContext.createTableEnvironment(ExecutionContext.java:111)
        at 
org.apache.flink.table.client.gateway.context.ExecutionContext.(ExecutionContext.java:66)
        at 
org.apache.flink.table.client.gateway.context.SessionContext.create(SessionContext.java:247)
        at 
org.apache.flink.table.client.gateway.local.LocalContextUtils.buildSessionContext(LocalContextUtils.java:87)
        at 
org.apache.flink.table.client.gateway.local.LocalExecutor.openSession(LocalExecutor.java:87)
        at org.apache.flink.table.client.SqlClient.start(SqlClient.java:88)
        at 
org.apache.flink.table.client.SqlClient.startClient(SqlClient.java:187)
        ... 1 more
Caused by: org.apache.flink.table.api.TableException: Unexpected error when 
trying to load service provider for factories.
        at 
org.apache.flink.table.factories.FactoryUtil.lambda$discoverFactories$19(FactoryUtil.java:813)
        at java.util.ArrayList.forEach(ArrayList.java:1259)
        at 
org.apache.flink.table.factories.FactoryUtil.discoverFactories(FactoryUtil.java:799)
        at 
org.apache.flink.table.factories.FactoryUtil.discoverFactory(FactoryUtil.java:517)
        at 
org.apache.flink.table.client.gateway.context.ExecutionContext.lookupExecutor(ExecutionContext.java:154)
        ... 8 more
Caused by: java.util.ServiceConfigurationError: 
org.apache.flink.table.factories.Factory: Provider 
org.apache.flink.table.planner.loader.DelegateExecutorFactory could not be 
instantiated
        at java.util.ServiceLoader.fail(ServiceLoader.java:232)
        at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
        at 
java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
        at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
        at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
        at 
org.apache.flink.table.factories.ServiceLoaderUtil.load(ServiceLoaderUtil.java:42)
        at 
org.apache.flink.table.factories.FactoryUtil.discoverFactories(FactoryUtil.java:798)
        ... 10 more
Caused by: java.lang.ExceptionInInitializerError
        at 
org.apache.flink.table.planner.loader.PlannerModule.getInstance(PlannerModule.java:135)
        at 
org.apache.flink.table.planner.loader.DelegateExecutorFactory.(DelegateExecutorFactory.java:34)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at java.lang.Class.newInstance(Class.java:442)
        at 
java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)
        ... 14 more
Caused by: org.apache.flink.table.api.TableException: Could not initialize the 
table planner components loader.
        at 
org.apache.flink.table.planner.loader.PlannerModule.(PlannerModule.java:123)
        at 
org.apache.flink.table.planner.loader.PlannerModule.(PlannerModule.java:52)
        at 
org.apache.flink.table.planner.loader.PlannerModule$PlannerComponentsHolder.(PlannerModule.java:131)
        ... 22 more
Caused by: java.nio.file.FileAlreadyExistsException: /tmp
        at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:88)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at 
sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
        at java.nio.file.Files.createDirectory(Files.java:674)
        at 

Re: [DISCUSS] Issue tracking workflow

2022-11-11 Thread Martijn Visser
Hi everyone,

Unfortunately ASF Infra has already implemented the change and new Jira
users can't sign up.

I think there is consensus that we shouldn't move from Jira now. My
proposal would be to setup a separate mailing list to which users can send
their request for an account, which gets sent to the PMC so they can create
accounts for them. I don't see any other short term solution.

If agreed, let's open up a vote thread on this.

Thanks, Martijn


Op do 3 nov. 2022 om 04:51 schreef Xintong Song 

> Thanks all for the valuable feedback, opinions and suggestions.
>
> # Option 1.
> I know this is the first choice for pretty much everyone. Many people from
> the Flink community (including myself) have shared their opinion with
> Infra. However, based on the feedback so far, TBH I don't think things
> would turn out the way we want. I don't see what else we can do. Does
> anyone have more suggestions on this option? Or we probably have to
> scratch it out of the list.
>
> # Option 4.
> Seems there are also quite some concerns on using solely GH issues: limited
> features (thus the significant changes to the current issue/release
> management processes), migration cost, source of truth, etc. I think I'm
> also convinced that this may not be a good choice.
>
> # Option 2 & 3.
> Between the two options, I'm leaning towards option 2.
> - IMO, making it as easy as possible for users to report issues should be a
> top priority. Having to wait for a human response for creating an account
> does not meet that requirement. That makes a strong objection to option 3
> from my side.
> - Using GH issues for consumer-facing issues and reflecting the valid ones
> back to Jira (either manually by committers or by bot) sounds good to me.
> The status (open/closed) and labels should make tracking the issues easier
> compared to in mailing lists / slack, in terms of whether an issue has been
> taken care of / reflected to Jira / closed as invalid. That does not mean
> we should not reflect things from mailing lists / slack to Jira. Ideally,
> we leverage every possible channel for collecting user issues / feedback,
> while guiding / suggesting users to choose GH issues over the others.
> - For new contributors, they still need to request an account from a PMC
> member. They can even make that request on GH issues, if they do not mind
> posting the email address publicly.
> - I would not be worried very much about the privacy issue, if the Jira
> account creation is restricted to contributors. Contributors are exposing
> their email addresses publicly anyway, in dev@ mailing list and commit
> history. I'm also not strongly against creating a dedicated mailing list
> though.
>
> Best,
>
> Xintong
>
>
>
> On Wed, Nov 2, 2022 at 9:16 PM Chesnay Schepler 
> wrote:
>
> > Calcite just requested a separate mailing list for users to request a
> > JIRA account.
> >
> >
> > I think I'd try going a similar route. While I prefer the openness of
> > github issues, they are really limited, and while some things can be
> > replicated with labels (like fix versions / components), things like
> > release notes can't.
> > We'd also lose a central place for collecting issues, since we'd have to
> > (?) scope issues per repo.
> >
> > I wouldn't want to import everything into GH issues (it's just a flawed
> > approach in the long-term imo), but on the other hand I don't know if
> > the auto linker even works if it has to link to either jira or a GH
> issue.
> >
> > Given that we need to change workflows in any case, I think I'd prefer
> > sticking to JIRA.
> > For reported bugs I'd wager that in most cases we can file the tickets
> > ourselves and communicate with users on slack/MLs to gather all the
> > information. I'd argue that if we'd had been more pro-active with filing
> > tickets for user issues (instead of relying on them to do it) we
> > would've addressed several issues way sooner.
> >
> > Additionally, since either option would be a sort of experiment, then
> > JIRA is a safer option. We have to change less and there aren't any
> > long-term ramifications (like having to re-import GH tickets into JIRA).
> >
> > On 28/10/2022 16:49, Piotr Nowojski wrote:
> > > Hi,
> > >
> > > I'm afraid of the migration cost to only github issues and lack of many
> > > features that we are currently using. That would be very disruptive and
> > > annoying. For me github issues are far worse compared to using the
> Jira.
> > >
> > > I would strongly prefer Option 1. over the others. Option 4 I would
> like
> > > the least. I would be fine with Option 3, and Option 2 but assuming
> that
> > > Jira would stay the source of truth.
> > > For option 2, maybe we could have a bot that would backport/copy user
> > > created issues in github to Jira (and link them together)? Discussions
> > > could still happen in the github, but we could track all of the issues
> as
> > > we are doing right now. Bot could also sync it the other way around
> (like
> > > marking 

Re: [DISCUSS] Allow sharing (RocksDB) memory between slots

2022-11-11 Thread Roman Khachatryan
Hi John, Yun,

Thank you for your feedback

@John

> It seems like operators would either choose isolation for the cluster’s
jobs
> or they would want to share the memory between jobs.
> I’m not sure I see the motivation to reserve only part of the memory for
sharing
> and allowing jobs to choose whether they will share or be isolated.

I see two related questions here:

1) Whether to allow mixed workloads within the same cluster.
I agree that most likely all the jobs will have the same "sharing"
requirement.
So we can drop "state.backend.memory.share-scope" from the proposal.

2) Whether to allow different memory consumers to use shared or exclusive
memory.
Currently, only RocksDB is proposed to use shared memory. For python, it's
non-trivial because it is job-specific.
So we have to partition managed memory into shared/exclusive and therefore
can NOT replace "taskmanager.memory.managed.shared-fraction" with some
boolean flag.

I think your question was about (1), just wanted to clarify why the
shared-fraction is needed.

@Yun

> I am just curious whether this could really bring benefits to our users
with such complex configuration logic.
I agree, and configuration complexity seems a common concern.
I hope that removing "state.backend.memory.share-scope" (as proposed above)
reduces the complexity.
Please share any ideas of how to simplify it further.

> Could you share some real experimental results?
I did an experiment to verify that the approach is feasible,
i.e. multilple jobs can share the same memory/block cache.
But I guess that's not what you mean here? Do you have any experiments in
mind?

> BTW, as talked before, I am not sure whether different lifecycles of
RocksDB state-backends
> would affect the memory usage of block cache & write buffer manager in
RocksDB.
> Currently, all instances would start and destroy nearly simultaneously,
> this would change after we introduce this feature with jobs running at
different scheduler times.
IIUC, the concern is that closing a RocksDB instance might close the
BlockCache.
I checked that manually and it seems to work as expected.
And I think that would contradict the sharing concept, as described in the
documentation [1].

[1]
https://github.com/facebook/rocksdb/wiki/Block-Cache

Regards,
Roman


On Wed, Nov 9, 2022 at 3:50 AM Yanfei Lei  wrote:

> Hi Roman,
> Thanks for the proposal, this allows State Backend to make better use of
> memory.
>
> After reading the ticket, I'm curious about some points:
>
> 1. Is shared-memory only for the state backend? If both
> "taskmanager.memory.managed.shared-fraction: >0" and
> "state.backend.rocksdb.memory.managed: false" are set at the same time,
> will the shared-memory be wasted?
> 2. It's said that "Jobs 4 and 5 will use the same 750Mb of unmanaged memory
> and will compete with each other" in the example, how is the memory size of
> unmanaged part calculated?
> 3. For fine-grained-resource-management, the control
> of cpuCores, taskHeapMemory can still work, right?  And I am a little
> worried that too many memory-about configuration options are complicated
> for users to understand.
>
> Regards,
> Yanfei
>
> Roman Khachatryan  于2022年11月8日周二 23:22写道:
>
> > Hi everyone,
> >
> > I'd like to discuss sharing RocksDB memory across slots as proposed in
> > FLINK-29928 [1].
> >
> > Since 1.10 / FLINK-7289 [2], it is possible to:
> > - share these objects among RocksDB instances of the same slot
> > - bound the total memory usage by all RocksDB instances of a TM
> >
> > However, the memory is divided between the slots equally (unless using
> > fine-grained resource control). This is sub-optimal if some slots contain
> > more memory intensive tasks than the others.
> > Using fine-grained resource control is also often not an option because
> the
> > workload might not be known in advance.
> >
> > The proposal is to widen the scope of sharing memory to TM, so that it
> can
> > be shared across all RocksDB instances of that TM. That would reduce the
> > overall memory consumption in exchange for resource isolation.
> >
> > Please see FLINK-29928 [1] for more details.
> >
> > Looking forward to feedback on that proposal.
> >
> > [1]
> > https://issues.apache.org/jira/browse/FLINK-29928
> > [2]
> > https://issues.apache.org/jira/browse/FLINK-7289
> >
> > Regards,
> > Roman
> >
>


Re: SQL Gateway and SQL Client

2022-11-11 Thread Jim Hughes
Hi Shengkai,

I think there is an additional case where a proxy is between the client and
gateway.  In that case, being able to pass headers would allow for
additional options / features.

I see several PRs from Yu Zelin.  Is there a first one to review?

Cheers,

Jim

On Thu, Nov 10, 2022 at 9:42 PM Shengkai Fang  wrote:

> Hi, Jim.
>
> > how to pass additional headers when sending REST requests
>
> Could you share what headers do you want to send when using SQL Client?  I
> think there are two cases we need to consider. Please correct me if I am
> wrong.
>
> # Case 1
>
> If users wants to connect to the SQL Gateway with its password, I think the
> users should type
> ```
> ./sql-client.sh --user xxx --password xxx
> ```
> in the terminal and the OpenSessionRequest should be enough.
>
> # Case 2
>
> If users  wants to modify the execution config, users should type
> ```
> Flink SQL> SET  `` = ``;
> ```
> in the terminal. The Client can send ExecuteStatementRequest to the
> Gateway.
>
> > As you have FLIPs or PRs, feel free to let me, Jamie, and Alexey know.
>
> It would be nice you can join us to finish the feature. I think the
> modification about the SQL Gateway side is ready to review.
>
> Best,
> Shengkai
>
>
> Jim Hughes  于2022年11月11日周五 05:19写道:
>
> > Hi Yu Zelin,
> >
> > I have read through your draft and it looks good.  I am new to Flink, so
> I
> > haven't learned about everything which needs to be done yet.
> >
> > One of the considerations that I'm interested in understanding is how to
> > pass additional headers when sending REST requests.  From looking at the
> > code, it looks like a custom `OutboundChannelHandlerFactory` could be
> > created to read additional configuration and set headers.  Does that make
> > sense?
> >
> > Thank you very much for sharing the proof of concept code and the
> > document.  As you have FLIPs or PRs, feel free to let me, Jamie, and
> Alexey
> > know.  We'll be happy to review them.
> >
> > Cheers,
> >
> > Jim
> >
> > On Wed, Nov 9, 2022 at 11:43 PM yu zelin  wrote:
> >
> > > Hi, all
> > > Sorry for late response. As Shengkai mentioned, Currently I’m working
> > with
> > > him on SQL Client, dedicating to implement the Remote Mode of SQL
> > Client. I
> > > have written a draft of implementation plan and will write a FLIP about
> > it
> > > ASAP. If you are interested in, please take a look at the draft and
> it’s
> > > nice if you give me some feedback.
> > > The doc is at:
> > >
> >
> https://docs.google.com/document/d/14cS4VBSamMUnlM_PZuK6QKLfriUuQU51iqET5oiYy_c/edit?usp=sharing
> > >
> > > > 2022年11月7日 11:19,Shengkai Fang  写道:
> > > >
> > > > Hi, all. Sorry for the late reply.
> > > >
> > > > > Is the gateway mode planned to be supported for SQL Client in 1.17?
> > > > > Do you have anything you can already share so we can start with
> your
> > > work or just play around with it.
> > > >
> > > > Yes. @yzl is working on it and he will list the implementation plan
> > > later and share the progress. I think the change is not very large and
> I
> > > think it's not a big problem to finish this in the release-1.17. I will
> > > join to develop this in the mid of November.
> > > >
> > > > Best,
> > > > Shengkai
> > > >
> > > >
> > > >
> > > >
> > > > Jamie Grier mailto:jgr...@apache.org>>
> > > 于2022年11月5日周六 00:48写道:
> > > >> Hi Shengkai,
> > > >>
> > > >> We're doing more and more Flink development at Confluent these days
> > and
> > > we're currently trying to bootstrap a prototype that relies on the SQL
> > > Client and Gateway.  We will be using the the components in some of our
> > > projects and would like to co-develop them with you and the rest of the
> > > Flink community.
> > > >>
> > > >> As of right now it's a pretty big blocker for our upcoming milestone
> > > that the SQL Client has not yet been modified to talk to the SQL
> Gateway
> > > and we want to help with this effort ASAP!  We would be even willing to
> > > take over the work if it's not yet started but I suspect it already is.
> > > >>
> > > >> Anyway, rather than start working immediately on the SQL Client and
> > > adding a the new Gateway mode ourselves we wanted to start a
> conversation
> > > with you and see where you're at with things and offer to help.
> > > >>
> > > >> Do you have anything you can already share so we can start with your
> > > work or just play around with it.  Like I said, we just want to get
> > started
> > > and are very able to help in this area.  We see both the SQL Client and
> > > Gateway being very important for us and have a good team to help
> develop
> > it.
> > > >>
> > > >> Let me know if there is a branch you can share, etc.  It would be
> much
> > > appreciated!
> > > >>
> > > >> -Jamie Grier
> > > >>
> > > >>
> > > >> On 2022/10/28 06:06:49 Shengkai Fang wrote:
> > > >> > Hi.
> > > >> >
> > > >> > > Is there a possibility for us to get engaged and at least
> > introduce
> > > >> > initial changes to support authentication/authorization?
> > > >> >
> 

[jira] [Created] (FLINK-30000) Introduce FileSystemFactory to create FileSystem from custom configuration

2022-11-11 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-3:


 Summary: Introduce FileSystemFactory to create FileSystem from 
custom configuration
 Key: FLINK-3
 URL: https://issues.apache.org/jira/browse/FLINK-3
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Reporter: Jingsong Lee
 Fix For: table-store-0.3.0


Currently, table store uses static Flink FileSystem. This can not support:
1. Use another FileSystem different from checkpoint FileSystem.
2. Use FileSystem in Hive and Spark from custom configuration instead of using 
FileSystem.initialize.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Remove FlinkKafkaConsumer and FlinkKafkaProducer in the master for 1.17 release

2022-11-11 Thread Jing Ge
Hi all,

Thank you all for the informative feedback. I figure there is a requirement
to improve the documentation wrt the migration from FlinkKafkaConsumer to
KafkaSource. I've fired a ticket[1] and connected it with [2]. This
shouldn't be the blocker for removing FlinkKafkaConsumer.

Given there will be some ongoing SinkV2 upgrades, I will start a vote only
limited to FlinkKafkaConsumer elimination and related APIs graduation. As a
follow-up task, I will sync with Yun Gao before the coding freeze of 1.17
release to check if we can start the second vote to remove
FlinkKafkaProducer with 1.17.

Best regards,
Jing

[1] https://issues.apache.org/jira/browse/FLINK-2
[2] https://issues.apache.org/jira/browse/FLINK-28302


On Wed, Nov 2, 2022 at 11:39 AM Martijn Visser 
wrote:

> Hi David,
>
> I believe that for the DataStream this is indeed documented [1] but it
> might be missed given that there is a lot of documentation and you need to
> know that your problem is related to idleness. For the Table API I think
> this is never mentioned, so it should definitely be at least documented
> there.
>
> Thanks,
>
> Martijn
>
> [1]
>
> https://nightlies.apache.org/flink/flink-docs-stable/docs/connectors/datastream/kafka/#idleness
>
> On Wed, Nov 2, 2022 at 11:28 AM David Anderson 
> wrote:
>
> > >
> > > For the partition
> > > idleness problem could you elaborate more about it? I assume both
> > > FlinkKafkaConsumer and KafkaSource need a WatermarkStrategy to decide
> > > whether to mark the partition as idle.
> >
> >
> > As a matter of fact, no, that's not the case -- which is why I mentioned
> > it.
> >
> > The FlinkKafkaConsumer automatically treats all initially empty (or
> > non-existent) partitions as idle, while the KafkaSource only does this if
> > the WatermarkStrategy specifies that idleness handling is desired by
> > configuring withIdleness. This can be a source of confusion for folks
> > upgrading to the new connector. It most often shows up in situations
> where
> > the number of Kafka partitions is less than the parallelism of the
> > connector, which is a rather common occurrence in development and testing
> > environments.
> >
> > I believe this change in behavior was made deliberately, so as to create
> a
> > more consistent experience across all FLIP-27 connectors. This isn't
> > something that needs to be fixed, but does need to be communicated more
> > clearly. Unfortunately, the whole idleness mechanism remained
> significantly
> > broken until 1.16 (considering the impact of [1] and [2]), further
> > complicating the situation. Because of FLINK-28975 [2], users with
> > partitions that are initially empty may have problems with versions
> before
> > 1.15.3 (still unreleased) and 1.16.0. See [3] for an example of this
> > confusion.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-18934 (idleness didn't
> > work
> > with connected streams)
> > [2] https://issues.apache.org/jira/browse/FLINK-28975 (idle streams
> could
> > never become active again)
> > [3]
> >
> >
> https://stackoverflow.com/questions/70096166/parallelism-in-flink-kafka-source-causes-nothing-to-execute/70101290#70101290
> >
> > Best,
> > David
> >
> > On Wed, Nov 2, 2022 at 5:26 AM Qingsheng Ren  wrote:
> >
> > > Thanks Jing for starting the discussion.
> > >
> > > +1 for removing FlinkKafkaConsumer, as KafkaSource has evolved for many
> > > release cycles and should be stable enough. I have some concerns about
> > the
> > > new Kafka sink based on sink v2, as sink v2 still has some ongoing work
> > in
> > > 1.17 (maybe Yun Gao could provide some inputs). Also we found some
> issues
> > > of KafkaSink related to the internal mechanism of sink v2, like
> > > FLINK-29492.
> > >
> > > @David
> > > About the ability of DeserializationSchema#isEndOfStream, FLIP-208 is
> > > trying to complete this piece of the puzzle, and Hang Ruan (
> > > ruanhang1...@gmail.com) plans to work on it in 1.17. For the partition
> > > idleness problem could you elaborate more about it? I assume both
> > > FlinkKafkaConsumer and KafkaSource need a WatermarkStrategy to decide
> > > whether to mark the partition as idle.
> > >
> > > Best,
> > > Qingsheng
> > > Ververica (Alibaba)
> > >
> > > On Thu, Oct 27, 2022 at 8:06 PM Jing Ge  wrote:
> > >
> > > > Hi Dev,
> > > >
> > > > I'd like to start a discussion about removing FlinkKafkaConsumer and
> > > > FlinkKafkaProducer in 1.17.
> > > >
> > > > Back in the past, it was originally announced to remove it with Flink
> > > 1.15
> > > > after Flink 1.14 had been released[1]. And then postponed to the next
> > > 1.15
> > > > release which meant to remove it with Flink 1.16 but forgot to change
> > the
> > > > doc[2]. I have created a PRs to fix it. Since the 1.16 release branch
> > has
> > > > code freeze, it makes sense to, first of all, update the doc to say
> > that
> > > > FlinkKafkaConsumer will be removed with Flink 1.17 [3][4] and second
> > > start
> > > > the discussion about removing them 

[jira] [Created] (FLINK-29999) Improve documentation on how a user should migrate from FlinkKafkaConsumer to KafkaSource

2022-11-11 Thread Jing Ge (Jira)
Jing Ge created FLINK-2:
---

 Summary: Improve documentation on how a user should migrate from 
FlinkKafkaConsumer to KafkaSource
 Key: FLINK-2
 URL: https://issues.apache.org/jira/browse/FLINK-2
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.16.0
Reporter: Jing Ge


We have described how to migrate job from FlinkKafkaConsumer to KafkaSource at 
[https://nightlies.apache.org/flink/flink-docs-master/release-notes/flink-1.14/#flink-24055httpsissuesapacheorgjirabrowseflink-24055.]
 But there are more things to take care of beyond it, one example is the 
idleness handling. Related documentation should improved.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-268: Rack Awareness for Kafka Sources

2022-11-11 Thread Jeremy DeGroot
Yanfei,

You caught a mistake on my part. I referenced the wrong KIP. That one is
for Kafka Streams, KIP-36 is the one for the regular client. I'll update
the referenced FLIP and its JIRA. This feature is already available in our
client, my team at work implemented it against the old Kafka Consumer API
earlier this year. We'd like to donate the implementation on the new API.

KIP-36

https://cwiki.apache.org/confluence/plugins/servlet/mobile?contentId=61321028#content/view/61321028

On Fri, Nov 11, 2022, 2:55 AM Martijn Visser 
wrote:

> Hi Yanfei,
>
> The version for Kafka Clients has been upgraded to 3.2.3 since 1.16 via
> FLINK-29513 [1]
>
> Best regards,
>
> Martijn
>
> [1] https://issues.apache.org/jira/browse/FLINK-29513
>
> On Fri, Nov 11, 2022 at 4:11 AM Yanfei Lei  wrote:
>
> > Hi Jeremy,
> >
> > Thanks for the proposal.
> > I'm curious if this feature can adapt to the current kafka-client version
> > in the connector. The rack awareness feature requires a version>=3.2.0 to
> > use, while the kafka-client version of Kafka connector is 2.8.1.  Will we
> > upgrade the Kafka-client version first?
> >
> > Best,
> > Yanfei
> >
> > Jeremy DeGroot  于2022年11月11日周五 05:35写道:
> >
> > > Kafak has a rack awareness feature that allows brokers and consumers to
> > > communicate about the rack (or AWS Availability Zone) they're located
> in.
> > > Reading from a local broker can save money in bandwidth and improve
> > latency
> > > for your consumers.
> > >
> > > This improvement proposes that a Kafka Consumer could be configured
> with
> > a
> > > callback that could be run when it's being configured on the task
> > manager,
> > > that will set the appropriate value at runtime if a value is provided.
> > >
> > > More detail about this proposal can be found at
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-268%3A+Kafka+Rack+Awareness
> > >
> > >
> > > More information about the Kafka rack awareness feature is at
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-708%3A+Rack+awareness+for+Kafka+Streams
> > >
> > >
> > > Best,
> > >
> > > Jeremy
> > >
> >
>


[jira] [Created] (FLINK-29998) Make the backpressure tab could be sort by the busy percent

2022-11-11 Thread Yun Tang (Jira)
Yun Tang created FLINK-29998:


 Summary: Make the backpressure tab could be sort by the busy 
percent
 Key: FLINK-29998
 URL: https://issues.apache.org/jira/browse/FLINK-29998
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Web Frontend
Reporter: Yun Tang
 Fix For: 1.17.0


Currently, we cannot sort the backpressure tab to see which task is busiest.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29997) Link to the taskmanager page in the expired sub-task of checkpoint tab

2022-11-11 Thread Yun Tang (Jira)
Yun Tang created FLINK-29997:


 Summary: Link to the taskmanager page in the expired sub-task of 
checkpoint tab
 Key: FLINK-29997
 URL: https://issues.apache.org/jira/browse/FLINK-29997
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Web Frontend
Reporter: Yun Tang
 Fix For: 1.17.0


Currently, when we debug why some of the sub-tasks cannot complete the 
checkpoints in time, we have a complex steps to find which task manager 
containing such logs. This could be simplified via a direct link.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29996) Link to the task manager's thread dump page in the backpressure tab

2022-11-11 Thread Yun Tang (Jira)
Yun Tang created FLINK-29996:


 Summary: Link to the task manager's thread dump page in the 
backpressure tab
 Key: FLINK-29996
 URL: https://issues.apache.org/jira/browse/FLINK-29996
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Web Frontend
Reporter: Yun Tang
 Fix For: 1.17.0


Currently, we have a complex steps to find the thread dump of backpressured 
tasks, however, this could be simplified with a link in the backpressure tab of 
web UI.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29995) Improve the usability of job analysis for backpressure, expired checkpoints and exceptions

2022-11-11 Thread Yun Tang (Jira)
Yun Tang created FLINK-29995:


 Summary: Improve the usability of job analysis for backpressure, 
expired checkpoints and exceptions
 Key: FLINK-29995
 URL: https://issues.apache.org/jira/browse/FLINK-29995
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Reporter: Yun Tang
Assignee: Yun Tang
 Fix For: 1.17.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29994) Update official document about Lookup Join.

2022-11-11 Thread Hang HOU (Jira)
Hang HOU created FLINK-29994:


 Summary: Update official document about Lookup Join.
 Key: FLINK-29994
 URL: https://issues.apache.org/jira/browse/FLINK-29994
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Affects Versions: 1.16.0
Reporter: Hang HOU


Missed a period in Description of "lookup.cache-rows".
[About 
rocksdb'configuration|https://nightlies.apache.org/flink/flink-table-store-docs-release-0.2/docs/development/lookup-join/#rocksdboptions]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29993) Provide more convenient way to programmatically configure reporters

2022-11-11 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-29993:


 Summary: Provide more convenient way to programmatically configure 
reporters
 Key: FLINK-29993
 URL: https://issues.apache.org/jira/browse/FLINK-29993
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics, Tests
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.17.0


Configuring reporters programmatically is an uncommon task in tests, but is 
currently not convenient and error prone since you need to manually assemble 
the config key as it must contain the reporter name.

Add some factory methods to the `MetricOptions` to generate a config options at 
runtime given a reporter name.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29992) Join execution plan parsing error

2022-11-11 Thread HunterXHunter (Jira)
HunterXHunter created FLINK-29992:
-

 Summary: Join execution plan parsing error
 Key: FLINK-29992
 URL: https://issues.apache.org/jira/browse/FLINK-29992
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.16.0, 1.17.0
Reporter: HunterXHunter


{code:java}
//
tableEnv.executeSql(" CREATE CATALOG hive WITH (\n"
+ "  'type' = 'hive',\n"
+ " 'default-database' = 'flinkdebug',\n"
+ " 'hive-conf-dir' = '/programe/hadoop/hive-3.1.2/conf'\n"
+ " )");
tableEnv.executeSql("create table datagen_tbl (\n"
+ "id STRING\n"
+ ",name STRING\n"
+ ",age bigint\n"
+ ",ts bigint\n"
+ ",`par` STRING\n"
+ ",pro_time as PROCTIME()\n"
+ ") with (\n"
+ "  'connector'='datagen'\n"
+ ",'rows-per-second'='10'\n"
+ " \n"
+ ")");
String dml1 = "select * "
+ " from datagen_tbl as p "
+ " join hive.flinkdebug.default_hive_src_tbl "
+ " FOR SYSTEM_TIME AS OF p.pro_time AS c"
+ " ON p.id = c.id";
// Execution succeeded
  System.out.println(tableEnv.explainSql(dml1));
String dml2 = "select p.id "
+ " from datagen_tbl as p "
+ " join hive.flinkdebug.default_hive_src_tbl "
+ " FOR SYSTEM_TIME AS OF p.pro_time AS c"
+ " ON p.id = c.id";
// Throw an exception
 System.out.println(tableEnv.explainSql(dml2)); {code}
{code:java}
org.apache.flink.table.api.TableException: Cannot generate a valid execution 
plan for the given query: FlinkLogicalCalc(select=[id]) +- 
FlinkLogicalJoin(condition=[=($0, $1)], joinType=[inner])    :- 
FlinkLogicalCalc(select=[id])    :  +- 
FlinkLogicalTableSourceScan(table=[[default_catalog, default_database, 
datagen_tbl]], fields=[id, name, age, ts, par])    +- 
FlinkLogicalSnapshot(period=[$cor1.pro_time])       +- 
FlinkLogicalTableSourceScan(table=[[hive, flinkdebug, default_hive_src_tbl, 
project=[id]]], fields=[id])This exception indicates that the query uses an 
unsupported SQL feature. Please check the documentation for the set of 
currently supported SQL features.    at 
org.apache.flink.table.planner.plan.optimize.program.FlinkVolcanoProgram.optimize(FlinkVolcanoProgram.scala:70)
     at 
org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.$anonfun$optimize$1(FlinkChainedProgram.scala:59)
 
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29991) KinesisFirehoseSinkTest#firehoseSinkFailsWhenUnableToConnectToRemoteService failed

2022-11-11 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-29991:
--

 Summary: 
KinesisFirehoseSinkTest#firehoseSinkFailsWhenUnableToConnectToRemoteService 
failed 
 Key: FLINK-29991
 URL: https://issues.apache.org/jira/browse/FLINK-29991
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kinesis
Affects Versions: 1.15.2
Reporter: Martijn Visser


{code:java}
Nov 10 10:22:53 [ERROR] 
org.apache.flink.connector.firehose.sink.KinesisFirehoseSinkTest.firehoseSinkFailsWhenUnableToConnectToRemoteService
  Time elapsed: 7.394 s  <<< FAILURE!
Nov 10 10:22:53 java.lang.AssertionError: 
Nov 10 10:22:53 
Nov 10 10:22:53 Expecting throwable message:
Nov 10 10:22:53   "An OperatorEvent from an OperatorCoordinator to a task was 
lost. Triggering task failover to ensure consistency. Event: 
'[NoMoreSplitEvent]', targetTask: Source: Sequence Source -> Map -> Map -> 
Sink: Writer (15/32) - execution #0"
Nov 10 10:22:53 to contain:
Nov 10 10:22:53   "Received an UnknownHostException when attempting to interact 
with a service."
Nov 10 10:22:53 but did not.
{code}

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=43017=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=ed165f3f-d0f6-524b-5279-86f8ee7d0e2d=44513



--
This message was sent by Atlassian Jira
(v8.20.10#820010)