Re: Flink job manager conditional start of flink jobs

2023-04-12 Thread Shammon FY
Hi

The job in ns2 has the permission to stop the job in ns1? How about
managing the relationship in your `Job Submission Service` if it exists.
The service can check and stop the job in ns1 before it submitting the job
to ns2, what do you think?

Best,
Shammon FY


On Thu, Apr 13, 2023 at 10:50 AM naga sudhakar 
wrote:

> Hi,
> Thanks for your reply.
> It is slightly different, would be happy to have any suggestion  for the
> scenario you mentioned.
> My scenario: I have 2 namespaces say ns1,ns2. I have to make sure only one
> of ns1 or ns2 should run my flink jobs. Say initially ns1 is running flink
> jobs, later planned to move them to ns2. Now when I start in ns2, I can
> make an api call to ns1 jobmanager about running jobs and if no jobs then
> only I should start in ns2. I can introduce this logic inside the flink job
> java main method  where my total streaming logic present. So if I identify
> then, how can I stop the initiated job?
>
> Thanks,
> Nagasudhakar.
>
> On Thu, 13 Apr, 2023, 7:08 am Shammon FY,  wrote:
>
>> Hi naga
>>
>> Could you provide a specific description of your scene? It sounds like
>> your requirement requires a uniqueness check to ensure that there are no
>> multiple identical jobs running simultaneously, right?
>>
>> Best,
>> Shammon FY
>>
>> On Wed, Apr 12, 2023 at 4:08 PM naga sudhakar 
>> wrote:
>>
>>> Thanks for your email.
>>> I am looking  more in terms of running  these flinkk jobs in multi names
>>> pace environment and make sure only one namespace  flink jobs are running.
>>> So on the Job manager when i try to start a flink job, it has to check
>>> if it's allowed to run in this namespace  or not and accordingly flink job
>>> shud turn into running state otherwise it shud cancel  by itself
>>>
>>>
>>>
>>> On Wed, 12 Apr, 2023, 12:06 pm Gen Luo,  wrote:
>>>
 Hi,

 Is the job you want to start running or already finished?
 If the job is running, this is simply a failover or a JM failover case.
 While if the job has finished, there's no such feature that can restart
 the job
 automatically, AFAIK.  The job has to be submitted again.

 On Wed, Apr 12, 2023 at 10:58 AM naga sudhakar 
 wrote:

> Hi Team,
> Greetings!!
> Just wanted to know when job manager or task manager is being
> restarted, is there a way to run the existing flink jobs based on a
> condition? Same query when I am starting flink job fresh also.
>
> Please let me know if any more information is required from my side.
>
> Thanks & Regards
> Nagasudhakar  Sajja.
>



Re: Flink job manager conditional start of flink jobs

2023-04-12 Thread naga sudhakar
Hi,
Thanks for your reply.
It is slightly different, would be happy to have any suggestion  for the
scenario you mentioned.
My scenario: I have 2 namespaces say ns1,ns2. I have to make sure only one
of ns1 or ns2 should run my flink jobs. Say initially ns1 is running flink
jobs, later planned to move them to ns2. Now when I start in ns2, I can
make an api call to ns1 jobmanager about running jobs and if no jobs then
only I should start in ns2. I can introduce this logic inside the flink job
java main method  where my total streaming logic present. So if I identify
then, how can I stop the initiated job?

Thanks,
Nagasudhakar.

On Thu, 13 Apr, 2023, 7:08 am Shammon FY,  wrote:

> Hi naga
>
> Could you provide a specific description of your scene? It sounds like
> your requirement requires a uniqueness check to ensure that there are no
> multiple identical jobs running simultaneously, right?
>
> Best,
> Shammon FY
>
> On Wed, Apr 12, 2023 at 4:08 PM naga sudhakar 
> wrote:
>
>> Thanks for your email.
>> I am looking  more in terms of running  these flinkk jobs in multi names
>> pace environment and make sure only one namespace  flink jobs are running.
>> So on the Job manager when i try to start a flink job, it has to check if
>> it's allowed to run in this namespace  or not and accordingly flink job
>> shud turn into running state otherwise it shud cancel  by itself
>>
>>
>>
>> On Wed, 12 Apr, 2023, 12:06 pm Gen Luo,  wrote:
>>
>>> Hi,
>>>
>>> Is the job you want to start running or already finished?
>>> If the job is running, this is simply a failover or a JM failover case.
>>> While if the job has finished, there's no such feature that can restart
>>> the job
>>> automatically, AFAIK.  The job has to be submitted again.
>>>
>>> On Wed, Apr 12, 2023 at 10:58 AM naga sudhakar 
>>> wrote:
>>>
 Hi Team,
 Greetings!!
 Just wanted to know when job manager or task manager is being
 restarted, is there a way to run the existing flink jobs based on a
 condition? Same query when I am starting flink job fresh also.

 Please let me know if any more information is required from my side.

 Thanks & Regards
 Nagasudhakar  Sajja.

>>>


Re: Table API function and expression vs SQL

2023-04-12 Thread liu ron
Hi,
Flink SQL follows the standard SQL, so I think SQL syntax will be richer
and richer in higher versions, it will not be changed.

Best,
Ron

liu ron  于2023年4月13日周四 10:24写道:

> Hi,
> Flink SQL follows the standard SQL, so I think SQL syntax will be richer
> and richer in higher versions, it will not be changed.
>
> Best,
> Ron
>
> Shammon FY  于2023年4月13日周四 09:57写道:
>
>> Hi
>>
>> Currently, Calcite supports standard SQL, so I think the main SQL syntax
>> will remain unchanged or backward compatible even when Flink upgrades its
>> calcite version. You can refer to it.
>>
>> Best,
>> Shammon FY
>>
>>
>> On Tue, Apr 11, 2023 at 12:00 PM ravi_suryavanshi.yahoo.com via user <
>> user@flink.apache.org> wrote:
>>
>>> Hi,
>>> we have decided to use the Table API using Flink SQL syntax (NOT JAVA).
>>> Can SQL syntax be changed in the higher version?
>>> as per the doc "SQL support is based on Apache Calcite
>>>  which implements the SQL standard."
>>>
>>> Thanks & Regards,
>>> Ravi
>>> On Saturday, 25 March, 2023 at 06:21:49 pm IST, Mate Czagany <
>>> czmat...@gmail.com> wrote:
>>>
>>>
>>> Hi,
>>>
>>> Please also keep in mind that restoring existing Table API jobs from
>>> savepoints when upgrading to a newer minor version of Flink, e.g. 1.16 ->
>>> 1.17 is not supported as the topology might change between these versions
>>> due to optimizer changes.
>>>
>>> See here for more information:
>>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/concepts/overview/#stateful-upgrades-and-evolution
>>>
>>> Regards,
>>> Mate
>>>
>>> Hang Ruan  ezt írta (időpont: 2023. márc. 25.,
>>> Szo, 13:38):
>>>
>>> Hi,
>>>
>>> I think the SQL job is better. Flink SQL jobs can be easily shared with
>>> others for debugging. And it is more suitable for flow batch integration.
>>> For a small part of jobs which can not be expressed through SQL, we will
>>> choose a job by DataStream API.
>>>
>>> Best,
>>> Hang
>>>
>>> ravi_suryavanshi.yahoo.com via user 
>>> 于2023年3月24日周五 17:25写道:
>>>
>>> Hello Team,
>>> Need your advice on which method is recommended considering don't want
>>> to change my query code when the Flink is updated/upgraded to the higher
>>> version.
>>>
>>> Here I am seeking advice for writing the SQL using java code(Table API
>>> function and Expression) or using pure SQL.
>>>
>>> I am assuming that SQL will not have any impact if upgraded to the
>>> higher version.
>>>
>>> Thanks and Regards,
>>> Ravi
>>>
>>>


Re: Table API function and expression vs SQL

2023-04-12 Thread Shammon FY
Hi

Currently, Calcite supports standard SQL, so I think the main SQL syntax
will remain unchanged or backward compatible even when Flink upgrades its
calcite version. You can refer to it.

Best,
Shammon FY


On Tue, Apr 11, 2023 at 12:00 PM ravi_suryavanshi.yahoo.com via user <
user@flink.apache.org> wrote:

> Hi,
> we have decided to use the Table API using Flink SQL syntax (NOT JAVA).
> Can SQL syntax be changed in the higher version?
> as per the doc "SQL support is based on Apache Calcite
>  which implements the SQL standard."
>
> Thanks & Regards,
> Ravi
> On Saturday, 25 March, 2023 at 06:21:49 pm IST, Mate Czagany <
> czmat...@gmail.com> wrote:
>
>
> Hi,
>
> Please also keep in mind that restoring existing Table API jobs from
> savepoints when upgrading to a newer minor version of Flink, e.g. 1.16 ->
> 1.17 is not supported as the topology might change between these versions
> due to optimizer changes.
>
> See here for more information:
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/concepts/overview/#stateful-upgrades-and-evolution
>
> Regards,
> Mate
>
> Hang Ruan  ezt írta (időpont: 2023. márc. 25.,
> Szo, 13:38):
>
> Hi,
>
> I think the SQL job is better. Flink SQL jobs can be easily shared with
> others for debugging. And it is more suitable for flow batch integration.
> For a small part of jobs which can not be expressed through SQL, we will
> choose a job by DataStream API.
>
> Best,
> Hang
>
> ravi_suryavanshi.yahoo.com via user  于2023年3月24日周五
> 17:25写道:
>
> Hello Team,
> Need your advice on which method is recommended considering don't want to
> change my query code when the Flink is updated/upgraded to the higher
> version.
>
> Here I am seeking advice for writing the SQL using java code(Table API
> function and Expression) or using pure SQL.
>
> I am assuming that SQL will not have any impact if upgraded to the higher
> version.
>
> Thanks and Regards,
> Ravi
>
>


Re: Requirements for POJO serialization

2023-04-12 Thread Shammon FY
Hi Alexis

Flink will recognize the POJO class and use PojoTypeInfo to serialize and
deserialize it. For specific constraints on POJO classes, please refer to
[1].

Users can also define serialization methods for their own classes, as you
mentioned in the email


[1]
https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/java/typeutils/PojoTypeInfo.java

Best,
Shammon FY


On Wed, Apr 12, 2023 at 1:15 AM Alexis Sarda-Espinosa <
sarda.espin...@gmail.com> wrote:

> Hello,
>
> according to the documentation, a POJO must have a no-arg constructor and
> either public fields or public getters and setters with conventional
> naming. I recently realized that if I create an explicit TypeInfoFactory
> that provides Types.POJO and all other required details, the getters and
> setters aren't needed. Is this an official feature?
>
> I ask because this means some classes could have an "immutable contract",
> so to speak. I'm guessing final fields might still be unsupported, but I
> haven't validated.
>
> Regards,
> Alexis.
>
>


Re: Flink job manager conditional start of flink jobs

2023-04-12 Thread Shammon FY
Hi naga

Could you provide a specific description of your scene? It sounds like your
requirement requires a uniqueness check to ensure that there are no
multiple identical jobs running simultaneously, right?

Best,
Shammon FY

On Wed, Apr 12, 2023 at 4:08 PM naga sudhakar 
wrote:

> Thanks for your email.
> I am looking  more in terms of running  these flinkk jobs in multi names
> pace environment and make sure only one namespace  flink jobs are running.
> So on the Job manager when i try to start a flink job, it has to check if
> it's allowed to run in this namespace  or not and accordingly flink job
> shud turn into running state otherwise it shud cancel  by itself
>
>
>
> On Wed, 12 Apr, 2023, 12:06 pm Gen Luo,  wrote:
>
>> Hi,
>>
>> Is the job you want to start running or already finished?
>> If the job is running, this is simply a failover or a JM failover case.
>> While if the job has finished, there's no such feature that can restart
>> the job
>> automatically, AFAIK.  The job has to be submitted again.
>>
>> On Wed, Apr 12, 2023 at 10:58 AM naga sudhakar 
>> wrote:
>>
>>> Hi Team,
>>> Greetings!!
>>> Just wanted to know when job manager or task manager is being restarted,
>>> is there a way to run the existing flink jobs based on a condition? Same
>>> query when I am starting flink job fresh also.
>>>
>>> Please let me know if any more information is required from my side.
>>>
>>> Thanks & Regards
>>> Nagasudhakar  Sajja.
>>>
>>


Re: emitValueWithRetract issue

2023-04-12 Thread Feng Jin
hi Adam

As far as I know, there is currently no similar API available,
but I believe that this feature was accidentally removed and we should add
it back.
I have created a Jira to track the progress of this feature.
https://issues.apache.org/jira/browse/FLINK-31788



On Tue, Apr 11, 2023 at 12:10 AM Adam Augusta  wrote:

> Many thanks for the sanity check, Feng.
>
> It’s a shame this well-documented feature was silently removed.
> emitValue() creates an unreasonable amount of unnecessary and disruptive
> chatter on the changelog stream, as evidenced by putting a print table
> after the flatAggregate. Lots of -D/+I RowData pairs with identical fields.
>
> Is there any clean way to set up a stateful group aggregation in the 1.18
> Table API that doesn’t misbehave in this fashion?
>
> On Mon, Apr 10, 2023 at 11:43 AM Feng Jin  wrote:
>
>> hi Adam
>>
>> I have checked the code and indeed this feature is not available in the
>> latest version of Flink code.
>>
>> This feature was originally implemented in the old planner:
>> 
>> https://github.com/apache/flink/pull/8550/files
>>
>> However, this logic was not implemented in the new planner , the Blink
>> planner.
>>
>> With the removal of the old planner in version 1.14
>> https://github.com/apache/flink/pull/16080 , this code was also removed.
>>
>>
>>
>> Best
>>
>> Feng
>>
>> On Sat, Apr 8, 2023 at 4:17 AM Adam Augusta  wrote:
>>
>>> The TableAggregateFunction javadocs indicate that either "emitValue" or
>>> "emitUpdateWithRetract" is required.
>>>
>>> But if I implement my TableAggregateFunction with
>>> "emitUpdateWithRetract", I get a validation error. If I implement both
>>> methods it works, but emitUpdateWithRetract is not used.
>>>
>>> Peering into the Flink source code, I see that
>>> ImperativeAggCodeGen validates the presence of emitValue, but is agnostic
>>> to emitUpdateWithRetract.
>>> More curiously, Flink's source code doesn't have a single test with a
>>> TableAggregateFunction that uses emitUpdateWithRetract.
>>>
>>> Is this a ghost feature?
>>>
>>> Thanks,
>>> Adam
>>>
>>


Re: Flink Job across Data Centers

2023-04-12 Thread Andrew Otto
Hi, I asked a similar question in this thread
, which
might have some relevant info.

On Wed, Apr 12, 2023 at 7:23 AM Chirag Dewan via user 
wrote:

> Hi,
>
> Can anyone share any experience on running Flink jobs across data centers?
>
> I am trying to create a Multi site/Geo Replicated Kafka cluster. I want
> that my Flink job to be closely colocated with my Kafka multi site cluster.
> If the Flink job is bound to a single data center, I believe we will
> observe a lot of client latency by trying to access the broker in another
> DC.
>
> Rather if I can make my Flink Kafka collectors as rack aware and start
> fetching data from the closest Kafka broker, I should get better results.
>
> I will be deploying Flink 1.16 on Kubernetes with Strimzi managed Apache
> Kafka.
>
> Thanks.
>
>


Flink Job across Data Centers

2023-04-12 Thread Chirag Dewan via user
Hi,
Can anyone share any experience on running Flink jobs across data centers?
I am trying to create a Multi site/Geo Replicated Kafka cluster. I want that my 
Flink job to be closely colocated with my Kafka multi site cluster. If the 
Flink job is bound to a single data center, I believe we will observe a lot of 
client latency by trying to access the broker in another DC.
Rather if I can make my Flink Kafka collectors as rack aware and start fetching 
data from the closest Kafka broker, I should get better results.
I will be deploying Flink 1.16 on Kubernetes with Strimzi managed Apache Kafka.
Thanks.


回复:监控flink的prometheus经常OOM

2023-04-12 Thread 17610775726
Hi 


这个是可以配置的,可以参考官网 filter.includes[1] 来过滤你想要的 metrics。


[1]https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/metric_reporters/#filter-includes


Best
JasonLee


 回复的原邮件 
| 发件人 | casel.chen |
| 发送日期 | 2023年03月22日 12:08 |
| 收件人 | user-zh@flink.apache.org |
| 主题 | 监控flink的prometheus经常OOM |
我们通过pushgateway上报metrics到prometheus,设置的上报周期是30秒,整个实时平台有200多个作业,启了一台50GB的prometheus还是撑不住,指标保留1天,设置了指标在内存中驻留2小时后写入磁盘。最大的一个metric已经有37万条。请问有什么解决办法么?能选择哪些指标进行上报不?

Re: Flink job manager conditional start of flink jobs

2023-04-12 Thread naga sudhakar
Thanks for your email.
I am looking  more in terms of running  these flinkk jobs in multi names
pace environment and make sure only one namespace  flink jobs are running.
So on the Job manager when i try to start a flink job, it has to check if
it's allowed to run in this namespace  or not and accordingly flink job
shud turn into running state otherwise it shud cancel  by itself



On Wed, 12 Apr, 2023, 12:06 pm Gen Luo,  wrote:

> Hi,
>
> Is the job you want to start running or already finished?
> If the job is running, this is simply a failover or a JM failover case.
> While if the job has finished, there's no such feature that can restart
> the job
> automatically, AFAIK.  The job has to be submitted again.
>
> On Wed, Apr 12, 2023 at 10:58 AM naga sudhakar 
> wrote:
>
>> Hi Team,
>> Greetings!!
>> Just wanted to know when job manager or task manager is being restarted,
>> is there a way to run the existing flink jobs based on a condition? Same
>> query when I am starting flink job fresh also.
>>
>> Please let me know if any more information is required from my side.
>>
>> Thanks & Regards
>> Nagasudhakar  Sajja.
>>
>


Re: Flink job manager conditional start of flink jobs

2023-04-12 Thread Gen Luo
Hi,

Is the job you want to start running or already finished?
If the job is running, this is simply a failover or a JM failover case.
While if the job has finished, there's no such feature that can restart the
job
automatically, AFAIK.  The job has to be submitted again.

On Wed, Apr 12, 2023 at 10:58 AM naga sudhakar 
wrote:

> Hi Team,
> Greetings!!
> Just wanted to know when job manager or task manager is being restarted,
> is there a way to run the existing flink jobs based on a condition? Same
> query when I am starting flink job fresh also.
>
> Please let me know if any more information is required from my side.
>
> Thanks & Regards
> Nagasudhakar  Sajja.
>