Re: Configure operator based on key

2021-02-21 Thread yidan zhao
You can self-define it using keyedStream.window(GlobalWindows.create()
).trigger(self-defined-trigger).

Abhinav Sharma  于2021年2月21日周日 下午3:57写道:

> Hi,
>
> Is there some way that I can configure an operator based on the key in a
> stream?
> Eg: If the key is 'abcd', then create a window of size X counts, if the
> key is 'bfgh', then create a window of size Y counts.
>
> Is this scenario possible in flink
>
>


Re: Run the code in the UI

2021-02-21 Thread Tzu-Li (Gordon) Tai
Hi,

Could you re-elaborate what exactly you mean?

If you wish to run a Flink job within the IDE, but also have the web UI
running for it, you can use
`StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(Configuration)`
to create the execution environment.
The default port 8081 will be used unless specified via `rest.port` in the
configuration.

Cheers,
Gordon



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: [Statefun] Dynamic behavior

2021-02-21 Thread Tzu-Li (Gordon) Tai
Hi,

FWIW, there is this JIRA that is tracking a pubsub / broadcast messaging
primitive in StateFun:
https://issues.apache.org/jira/browse/FLINK-16319

This is probably what you are looking for. And I do agree, in the case that
the control stream (which updates the application logic) is high volume,
redeploying functions may not work well.

I don't think there really is a "recommended" way of doing the "broadcast
control stream, join with main stream" pattern with StateFun at the moment,
at least without FLINK-16319.
On the other hand, it could be possible to use stateful functions to
implement a pub-sub model in user space for the time being. I've actually
left some ideas for implementing that in the comments of FLINK-16319.

Cheers,
Gordon


On Mon, Feb 22, 2021 at 6:38 AM Miguel Araújo  wrote:

> Hi everyone,
>
> What is the recommended way of achieving the equivalent of a broadcast in
> Flink when using Stateful Functions?
>
> For instance, assume we are implementing something similar to Flink's
> demo fraud detection
>  but
> in Stateful Functions - how can one dynamically update the application's
> logic then?
> There was a similar question in this mailing list in the past where it was 
> recommended
> moving the dynamic logic to a remote function
> 
>  so
> that one could achieve that by deploying a new container. I think that's
> not very realistic as updates might happen with a frequency that's not
> compatible with that approach (e.g., sticking to the fraud detection
> example, updating fraud detection rules every hour is not unusual), nor
> should one be deploying a new container when data (not code) changes.
>
> Is there a way of, for example, modifying FunctionProviders
> 
> on the fly?
>
> Thanks,
> Miguel
>


[Statefun] Dynamic behavior

2021-02-21 Thread Miguel Araújo
Hi everyone,

What is the recommended way of achieving the equivalent of a broadcast in
Flink when using Stateful Functions?

For instance, assume we are implementing something similar to Flink's demo
fraud detection
 but
in Stateful Functions - how can one dynamically update the application's
logic then?
There was a similar question in this mailing list in the past where it
was recommended
moving the dynamic logic to a remote function

so
that one could achieve that by deploying a new container. I think that's
not very realistic as updates might happen with a frequency that's not
compatible with that approach (e.g., sticking to the fraud detection
example, updating fraud detection rules every hour is not unusual), nor
should one be deploying a new container when data (not code) changes.

Is there a way of, for example, modifying FunctionProviders

on the fly?

Thanks,
Miguel


Union fields with time attributes have different types

2021-02-21 Thread Sebastián Magrí
I'm using a query like this

WITH aggs_1m AS (
  SELECT
`evt`,
`startts`
`endts`,
SUM(`value`) AS `value`
  FROM aggregates_per_minute
), aggs_3m AS (
  SELECT
`evt`,
TUMBLE_START(`endts`, INTERVAL '3' MINUTE) AS `startts`,
TUMBLE_END(`endts`, INTERVAL '3' MINUTE) AS `endts`,
SUM(`c`) AS `value`
  FROM aggregates_per_minute
  GROUP BY t, TUMBLE(`endts`, INTERVAL '3' MINUTE)
)
SELECT `evt`, `value`, `startts`, `endts`
FROM aggs_1m
UNION
SELECT `evt`, `value`, `startts`, `endts`
FROM aggs_3m

But it's throwing this exception

org.apache.flink.table.api.ValidationException: Union fields with time
attributes have different types.

Doesn't TUMBLE_START(somets, ...) return a TIMESTAMP of the same type?

-- 
Sebastián Ramírez Magrí


Datastream Lag Windowing function

2021-02-21 Thread s_penakalap...@yahoo.com
Hi All,
I am using Flink1.12, I am trying to read realtime data from Kafka topic and as 
per the requirement I need to implement windowing LAG function. Approach I 
followed is below:
DataStream vData = env.addSource(...)vData.keyBy(Id)
createTemperoryViewthen apply flink sql.
My sample data is like below, vTime field contains the timestamp when the even 
was generated and vNumSeq is the unique number for particular group Id.
I tried Lag function by ordering by vSeq field (long datatype), Job failed with 
"OVER windows' ordering in stream mode must be defined on a time attribute". 
I even tried by using vTime field (eventTS is also long datatype). I tried 
converting this field to sql.Timestamp, still no luck Job failed with above 
error.
When I referred few documents solution provided was to use proctime/rowtime. So 
I modified the query to use proctime() Job succeeded but with wrong results.
Kindly help with simple example badly stuck. I am ok to use even Datastream API 
to implement lag functionality.
Lag Query:select vdata.f0 as id, vdata.f1 as name, vdata.f2 as vTime, vdata.f3 
as vSeq, vdata.f4 as currentSal, LAG(vdata.f4,1,0) OVER ( partition BY vdata.f0 
ORDER BY proctime()) AS prevSal from VData vData 

Wrong output :

Expected:

Regards,Sunitha.