Re: [DISCUSS] FLIP-193: Snapshots ownership

2021-11-18 Thread Konstantin Knauf
Hi Dawid,

Thanks for working on this FLIP. Clarifying the differences and
guarantees around savepoints and checkpoints will make it easier and safer
for users and downstream projects and platforms to work with them.

+1 to the changing the current (undefined) behavior when recovering from
retained checkpoints. Users can now choose between claiming and not
claiming, which I think will make the current mixed behavior obsolete.

Cheers,

Konstantin

On Fri, Nov 19, 2021 at 8:19 AM Dawid Wysakowicz 
wrote:

> Hi devs,
>
> I'd like to bring up for a discussion a proposal to clean up ownership
> of snapshots, both checkpoints and savepoints.
>
> The goal here is to make it clear who is responsible for deleting
> checkpoints/savepoints files and when can that be done in a safe manner.
>
> Looking forward for your feedback!
>
> Best,
>
> Dawid
>
> [1] https://cwiki.apache.org/confluence/x/bIyqCw
>
>
>

-- 

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk


[DISCUSS] FLIP-193: Snapshots ownership

2021-11-18 Thread Dawid Wysakowicz
Hi devs,

I'd like to bring up for a discussion a proposal to clean up ownership
of snapshots, both checkpoints and savepoints.

The goal here is to make it clear who is responsible for deleting
checkpoints/savepoints files and when can that be done in a safe manner.

Looking forward for your feedback!

Best,

Dawid

[1] https://cwiki.apache.org/confluence/x/bIyqCw




OpenPGP_signature
Description: OpenPGP digital signature


[jira] [Created] (FLINK-24960) YARNSessionCapacitySchedulerITCase.testVCoresAreSetCorrectlyAndJobManagerHostnameAreShownInWebInterfaceAndDynamicPropertiesAndYarnApplicationNameAndTaskManagerSlots hang

2021-11-18 Thread Yun Gao (Jira)
Yun Gao created FLINK-24960:
---

 Summary: 
YARNSessionCapacitySchedulerITCase.testVCoresAreSetCorrectlyAndJobManagerHostnameAreShownInWebInterfaceAndDynamicPropertiesAndYarnApplicationNameAndTaskManagerSlots
 hangs on azure
 Key: FLINK-24960
 URL: https://issues.apache.org/jira/browse/FLINK-24960
 Project: Flink
  Issue Type: Bug
  Components: Deployment / YARN
Affects Versions: 1.15.0
Reporter: Yun Gao


{code:java}
Nov 18 22:37:08 

Nov 18 22:37:08 Test 
testVCoresAreSetCorrectlyAndJobManagerHostnameAreShownInWebInterfaceAndDynamicPropertiesAndYarnApplicationNameAndTaskManagerSlots(org.apache.flink.yarn.YARNSessionCapacitySchedulerITCase)
 is running.
Nov 18 22:37:08 

Nov 18 22:37:25 22:37:25,470 [main] INFO  
org.apache.flink.yarn.YARNSessionCapacitySchedulerITCase [] - Extracted 
hostname:port: 5718b812c7ab:38622
Nov 18 22:52:36 
==
Nov 18 22:52:36 Process produced no output for 900 seconds.
Nov 18 22:52:36 
==
 {code}
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=26722=logs=f450c1a5-64b1-5955-e215-49cb1ad5ec88=cc452273-9efa-565d-9db8-ef62a38a0c10=36395



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-24959) Add a BitMap function to FlinkSQL

2021-11-18 Thread ZhuoYu Chen (Jira)
ZhuoYu Chen created FLINK-24959:
---

 Summary: Add a BitMap function to FlinkSQL
 Key: FLINK-24959
 URL: https://issues.apache.org/jira/browse/FLINK-24959
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / API
Affects Versions: 1.15.0
Reporter: ZhuoYu Chen


bitmap_and :{color:#33}Computes the intersection of two input bitmaps and 
returns the new bitmap{color}

{color:#30323e}bitmap_andnot:{color:#33}Computes the set (difference set) 
that is in A but not in B.{color}{color}

{color:#30323e}{color:#33}Bitmap functions related to join operations, 
etc{color}{color}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [ANNOUNCE] New Apache Flink Committer - Yingjie Cao

2021-11-18 Thread Yuan Mei
well deserved! Congratulations, Yingjie!

On Fri, Nov 19, 2021 at 12:39 PM Yu Li  wrote:

> Congrats and welcome, Yingjie!
>
> Best Regards,
> Yu
>
>
> On Thu, 18 Nov 2021 at 19:01, Yun Tang  wrote:
>
> > Congratulations, Yinjie!
> >
> > Best
> > Yun Tang
> >
> > On 2021/11/18 08:01:44 Martijn Visser wrote:
> > > Congratulations!
> > >
> > > On Thu, 18 Nov 2021 at 02:44, Leonard Xu  wrote:
> > >
> > > > Congratulations!Yingjie
> > > >
> > > > Best,
> > > > Leonard
> > > >
> > > > > 在 2021年11月18日,01:40,Till Rohrmann  写道:
> > > > >
> > > > > Congratulations Yingjie!
> > > >
> > > >
> > >
> >
>


Re: [ANNOUNCE] New Apache Flink Committer - Yingjie Cao

2021-11-18 Thread Yu Li
Congrats and welcome, Yingjie!

Best Regards,
Yu


On Thu, 18 Nov 2021 at 19:01, Yun Tang  wrote:

> Congratulations, Yinjie!
>
> Best
> Yun Tang
>
> On 2021/11/18 08:01:44 Martijn Visser wrote:
> > Congratulations!
> >
> > On Thu, 18 Nov 2021 at 02:44, Leonard Xu  wrote:
> >
> > > Congratulations!Yingjie
> > >
> > > Best,
> > > Leonard
> > >
> > > > 在 2021年11月18日,01:40,Till Rohrmann  写道:
> > > >
> > > > Congratulations Yingjie!
> > >
> > >
> >
>


[jira] [Created] (FLINK-24958) correct the example and link for temporal table function documentation

2021-11-18 Thread zoucao (Jira)
zoucao created FLINK-24958:
--

 Summary: correct the example and link for temporal table function 
documentation 
 Key: FLINK-24958
 URL: https://issues.apache.org/jira/browse/FLINK-24958
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.14.0
Reporter: zoucao
 Fix For: 1.15.0


correct the example and link for temporal table function documentation 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-24957) show `host:port` information in `subtasks` tab

2021-11-18 Thread Xianxun Ye (Jira)
Xianxun Ye created FLINK-24957:
--

 Summary: show `host:port` information in `subtasks` tab
 Key: FLINK-24957
 URL: https://issues.apache.org/jira/browse/FLINK-24957
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / REST
Affects Versions: 1.14.0
Reporter: Xianxun Ye
 Attachments: image-2021-11-19-10-45-41-395.png

Help users locate container and find logs through subtask id when there are 
multi containers running on the same host

 

!image-2021-11-19-10-45-41-395.png!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Shall casting functions return null or throw exceptions for invalid input

2021-11-18 Thread Kurt Young
Hi Timo,

Regarding CAST, I think no one denies the standard behavior which should
raise errors when
failed. The only question is how do we solve it, given lots of users
already relying on current
more tolerant behavior. Some violation of standard but acceptable behavior
doesn't deserve
a breaking change in Flink minor version IMO, i'm more comfortable to fix
it in versions like
Flink 2.0.

Best,
Kurt


On Thu, Nov 18, 2021 at 11:44 PM Timo Walther  wrote:

> Hi everyone,
>
>
> thanks for finally have this discussion on the mailing list. As both a
> contributor and user, I have experienced a couple issues around
> nullability coming out of nowhere in a pipeline. This discussion should
> not only cover CAST but failure handling in general.
>
> Let me summarize my opinion:
>
> 1) CAST vs. TRY_CAST
>
> CAST is a SQL standard core operation with well-defined semantics across
> all major SQL vendors. There should be no discussion whether it returns
> NULL or an error. The semantics are already defined externally. I don't
> agree with "Streaming computing is a resident program ... users do not
> want it to frequently fail", the same argument is also true for nightly
> batch jobs. A batch job can also get stuck through a SQL statement that
> is not lenient enough defined by the user.
>
> An option that restores the old behavior and TRY_CAST for the future
> should solve this use case and make all parties happy.
>
> 2) TO_TIMESTAMP / TO_DATE
>
> We should distinguish between CASTING and CONVERSION / PARSING. As a
> user, I would expect that parsing can fail and have to deal with this
> accordingly. Therefore, I'm fine with returning NULL in TO_ or CONVERT_
> functions. This is also consistent with other vendors. Take PARSE of SQL
> Server as an example [1]: "If a parameter with a null value is passed at
> run time, then a null is returned, to avoid canceling the whole batch.".
> Here we can be more flexible with the semantics because users need to
> read the docs anyway.
>
> 3) Null at other locations
>
> In general, we should stick to our data type constraints. Everything
> else will mess up the architecture of functions/connectors and their
> return types. Take the rowtime (event-time timestamp) attribute as an
> example: PRs like the one for FLINK-24885 are just the peak of the
> iceberg. If we would allow rowtime columns to be NULL we would need to
> check all time-based operators and implement additional handling logic
> for this.
>
> It would be better to define unified error-handling for operators and
> maybe drop rows if the per-element processing failed. We should have a
> unified approach how to log/side output such records.
>
> Until this is in place, I would suggest we spend some time in rules that
> can be enabled with an option for modifying the plan and wrap frequently
> failing expressions with a generic TRY() function. In this case, we
> don't need to deal with NULL in all built-in functions, we can throw
> helpful errors during development, and can return NULL even though the
> return type is NOT NULL. It would also make the NULL returning explicit
> in the plan.
>
> Regards,
> Timo
>
>
>
>
>
> [1]
>
> https://docs.microsoft.com/en-us/sql/t-sql/functions/parse-transact-sql?view=sql-server-ver15
> [2] https://issues.apache.org/jira/browse/FLINK-24885
>
>
>
>
>
> On 18.11.21 11:34, Kurt Young wrote:
> > Sorry I forgot to add user ML. I also would like to gather some users
> > feedback on this thing.
> > Since I didn't get any feedback on this topic before from users.
> >
> > Best,
> > Kurt
> >
> >
> > On Thu, Nov 18, 2021 at 6:33 PM Kurt Young  wrote:
> >
> >> (added user ML to this thread)
> >>
> >> HI all,
> >>
> >> I would like to raise a different opinion about this change. I agree
> with
> >> Ingo that
> >> we should not just break some existing behavior, and even if we
> introduce
> >> an
> >> option to control the behavior, i would propose to set the default value
> >> to current
> >> behavior.
> >>
> >> I want to mention one angle to assess whether we should change it or
> not,
> >> which
> >> is "what could users benefit from the changes". To me, it looks like:
> >>
> >> * new users: happy about the behavior
> >> * existing users: suffer from the change, it either cause them to modify
> >> the SQL or
> >> got a call in late night reporting his online job got crashed and
> couldn't
> >> be able to
> >> restart.
> >>
> >> I would like to quote another breaking change we did when we adjust the
> >> time-related
> >> function in FLIP-162 [1]. In that case, both new users and existing
> users
> >> are suffered
> >> from *incorrectly* implemented time function behavior, and we saw a lots
> >> of feedbacks and
> >> complains from various channels. After we fixed that, we never saw
> related
> >> problems again.
> >>
> >> Back to this topic, do we ever seen a user complain about current CAST
> >> behavior? Form my
> >> side, no.
> >>
> >> To summarize:
> >>
> >> +1 to introduce TRY_CAST to better 

[jira] [Created] (FLINK-24956) SqlSnapshot throws NullPointerException when used in conjunction with CTE

2021-11-18 Thread Yuval Itzchakov (Jira)
Yuval Itzchakov created FLINK-24956:
---

 Summary: SqlSnapshot throws NullPointerException when used in 
conjunction with CTE
 Key: FLINK-24956
 URL: https://issues.apache.org/jira/browse/FLINK-24956
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.13.3, 1.14.0
Reporter: Yuval Itzchakov


Executing the following program will fail with a NullPointerException:

 
{code:java}
package foo.bar

import org.apache.flink.api.scala.createTypeInformation
import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
import org.apache.flink.table.api.DataTypes
import org.apache.flink.table.api.Schema
import org.apache.flink.table.api.bridge.scala.StreamTableEnvironment
object Test {
  final case class Person(name: String, age: Int)

  def main(args: Array[String]): Unit = {
    val ee = StreamExecutionEnvironment.getExecutionEnvironment
    val te = StreamTableEnvironment.create(ee)    
val personSchema = Schema.newBuilder().column("name", 
DataTypes.STRING()).column("age", DataTypes.INT()).build()
    val x = ee.fromCollection(List(Person("a", 1)))
    te.createTemporaryView(
      "my_table",
      x,
      personSchema
    )
    val y = ee.fromCollection(List(Person("b", 2)))
    te.createTemporaryView(
      "my_table_2",
      y,
      personSchema
    )
    val res =
      te.executeSql("""
                      |WITH A AS (
                      |  select name, age + 1 from my_table
                      |),
                      |B AS (
                      |  select name, age + 2 from my_table_2
                      |)
                      |
                      |SELECT A.name, B.age
                      |FROM A
                      |JOIN B
                      |FOR SYSTEM_TIME AS OF PROCTIME() on (A.name = B.name)
                      |""".stripMargin)
    res.print()
  }
}
{code}
Stacktrace:
{code:java}
Caused by: java.lang.NullPointerException
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateSnapshot(SqlValidatorImpl.java:4714)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:986)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3085)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateJoin(SqlValidatorImpl.java:3133)
    at 
org.apache.flink.table.planner.calcite.FlinkCalciteSqlValidator.validateJoin(FlinkCalciteSqlValidator.java:117)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3076)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3335)
    at 
org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
    at 
org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:997)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:975)
    at 
org.apache.calcite.sql.validate.WithNamespace.validateImpl(WithNamespace.java:57)
    at 
org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:997)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateWith(SqlValidatorImpl.java:3744)
    at org.apache.calcite.sql.SqlWith.validate(SqlWith.java:71)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:952)
    at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:704)
    at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:159)
 {code}
The reason this fails is that SqlValidatorImpl, when validating the 
SqlSnapshot, always assumes it's operating on a node which has an underlying 
table directly:
{code:java}
            if (!ns.getTable().isTemporal()) {
                List qualifiedName = ns.getTable().getQualifiedName();
                String tableName = qualifiedName.get(qualifiedName.size() - 1);
                throw newValidationError(
                        snapshot.getTableRef(), 
Static.RESOURCE.notTemporalTable(tableName));
            }
 {code}
This is not always the case, as with CTE. 

A simple fix for this would be first checking `ns.getTable` agains't null, and 
only then checking it's temporality.

 

The issue here is that this bug is inside Calcites validator. 

Would love some guidance on how to fix this issue.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Shall casting functions return null or throw exceptions for invalid input

2021-11-18 Thread Timo Walther

Hi everyone,


thanks for finally have this discussion on the mailing list. As both a 
contributor and user, I have experienced a couple issues around 
nullability coming out of nowhere in a pipeline. This discussion should 
not only cover CAST but failure handling in general.


Let me summarize my opinion:

1) CAST vs. TRY_CAST

CAST is a SQL standard core operation with well-defined semantics across 
all major SQL vendors. There should be no discussion whether it returns 
NULL or an error. The semantics are already defined externally. I don't 
agree with "Streaming computing is a resident program ... users do not
want it to frequently fail", the same argument is also true for nightly 
batch jobs. A batch job can also get stuck through a SQL statement that 
is not lenient enough defined by the user.


An option that restores the old behavior and TRY_CAST for the future 
should solve this use case and make all parties happy.


2) TO_TIMESTAMP / TO_DATE

We should distinguish between CASTING and CONVERSION / PARSING. As a 
user, I would expect that parsing can fail and have to deal with this 
accordingly. Therefore, I'm fine with returning NULL in TO_ or CONVERT_ 
functions. This is also consistent with other vendors. Take PARSE of SQL 
Server as an example [1]: "If a parameter with a null value is passed at 
run time, then a null is returned, to avoid canceling the whole batch.". 
Here we can be more flexible with the semantics because users need to 
read the docs anyway.


3) Null at other locations

In general, we should stick to our data type constraints. Everything 
else will mess up the architecture of functions/connectors and their 
return types. Take the rowtime (event-time timestamp) attribute as an 
example: PRs like the one for FLINK-24885 are just the peak of the 
iceberg. If we would allow rowtime columns to be NULL we would need to 
check all time-based operators and implement additional handling logic 
for this.


It would be better to define unified error-handling for operators and 
maybe drop rows if the per-element processing failed. We should have a 
unified approach how to log/side output such records.


Until this is in place, I would suggest we spend some time in rules that 
can be enabled with an option for modifying the plan and wrap frequently 
failing expressions with a generic TRY() function. In this case, we 
don't need to deal with NULL in all built-in functions, we can throw 
helpful errors during development, and can return NULL even though the 
return type is NOT NULL. It would also make the NULL returning explicit 
in the plan.


Regards,
Timo





[1] 
https://docs.microsoft.com/en-us/sql/t-sql/functions/parse-transact-sql?view=sql-server-ver15

[2] https://issues.apache.org/jira/browse/FLINK-24885





On 18.11.21 11:34, Kurt Young wrote:

Sorry I forgot to add user ML. I also would like to gather some users
feedback on this thing.
Since I didn't get any feedback on this topic before from users.

Best,
Kurt


On Thu, Nov 18, 2021 at 6:33 PM Kurt Young  wrote:


(added user ML to this thread)

HI all,

I would like to raise a different opinion about this change. I agree with
Ingo that
we should not just break some existing behavior, and even if we introduce
an
option to control the behavior, i would propose to set the default value
to current
behavior.

I want to mention one angle to assess whether we should change it or not,
which
is "what could users benefit from the changes". To me, it looks like:

* new users: happy about the behavior
* existing users: suffer from the change, it either cause them to modify
the SQL or
got a call in late night reporting his online job got crashed and couldn't
be able to
restart.

I would like to quote another breaking change we did when we adjust the
time-related
function in FLIP-162 [1]. In that case, both new users and existing users
are suffered
from *incorrectly* implemented time function behavior, and we saw a lots
of feedbacks and
complains from various channels. After we fixed that, we never saw related
problems again.

Back to this topic, do we ever seen a user complain about current CAST
behavior? Form my
side, no.

To summarize:

+1 to introduce TRY_CAST to better prepare for the future.
-1 to modify the default behavior.
+0 to introduce a config option, but with the default value to existing
behavior. it's +0 because it
seems not necessary if i'm -1 to change the default behavior and also
don't see an urgent to modify.


[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-162%3A+Consistent+Flink+SQL+time+function+behavior

Best,
Kurt


On Thu, Nov 18, 2021 at 4:26 PM Ingo Bürk  wrote:


Hi,

first of all, thanks for the summary of both sides, and for bringing up
the
discussion on this.
I think it is obvious that this is not something we can just "break", so
the config option seems mandatory to me.

Overall I agree with Martijn and Till that throwing errors is the more
expected behavior. I mostly think this is valuable 

[GitHub] [flink-connectors] AHeise closed pull request #1: Fix tests

2021-11-18 Thread GitBox


AHeise closed pull request #1:
URL: https://github.com/apache/flink-connectors/pull/1


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (FLINK-24955) Add One-hot Encoder to Flink ML

2021-11-18 Thread Yunfeng Zhou (Jira)
Yunfeng Zhou created FLINK-24955:


 Summary: Add One-hot Encoder to Flink ML
 Key: FLINK-24955
 URL: https://issues.apache.org/jira/browse/FLINK-24955
 Project: Flink
  Issue Type: New Feature
  Components: Library / Machine Learning
Reporter: Yunfeng Zhou


Add One-hot Encoder to Flink ML



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Improve the name and structure of job vertex and operator name for job

2021-11-18 Thread Yun Tang
Hi Wenlong,

Thanks for bringing up this discussion and I believe many guys have ever 
suffered from the long and unreadable operator name for long time.

I have another suggestion which inspired by Aitozi, that we could add some hint 
to tell the vertex index. Such as make the pipeline from "source --> flatMap 
--> sink" to "[vertex-0] souce --> [vertex-1] flatMap --> [vertex-2] sink". 
This could make user or developer much easier to know which vertex is wrong 
when meeting exceptions.

Best
Yun Tang

On 2021/11/17 07:42:28 godfrey he wrote:
> Hi Wenlong, I'm fine with the config options.
> 
> Best,
> Godfrey
> 
> wenlong.lwl  于2021年11月17日周三 下午3:13写道:
> 
> >
> > Hi Chesney and Konstantin,
> > thanks for your feedback, I have added a section about How we support set
> > description at DataStream API in the doc.
> >
> >
> > Bests,
> > Wenlong
> >
> > On Tue, 16 Nov 2021 at 21:05, Konstantin Knauf  wrote:
> >
> > > Hi everyone,
> > >
> > > Thanks for starting this discussion. I am in favor of solving this for
> > > DataStream and Table API at the same time, using the same configuration
> > > keys. IMO we shouldn't introduce any additional fragmentation if we can
> > > avoid it.
> > >
> > > Cheers,
> > >
> > > Konstantin
> > >
> > > On Tue, Nov 16, 2021 at 1:50 PM wenlong.lwl 
> > > wrote:
> > >
> > > > hi, Chesney, we focus on sql first because the operator and topology of
> > > sql
> > > > jobs are generated by the engine, raising most of the problems in 
> > > > naming,
> > > > not only because the name is long but also because the topology can be
> > > more
> > > > complex than DataStream.
> > > >
> > > > The case in Datastream is much better, most of the names in DataStream
> > > API
> > > > are quite concise except for the windowing you mentioned, and the
> > > topology
> > > > is usually simpler,  what's more we can easily expose to DataStream API
> > > as
> > > > a second step once the foundation implementation is done. If it is
> > > > necessary, we can also cover the changes on DataStream API now, maybe
> > > take
> > > > Windowing first as an example?
> > > >
> > > > Best,
> > > > Wenlong
> > > >
> > > > On Tue, 16 Nov 2021 at 19:14, Chesnay Schepler 
> > > wrote:
> > > >
> > > > > Why should this be specific to the table API? The datastream API has
> > > > > similar issues with long operator names (like windowing).
> > > > >
> > > > > On 16/11/2021 11:22, wenlong.lwl wrote:
> > > > > > Thanks Godfrey for the suggestion.
> > > > > > Regarding 1, how about
> > > table.optimizer.simplify-operator-name-enabled,
> > > > > > which means that we would simplify the name of operator and keep the
> > > > > > details in description only.
> > > > > > "table.optimizer.operator-name.description-enabled" can not describe
> > > > what
> > > > > > it means I think.
> > > > > > Regarding 2, I agree that it is better to use enum instead of
> > > boolean.
> > > > > For
> > > > > > key I think you are meaning "pipeline.vertex-description-pattern"
> > > > instead
> > > > > > of "pipeline.vertex-name-pattern", and I would like to choose
> > > > > DEFAULT/TREE
> > > > > > for values.
> > > > > >
> > > > > > Best,
> > > > > > Wenlong
> > > > > >
> > > > > > On Tue, 16 Nov 2021 at 17:28, godfrey he 
> > > wrote:
> > > > > >
> > > > > >> Thanks for creating this FLIP Wenlong.
> > > > > >>
> > > > > >> The FLIP already looks pretty solid, I think the config options can
> > > be
> > > > > >> improved a little:
> > > > > >> 1) about table.optimizer.separate-name-and-description, I think
> > > > > >> "operator-name" should be considered in the option,
> > > > > >> how about table.optimizer.operator-name.description-enabled ?
> > > > > >> 2) about pipeline.tree-mode-vertex-description, I think we can make
> > > > > >> the mode accept string value,
> > > > > >> which is more flexible. How about pipeline.vertex-name-pattern, the
> > > > > >> default value is "TREE",
> > > > > >> another option is "CASCADE" (or "DEFAULT", which is more simple)
> > > > > >>
> > > > > >> What do you think?
> > > > > >>
> > > > > >> Best,
> > > > > >> Godfrey
> > > > > >>
> > > > > >> wenlong.lwl  于2021年11月15日周一 下午6:36写道:
> > > > > >>
> > > > > >>> Hi, all, FYI the FLIP doc has been created :
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-195%3A+Improve+the+name+and+structure+of+vertex+and+operator+name+for+sql+job
> > > > > >>> Best,
> > > > > >>> Wenlong
> > > > > >>>
> > > > > >>> On Mon, 15 Nov 2021 at 11:41, wenlong.lwl  > > >
> > > > > >> wrote:
> > > > >  Hi all,
> > > > >  Thanks for the feedback, It seems that the proposal is accepted 
> > > > >  by
> > > > all
> > > > > >> of
> > > > >  you guys. I will prepare a formal FLIP document and then go ahead
> > > to
> > > > > >> the
> > > > >  vote stage.
> > > > >  If any one has any other comments or suggestions, please let me
> > > > know,
> > > > >  thanks.
> > > > > 
> > > > >  Best,
> > > > > 

Re: [DISCUSS] Definition of Done for Apache Flink

2021-11-18 Thread Yun Tang
Hi Joe,

Thanks for bringing this to our attention.

In general, I agreed with Chesnay's reply on PR [1]. For the rule-3, we might 
indeed create another PR to add documentation previously. And I think if 
forcing to obey it to include the documentation in the same PR, that could 
benefit the review progress. Thus, I am not against for this rule.

For the rule related to the PR description, I think current flinkbot has tools 
to let committer to run command like "@flinkbot approve description". However, 
I think many committers did not leverage this, which makes the bot useless at 
most of the time. I think this discussion draws the attention that whether we 
should strictly obey the review process via using flinkbot or still not force 
committer to leverage it.

[1] https://github.com/apache/flink/pull/17801#issuecomment-970048058

Best
Yun Tang

On 2021/11/16 10:38:39 Ingo Bürk wrote:
> > On the other hand I am a silent fan of the current PR template because
> > it also provides a summary of the PR to make it easier for committers
> > to determine the impacts.
> 
> I 100% agree that part of a PR (and thus the template) should be the
> summary of the what, why, and how of the changes. I also see value in
> marking a PR as a breaking change if the author is aware of it being one
> (of course a committer needs to verify this nonetheless).
> 
> But apart from that, there's a lot of questions in there that no one seems
> to care about, and e.g. the question of how a change can be verified seems
> fairly useless to me: if tests have been changed, that can trivially be
> seen in the PR. The CI runs on top of that anyway as well. So I never
> really understood why I need to manually list all the tests I have touched
> here (or maybe I misunderstood this question the entire time?).
> 
> If the template is supposed to be useful for the committer rather than the
> author, it would have to be mandatory to fill it out, which de-facto it
> isn't.
> 
> Also, even if we keep all the same information, I would still love to see
> it converted into checkboxes. I know it's a small detail, but it's much
> less annoying than the current template. Something like
> 
> ```
> - [ ] This pull requests changes the public API (i.e., any class annotated
> with `@Public(Evolving)`)
> - [ ] This pull request adds, removes, or updates dependencies
> - [ ] I have updated the documentation to reflect the changes made in this
> pull request
> ```
> 
> On Tue, Nov 16, 2021 at 10:28 AM Fabian Paul  wrote:
> 
> > Hi all,
> >
> > Maybe I am the devil's advocate but I see the stability of master and
> > the definition of done as disjunct properties. I think it is more a
> > question of prioritization that test instabilities are treated as
> > critical tickets and have to be addressed before continuing any other
> > work. It will always happen that we merge code that is not 100%
> > stable; that is probably the nature of software development. I agree
> > when it comes to documentation that PRs are only mergeable if the
> > documentation has also been updated.
> >
> > On the other hand I am a silent fan of the current PR template because
> > it also provides a summary of the PR to make it easier for committers
> > to determine the impacts. It also reminds the contributors of our
> > principles i.e. how do you verify the change should probably not be
> > answered with "test were not possible".
> >
> > I agree with @Martijn Visser that we can improve the CI i.e.
> > performance regression test, execute s3 test but these things should
> > be addressed in another discussion.
> >
> > So I would prefer to keep the current PR template.
> >
> > Best,
> > Fabian
> >
> > On Tue, Nov 16, 2021 at 10:17 AM Martijn Visser 
> > wrote:
> > >
> > > Hi all,
> > >
> > > Thanks for bringing this up for this discussion, because I think it's an
> > > important aspect.
> > >
> > > From my perspective, a 'definition of done' serves two purposes:
> > > 1. It informs the contributor on what's expected when making a
> > contribution
> > > in the form of a PR
> > > 2. It instructs the committer on what to check before accepting/merging
> > a PR
> > >
> > > I would use a Github template primarily to deal with the first purpose. I
> > > think that should be short and easily understandable, preferably with as
> > > many automated checks as possible.
> > >
> > > I would propose something like this to condense information.
> > >
> > > 1. It is following the code contribution process, including code style
> > and
> > > quality guide https://flink.apache.org/contributing/contribute-code.html
> > > 2. It is covered by tests and all tests have passed
> > > 3. If it has user facing changes the documentation has been updated
> > > according to the documentation style guide
> > >
> > > These 3 DoD can probably be broken down into multiple automation tests:
> > >
> > > * Run a spotless check
> > > * Run a license check
> > > * Compile application
> > > * Run tests
> > > * Run E2E tests

Re: [ANNOUNCE] New Apache Flink Committer - Yingjie Cao

2021-11-18 Thread Yun Tang
Congratulations, Yinjie!

Best
Yun Tang

On 2021/11/18 08:01:44 Martijn Visser wrote:
> Congratulations!
> 
> On Thu, 18 Nov 2021 at 02:44, Leonard Xu  wrote:
> 
> > Congratulations!Yingjie
> >
> > Best,
> > Leonard
> >
> > > 在 2021年11月18日,01:40,Till Rohrmann  写道:
> > >
> > > Congratulations Yingjie!
> >
> >
> 


Re: [DISCUSS] Shall casting functions return null or throw exceptions for invalid input

2021-11-18 Thread Kurt Young
Sorry I forgot to add user ML. I also would like to gather some users
feedback on this thing.
Since I didn't get any feedback on this topic before from users.

Best,
Kurt


On Thu, Nov 18, 2021 at 6:33 PM Kurt Young  wrote:

> (added user ML to this thread)
>
> HI all,
>
> I would like to raise a different opinion about this change. I agree with
> Ingo that
> we should not just break some existing behavior, and even if we introduce
> an
> option to control the behavior, i would propose to set the default value
> to current
> behavior.
>
> I want to mention one angle to assess whether we should change it or not,
> which
> is "what could users benefit from the changes". To me, it looks like:
>
> * new users: happy about the behavior
> * existing users: suffer from the change, it either cause them to modify
> the SQL or
> got a call in late night reporting his online job got crashed and couldn't
> be able to
> restart.
>
> I would like to quote another breaking change we did when we adjust the
> time-related
> function in FLIP-162 [1]. In that case, both new users and existing users
> are suffered
> from *incorrectly* implemented time function behavior, and we saw a lots
> of feedbacks and
> complains from various channels. After we fixed that, we never saw related
> problems again.
>
> Back to this topic, do we ever seen a user complain about current CAST
> behavior? Form my
> side, no.
>
> To summarize:
>
> +1 to introduce TRY_CAST to better prepare for the future.
> -1 to modify the default behavior.
> +0 to introduce a config option, but with the default value to existing
> behavior. it's +0 because it
> seems not necessary if i'm -1 to change the default behavior and also
> don't see an urgent to modify.
>
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-162%3A+Consistent+Flink+SQL+time+function+behavior
>
> Best,
> Kurt
>
>
> On Thu, Nov 18, 2021 at 4:26 PM Ingo Bürk  wrote:
>
>> Hi,
>>
>> first of all, thanks for the summary of both sides, and for bringing up
>> the
>> discussion on this.
>> I think it is obvious that this is not something we can just "break", so
>> the config option seems mandatory to me.
>>
>> Overall I agree with Martijn and Till that throwing errors is the more
>> expected behavior. I mostly think this is valuable default behavior
>> because
>> it allows developers to find mistakes early and diagnose them much easier
>> compare to having to "work backwards" and figure out that it is the CAST
>> that failed. It also means that pipelines using TRY_CAST are
>> self-documenting because using that can signal "we might receive broken
>> data here".
>>
>>
>> Best
>> Ingo
>>
>> On Thu, Nov 18, 2021 at 9:11 AM Till Rohrmann 
>> wrote:
>>
>> > Hi everyone,
>> >
>> > personally I would also prefer the system telling me that something is
>> > wrong instead of silently ignoring records. If there is a TRY_CAST
>> function
>> > that has the old behaviour, people can still get the old behaviour. For
>> > backwards compatibility reasons it is a good idea to introduce a switch
>> to
>> > get back the old behaviour. By default we could set it to the new
>> > behaviour, though. Of course, we should explicitly document this new
>> > behaviour so that people are aware of it before running their jobs for
>> days
>> > and then encountering an invalid input.
>> >
>> > Cheers,
>> > Till
>> >
>> > On Thu, Nov 18, 2021 at 9:02 AM Martijn Visser 
>> > wrote:
>> >
>> > > Hi Caizhi,
>> > >
>> > > Thanks for bringing this up for discussion. I think the important
>> part is
>> > > what do developers expect as the default behaviour of a CAST function
>> > when
>> > > casting fails. If I look at Postgres [1] or MSSQL [2], the default
>> > > behaviour of a CAST failing would be to return an error, which would
>> be
>> > the
>> > > new behaviour. Returning a value when a CAST fails can lead to users
>> not
>> > > understanding immediately where that value comes from. So, I would be
>> in
>> > > favor of the new behaviour by default, but including a configuration
>> flag
>> > > to maintain the old behaviour to avoid that you need to rewrite all
>> these
>> > > jobs.
>> > >
>> > > Best regards,
>> > >
>> > > Martijn
>> > >
>> > > [1] https://www.postgresql.org/docs/current/sql-createcast.html
>> > > [2]
>> > >
>> > >
>> >
>> https://docs.microsoft.com/en-us/sql/t-sql/functions/try-cast-transact-sql?view=sql-server-ver15
>> > >
>> > > On Thu, 18 Nov 2021 at 03:17, Caizhi Weng 
>> wrote:
>> > >
>> > > > Hi devs!
>> > > >
>> > > > We're discussing the behavior of casting functions (including cast,
>> > > > to_timestamp, to_date, etc.) for invalid input in
>> > > > https://issues.apache.org/jira/browse/FLINK-24924. As this topic is
>> > > > crucial
>> > > > to compatibility and usability we'd like to continue discussing this
>> > > > publicly in the mailing list.
>> > > >
>> > > > The main topic is to discuss that shall casting functions return
>> null
>> > > (keep
>> > > > its current behavior) or throw 

Re: [DISCUSS] Shall casting functions return null or throw exceptions for invalid input

2021-11-18 Thread Kurt Young
(added user ML to this thread)

HI all,

I would like to raise a different opinion about this change. I agree with
Ingo that
we should not just break some existing behavior, and even if we introduce an
option to control the behavior, i would propose to set the default value to
current
behavior.

I want to mention one angle to assess whether we should change it or not,
which
is "what could users benefit from the changes". To me, it looks like:

* new users: happy about the behavior
* existing users: suffer from the change, it either cause them to modify
the SQL or
got a call in late night reporting his online job got crashed and couldn't
be able to
restart.

I would like to quote another breaking change we did when we adjust the
time-related
function in FLIP-162 [1]. In that case, both new users and existing users
are suffered
from *incorrectly* implemented time function behavior, and we saw a lots of
feedbacks and
complains from various channels. After we fixed that, we never saw related
problems again.

Back to this topic, do we ever seen a user complain about current CAST
behavior? Form my
side, no.

To summarize:

+1 to introduce TRY_CAST to better prepare for the future.
-1 to modify the default behavior.
+0 to introduce a config option, but with the default value to existing
behavior. it's +0 because it
seems not necessary if i'm -1 to change the default behavior and also don't
see an urgent to modify.


[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-162%3A+Consistent+Flink+SQL+time+function+behavior

Best,
Kurt


On Thu, Nov 18, 2021 at 4:26 PM Ingo Bürk  wrote:

> Hi,
>
> first of all, thanks for the summary of both sides, and for bringing up the
> discussion on this.
> I think it is obvious that this is not something we can just "break", so
> the config option seems mandatory to me.
>
> Overall I agree with Martijn and Till that throwing errors is the more
> expected behavior. I mostly think this is valuable default behavior because
> it allows developers to find mistakes early and diagnose them much easier
> compare to having to "work backwards" and figure out that it is the CAST
> that failed. It also means that pipelines using TRY_CAST are
> self-documenting because using that can signal "we might receive broken
> data here".
>
>
> Best
> Ingo
>
> On Thu, Nov 18, 2021 at 9:11 AM Till Rohrmann 
> wrote:
>
> > Hi everyone,
> >
> > personally I would also prefer the system telling me that something is
> > wrong instead of silently ignoring records. If there is a TRY_CAST
> function
> > that has the old behaviour, people can still get the old behaviour. For
> > backwards compatibility reasons it is a good idea to introduce a switch
> to
> > get back the old behaviour. By default we could set it to the new
> > behaviour, though. Of course, we should explicitly document this new
> > behaviour so that people are aware of it before running their jobs for
> days
> > and then encountering an invalid input.
> >
> > Cheers,
> > Till
> >
> > On Thu, Nov 18, 2021 at 9:02 AM Martijn Visser 
> > wrote:
> >
> > > Hi Caizhi,
> > >
> > > Thanks for bringing this up for discussion. I think the important part
> is
> > > what do developers expect as the default behaviour of a CAST function
> > when
> > > casting fails. If I look at Postgres [1] or MSSQL [2], the default
> > > behaviour of a CAST failing would be to return an error, which would be
> > the
> > > new behaviour. Returning a value when a CAST fails can lead to users
> not
> > > understanding immediately where that value comes from. So, I would be
> in
> > > favor of the new behaviour by default, but including a configuration
> flag
> > > to maintain the old behaviour to avoid that you need to rewrite all
> these
> > > jobs.
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > > [1] https://www.postgresql.org/docs/current/sql-createcast.html
> > > [2]
> > >
> > >
> >
> https://docs.microsoft.com/en-us/sql/t-sql/functions/try-cast-transact-sql?view=sql-server-ver15
> > >
> > > On Thu, 18 Nov 2021 at 03:17, Caizhi Weng 
> wrote:
> > >
> > > > Hi devs!
> > > >
> > > > We're discussing the behavior of casting functions (including cast,
> > > > to_timestamp, to_date, etc.) for invalid input in
> > > > https://issues.apache.org/jira/browse/FLINK-24924. As this topic is
> > > > crucial
> > > > to compatibility and usability we'd like to continue discussing this
> > > > publicly in the mailing list.
> > > >
> > > > The main topic is to discuss that shall casting functions return null
> > > (keep
> > > > its current behavior) or throw exceptions (introduce a new behavior).
> > I'm
> > > > trying to conclude the ideas on both sides. Correct me if I miss
> > > something.
> > > >
> > > > *From the devs who support throwing exceptions (new behavior):*
> > > >
> > > > The main concern is that if we silently return null then unexpected
> > > results
> > > > or exceptions (mainly NullPointerException) may be produced. However,
> > it
> > > > will be hard 

[jira] [Created] (FLINK-24954) Reset read buffer request timeout on buffer recycling for sort-shuffle

2021-11-18 Thread Yingjie Cao (Jira)
Yingjie Cao created FLINK-24954:
---

 Summary: Reset read buffer request timeout on buffer recycling for 
sort-shuffle
 Key: FLINK-24954
 URL: https://issues.apache.org/jira/browse/FLINK-24954
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Network
Reporter: Yingjie Cao
 Fix For: 1.15.0


Currently, the read buffer request timeout implementation of sort-shuffle is a 
little aggressive. As reported in the mailing list: 
[https://lists.apache.org/thread/bd3s5bqfg9oxlb1g1gg3pxs3577lhf88]. The 
TimeoutException may be triggered if there is data skew and the downstream task 
is slow. Actually, we can further improve this case by reseting the request 
timeout on buffer recycling.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-24953) Optime hive parallelism inference

2021-11-18 Thread xiangqiao (Jira)
xiangqiao created FLINK-24953:
-

 Summary: Optime hive parallelism inference
 Key: FLINK-24953
 URL: https://issues.apache.org/jira/browse/FLINK-24953
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive
Affects Versions: 1.14.0, 1.13.0
Reporter: xiangqiao


Currently, when I disable hive table source parallelism inference using 
configuration  and set parallelism.default: 100. 
{code:java}
table.exec.hive.infer-source-parallelism: false 
parallelism.default: 100{code}
The result is that the parallelism of hive table source is {*}1{*}, and the 
configuration of the default parallelism is not effective.

I will optimize this problem. In the future, when disable hive table source 
parallelism inference ,the  parallelism of hive table source will be determined 
according to the following order:
 
1. If table.exec.resource.default-parallelism is set, the configured value will 
be used

2. If parallelism.default is set, the configured value is used

3. If the above two configuration items are not set, the default value is 1

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-24952) Rowtime attributes must not be in the input rows of a regular join. As a workaround you can cast the time attributes of input tables to TIMESTAMP before

2021-11-18 Thread wangbaohua (Jira)
wangbaohua created FLINK-24952:
--

 Summary: Rowtime attributes must not be in the input rows of a 
regular join. As a workaround you can cast the time attributes of input tables 
to TIMESTAMP before
 Key: FLINK-24952
 URL: https://issues.apache.org/jira/browse/FLINK-24952
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.13.1
 Environment: public void test() throws Exception {

final StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(1);
env.enableCheckpointing(5000);  //检查点 每5000ms
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);

Properties browseProperties = new Properties();
browseProperties.put("bootstrap.servers", "192.168.1.25:9093");
browseProperties.put("group.id", "temporal");
browseProperties.put("auto.offset.reset", "latest");

PropTransformMap.getInstance().readConfigMap("./conf/cfg.properties");
Map configMap = new HashMap();
configMap.put(Constants.DB_JDBC_USER, "root");
configMap.put(Constants.DB_JDBC_PASSWD, "1qazXSW@3edc");
configMap.put(Constants.DB_JDBC_URL, 
"jdbc:mysql://192.168.1.25:3306/SSA?useUnicode=true=utf-8");
configMap.put(Constants.DB_JDBC_DRIVER, 
"com.mysql.jdbc.Driver");
configMap.put(Constants.INITAL_POOL_SIZE, "10");
configMap.put(Constants.MIN_POOL_SIZE, "5");
configMap.put(Constants.MAX_IDLE_TIME, "50");
configMap.put(Constants.MAX_STATE_ELEMENTS, "100");
configMap.put(Constants.MAX_IDLE_TIME, "60");
DbFetcher dbFetcher = new DbFetcher(configMap);
List listRule = RuleReader.readRules(dbFetcher);
System.out.println("ListRule::" + listRule.size());

final String RULE_SBROAD_CAST_STATE = "RulesBroadcastState";

RuleParse ruleParse = new RuleParse();
Map properties = new HashMap();
ruleParse.parseData("./conf/cfg.json");

//1、读取mysql的配置消息
DataStream> conf = env.addSource(new 
MysqlSourceFunction1(dbFetcher));

//2、创建MapStateDescriptor规则,对广播的数据的数据类型的规则
MapStateDescriptor> ruleStateDescriptor = 
new MapStateDescriptor<>(RULE_SBROAD_CAST_STATE
, BasicTypeInfo.STRING_TYPE_INFO
, new ListTypeInfo<>(String.class));
//3、对conf进行broadcast返回BroadcastStream
final BroadcastStream> confBroadcast = 
conf.broadcast(ruleStateDescriptor);

//DataStream dataStream = 
env.fromElements("{\"ORG_ID\":\"1\",\"RAW_MSG\":\"useradd,su - 
root\",\"EVENT_THREE_TYPE\":\"40001\",\"EVENT_TWO_TYPE\":\"40001\",\"SRC_PORT\":\"123\",\"DST_PORT\":\"124\",\"DST_IP\":\"10.16.254.11\",\"SRC_IP\":\"50.115.134.50\",\"CREATE_TIME\":\"2021-07-09
 
18:15:21.001\",\"DEVICE_PARENT_TYPE\":\"LINUX\",\"SNOW_ID\":\"85512\",\"EVENT_THREE_TYPE_DESC\":\"暴力破解失败\",\"ts\":\"2021-05-27
 
16:06:58\",\"ACCOUNT\":\"asap\",\"collectionName\":\"bwdOMS\",\"eRuleId\":\"0\",\"RULE_TJ_COUNT\":11,\"TAGS\":{\"EVENT_ONE_TYPE\":\"2\",\"DIRECTION\":\"内部\",\"EVENT_TWO_TYPE\":\"10015\",\"EVENT_THREE_TYPE\":\"20101\"},\"DEVICE_TYPE\":\"OSM\",\"DIRECTION\":\"0\"}\n");
DataStream dataStream = 
env.fromElements("{\"DIRECTION\":\"0\",\"ATTACK_STAGE\":\"命令控制\",\"DEVICE_PARENT_TYPE\":\"IPS\",\"URL\":\"www.baidu.com\",\"SRC_PORT\":\"58513\",\"DST_PORT\":\"31177\",\"RISK_LEVEL\":\"99\",\"SRC_ASSET_TYPE\":\"4\",\"SRC_ASSET_SUB_TYPE\":\"412\",\"DST_ASSET_TYPE\":\"4\",\"DST_ASSET_SUB_TYPE\":\"412\",\"SRC_POST\":\"1\",\"DST_POST\":\"0\",\"INSERT_TIME\":\"2021-05-01
 
00:00:00.000\",\"DST_ASSET_NAME\":\"ddde\",\"SRC_ASSET_NAME\":\"wangwu\",\"SCENE_ID\":-5216633060008277343,\"SOURCE\":\"4\",\"ASSET_IP\":\"73.243.143.114\",\"TENANT_ID\":\"-1\",\"ORG_ID\":\"1\",\"DST_IP\":\"192.118.8.218\",\"EVENT_TYPE\":\"1008\",\"SRC_IP\":\"153.79.42.45\",\"CUSTOM_VALUE1\":\"187.36.226.184\",\"CUSTOM_VALUE2\":\"41.68.25.104\",\"SRC_PROVINCE\":\"日本\",\"SNOW_ID\":\"469260998\",\"DEVICE_TYPE\":\"TDA\",\"MESSAGE\":\"\",\"CHARACTER\":\"\",\"CUSTOM_LABEL1\":\"控制IP\",\"CREATE_TIME\":\"2021-04-26
 
17:04:18.000\",\"CUSTOM_LABEL2\":\"受控IP\",\"TYPE\":\"未知\",\"MALWARE_TYPE\":\"其他\",\"collectTime\":\"2021-04-29
 
19:40:36.000\",\"RULE_ID\":\"180607832\",\"DEVICE_IP\":\"239.150.69.203\",\"SRC_CITY\":\"日本\",\"recordTime\":\"2021-04-30
 

[jira] [Created] (FLINK-24951) Allow watch bookmarks to mitigate frequent watcher rebuilding

2021-11-18 Thread Yangze Guo (Jira)
Yangze Guo created FLINK-24951:
--

 Summary: Allow watch bookmarks to mitigate frequent watcher 
rebuilding
 Key: FLINK-24951
 URL: https://issues.apache.org/jira/browse/FLINK-24951
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / Kubernetes
Affects Versions: 1.15.0
Reporter: Yangze Guo
 Fix For: 1.15.0


In some production environments, there are massive pods that create and delete. 
Thus the global resource version is updated very quickly and may cause frequent 
watcher rebuilding because of "too old resource version". To avoid this, K8s 
provide a Bookmark mechanism[1].

I propose to enable bookmark by default

[1] https://kubernetes.io/docs/reference/using-api/api-concepts/#watch-bookmarks



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


Re: [DISCUSS] Shall casting functions return null or throw exceptions for invalid input

2021-11-18 Thread Ingo Bürk
Hi,

first of all, thanks for the summary of both sides, and for bringing up the
discussion on this.
I think it is obvious that this is not something we can just "break", so
the config option seems mandatory to me.

Overall I agree with Martijn and Till that throwing errors is the more
expected behavior. I mostly think this is valuable default behavior because
it allows developers to find mistakes early and diagnose them much easier
compare to having to "work backwards" and figure out that it is the CAST
that failed. It also means that pipelines using TRY_CAST are
self-documenting because using that can signal "we might receive broken
data here".


Best
Ingo

On Thu, Nov 18, 2021 at 9:11 AM Till Rohrmann  wrote:

> Hi everyone,
>
> personally I would also prefer the system telling me that something is
> wrong instead of silently ignoring records. If there is a TRY_CAST function
> that has the old behaviour, people can still get the old behaviour. For
> backwards compatibility reasons it is a good idea to introduce a switch to
> get back the old behaviour. By default we could set it to the new
> behaviour, though. Of course, we should explicitly document this new
> behaviour so that people are aware of it before running their jobs for days
> and then encountering an invalid input.
>
> Cheers,
> Till
>
> On Thu, Nov 18, 2021 at 9:02 AM Martijn Visser 
> wrote:
>
> > Hi Caizhi,
> >
> > Thanks for bringing this up for discussion. I think the important part is
> > what do developers expect as the default behaviour of a CAST function
> when
> > casting fails. If I look at Postgres [1] or MSSQL [2], the default
> > behaviour of a CAST failing would be to return an error, which would be
> the
> > new behaviour. Returning a value when a CAST fails can lead to users not
> > understanding immediately where that value comes from. So, I would be in
> > favor of the new behaviour by default, but including a configuration flag
> > to maintain the old behaviour to avoid that you need to rewrite all these
> > jobs.
> >
> > Best regards,
> >
> > Martijn
> >
> > [1] https://www.postgresql.org/docs/current/sql-createcast.html
> > [2]
> >
> >
> https://docs.microsoft.com/en-us/sql/t-sql/functions/try-cast-transact-sql?view=sql-server-ver15
> >
> > On Thu, 18 Nov 2021 at 03:17, Caizhi Weng  wrote:
> >
> > > Hi devs!
> > >
> > > We're discussing the behavior of casting functions (including cast,
> > > to_timestamp, to_date, etc.) for invalid input in
> > > https://issues.apache.org/jira/browse/FLINK-24924. As this topic is
> > > crucial
> > > to compatibility and usability we'd like to continue discussing this
> > > publicly in the mailing list.
> > >
> > > The main topic is to discuss that shall casting functions return null
> > (keep
> > > its current behavior) or throw exceptions (introduce a new behavior).
> I'm
> > > trying to conclude the ideas on both sides. Correct me if I miss
> > something.
> > >
> > > *From the devs who support throwing exceptions (new behavior):*
> > >
> > > The main concern is that if we silently return null then unexpected
> > results
> > > or exceptions (mainly NullPointerException) may be produced. However,
> it
> > > will be hard for users to reason for this because there is no detailed
> > > message. If we throw exceptions in the first place, then it's much
> easier
> > > to catch the exception with nice detailed messages explaining what is
> > going
> > > wrong. Especially for this case of DATE/TIME/TIMESTAMP it's very
> helpful
> > to
> > > have a detailed error and see where and why the parsing broke.
> > >
> > > For compatibility concerns, we can provide a TRY_CAST function which is
> > > exactly the same as the current CAST function by returning nulls for
> > > invalid input.
> > >
> > > *From the devs who support return null (current behavior):*
> > >
> > > The main concern is compatibility and usability.
> > >
> > > On usability: The upstream system may occasionally produce invalid data
> > and
> > > if we throw exceptions when seeing this it will fail the job again and
> > > again even after restart (because the invalid data is always
> > > there). Streaming computing is a resident program and users do not want
> > it
> > > to frequently fail and cannot automatically recover. Most users are
> > willing
> > > to just skip that record and continue processing. Imagine an online job
> > > running for a couple of weeks and suddenly fails due to some unexpected
> > > dirty data. What choices do users have to quickly resume the job?
> > >
> > > On compatibility: There are currently thousands of users and tens of
> > > thousands of jobs relying on the current behavior to filter out invalid
> > > input. If we change the behavior it will be a disaster for users as
> they
> > > have to rewrite and check their SQL very carefully.
> > >
> > >
> > > What do you think? We're looking forward to your feedback.
> > >
> >
>


Re: [DISCUSS] Shall casting functions return null or throw exceptions for invalid input

2021-11-18 Thread Till Rohrmann
Hi everyone,

personally I would also prefer the system telling me that something is
wrong instead of silently ignoring records. If there is a TRY_CAST function
that has the old behaviour, people can still get the old behaviour. For
backwards compatibility reasons it is a good idea to introduce a switch to
get back the old behaviour. By default we could set it to the new
behaviour, though. Of course, we should explicitly document this new
behaviour so that people are aware of it before running their jobs for days
and then encountering an invalid input.

Cheers,
Till

On Thu, Nov 18, 2021 at 9:02 AM Martijn Visser 
wrote:

> Hi Caizhi,
>
> Thanks for bringing this up for discussion. I think the important part is
> what do developers expect as the default behaviour of a CAST function when
> casting fails. If I look at Postgres [1] or MSSQL [2], the default
> behaviour of a CAST failing would be to return an error, which would be the
> new behaviour. Returning a value when a CAST fails can lead to users not
> understanding immediately where that value comes from. So, I would be in
> favor of the new behaviour by default, but including a configuration flag
> to maintain the old behaviour to avoid that you need to rewrite all these
> jobs.
>
> Best regards,
>
> Martijn
>
> [1] https://www.postgresql.org/docs/current/sql-createcast.html
> [2]
>
> https://docs.microsoft.com/en-us/sql/t-sql/functions/try-cast-transact-sql?view=sql-server-ver15
>
> On Thu, 18 Nov 2021 at 03:17, Caizhi Weng  wrote:
>
> > Hi devs!
> >
> > We're discussing the behavior of casting functions (including cast,
> > to_timestamp, to_date, etc.) for invalid input in
> > https://issues.apache.org/jira/browse/FLINK-24924. As this topic is
> > crucial
> > to compatibility and usability we'd like to continue discussing this
> > publicly in the mailing list.
> >
> > The main topic is to discuss that shall casting functions return null
> (keep
> > its current behavior) or throw exceptions (introduce a new behavior). I'm
> > trying to conclude the ideas on both sides. Correct me if I miss
> something.
> >
> > *From the devs who support throwing exceptions (new behavior):*
> >
> > The main concern is that if we silently return null then unexpected
> results
> > or exceptions (mainly NullPointerException) may be produced. However, it
> > will be hard for users to reason for this because there is no detailed
> > message. If we throw exceptions in the first place, then it's much easier
> > to catch the exception with nice detailed messages explaining what is
> going
> > wrong. Especially for this case of DATE/TIME/TIMESTAMP it's very helpful
> to
> > have a detailed error and see where and why the parsing broke.
> >
> > For compatibility concerns, we can provide a TRY_CAST function which is
> > exactly the same as the current CAST function by returning nulls for
> > invalid input.
> >
> > *From the devs who support return null (current behavior):*
> >
> > The main concern is compatibility and usability.
> >
> > On usability: The upstream system may occasionally produce invalid data
> and
> > if we throw exceptions when seeing this it will fail the job again and
> > again even after restart (because the invalid data is always
> > there). Streaming computing is a resident program and users do not want
> it
> > to frequently fail and cannot automatically recover. Most users are
> willing
> > to just skip that record and continue processing. Imagine an online job
> > running for a couple of weeks and suddenly fails due to some unexpected
> > dirty data. What choices do users have to quickly resume the job?
> >
> > On compatibility: There are currently thousands of users and tens of
> > thousands of jobs relying on the current behavior to filter out invalid
> > input. If we change the behavior it will be a disaster for users as they
> > have to rewrite and check their SQL very carefully.
> >
> >
> > What do you think? We're looking forward to your feedback.
> >
>


Re: [ANNOUNCE] New Apache Flink Committer - Yingjie Cao

2021-11-18 Thread Martijn Visser
Congratulations!

On Thu, 18 Nov 2021 at 02:44, Leonard Xu  wrote:

> Congratulations!Yingjie
>
> Best,
> Leonard
>
> > 在 2021年11月18日,01:40,Till Rohrmann  写道:
> >
> > Congratulations Yingjie!
>
>


Re: [DISCUSS] Shall casting functions return null or throw exceptions for invalid input

2021-11-18 Thread Martijn Visser
Hi Caizhi,

Thanks for bringing this up for discussion. I think the important part is
what do developers expect as the default behaviour of a CAST function when
casting fails. If I look at Postgres [1] or MSSQL [2], the default
behaviour of a CAST failing would be to return an error, which would be the
new behaviour. Returning a value when a CAST fails can lead to users not
understanding immediately where that value comes from. So, I would be in
favor of the new behaviour by default, but including a configuration flag
to maintain the old behaviour to avoid that you need to rewrite all these
jobs.

Best regards,

Martijn

[1] https://www.postgresql.org/docs/current/sql-createcast.html
[2]
https://docs.microsoft.com/en-us/sql/t-sql/functions/try-cast-transact-sql?view=sql-server-ver15

On Thu, 18 Nov 2021 at 03:17, Caizhi Weng  wrote:

> Hi devs!
>
> We're discussing the behavior of casting functions (including cast,
> to_timestamp, to_date, etc.) for invalid input in
> https://issues.apache.org/jira/browse/FLINK-24924. As this topic is
> crucial
> to compatibility and usability we'd like to continue discussing this
> publicly in the mailing list.
>
> The main topic is to discuss that shall casting functions return null (keep
> its current behavior) or throw exceptions (introduce a new behavior). I'm
> trying to conclude the ideas on both sides. Correct me if I miss something.
>
> *From the devs who support throwing exceptions (new behavior):*
>
> The main concern is that if we silently return null then unexpected results
> or exceptions (mainly NullPointerException) may be produced. However, it
> will be hard for users to reason for this because there is no detailed
> message. If we throw exceptions in the first place, then it's much easier
> to catch the exception with nice detailed messages explaining what is going
> wrong. Especially for this case of DATE/TIME/TIMESTAMP it's very helpful to
> have a detailed error and see where and why the parsing broke.
>
> For compatibility concerns, we can provide a TRY_CAST function which is
> exactly the same as the current CAST function by returning nulls for
> invalid input.
>
> *From the devs who support return null (current behavior):*
>
> The main concern is compatibility and usability.
>
> On usability: The upstream system may occasionally produce invalid data and
> if we throw exceptions when seeing this it will fail the job again and
> again even after restart (because the invalid data is always
> there). Streaming computing is a resident program and users do not want it
> to frequently fail and cannot automatically recover. Most users are willing
> to just skip that record and continue processing. Imagine an online job
> running for a couple of weeks and suddenly fails due to some unexpected
> dirty data. What choices do users have to quickly resume the job?
>
> On compatibility: There are currently thousands of users and tens of
> thousands of jobs relying on the current behavior to filter out invalid
> input. If we change the behavior it will be a disaster for users as they
> have to rewrite and check their SQL very carefully.
>
>
> What do you think? We're looking forward to your feedback.
>