[jira] [Created] (FLINK-36007) cdc load and register factory in once search

2024-08-08 Thread Hongshun Wang (Jira)
Hongshun Wang created FLINK-36007:
-

 Summary: cdc load and register factory in once search
 Key: FLINK-36007
 URL: https://issues.apache.org/jira/browse/FLINK-36007
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Affects Versions: cdc-3.1.1
Reporter: Hongshun Wang
 Fix For: cdc-3.2.0
 Attachments: image-2024-08-08-17-02-34-553.png, 
image-2024-08-08-17-03-20-893.png

Current, in cdc, it will search the factory to generate source and sink, and 
then will search again to get url, which is no need.

 

!image-2024-08-08-17-03-20-893.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34936) Register shared state files to FileMergingSnapshotManager

2024-03-25 Thread Jinzhong Li (Jira)
Jinzhong Li created FLINK-34936:
---

 Summary: Register shared state files to FileMergingSnapshotManager
 Key: FLINK-34936
 URL: https://issues.apache.org/jira/browse/FLINK-34936
 Project: Flink
  Issue Type: New Feature
Reporter: Jinzhong Li
 Fix For: 1.20.0


The shared state files should be registered into the 
FileMergingSnapshotManager, so that these files can be properly cleaned up  
when checkpoint aborted/subsumed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Meet our keynote speakers and register to Community Over Code EU!

2023-12-22 Thread Ryan Skraba
[Note: You're receiving this email because you are subscribed to one or
more project dev@ mailing lists at the Apache Software Foundation.]











*
Merge
with the ASF EUniverse!The registration for Community Over Code Europe is
finally open! Get your tickets now and save your spot!
We are happy to announce that we
have confirmed the first featured speakers
!  - Asim Hussain, Executive
Director at Green Software Foundation- Dirk-Willem Van Gulik, VP of Public
Policy at The Apache Software Foundation- Ruth Ikega, Community Lead at
CHAOSS Africa Visit our website
 to learn more about this
amazing lineup.CFP is openWe are looking forward to hearing all you have to
share with the Apache Community. Please submit your talk proposal
 before January 12, 2024.Interested in
boosting your brand?Take a look at our prospectus

and find out the opportunities we have for you. Be one step ahead and book
your room at the hotel venueWe have a special rate for you at the Radisson
Blu Carlton, the hotel that will hold Community Over Code EU. Learn more
about the location and venue 
and book your accommodation. Should you have any questions, please do not
hesitate to contact us. We wish you Happy Holidays in the company of your
loved ones! See you in Bratislava next year!Community Over Code EU
Organizer Committee*


[jira] [Created] (FLINK-33924) Register for INFRA's self-hosted runner trial

2023-12-21 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33924:
-

 Summary: Register for INFRA's self-hosted runner trial
 Key: FLINK-33924
 URL: https://issues.apache.org/jira/browse/FLINK-33924
 Project: Flink
  Issue Type: Sub-task
Reporter: Matthias Pohl






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-30126) Delay registration of the catalog, register the catalog as needed

2022-11-21 Thread melin (Jira)
melin created FLINK-30126:
-

 Summary: Delay registration of the catalog, register the catalog 
as needed
 Key: FLINK-30126
 URL: https://issues.apache.org/jira/browse/FLINK-30126
 Project: Flink
  Issue Type: New Feature
Reporter: melin


Data platform has registered many relational database data sources such as 
mysql, data source code is used as the catalog name, we are not sure which data 
source needs to register the catalog in flink, we hope that the required 
catalog can be dynamically loaded when sql is executed, flink provides the 
interface. Users can customize the register time catalog



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28866) Use DDL instead of legacy method to register the test source in JoinITCase

2022-08-08 Thread godfrey he (Jira)
godfrey he created FLINK-28866:
--

 Summary: Use DDL instead of legacy method to register the test 
source in JoinITCase
 Key: FLINK-28866
 URL: https://issues.apache.org/jira/browse/FLINK-28866
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Reporter: godfrey he






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28621) Register JavaTimeModule in all internal object mappers

2022-07-21 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-28621:


 Summary: Register JavaTimeModule in all internal object mappers
 Key: FLINK-28621
 URL: https://issues.apache.org/jira/browse/FLINK-28621
 Project: Flink
  Issue Type: Improvement
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Reporter: Chesnay Schepler
 Fix For: 1.16.0, 1.15.2


In FLINK-25588 we extended flink-shaded-jackson to also bundle the jackson 
extensions for handling Java 8 time classes, but barely any of the internal 
object mappers were adjusted to register said module.

As a result it is for example not possible to read an Instant from a CSV with 
the CsvReaderFormat.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-27658) Introduce MutableURLClassLoader allow to register and remove user jar dynamically

2022-05-16 Thread dalongliu (Jira)
dalongliu created FLINK-27658:
-

 Summary: Introduce MutableURLClassLoader allow to register and 
remove user jar dynamically
 Key: FLINK-27658
 URL: https://issues.apache.org/jira/browse/FLINK-27658
 Project: Flink
  Issue Type: Sub-task
Affects Versions: 1.16.0
Reporter: dalongliu
 Fix For: 1.16.0






--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (FLINK-26775) PyFlink WindowOperator#process_element register wrong cleanup timer

2022-03-21 Thread Juntao Hu (Jira)
Juntao Hu created FLINK-26775:
-

 Summary: PyFlink WindowOperator#process_element register wrong 
cleanup timer
 Key: FLINK-26775
 URL: https://issues.apache.org/jira/browse/FLINK-26775
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.15.0
Reporter: Juntao Hu
 Fix For: 1.15.0


In window_operator.py line 378, when dealing with merging window assigner:
{code:python}
self.register_cleanup_timer(window)
{code}
This should be registering a cleanup timer for `actual_window`, but won't 
causing window emitting bugs when session window trigger is implemented 
correctly.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25862) Refactor SharedStateRegistry to limit StreamStateHandle to register/unregister

2022-01-28 Thread Yun Tang (Jira)
Yun Tang created FLINK-25862:


 Summary: Refactor SharedStateRegistry to limit StreamStateHandle 
to register/unregister
 Key: FLINK-25862
 URL: https://issues.apache.org/jira/browse/FLINK-25862
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Checkpointing, Runtime / State Backends
Reporter: Yun Tang


Current implementation of SharedStateRegistry would use `StreamStateHandle` to 
register and unregister. This would limit the usage for other componments, such 
as change-log state backend handle usage.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-25669) Support register operator coordinators for newly initialized ExecutionJobVertex.

2022-01-17 Thread Lijie Wang (Jira)
Lijie Wang created FLINK-25669:
--

 Summary: Support register operator coordinators for newly 
initialized ExecutionJobVertex.
 Key: FLINK-25669
 URL: https://issues.apache.org/jira/browse/FLINK-25669
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Lijie Wang






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-24086) Do not re-register SharedStateRegistry to reduce the recovery time of the job

2021-08-31 Thread ming li (Jira)
ming li created FLINK-24086:
---

 Summary: Do not re-register SharedStateRegistry to reduce the 
recovery time of the job
 Key: FLINK-24086
 URL: https://issues.apache.org/jira/browse/FLINK-24086
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Checkpointing
Reporter: ming li


At present, we only recover the {{CompletedCheckpointStore}} when the 
{{JobManager}} starts, so it seems that we do not need to re-register the 
{{SharedStateRegistry}} when the task restarts.


The reason for this issue is that in our production environment, we discard 
part of the data and state to only restart the failed task, but found that it 
may take several seconds to register the {{SharedStateRegistry}} (thousands of 
tasks and dozens of TB states). When there are a large number of task failures 
at the same time, this may take several minutes (number of tasks * several 
seconds).

 

Therefore, if the {{SharedStateRegistry}} can be reused, the time for task 
recovery can be reduced.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


回复:回复:[DISCUSS] [FLINK-23122] Provide the Dynamic register converter

2021-06-24 Thread 云华
 @Timo Walther Thanks great reply. 

1, TIMESTAMP_WITH_TIME_ZONE issue: 
https://issues.apache.org/jira/browse/FLINK-23145 will track.

2, Unsupported types of other systems:
I basic agree that provide modules to load specific types on demand. But to my 
concern : 
2.1 the conversion for different data source have their own conversion (JDBC 
for 
org.apache.flink.connector.jdbc.internal.converter.AbstractJdbcRowConverter#createInternalConverter,
 Postgres for 
org.apache.flink.connector.jdbc.internal.converter.PostgresRowConverter#createInternalConverter).
 
2.2 For SQL extend type not provide the dynamic register mechanism (e.g. CREATE 
TYPE user_enum AS ENUM ('enum1', 'enum2', 'enum3');
CREATE TABLE user_enum_tb( enum_column user_enum);).   In summary Flink 
DataType provide the expansion for user to define custom type they want. But 
lack of expansion from data source type to Flink datatype , and reverse. I 
suggest that we can provide the basic type conversion for most case. Different 
can load the type conversion by their needed. At the same time, we can provide 
the extension mechanism for them to register the custom type or override the 
type conversion .


--
发件人:Timo Walther 
发送时间:2021年6月24日(星期四) 21:06
收件人:dev 
主 题:Re: 回复:[DISCUSS] [FLINK-23122] Provide the Dynamic register converter

Hi Jack,

thanks for sharing your proposal with us. I totally understand the 
issues that you are trying to solve. Having a more flexible type support 
in the connectors is definitely a problem that we would like to address 
in the mid term. It is already considered in on our internal roadmap 
planning.

I haven't taken a deeper look at your current proposal but will do so 
soon. Until then, let me give you some general feedback.

I see a couple of orthogonal issues that we need to solve:

1) The TIMESTAMP_WITH_TIME_ZONE problem: this is one of the easier 
issues that we simply need to fix on the runtime side. We are planning 
to support this type because it is one of the core data structures that 
you need in basically every pipeline.

2) Unsupported types of other systems: As Jark said, we offer support 
for RAW types and also user-defined structured types. Since most of the 
pre-requisite work has been done for user-defined types (e.g. a central 
type registry). I could imagine that we are able to extend Flink's type 
system soon. My idea would be to provide modules via Flink's module 
system to load Postgres or MySQL specific types that could then be used 
at all regular locations such as DDL or functions.

3) Add connector specific type information in DDL: We should allow to 
enrich the automatic schema convertion step when translating DDL into 
other system's types. This is were you proposal might make sense.


Ragrds,
Timo


On 24.06.21 14:19, 云华 wrote:
> 
> @Jark Wuthanks reply.  However Several case I want to cover:
> 
> 1, Unknown types CITEXT:
> Flink SQL cannot exexute "CREATE TABLE string_table (pk SERIAL, vc 
> VARCHAR(2), vcv CHARACTER VARYING(2), ch CHARACTER(4), c CHAR(3), t TEXT, b 
> BYTEA, bnn BYTEA NOT NULL, ct CITEXT, PRIMARY KEY(pk));".
> this is because 
> org.apache.flink.connector.jdbc.catalog.PostgresCatalog#fromJDBCType cannot 
> support CITEXT.
> 
> 2, TIMESTAMP_WITH_TIME_ZONE unsuppoted : 
> org.apache.flink.table.runtime.typeutils.InternalSerializers#createInternal 
> cannot support TIMESTAMP_WITH_TIME_ZONE.
> 3, Unsupported types(MySQL): 
> org.apache.flink.connector.jdbc.dialect.MySQLDialect#unsupportedTypes provide 
> the unsuppoted types.
> 4,  Unsupported types(Postgres): 
> org.apache.flink.connector.jdbc.dialect.PostgresDialect#unsupportedTypes 
> provide the unsuppoted types.
> 5,  (Postgres) parts of types implements referenced from Postgres 
> https://www.postgresql.org/docs/12/datatype.html .6,  (MySQL) parts of types 
> implements referenced from MySQL 
> https://dev.mysql.com/doc/refman/8.0/en/data-types.html.
> 
> 
> Please let me If you have any suggestion.
> 
> 
> ------
> 发件人:Jark Wu 
> 发送时间:2021年6月23日(星期三) 23:13
> 收件人:dev ; 云华 
> 主 题:Re: [DISCUSS] [FLINK-23122] Provide the Dynamic register converter
> 
> Hi,
> 
> `TIMESTAMP_WITH_TIME_ZONE` is not supported in the Flink SQL engine,
>   even though it is listed in the type API.
> 
> I think what you are looking for is the RawValueType which can be used as
> user-defined type. You can use `DataTypes.RAW(TypeInformation)` to define
>   a Raw type with the given TypeInformation which includes the serializer
> and deserializer.
> 
> Best,
> Jark
> On Wed, 23 Jun 2021 at 21:09, 云华  wrote:
> 
>   Hi everyone,
>   I want to rework type conversion system in connect

Re: 回复:[DISCUSS] [FLINK-23122] Provide the Dynamic register converter

2021-06-24 Thread Timo Walther

Hi Jack,

thanks for sharing your proposal with us. I totally understand the 
issues that you are trying to solve. Having a more flexible type support 
in the connectors is definitely a problem that we would like to address 
in the mid term. It is already considered in on our internal roadmap 
planning.


I haven't taken a deeper look at your current proposal but will do so 
soon. Until then, let me give you some general feedback.


I see a couple of orthogonal issues that we need to solve:

1) The TIMESTAMP_WITH_TIME_ZONE problem: this is one of the easier 
issues that we simply need to fix on the runtime side. We are planning 
to support this type because it is one of the core data structures that 
you need in basically every pipeline.


2) Unsupported types of other systems: As Jark said, we offer support 
for RAW types and also user-defined structured types. Since most of the 
pre-requisite work has been done for user-defined types (e.g. a central 
type registry). I could imagine that we are able to extend Flink's type 
system soon. My idea would be to provide modules via Flink's module 
system to load Postgres or MySQL specific types that could then be used 
at all regular locations such as DDL or functions.


3) Add connector specific type information in DDL: We should allow to 
enrich the automatic schema convertion step when translating DDL into 
other system's types. This is were you proposal might make sense.



Ragrds,
Timo


On 24.06.21 14:19, 云华 wrote:


@Jark Wuthanks reply.  However Several case I want to cover:

1, Unknown types CITEXT:
Flink SQL cannot exexute "CREATE TABLE string_table (pk SERIAL, vc VARCHAR(2), vcv 
CHARACTER VARYING(2), ch CHARACTER(4), c CHAR(3), t TEXT, b BYTEA, bnn BYTEA NOT NULL, ct 
CITEXT, PRIMARY KEY(pk));".
this is because 
org.apache.flink.connector.jdbc.catalog.PostgresCatalog#fromJDBCType cannot 
support CITEXT.

2, TIMESTAMP_WITH_TIME_ZONE unsuppoted : 
org.apache.flink.table.runtime.typeutils.InternalSerializers#createInternal 
cannot support TIMESTAMP_WITH_TIME_ZONE.
3, Unsupported types(MySQL): 
org.apache.flink.connector.jdbc.dialect.MySQLDialect#unsupportedTypes provide 
the unsuppoted types.
4,  Unsupported types(Postgres): 
org.apache.flink.connector.jdbc.dialect.PostgresDialect#unsupportedTypes 
provide the unsuppoted types.
5,  (Postgres) parts of types implements referenced from Postgres 
https://www.postgresql.org/docs/12/datatype.html .6,  (MySQL) parts of types 
implements referenced from MySQL 
https://dev.mysql.com/doc/refman/8.0/en/data-types.html.


Please let me If you have any suggestion.


--
发件人:Jark Wu 
发送时间:2021年6月23日(星期三) 23:13
收件人:dev ; 云华 
主 题:Re: [DISCUSS] [FLINK-23122] Provide the Dynamic register converter

Hi,

`TIMESTAMP_WITH_TIME_ZONE` is not supported in the Flink SQL engine,
  even though it is listed in the type API.

I think what you are looking for is the RawValueType which can be used as
user-defined type. You can use `DataTypes.RAW(TypeInformation)` to define
  a Raw type with the given TypeInformation which includes the serializer
and deserializer.

Best,
Jark
On Wed, 23 Jun 2021 at 21:09, 云华  wrote:

  Hi everyone,
  I want to rework type conversion system in connector and flink table module 
to be resuable and scalability.
  I Postgres system, the type '_citext' will not supported in 
org.apache.flink.connector.jdbc.catalog.PostgresCatalog#fromJDBCType.  what's 
more, 
org.apache.flink.table.runtime.typeutils.InternalSerializers#createInternal 
cannnot support the TIMESTAMP_WITH_TIME_ZONE.
  For more background and api design : 
https://issues.apache.org/jira/browse/FLINK-23122.
  Please let me know if this matches your thoughts.



  Regards,Jack





回复:[DISCUSS] [FLINK-23122] Provide the Dynamic register converter

2021-06-24 Thread 云华

@Jark Wuthanks reply.  However Several case I want to cover: 

1, Unknown types CITEXT: 
Flink SQL cannot exexute "CREATE TABLE string_table (pk SERIAL, vc VARCHAR(2), 
vcv CHARACTER VARYING(2), ch CHARACTER(4), c CHAR(3), t TEXT, b BYTEA, bnn 
BYTEA NOT NULL, ct CITEXT, PRIMARY KEY(pk));". 
this is because 
org.apache.flink.connector.jdbc.catalog.PostgresCatalog#fromJDBCType cannot 
support CITEXT.

2, TIMESTAMP_WITH_TIME_ZONE unsuppoted : 
org.apache.flink.table.runtime.typeutils.InternalSerializers#createInternal 
cannot support TIMESTAMP_WITH_TIME_ZONE.
3, Unsupported types(MySQL): 
org.apache.flink.connector.jdbc.dialect.MySQLDialect#unsupportedTypes provide 
the unsuppoted types.
4,  Unsupported types(Postgres): 
org.apache.flink.connector.jdbc.dialect.PostgresDialect#unsupportedTypes 
provide the unsuppoted types.
5,  (Postgres) parts of types implements referenced from Postgres 
https://www.postgresql.org/docs/12/datatype.html .6,  (MySQL) parts of types 
implements referenced from MySQL 
https://dev.mysql.com/doc/refman/8.0/en/data-types.html.


Please let me If you have any suggestion. 


--
发件人:Jark Wu 
发送时间:2021年6月23日(星期三) 23:13
收件人:dev ; 云华 
主 题:Re: [DISCUSS] [FLINK-23122] Provide the Dynamic register converter

Hi,

`TIMESTAMP_WITH_TIME_ZONE` is not supported in the Flink SQL engine,
 even though it is listed in the type API.

I think what you are looking for is the RawValueType which can be used as 
user-defined type. You can use `DataTypes.RAW(TypeInformation)` to define
 a Raw type with the given TypeInformation which includes the serializer 
and deserializer.

Best,
Jark
On Wed, 23 Jun 2021 at 21:09, 云华  wrote:

 Hi everyone,
 I want to rework type conversion system in connector and flink table module to 
be resuable and scalability. 
 I Postgres system, the type '_citext' will not supported in 
org.apache.flink.connector.jdbc.catalog.PostgresCatalog#fromJDBCType.  what's 
more, 
org.apache.flink.table.runtime.typeutils.InternalSerializers#createInternal 
cannnot support the TIMESTAMP_WITH_TIME_ZONE. 
 For more background and api design : 
https://issues.apache.org/jira/browse/FLINK-23122.
 Please let me know if this matches your thoughts.



 Regards,Jack 

[RESULT][VOTE] FLIP-129 Register sources/sinks in Table API

2021-06-24 Thread Ingo Bürk
Hi everyone,

The vote for FLIP-129 has now been closed. We have five approving votes,
four of which are binding:

* Timo Walther (binding)
* Jark Wu (binding)
* Jingsong Li (binding)
* Jing Zhang (binding)
* Leonard Xu

There are no disapproving votes. Thanks everyone for voting!


Regards
Ingo


Re: [DISCUSS] [FLINK-23122] Provide the Dynamic register converter

2021-06-23 Thread Jark Wu
Hi,

`TIMESTAMP_WITH_TIME_ZONE` is not supported in the Flink SQL engine,
 even though it is listed in the type API.

I think what you are looking for is the RawValueType which can be used as
user-defined type. You can use `DataTypes.RAW(TypeInformation)` to define
 a Raw type with the given TypeInformation which includes the serializer
and deserializer.

Best,
Jark

On Wed, 23 Jun 2021 at 21:09, 云华  wrote:

>
> Hi everyone,
> I want to rework type conversion system in connector and flink table
> module to be resuable and scalability.
> I Postgres system, the type '_citext' will not supported in
> org.apache.flink.connector.jdbc.catalog.PostgresCatalog#fromJDBCType.
> what's more,
> org.apache.flink.table.runtime.typeutils.InternalSerializers#createInternal
> cannnot support the TIMESTAMP_WITH_TIME_ZONE.
> For more background and api design :
> https://issues.apache.org/jira/browse/FLINK-23122.
> Please let me know if this matches your thoughts.
>
>
>
> Regards,Jack


[DISCUSS] [FLINK-23122] Provide the Dynamic register converter

2021-06-23 Thread 云华

Hi everyone,
I want to rework type conversion system in connector and flink table module to 
be resuable and scalability. 
I Postgres system, the type '_citext' will not supported in 
org.apache.flink.connector.jdbc.catalog.PostgresCatalog#fromJDBCType.  what's 
more, 
org.apache.flink.table.runtime.typeutils.InternalSerializers#createInternal 
cannnot support the TIMESTAMP_WITH_TIME_ZONE. 
For more background and api design : 
https://issues.apache.org/jira/browse/FLINK-23122.
Please let me know if this matches your thoughts.



Regards,Jack

[jira] [Created] (FLINK-23122) Provide the Dynamic register converter

2021-06-23 Thread lqjacklee (Jira)
lqjacklee created FLINK-23122:
-

 Summary: Provide the Dynamic register converter 
 Key: FLINK-23122
 URL: https://issues.apache.org/jira/browse/FLINK-23122
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Common, Connectors / HBase, Connectors / 
Hive, Connectors / JDBC, Connectors / ORC, Table SQL / API
Affects Versions: 1.14.0
Reporter: lqjacklee


Background:

Type conversion is the core of direct data conversion between Flink and data 
source. By default, Flink provides type conversion for different connectors. 
Different transformation logic is distributed in the specific implementation of 
multiple connectors. It brings a big problem to the reuse of Flink system. 
Secondly, due to the diversity of different types of data sources, the original 
transformation needs to be extended, and the original transformation does not 
have dynamic expansion. Finally, the core of the transformation logic needs to 
be reused in multiple projects, hoping to abstract the transformation logic 
into a unified processing. The application program directly depends on the same 
type transformation system, and different sub components can dynamically expand 
the types of transformation.


1, ConvertServiceRegister : provide register and search function. 

{code:java}
public interface ConvertServiceRegister {

void register(ConversionService conversionService);

void register(ConversionServiceFactory conversionServiceFactory);

void register(ConversionServiceSet conversionServiceSet);

Collection convertServices();

Collection convertServices(String group);

Collection convertServiceSets();

Collection convertServiceSets(String group);
}
{code}

2, ConversionService : provide the implement.

{code:java}
public interface ConversionService extends Order {

Set tags();

boolean canConvert(TypeInformationHolder source, TypeInformationHolder 
target)
throws ConvertException;

Object convert(
TypeInformationHolder sourceType,
Object source,
TypeInformationHolder targetType,
Object defaultValue,
boolean nullable)
throws ConvertException;
}
{code}


3, ConversionServiceFactory : provide the conversion service factory function.

{code:java}
public interface ConversionServiceFactory extends Order {

Set tags();

ConversionService getConversionService(T target) throws 
ConvertException;
}
{code}


4, ConversionServiceSet : provide group management.

{code:java}
public interface ConversionServiceSet extends Loadable {

Set tags();

Collection conversionServices();

boolean support(TypeInformationHolder source, TypeInformationHolder target)
throws ConvertException;

Object convert(
String name,
TypeInformationHolder typeInformationHolder,
Object value,
TypeInformationHolder type,
Object defaultValue,
boolean nullable)
throws ConvertException;
}
{code}






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] FLIP-129 Register sources/sinks in Table API

2021-06-21 Thread Leonard Xu
+1

Thanks Ingo for picking up this FLIP.

Best,
Leonard


> 在 2021年6月22日,10:16,JING ZHANG  写道:
> 
> +1 (binding)
> 
> Best regards,
> JING ZHANG
> 
> Jingsong Li  于2021年6月22日周二 上午10:11写道:
> 
>> +1 (binding)
>> 
>> Thanks for driving.
>> 
>> Best,
>> Jingsong
>> 
>> On Tue, Jun 22, 2021 at 12:07 AM Jark Wu  wrote:
>> 
>>> +1 (binding)
>>> 
>>> Best,
>>> Jark
>>> 
>>> On Mon, 21 Jun 2021 at 22:51, Timo Walther  wrote:
>>> 
>>>> +1 (binding)
>>>> 
>>>> Thanks for driving this.
>>>> 
>>>> Regards,
>>>> Timo
>>>> 
>>>> On 21.06.21 13:24, Ingo Bürk wrote:
>>>>> Hi everyone,
>>>>> 
>>>>> thanks for all the feedback so far. Based on the discussion[1] we
>> seem
>>> to
>>>>> have consensus, so I would like to start a vote on FLIP-129 for which
>>> the
>>>>> FLIP has now also been updated[2].
>>>>> 
>>>>> The vote will last for at least 72 hours (Thu, Jun 24th 12:00 GMT)
>>> unless
>>>>> there is an objection or insufficient votes.
>>>>> 
>>>>> 
>>>>> Thanks
>>>>> Ingo
>>>>> 
>>>>> [1]
>>>>> 
>>>> 
>>> 
>> https://lists.apache.org/thread.html/rc75d64e889bf35592e9843dde86e82bdfea8fd4eb4c3df150112b305%40%3Cdev.flink.apache.org%3E
>>>>> [2]
>>>>> 
>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API
>>>>> 
>>>> 
>>>> 
>>> 
>> 
>> 
>> --
>> Best, Jingsong Lee
>> 



Re: [VOTE] FLIP-129 Register sources/sinks in Table API

2021-06-21 Thread JING ZHANG
+1 (binding)

Best regards,
JING ZHANG

Jingsong Li  于2021年6月22日周二 上午10:11写道:

> +1 (binding)
>
> Thanks for driving.
>
> Best,
> Jingsong
>
> On Tue, Jun 22, 2021 at 12:07 AM Jark Wu  wrote:
>
> > +1 (binding)
> >
> > Best,
> > Jark
> >
> > On Mon, 21 Jun 2021 at 22:51, Timo Walther  wrote:
> >
> > > +1 (binding)
> > >
> > > Thanks for driving this.
> > >
> > > Regards,
> > > Timo
> > >
> > > On 21.06.21 13:24, Ingo Bürk wrote:
> > > > Hi everyone,
> > > >
> > > > thanks for all the feedback so far. Based on the discussion[1] we
> seem
> > to
> > > > have consensus, so I would like to start a vote on FLIP-129 for which
> > the
> > > > FLIP has now also been updated[2].
> > > >
> > > > The vote will last for at least 72 hours (Thu, Jun 24th 12:00 GMT)
> > unless
> > > > there is an objection or insufficient votes.
> > > >
> > > >
> > > > Thanks
> > > > Ingo
> > > >
> > > > [1]
> > > >
> > >
> >
> https://lists.apache.org/thread.html/rc75d64e889bf35592e9843dde86e82bdfea8fd4eb4c3df150112b305%40%3Cdev.flink.apache.org%3E
> > > > [2]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API
> > > >
> > >
> > >
> >
>
>
> --
> Best, Jingsong Lee
>


Re: [VOTE] FLIP-129 Register sources/sinks in Table API

2021-06-21 Thread Jingsong Li
+1 (binding)

Thanks for driving.

Best,
Jingsong

On Tue, Jun 22, 2021 at 12:07 AM Jark Wu  wrote:

> +1 (binding)
>
> Best,
> Jark
>
> On Mon, 21 Jun 2021 at 22:51, Timo Walther  wrote:
>
> > +1 (binding)
> >
> > Thanks for driving this.
> >
> > Regards,
> > Timo
> >
> > On 21.06.21 13:24, Ingo Bürk wrote:
> > > Hi everyone,
> > >
> > > thanks for all the feedback so far. Based on the discussion[1] we seem
> to
> > > have consensus, so I would like to start a vote on FLIP-129 for which
> the
> > > FLIP has now also been updated[2].
> > >
> > > The vote will last for at least 72 hours (Thu, Jun 24th 12:00 GMT)
> unless
> > > there is an objection or insufficient votes.
> > >
> > >
> > > Thanks
> > > Ingo
> > >
> > > [1]
> > >
> >
> https://lists.apache.org/thread.html/rc75d64e889bf35592e9843dde86e82bdfea8fd4eb4c3df150112b305%40%3Cdev.flink.apache.org%3E
> > > [2]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API
> > >
> >
> >
>


-- 
Best, Jingsong Lee


Re: [VOTE] FLIP-129 Register sources/sinks in Table API

2021-06-21 Thread Jark Wu
+1 (binding)

Best,
Jark

On Mon, 21 Jun 2021 at 22:51, Timo Walther  wrote:

> +1 (binding)
>
> Thanks for driving this.
>
> Regards,
> Timo
>
> On 21.06.21 13:24, Ingo Bürk wrote:
> > Hi everyone,
> >
> > thanks for all the feedback so far. Based on the discussion[1] we seem to
> > have consensus, so I would like to start a vote on FLIP-129 for which the
> > FLIP has now also been updated[2].
> >
> > The vote will last for at least 72 hours (Thu, Jun 24th 12:00 GMT) unless
> > there is an objection or insufficient votes.
> >
> >
> > Thanks
> > Ingo
> >
> > [1]
> >
> https://lists.apache.org/thread.html/rc75d64e889bf35592e9843dde86e82bdfea8fd4eb4c3df150112b305%40%3Cdev.flink.apache.org%3E
> > [2]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API
> >
>
>


Re: [VOTE] FLIP-129 Register sources/sinks in Table API

2021-06-21 Thread Timo Walther

+1 (binding)

Thanks for driving this.

Regards,
Timo

On 21.06.21 13:24, Ingo Bürk wrote:

Hi everyone,

thanks for all the feedback so far. Based on the discussion[1] we seem to
have consensus, so I would like to start a vote on FLIP-129 for which the
FLIP has now also been updated[2].

The vote will last for at least 72 hours (Thu, Jun 24th 12:00 GMT) unless
there is an objection or insufficient votes.


Thanks
Ingo

[1]
https://lists.apache.org/thread.html/rc75d64e889bf35592e9843dde86e82bdfea8fd4eb4c3df150112b305%40%3Cdev.flink.apache.org%3E
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API





[VOTE] FLIP-129 Register sources/sinks in Table API

2021-06-21 Thread Ingo Bürk
Hi everyone,

thanks for all the feedback so far. Based on the discussion[1] we seem to
have consensus, so I would like to start a vote on FLIP-129 for which the
FLIP has now also been updated[2].

The vote will last for at least 72 hours (Thu, Jun 24th 12:00 GMT) unless
there is an objection or insufficient votes.


Thanks
Ingo

[1]
https://lists.apache.org/thread.html/rc75d64e889bf35592e9843dde86e82bdfea8fd4eb4c3df150112b305%40%3Cdev.flink.apache.org%3E
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API


[jira] [Created] (FLINK-23062) FLIP-129: Register sources/sinks in Table API

2021-06-21 Thread Jira
Ingo Bürk created FLINK-23062:
-

 Summary: FLIP-129: Register sources/sinks in Table API
 Key: FLINK-23062
 URL: https://issues.apache.org/jira/browse/FLINK-23062
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / API
Reporter: Ingo Bürk
Assignee: Ingo Bürk


(!) FLIP-129 is awaiting another voting. These issues are preliminary. (!)

https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-21806) ContinuousEventTimeTrigger should register its first timer base on current watermark

2021-03-15 Thread Kezhu Wang (Jira)
Kezhu Wang created FLINK-21806:
--

 Summary: ContinuousEventTimeTrigger should register its first 
timer base on current watermark
 Key: FLINK-21806
 URL: https://issues.apache.org/jira/browse/FLINK-21806
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Affects Versions: 1.13.0
Reporter: Kezhu Wang


Currently, the first timer is registered base on timestamp of first element. 
But that that could be relative large in window. In case of {{maxTimestamp}}, 
then {{ContinuousEventTimeTrigger}} will never fire early.

Due to compatibility, we might not able to change current behavior. How about 
an option in construction ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Failed to register Protobuf Kryo serialization

2021-02-14 Thread Svend Vanderveken
Sorry, I realize I posted this to the wrong list, please ignore and I'll
post it to the flink-user one.

On Sun, Feb 14, 2021 at 11:41 AM Svend Vanderveken 
wrote:

> Hi all,
>
> I'm failing to setup an example of wire serialization with Protobuf, could
> you help me figure out what I'm doing wrong?
>
> I'm using a simple protobuf schema:
>
> ```
>
> syntax = "proto3";
>
> import "google/protobuf/wrappers.proto";
> option java_multiple_files = true;
>
> message DemoUserEvent {
>   Metadata metadata = 1;
>   oneof payload {
> Created created = 10;
> Updated updated = 11;
>   }
>
>   message Created {...}
>
>   message Updated {...}
>
>   ...
>
> }
>
> ```
>
>
> From which I'm generating java from this Gradle plugin:
>
>
> ```
>
> plugins {
> id "com.google.protobuf" version "0.8.15"
> }
>
> ```
>
>
> And I'm generating DemoUserEvent instances with Java Iterator looking like 
> this:
>
>
> ```
>
> public class UserEventGenerator implements Iterator, 
> Serializable {
> transient public final static Faker faker = new Faker();
> ...
> @Override public DemoUserEvent next() {
> return randomCreatedEvent();
>
>  }
>
>  ...
>
> ```
>
>
> I read those two pieces of documentation:
> *
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/types_serialization.html
> *
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/custom_serializers.html
>
> And tried the demo app below:
>
> ```
>
> import com.twitter.chill.protobuf.ProtobufSerializer;
>
> ...
>
> public static void main(String[] args) {
> final StreamExecutionEnvironment flinkEnv = 
> StreamExecutionEnvironment.getExecutionEnvironment();
> flinkEnv.getConfig().registerTypeWithKryoSerializer(DemoUserEvent.class, 
> ProtobufSerializer.class);
> flinkEnv.fromCollection(new UserEventGenerator(), 
> DemoUserEvent.class).print();
> }
>
> ```
>
> But the serialization mechanism still fails to handle my protobuf class:
>
> 11:22:45,822 INFO  org.apache.flink.api.java.typeutils.TypeExtractor  
>   [] - class live.schema.event.user.v1.DemoUserEvent does not contain a 
> getter for field payloadCase_
> 11:22:45,822 INFO  org.apache.flink.api.java.typeutils.TypeExtractor  
>   [] - class live.schema.event.user.v1.DemoUserEvent does not contain a 
> setter for field payloadCase_
> 11:22:45,822 INFO  org.apache.flink.api.java.typeutils.TypeExtractor  
>   [] - Class class live.schema.event.user.v1.DemoUserEvent cannot be used as 
> a POJO type because not all fields are valid POJO fields, and must be 
> processed as GenericType. Please read the Flink documentation on "Data Types 
> & Serialization" for details of the effect on performance.
>
> I've also tried this, without success:
>
> ```
>
> flinkEnv.getConfig().addDefaultKryoSerializer(DemoUserEvent.class, 
> ProtobufSerializer.class);
>
> ```
>
>
> I'm using those versions:
>
> ```
>
> ext {
> javaVersion = '11'
> flinkVersion = '1.12.1'
> scalaBinaryVersion = '2.12'
> }
>
> dependencies {
> compileOnly 
> "org.apache.flink:flink-streaming-java_${scalaBinaryVersion}:${flinkVersion}"
> implementation ("com.twitter:chill-protobuf:0.9.5") {
> exclude group: 'com.esotericsoftware.kryo', module: 'kryo'
> }
> implementation "com.google.protobuf:protobuf-java:3.14.0"
> implementation 'com.github.javafaker:javafaker:1.0.2'
> }
>
> ```
>
>
> Any idea what I should try next?
>
> Thanks in advance!
>
>
>
>

-- 
Svend Vanderveken
Kelesia SPRL - BE 0839 049 010
blog: https://svend.kelesia.com 
Twitter: @sv3ndk 


Failed to register Protobuf Kryo serialization

2021-02-14 Thread Svend Vanderveken
Hi all,

I'm failing to setup an example of wire serialization with Protobuf, could
you help me figure out what I'm doing wrong?

I'm using a simple protobuf schema:

```

syntax = "proto3";

import "google/protobuf/wrappers.proto";
option java_multiple_files = true;

message DemoUserEvent {
  Metadata metadata = 1;
  oneof payload {
Created created = 10;
Updated updated = 11;
  }

  message Created {...}

  message Updated {...}

  ...

}

```


>From which I'm generating java from this Gradle plugin:


```

plugins {
id "com.google.protobuf" version "0.8.15"
}

```


And I'm generating DemoUserEvent instances with Java Iterator looking like this:


```

public class UserEventGenerator implements Iterator,
Serializable {
transient public final static Faker faker = new Faker();
...
@Override public DemoUserEvent next() {
return randomCreatedEvent();

 }

 ...

```


I read those two pieces of documentation:
*
https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/types_serialization.html
*
https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/custom_serializers.html

And tried the demo app below:

```

import com.twitter.chill.protobuf.ProtobufSerializer;

...

public static void main(String[] args) {
final StreamExecutionEnvironment flinkEnv =
StreamExecutionEnvironment.getExecutionEnvironment();
flinkEnv.getConfig().registerTypeWithKryoSerializer(DemoUserEvent.class,
ProtobufSerializer.class);
flinkEnv.fromCollection(new UserEventGenerator(),
DemoUserEvent.class).print();
}

```

But the serialization mechanism still fails to handle my protobuf class:

11:22:45,822 INFO  org.apache.flink.api.java.typeutils.TypeExtractor
 [] - class live.schema.event.user.v1.DemoUserEvent does not
contain a getter for field payloadCase_
11:22:45,822 INFO  org.apache.flink.api.java.typeutils.TypeExtractor
 [] - class live.schema.event.user.v1.DemoUserEvent does not
contain a setter for field payloadCase_
11:22:45,822 INFO  org.apache.flink.api.java.typeutils.TypeExtractor
 [] - Class class live.schema.event.user.v1.DemoUserEvent
cannot be used as a POJO type because not all fields are valid POJO
fields, and must be processed as GenericType. Please read the Flink
documentation on "Data Types & Serialization" for details of the
effect on performance.

I've also tried this, without success:

```

flinkEnv.getConfig().addDefaultKryoSerializer(DemoUserEvent.class,
ProtobufSerializer.class);

```


I'm using those versions:

```

ext {
javaVersion = '11'
flinkVersion = '1.12.1'
scalaBinaryVersion = '2.12'
}

dependencies {
compileOnly
"org.apache.flink:flink-streaming-java_${scalaBinaryVersion}:${flinkVersion}"
implementation ("com.twitter:chill-protobuf:0.9.5") {
exclude group: 'com.esotericsoftware.kryo', module: 'kryo'
}
implementation "com.google.protobuf:protobuf-java:3.14.0"
implementation 'com.github.javafaker:javafaker:1.0.2'
}

```


Any idea what I should try next?

Thanks in advance!


[jira] [Created] (FLINK-20836) Register TaskManager with total and default slot resource profile in SlotManager

2021-01-03 Thread Yangze Guo (Jira)
Yangze Guo created FLINK-20836:
--

 Summary: Register TaskManager with total and default slot resource 
profile in SlotManager
 Key: FLINK-20836
 URL: https://issues.apache.org/jira/browse/FLINK-20836
 Project: Flink
  Issue Type: Sub-task
Reporter: Yangze Guo
 Fix For: 1.13.0


In FLINK-14188 and FLINK-14189, the TaskExecutor will derive and register to 
ResourceManager with the total/default slot resource profile. We need to pass 
that information to SlotManager as it is required in fine-grained resource 
management.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-20749) Allow to register new slots listener at DeclarativeSlotPool

2020-12-23 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-20749:
-

 Summary: Allow to register new slots listener at 
DeclarativeSlotPool
 Key: FLINK-20749
 URL: https://issues.apache.org/jira/browse/FLINK-20749
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Affects Versions: 1.13.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.13.0


At the moment it is not possible to register a new slots listener at the 
{{DeclarativeSlotPool}} after its creation. This is problematic because the 
listener might not be known at the time of instantiating the 
{{DeclarativeSlotPool}}. I propose to add a {{registerNewSlotListener}} method 
to the {{DeclarativeSlotPool}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Register processing time timers when Operator.close() is called

2020-11-11 Thread Boyuan Zhang
Thanks, Aljoscha!

Manually draining processing time timers during operator.close() is my
current workaround as well. It's just not efficient for me since I may set
the processing time timer for the callback after 5 mins but now I need to
fire them immediately.

https://issues.apache.org/jira/browse/FLINK-18647 is really helpful and
looking forward to the solution.

Thanks for your help!


On Wed, Nov 11, 2020 at 8:13 AM Aljoscha Krettek 
wrote:

> Hi!
>
> This is an interesting topic and we recently created a Jira issue about
> this: https://issues.apache.org/jira/browse/FLINK-18647.
>
> In Beam we even have a workaround for this:
>
> https://github.com/apache/beam/blob/0c01636fc8610414859d946cb93eabc904123fc8/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java#L581
>
> Maybe it's time to finally address this in Flink as well.
>
> Best,
> Aljoscha
>
>
> On 11.11.20 01:02, Boyuan Zhang wrote:
> > Hi team,
> >
> > I'm writing my custom Operator as a high fan-out operation and I use
> > processing time timers to defer processing some inputs When timers are
> > firing, the operator will continue to process the deferred elements. One
> > typical use case for my Operator is like:
> > ImpulseOperator -> my Operator -> downstream where the watermark of
> > ImpulseOperator advances to MAX_TIMESTAMP immediately.
> >
> > One problem I have is that after my operator.close() is called, it's
> still
> > possible for me to set processing time timers and wait for these timers
> to
> > be fired. But it seems like Flink pauses invoking processing timers once
> > one operator.close() is called in the new version. I'm curious why Flink
> > decides to do so and any workaround I can do for my operator?
> >
> > Thanks for your help!
> >
>
>


Re: Register processing time timers when Operator.close() is called

2020-11-11 Thread Aljoscha Krettek

Hi!

This is an interesting topic and we recently created a Jira issue about 
this: https://issues.apache.org/jira/browse/FLINK-18647.


In Beam we even have a workaround for this: 
https://github.com/apache/beam/blob/0c01636fc8610414859d946cb93eabc904123fc8/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java#L581


Maybe it's time to finally address this in Flink as well.

Best,
Aljoscha


On 11.11.20 01:02, Boyuan Zhang wrote:

Hi team,

I'm writing my custom Operator as a high fan-out operation and I use
processing time timers to defer processing some inputs When timers are
firing, the operator will continue to process the deferred elements. One
typical use case for my Operator is like:
ImpulseOperator -> my Operator -> downstream where the watermark of
ImpulseOperator advances to MAX_TIMESTAMP immediately.

One problem I have is that after my operator.close() is called, it's still
possible for me to set processing time timers and wait for these timers to
be fired. But it seems like Flink pauses invoking processing timers once
one operator.close() is called in the new version. I'm curious why Flink
decides to do so and any workaround I can do for my operator?

Thanks for your help!





Register processing time timers when Operator.close() is called

2020-11-10 Thread Boyuan Zhang
Hi team,

I'm writing my custom Operator as a high fan-out operation and I use
processing time timers to defer processing some inputs When timers are
firing, the operator will continue to process the deferred elements. One
typical use case for my Operator is like:
ImpulseOperator -> my Operator -> downstream where the watermark of
ImpulseOperator advances to MAX_TIMESTAMP immediately.

One problem I have is that after my operator.close() is called, it's still
possible for me to set processing time timers and wait for these timers to
be fired. But it seems like Flink pauses invoking processing timers once
one operator.close() is called in the new version. I'm curious why Flink
decides to do so and any workaround I can do for my operator?

Thanks for your help!


[jira] [Created] (FLINK-20077) Cannot register a view with MATCH_RECOGNIZE clause

2020-11-10 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-20077:


 Summary: Cannot register a view with MATCH_RECOGNIZE clause
 Key: FLINK-20077
 URL: https://issues.apache.org/jira/browse/FLINK-20077
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.11.0, 1.12.0
Reporter: Dawid Wysakowicz
Assignee: Dawid Wysakowicz
 Fix For: 1.12.0


{code}
TableEnvironment env = 
TableEnvironment.create(EnvironmentSettings.newInstance()
.useBlinkPlanner()
.build());

env.executeSql("" +
"CREATE TEMPORARY TABLE data (\n" +
"id INT,\n" +
"ts AS PROCTIME()\n" +
") WITH (\n" +
"'connector' = 'datagen',\n" +
"'rows-per-second' = '3',\n" +
"'fields.id.kind' = 'sequence',\n" +
"'fields.id.start' = '100',\n" +
"'fields.id.end' = '200'\n" +
")");

env.executeSql("" +
"CREATE TEMPORARY VIEW events AS \n" +
"SELECT 1 AS key, id, MOD(id, 10) AS measurement, ts 
\n" +
"FROM data");

env.executeSql("" +
"CREATE TEMPORARY VIEW foo AS \n" +
"SELECT * \n" +
"FROM events MATCH_RECOGNIZE (\n" +
"PARTITION BY key \n" +
"ORDER BY ts ASC \n" +
"MEASURES \n" +
"  this_step.id as startId,\n" +
"  next_step.id as nextId,\n" +
"  this_step.ts AS ts1,\n" +
"  next_step.ts AS ts2,\n" +
"  next_step.measurement - this_step.measurement AS 
diff \n" +
"AFTER MATCH SKIP TO NEXT ROW \n" +
"PATTERN (this_step next_step)\n" +
"DEFINE this_step AS TRUE,\n" +
"next_step AS TRUE\n" +
")");

env.executeSql("SELECT * FROM foo");
{code}

fails with: 
{code}
java.lang.AssertionError
at 
org.apache.calcite.sql.SqlMatchRecognize$SqlMatchRecognizeOperator.createCall(SqlMatchRecognize.java:274)
at 
org.apache.calcite.sql.util.SqlShuttle$CallCopyingArgHandler.result(SqlShuttle.java:117)
at 
org.apache.calcite.sql.util.SqlShuttle$CallCopyingArgHandler.result(SqlShuttle.java:101)
at org.apache.calcite.sql.util.SqlShuttle.visit(SqlShuttle.java:67)
at 
org.apache.flink.table.planner.utils.Expander$Expanded$1.visit(Expander.java:153)
at 
org.apache.flink.table.planner.utils.Expander$Expanded$1.visit(Expander.java:130)
at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
at 
org.apache.calcite.sql.util.SqlShuttle$CallCopyingArgHandler.visitChild(SqlShuttle.java:134)
at 
org.apache.calcite.sql.util.SqlShuttle$CallCopyingArgHandler.visitChild(SqlShuttle.java:101)
at org.apache.calcite.sql.SqlOperator.acceptCall(SqlOperator.java:879)
at 
org.apache.calcite.sql.SqlSelectOperator.acceptCall(SqlSelectOperator.java:133)
at org.apache.calcite.sql.util.SqlShuttle.visit(SqlShuttle.java:66)
at 
org.apache.flink.table.planner.utils.Expander$Expanded$1.visit(Expander.java:153)
at 
org.apache.flink.table.planner.utils.Expander$Expanded$1.visit(Expander.java:130)
at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
at 
org.apache.flink.table.planner.utils.Expander$Expanded.substitute(Expander.java:168)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convertViewQuery(SqlToOperationConverter.java:728)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convertCreateView(SqlToOperationConverter.java:699)
at 
org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:226)
at 
org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:78)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:6

[jira] [Created] (FLINK-19677) TaskManager takes abnormally long time to register with JobManager on Kubernetes

2020-10-16 Thread Weike Dong (Jira)
Weike Dong created FLINK-19677:
--

 Summary: TaskManager takes abnormally long time to register with 
JobManager on Kubernetes
 Key: FLINK-19677
 URL: https://issues.apache.org/jira/browse/FLINK-19677
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Task
Affects Versions: 1.11.2, 1.11.1, 1.11.0
Reporter: Weike Dong


During the registration process of TaskManager, JobManager would create a 

_TaskManagerLocation_ instance, which tries to get hostname of the TaskManager 
via reverse DNS lookup.

However, this always fails in Kubernetes environment, because for pods that are 
not exposed by Services, their IPs cannot be resolved to domains by coredns, 
and _InetAddress#getCanonicalHostName()_ would take ~5 seconds to return, 
blocking the whole registration process.

Therefore Flink should provide a configuration parameter to turn off reverse 
DNS lookup. Also, even when hostname is actually needed, this could be done 
lazily to avoid blocking registration of other TaskManagers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19422) Avro Confluent Schema Registry nightly end-to-end test failed with "Register operation timed out; error code: 50002"

2020-09-25 Thread Dian Fu (Jira)
Dian Fu created FLINK-19422:
---

 Summary: Avro Confluent Schema Registry nightly end-to-end test 
failed with "Register operation timed out; error code: 50002"
 Key: FLINK-19422
 URL: https://issues.apache.org/jira/browse/FLINK-19422
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.12.0
Reporter: Dian Fu


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=6955&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=ff888d9b-cd34-53cc-d90f-3e446d355529

{code}
2020-09-25T14:09:18.2560779Z Caused by: java.io.IOException: Could not register 
schema in registry
2020-09-25T14:09:18.2561395Zat 
org.apache.flink.formats.avro.registry.confluent.ConfluentSchemaRegistryCoder.writeSchema(ConfluentSchemaRegistryCoder.java:91)
 ~[?:?]
2020-09-25T14:09:18.2562127Zat 
org.apache.flink.formats.avro.RegistryAvroSerializationSchema.serialize(RegistryAvroSerializationSchema.java:82)
 ~[?:?]
2020-09-25T14:09:18.2562883Zat 
org.apache.flink.streaming.connectors.kafka.internals.KafkaSerializationSchemaWrapper.serialize(KafkaSerializationSchemaWrapper.java:71)
 ~[?:?]
2020-09-25T14:09:18.2563622Zat 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.invoke(FlinkKafkaProducer.java:866)
 ~[?:?]
2020-09-25T14:09:18.2564255Zat 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.invoke(FlinkKafkaProducer.java:99)
 ~[?:?]
2020-09-25T14:09:18.2565375Zat 
org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction.invoke(TwoPhaseCommitSinkFunction.java:235)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-09-25T14:09:18.2566540Zat 
org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:56)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-09-25T14:09:18.2567692Zat 
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:71)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-09-25T14:09:18.2568852Zat 
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:46)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-09-25T14:09:18.2570022Zat 
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:26)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-09-25T14:09:18.2571315Zat 
org.apache.flink.streaming.runtime.tasks.BroadcastingOutputCollector.collect(BroadcastingOutputCollector.java:76)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-09-25T14:09:18.2572586Zat 
org.apache.flink.streaming.runtime.tasks.BroadcastingOutputCollector.collect(BroadcastingOutputCollector.java:32)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-09-25T14:09:18.2573736Zat 
org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:52)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-09-25T14:09:18.2574824Zat 
org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:30)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-09-25T14:09:18.2576019Zat 
org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collect(StreamSourceContexts.java:104)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-09-25T14:09:18.2577309Zat 
org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collectWithTimestamp(StreamSourceContexts.java:111)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-09-25T14:09:18.2578256Zat 
org.apache.flink.streaming.connectors.kafka.internals.AbstractFetcher.emitRecordsWithTimestamps(AbstractFetcher.java:352)
 ~[?:?]
2020-09-25T14:09:18.2579003Zat 
org.apache.flink.streaming.connectors.kafka.internal.KafkaFetcher.partitionConsumerRecordsHandler(KafkaFetcher.java:185)
 ~[?:?]
2020-09-25T14:09:18.2579687Zat 
org.apache.flink.streaming.connectors.kafka.internal.KafkaFetcher.runFetchLoop(KafkaFetcher.java:141)
 ~[?:?]
2020-09-25T14:09:18.2580733Zat 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:755)
 ~[?:?]
2020-09-25T14:09:18.2582441Zat 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-09-25T14:09:18.2583534Zat 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63) 
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-09-25T14:09:18.2584695Zat 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:213)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-09-25T14:09:18.2585556Z Caused by: 
io.confluent.kafka.schemaregistry.client.rest.exceptio

[jira] [Created] (FLINK-19166) StreamingFileWriter should register Listener before the initialization of buckets

2020-09-08 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-19166:


 Summary: StreamingFileWriter should register Listener before the 
initialization of buckets
 Key: FLINK-19166
 URL: https://issues.apache.org/jira/browse/FLINK-19166
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / Runtime
Affects Versions: 1.11.1
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: 1.11.2


In 
[http://apache-flink.147419.n8.nabble.com/StreamingFileSink-hive-metadata-td6898.html]

The feedback of User indicates that some partitions have not been committed 
since the job failed.

This maybe due to FLINK-18110, in FLINK-18110, it has fixed Buckets, but forgot 
fixing {{StreamingFileWriter}} , it should register Listener before the 
initialization of buckets, otherwise, will loose listening too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-19012) E2E test fails with "Cannot register Closeable, this subtaskCheckpointCoordinator is already closed. Closing argument."

2020-08-21 Thread Robert Metzger (Jira)
Robert Metzger created FLINK-19012:
--

 Summary: E2E test fails with "Cannot register Closeable, this 
subtaskCheckpointCoordinator is already closed. Closing argument."
 Key: FLINK-19012
 URL: https://issues.apache.org/jira/browse/FLINK-19012
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing, Runtime / Task, Tests
Affects Versions: 1.12.0
Reporter: Robert Metzger


Note: This error occurred in a custom branch with unreviewed changes. I don't 
believe my changes affect this error, but I would keep this in mind when 
investigating the error: 
https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8307&view=logs&j=1f3ed471-1849-5d3c-a34c-19792af4ad16&t=0d2e35fc-a330-5cf2-a012-7267e2667b1d
 
{code}
2020-08-20T20:55:30.2400645Z 2020-08-20 20:55:22,373 INFO  
org.apache.flink.runtime.taskmanager.Task[] - Registering 
task at network: Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
(cbc357ccb763df2852fee8c4fc7d55f2_0_0) [DEPLOYING].
2020-08-20T20:55:30.2402392Z 2020-08-20 20:55:22,401 INFO  
org.apache.flink.streaming.runtime.tasks.StreamTask  [] - No state 
backend has been configured, using default (Memory / JobManager) 
MemoryStateBackend (data in heap memory / checkpoints to JobManager) 
(checkpoints: 'null', savepoints: 'null', asynchronous: TRUE, maxStateSize: 
5242880)
2020-08-20T20:55:30.2404297Z 2020-08-20 20:55:22,413 INFO  
org.apache.flink.runtime.taskmanager.Task[] - Source: 
Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
(cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from DEPLOYING to RUNNING.
2020-08-20T20:55:30.2405805Z 2020-08-20 20:55:22,786 INFO  
org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
 [] - Pinging Elasticsearch cluster via hosts [http://127.0.0.1:9200] ...
2020-08-20T20:55:30.2407027Z 2020-08-20 20:55:22,848 INFO  
org.apache.flink.streaming.connectors.elasticsearch6.Elasticsearch6ApiCallBridge
 [] - Elasticsearch RestHighLevelClient is connected to [http://127.0.0.1:9200]
2020-08-20T20:55:30.2409277Z 2020-08-20 20:55:29,205 INFO  
org.apache.flink.runtime.checkpoint.channel.ChannelStateWriteRequestExecutorImpl
 [] - Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) discarding 0 
drained requests
2020-08-20T20:55:30.2410690Z 2020-08-20 20:55:29,218 INFO  
org.apache.flink.runtime.taskmanager.Task[] - Source: 
Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
(cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from RUNNING to FINISHED.
2020-08-20T20:55:30.2412187Z 2020-08-20 20:55:29,218 INFO  
org.apache.flink.runtime.taskmanager.Task[] - Freeing task 
resources for Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
(cbc357ccb763df2852fee8c4fc7d55f2_0_0).
2020-08-20T20:55:30.2414203Z 2020-08-20 20:55:29,224 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - 
Un-registering task and sending final execution state FINISHED to JobManager 
for task Source: Sequence Source -> Flat Map -> Sink: Unnamed (1/1) 
cbc357ccb763df2852fee8c4fc7d55f2_0_0.
2020-08-20T20:55:30.2415602Z 2020-08-20 20:55:29,219 INFO  
org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable [] - Source: 
Sequence Source -> Flat Map -> Sink: Unnamed (1/1) - asynchronous part of 
checkpoint 1 could not be completed.
2020-08-20T20:55:30.2416411Z java.io.UncheckedIOException: java.io.IOException: 
Cannot register Closeable, this subtaskCheckpointCoordinator is already closed. 
Closing argument.
2020-08-20T20:55:30.2418956Zat 
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.lambda$registerConsumer$2(SubtaskCheckpointCoordinatorImpl.java:468)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-08-20T20:55:30.2420100Zat 
org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:91)
 [flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-08-20T20:55:30.2420927Zat 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_265]
2020-08-20T20:55:30.2421455Zat 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_265]
2020-08-20T20:55:30.2421879Zat java.lang.Thread.run(Thread.java:748) 
[?:1.8.0_265]
2020-08-20T20:55:30.2422348Z Caused by: java.io.IOException: Cannot register 
Closeable, this subtaskCheckpointCoordinator is already closed. Closing 
argument.
2020-08-20T20:55:30.2423416Zat 
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.registerAsyncCheckpointRunnable(SubtaskCheckpointCoordinatorImpl.java:378)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-08-20T20:55:30.2424635Zat 
org.apache

Re: [VOTE] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-29 Thread Jark Wu
Hi all,

The voting time for FLIP-129 has passed. I'm closing the vote now.

There were 5 +1 votes, 4 of which are binding:
- Timo (binding)
- Jark (binding)
- jincheng sun (binding)
- Leonard Xu
- Jingsong (binding)

There were no disapproving votes.

Thus, FLIP-129 has been accepted.

Thanks everyone for joining the discussion and giving feedback!

Best,
Jark

On Wed, 29 Jul 2020 at 13:12, Jingsong Li  wrote:

> +1
>
> Thanks Jark for driving this.
>
> On Mon, Jul 27, 2020 at 4:14 PM Leonard Xu  wrote:
>
> > Thanks Jark
> >
> > +1(non-binding)
> >
> > Best
> > Leonard
> > > 在 2020年7月27日,15:32,jincheng sun  写道:
> > >
> > > +1(binding)
> > >
> > > Best,
> > > Jincheng
> > >
> > >
> > > Jark Wu  于2020年7月27日周一 上午10:01写道:
> > >
> > >> +1 (binding)
> > >>
> > >> On Fri, 24 Jul 2020 at 22:22, Timo Walther 
> wrote:
> > >>
> > >>> +1
> > >>>
> > >>> Thanks for driving this Jark.
> > >>>
> > >>> Regards,
> > >>> Timo
> > >>>
> > >>> On 24.07.20 12:42, Jark Wu wrote:
> > >>>> Hi all,
> > >>>>
> > >>>> I would like to start the vote for FLIP-129 [1], which is discussed
> > and
> > >>>> reached consensus in the discussion thread [2].
> > >>>>
> > >>>> The vote will be open until 27th July (72h), unless there is an
> > >> objection
> > >>>> or not enough votes.
> > >>>>
> > >>>> Best,
> > >>>> Jark
> > >>>>
> > >>>> [1]:
> > >>>>
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API
> > >>>> [2]:
> > >>>>
> > >>>
> > >>
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-129-Refactor-Descriptor-API-to-register-connector-in-Table-API-td42995.html
> > >>>>
> > >>>
> > >>>
> > >>
> >
> >
>
> --
> Best, Jingsong Lee
>


Re: [VOTE] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-29 Thread Dawid Wysakowicz
+1

On 29/07/2020 07:11, Jingsong Li wrote:
> +1
>
> Thanks Jark for driving this.
>
> On Mon, Jul 27, 2020 at 4:14 PM Leonard Xu  wrote:
>
>> Thanks Jark
>>
>> +1(non-binding)
>>
>> Best
>> Leonard
>>> 在 2020年7月27日,15:32,jincheng sun  写道:
>>>
>>> +1(binding)
>>>
>>> Best,
>>> Jincheng
>>>
>>>
>>> Jark Wu  于2020年7月27日周一 上午10:01写道:
>>>
>>>> +1 (binding)
>>>>
>>>> On Fri, 24 Jul 2020 at 22:22, Timo Walther  wrote:
>>>>
>>>>> +1
>>>>>
>>>>> Thanks for driving this Jark.
>>>>>
>>>>> Regards,
>>>>> Timo
>>>>>
>>>>> On 24.07.20 12:42, Jark Wu wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> I would like to start the vote for FLIP-129 [1], which is discussed
>> and
>>>>>> reached consensus in the discussion thread [2].
>>>>>>
>>>>>> The vote will be open until 27th July (72h), unless there is an
>>>> objection
>>>>>> or not enough votes.
>>>>>>
>>>>>> Best,
>>>>>> Jark
>>>>>>
>>>>>> [1]:
>>>>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API
>>>>>> [2]:
>>>>>>
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-129-Refactor-Descriptor-API-to-register-connector-in-Table-API-td42995.html
>>>>>
>>



signature.asc
Description: OpenPGP digital signature


Re: [VOTE] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-28 Thread Jingsong Li
+1

Thanks Jark for driving this.

On Mon, Jul 27, 2020 at 4:14 PM Leonard Xu  wrote:

> Thanks Jark
>
> +1(non-binding)
>
> Best
> Leonard
> > 在 2020年7月27日,15:32,jincheng sun  写道:
> >
> > +1(binding)
> >
> > Best,
> > Jincheng
> >
> >
> > Jark Wu  于2020年7月27日周一 上午10:01写道:
> >
> >> +1 (binding)
> >>
> >> On Fri, 24 Jul 2020 at 22:22, Timo Walther  wrote:
> >>
> >>> +1
> >>>
> >>> Thanks for driving this Jark.
> >>>
> >>> Regards,
> >>> Timo
> >>>
> >>> On 24.07.20 12:42, Jark Wu wrote:
> >>>> Hi all,
> >>>>
> >>>> I would like to start the vote for FLIP-129 [1], which is discussed
> and
> >>>> reached consensus in the discussion thread [2].
> >>>>
> >>>> The vote will be open until 27th July (72h), unless there is an
> >> objection
> >>>> or not enough votes.
> >>>>
> >>>> Best,
> >>>> Jark
> >>>>
> >>>> [1]:
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API
> >>>> [2]:
> >>>>
> >>>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-129-Refactor-Descriptor-API-to-register-connector-in-Table-API-td42995.html
> >>>>
> >>>
> >>>
> >>
>
>

-- 
Best, Jingsong Lee


Re: [VOTE] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-27 Thread Leonard Xu
Thanks Jark
 
+1(non-binding)

Best
Leonard
> 在 2020年7月27日,15:32,jincheng sun  写道:
> 
> +1(binding)
> 
> Best,
> Jincheng
> 
> 
> Jark Wu  于2020年7月27日周一 上午10:01写道:
> 
>> +1 (binding)
>> 
>> On Fri, 24 Jul 2020 at 22:22, Timo Walther  wrote:
>> 
>>> +1
>>> 
>>> Thanks for driving this Jark.
>>> 
>>> Regards,
>>> Timo
>>> 
>>> On 24.07.20 12:42, Jark Wu wrote:
>>>> Hi all,
>>>> 
>>>> I would like to start the vote for FLIP-129 [1], which is discussed and
>>>> reached consensus in the discussion thread [2].
>>>> 
>>>> The vote will be open until 27th July (72h), unless there is an
>> objection
>>>> or not enough votes.
>>>> 
>>>> Best,
>>>> Jark
>>>> 
>>>> [1]:
>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API
>>>> [2]:
>>>> 
>>> 
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-129-Refactor-Descriptor-API-to-register-connector-in-Table-API-td42995.html
>>>> 
>>> 
>>> 
>> 



Re: [VOTE] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-27 Thread jincheng sun
+1(binding)

Best,
Jincheng


Jark Wu  于2020年7月27日周一 上午10:01写道:

> +1 (binding)
>
> On Fri, 24 Jul 2020 at 22:22, Timo Walther  wrote:
>
> > +1
> >
> > Thanks for driving this Jark.
> >
> > Regards,
> > Timo
> >
> > On 24.07.20 12:42, Jark Wu wrote:
> > > Hi all,
> > >
> > > I would like to start the vote for FLIP-129 [1], which is discussed and
> > > reached consensus in the discussion thread [2].
> > >
> > > The vote will be open until 27th July (72h), unless there is an
> objection
> > > or not enough votes.
> > >
> > > Best,
> > > Jark
> > >
> > > [1]:
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API
> > > [2]:
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-129-Refactor-Descriptor-API-to-register-connector-in-Table-API-td42995.html
> > >
> >
> >
>


Re: [VOTE] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-26 Thread Jark Wu
+1 (binding)

On Fri, 24 Jul 2020 at 22:22, Timo Walther  wrote:

> +1
>
> Thanks for driving this Jark.
>
> Regards,
> Timo
>
> On 24.07.20 12:42, Jark Wu wrote:
> > Hi all,
> >
> > I would like to start the vote for FLIP-129 [1], which is discussed and
> > reached consensus in the discussion thread [2].
> >
> > The vote will be open until 27th July (72h), unless there is an objection
> > or not enough votes.
> >
> > Best,
> > Jark
> >
> > [1]:
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API
> > [2]:
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-129-Refactor-Descriptor-API-to-register-connector-in-Table-API-td42995.html
> >
>
>


Re: [VOTE] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-24 Thread Timo Walther

+1

Thanks for driving this Jark.

Regards,
Timo

On 24.07.20 12:42, Jark Wu wrote:

Hi all,

I would like to start the vote for FLIP-129 [1], which is discussed and
reached consensus in the discussion thread [2].

The vote will be open until 27th July (72h), unless there is an objection
or not enough votes.

Best,
Jark

[1]:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API
[2]:
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-129-Refactor-Descriptor-API-to-register-connector-in-Table-API-td42995.html





[VOTE] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-24 Thread Jark Wu
Hi all,

I would like to start the vote for FLIP-129 [1], which is discussed and
reached consensus in the discussion thread [2].

The vote will be open until 27th July (72h), unless there is an objection
or not enough votes.

Best,
Jark

[1]:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API
[2]:
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-129-Refactor-Descriptor-API-to-register-connector-in-Table-API-td42995.html


Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-24 Thread Aljoscha Krettek
I'm jumping in quite late but I think overall this is a very good effort 
and it's in very good shape now.


Best,
Aljoscha

On 24.07.20 10:24, Jark Wu wrote:

Thanks Dawid,

Regarding (3), I think you mentioned it will affect other behavior, e.g.
listing tables, is a strong reason.
I will at least consider implementing it in the `TableSourceQueryOperation`
way to avoid affecting other behaviors. I have updated the FLIP.
Anyway, this is implementation detail, we can continue this discussion in
JIRA and code review.

Thank you all for the response.
It seems that everyone participated in the discussion has reached a
consensus.
I will start a vote later today.

Best,
Jark

On Fri, 24 Jul 2020 at 16:03, Leonard Xu  wrote:


Thanks Jark for the update.

The latest FLIP looks well.

I like Dawid’s proposal of TableDescriptor.

Best
Leonard Xu

在 2020年7月23日,22:56,Jark Wu  写道:

Hi Timo,

That's a good point I missed in the design. I have updated the FLIP and
added a note under the `KafkaConnector` to mention this.
I will not list all the method names in the FLIP as the design doc is
super long now.


Hi Dawid,

1) KafkaConnector not extends TableDescriptor
The reason why KafkaConnector extends TableDescriptor is that, a builder
pattern "KafkaConnector.newBuilder()...build()" should return
"KafkaConnector" in theory.
So users can write something like the following code which might be
more intuitive.

KafkaConnector kafka = KafkaConnector.newBuilder()...build();
tEnv.createTemporaryTable("MyTable", kafka);

But I agree connector implementation will be simpler if this is not
strongly needed, e.g. we don't need the generic type for descriptor,
we don't need to pass the descriptor class in the builder. So I'm also
fine to not extend it if others don't against it. What's your opinion here @Timo
Walther  ?

2) LikeOptions
I am not very satisfied with the new design. Because the API is not very
fluent. Users will be interrupted to consider what the `overwrite()`
parameter to be.
And the API design doesn't protect users from using the wrong options
before running the code.
What about to list all possible options in one level? This will be more
aligned with SQL DDL and easy to understand and use for users.

public enum LikeOption {
   INCLUDING_ALL,
   INCLUDING_CONSTRAINTS,
   INCLUDING_GENERATED,
   INCLUDING_OPTIONS,
   INCLUDING_PARTITIONS,
   INCLUDING_WATERMARKS,

   EXCLUDING_ALL,
   EXCLUDING_CONSTRAINTS,
   EXCLUDING_GENERATED,
   EXCLUDING_OPTIONS,
   EXCLUDING_PARTITIONS,
   EXCLUDING_WATERMARKS,

   OVERWRITING_GENERATED,
   OVERWRITING_OPTIONS
}

3) register the table under a generated table path
I'm afraid we have to do that. The generated table path is still needed
for `TableSourceTable#tableIdentifier` which is used to calculate the
digest.
This requires that the registered table must have an unique identifier.
The old `TableSourceQueryOperation` will also generate the identifier
according
  to the hashcode of the TableSource object. However, the generated
identifier "Unregistered_TableSource_1234" is still possible to be in
conflict with
the user's table path. Therefore, I prefer to register the generated name
in the (temporary) catalog to throw explicit exceptions, rather than
generating a wrong plan.


Hi @Leonard Xu  and @Jingsong Li
 ,

Do you have other concerns on the latest FLIP and the above discussion?

Best,
Jark

On Thu, 23 Jul 2020 at 18:26, Dawid Wysakowicz 
wrote:


Hi Jark,

Thanks for the update. I think the FLIP looks really well on the high
level.

I have a few comments to the code structure in the FLIP:

1) I really don't like how the TableDescriptor exposes protected fields.
Moreover why do we need to extend from it? I don't think we need
KafkaConnector extends TableDescriptor and alike. We only need the builders
e.g. the KafkaConnectorBuilder.

If I understand it correctly this is the interface needed from the
TableEnvironment perspective and it is the contract that the
TableEnvironment expects. I would suggest making it an interface:
@PublicEvolving
public interface TableDescriptor {
 List getPartitionedFields();
   Schema getSchema();
 Map getOptions();
 LikeOption[] getLikeOptions();
 String getLikePath();
}

Then the TableDescriptorBuilder would work with an internal
implementation of this interface
@PublicEvolving
public abstract class TableDescriptorBuilder>
{

 private final InternalTableDescriptor descriptor = new
InternalTableDescriptor();

 /**
  * Returns the this builder instance in the type of subclass.
  */
 protected abstract BUILDER self();

 /**
  * Specifies the table schema.
  */
 public BUILDER schema(Schema schema) {
 descriptor.schema = schema;
 retu

Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-24 Thread Jark Wu
Thanks Dawid,

Regarding (3), I think you mentioned it will affect other behavior, e.g.
listing tables, is a strong reason.
I will at least consider implementing it in the `TableSourceQueryOperation`
way to avoid affecting other behaviors. I have updated the FLIP.
Anyway, this is implementation detail, we can continue this discussion in
JIRA and code review.

Thank you all for the response.
It seems that everyone participated in the discussion has reached a
consensus.
I will start a vote later today.

Best,
Jark

On Fri, 24 Jul 2020 at 16:03, Leonard Xu  wrote:

> Thanks Jark for the update.
>
> The latest FLIP looks well.
>
> I like Dawid’s proposal of TableDescriptor.
>
> Best
> Leonard Xu
>
> 在 2020年7月23日,22:56,Jark Wu  写道:
>
> Hi Timo,
>
> That's a good point I missed in the design. I have updated the FLIP and
> added a note under the `KafkaConnector` to mention this.
> I will not list all the method names in the FLIP as the design doc is
> super long now.
>
> 
> Hi Dawid,
>
> 1) KafkaConnector not extends TableDescriptor
> The reason why KafkaConnector extends TableDescriptor is that, a builder
> pattern "KafkaConnector.newBuilder()...build()" should return
> "KafkaConnector" in theory.
> So users can write something like the following code which might be
> more intuitive.
>
> KafkaConnector kafka = KafkaConnector.newBuilder()...build();
> tEnv.createTemporaryTable("MyTable", kafka);
>
> But I agree connector implementation will be simpler if this is not
> strongly needed, e.g. we don't need the generic type for descriptor,
> we don't need to pass the descriptor class in the builder. So I'm also
> fine to not extend it if others don't against it. What's your opinion here 
> @Timo
> Walther  ?
>
> 2) LikeOptions
> I am not very satisfied with the new design. Because the API is not very
> fluent. Users will be interrupted to consider what the `overwrite()`
> parameter to be.
> And the API design doesn't protect users from using the wrong options
> before running the code.
> What about to list all possible options in one level? This will be more
> aligned with SQL DDL and easy to understand and use for users.
>
> public enum LikeOption {
>   INCLUDING_ALL,
>   INCLUDING_CONSTRAINTS,
>   INCLUDING_GENERATED,
>   INCLUDING_OPTIONS,
>   INCLUDING_PARTITIONS,
>   INCLUDING_WATERMARKS,
>
>   EXCLUDING_ALL,
>   EXCLUDING_CONSTRAINTS,
>   EXCLUDING_GENERATED,
>   EXCLUDING_OPTIONS,
>   EXCLUDING_PARTITIONS,
>   EXCLUDING_WATERMARKS,
>
>   OVERWRITING_GENERATED,
>   OVERWRITING_OPTIONS
> }
>
> 3) register the table under a generated table path
> I'm afraid we have to do that. The generated table path is still needed
> for `TableSourceTable#tableIdentifier` which is used to calculate the
> digest.
> This requires that the registered table must have an unique identifier.
> The old `TableSourceQueryOperation` will also generate the identifier
> according
>  to the hashcode of the TableSource object. However, the generated
> identifier "Unregistered_TableSource_1234" is still possible to be in
> conflict with
> the user's table path. Therefore, I prefer to register the generated name
> in the (temporary) catalog to throw explicit exceptions, rather than
> generating a wrong plan.
>
> 
> Hi @Leonard Xu  and @Jingsong Li
>  ,
>
> Do you have other concerns on the latest FLIP and the above discussion?
>
> Best,
> Jark
>
> On Thu, 23 Jul 2020 at 18:26, Dawid Wysakowicz 
> wrote:
>
>> Hi Jark,
>>
>> Thanks for the update. I think the FLIP looks really well on the high
>> level.
>>
>> I have a few comments to the code structure in the FLIP:
>>
>> 1) I really don't like how the TableDescriptor exposes protected fields.
>> Moreover why do we need to extend from it? I don't think we need
>> KafkaConnector extends TableDescriptor and alike. We only need the builders
>> e.g. the KafkaConnectorBuilder.
>>
>> If I understand it correctly this is the interface needed from the
>> TableEnvironment perspective and it is the contract that the
>> TableEnvironment expects. I would suggest making it an interface:
>> @PublicEvolving
>> public interface TableDescriptor {
>> List getPartitionedFields();
>>   Schema getSchema();
>> Map getOptions();
>> LikeOption[] getLikeOptions();
>> String getLikePath();
>> }
>>
>> Then the TableDescriptorBuilder would work with a

Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-24 Thread Leonard Xu
Thanks Jark for the update.

The latest FLIP looks well.

I like Dawid’s proposal of TableDescriptor.

Best
Leonard Xu

> 在 2020年7月23日,22:56,Jark Wu  写道:
> 
> Hi Timo,
> 
> That's a good point I missed in the design. I have updated the FLIP and added 
> a note under the `KafkaConnector` to mention this. 
> I will not list all the method names in the FLIP as the design doc is super 
> long now. 
> 
> 
> Hi Dawid,
> 
> 1) KafkaConnector not extends TableDescriptor
> The reason why KafkaConnector extends TableDescriptor is that, a builder 
> pattern "KafkaConnector.newBuilder()...build()" should return 
> "KafkaConnector" in theory. 
> So users can write something like the following code which might be more 
> intuitive.
> 
> KafkaConnector kafka = KafkaConnector.newBuilder()...build();
> tEnv.createTemporaryTable("MyTable", kafka);
> 
> But I agree connector implementation will be simpler if this is not strongly 
> needed, e.g. we don't need the generic type for descriptor, 
> we don't need to pass the descriptor class in the builder. So I'm also fine 
> to not extend it if others don't against it. What's your opinion here @Timo 
> Walther <mailto:twal...@apache.org> ?
> 
> 2) LikeOptions
> I am not very satisfied with the new design. Because the API is not very 
> fluent. Users will be interrupted to consider what the `overwrite()` 
> parameter to be.
> And the API design doesn't protect users from using the wrong options before 
> running the code. 
> What about to list all possible options in one level? This will be more 
> aligned with SQL DDL and easy to understand and use for users.
> 
> public enum LikeOption {
>   INCLUDING_ALL,
>   INCLUDING_CONSTRAINTS,
>   INCLUDING_GENERATED,
>   INCLUDING_OPTIONS,
>   INCLUDING_PARTITIONS,
>   INCLUDING_WATERMARKS,
> 
>   EXCLUDING_ALL,
>   EXCLUDING_CONSTRAINTS,
>   EXCLUDING_GENERATED,
>   EXCLUDING_OPTIONS,
>   EXCLUDING_PARTITIONS,
>   EXCLUDING_WATERMARKS,
> 
>   OVERWRITING_GENERATED,
>   OVERWRITING_OPTIONS
> }
> 
> 3) register the table under a generated table path
> I'm afraid we have to do that. The generated table path is still needed for 
> `TableSourceTable#tableIdentifier` which is used to calculate the digest. 
> This requires that the registered table must have an unique identifier. The 
> old `TableSourceQueryOperation` will also generate the identifier according
>  to the hashcode of the TableSource object. However, the generated identifier 
> "Unregistered_TableSource_1234" is still possible to be in conflict with 
> the user's table path. Therefore, I prefer to register the generated name in 
> the (temporary) catalog to throw explicit exceptions, rather than generating 
> a wrong plan.
> 
> 
> Hi @Leonard Xu <mailto:xbjt...@gmail.com> and @Jingsong Li 
> <mailto:jingsongl...@gmail.com> ,
> 
> Do you have other concerns on the latest FLIP and the above discussion?
> 
> Best,
> Jark
> 
> On Thu, 23 Jul 2020 at 18:26, Dawid Wysakowicz  <mailto:dwysakow...@apache.org>> wrote:
> Hi Jark,
> 
> Thanks for the update. I think the FLIP looks really well on the high level.
> 
> I have a few comments to the code structure in the FLIP:
> 
> 1) I really don't like how the TableDescriptor exposes protected fields. 
> Moreover why do we need to extend from it? I don't think we need 
> KafkaConnector extends TableDescriptor and alike. We only need the builders 
> e.g. the KafkaConnectorBuilder.
> 
> If I understand it correctly this is the interface needed from the 
> TableEnvironment perspective and it is the contract that the TableEnvironment 
> expects. I would suggest making it an interface: 
> 
> @PublicEvolving
> public interface TableDescriptor {
> List getPartitionedFields();
>   Schema getSchema();
> Map getOptions();
> LikeOption[] getLikeOptions();
> String getLikePath();
> }
> 
> Then the TableDescriptorBuilder would work with an internal implementation of 
> this interface
> 
> @PublicEvolving
> public abstract class TableDescriptorBuilder TableDescriptorBuilder> {
>  
> private final InternalTableDescriptor descriptor = new 
> InternalTableDescriptor();
>  
> /**
>  * Returns the this builder instance in the type of subclass.
>  */
> protected abstract BUILDER self();
>  
> /**
>  * Specifies the table schema.
>  */
> public BUILDER schema(Schema schema) {
> descriptor.sche

Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-24 Thread Jingsong Li
Thanks for the update.

The FLIP looks good, I think it is time to vote.

Best,
Jingsong

On Fri, Jul 24, 2020 at 3:21 PM Dawid Wysakowicz 
wrote:

> Hi Jark,
>
> Ad. 1
>
> Personally I don't see any benefit of having the specific descriptors
> (KafkaDescriptor/ElasticDescriptor/...). E.g. the KafkaDescriptor would
> not differ at all from other descriptors. I don't think returning always
> a TableDescriptor would be confusing for users. Therefore I am very much
> in favour of simplifying it.
>
> Ad. 2
>
> Honestly, here we are discussing personal preferences. Let's go with
> your original suggestion. I am not strong on this.
>
> Ad. 3
>
> My personal take is that in this case the digest should not include a
> table identifier as we don't have one. I could see the digest being
> calculated from the properties/CatalogTable. That way you would have the
> same digest for all inline tables with the same properties. In your
> approach two scans of the same descriptor will have different digests.
> (I am not saying this is particularly wrong, but different from
> scan("test") ). Similarly, we do not register filters/joins/groupings
> under generated names just to compute the digest. One additional problem
> you will have is you will need to add handling of that special tables
> when e.g. listing tables.
>
> Nevertheless I don't want to block the progress on this. I'd appreciate
> though if you consider that again during the implementation.
>
> Best,
>
> Dawid
>
> On 23/07/2020 16:56, Jark Wu wrote:
> > Hi Timo,
> >
> > That's a good point I missed in the design. I have updated the FLIP and
> > added a note under the `KafkaConnector` to mention this.
> > I will not list all the method names in the FLIP as the design doc is
> super
> > long now.
> >
> > 
> > Hi Dawid,
> >
> > 1) KafkaConnector not extends TableDescriptor
> > The reason why KafkaConnector extends TableDescriptor is that, a builder
> > pattern "KafkaConnector.newBuilder()...build()" should return
> > "KafkaConnector" in theory.
> > So users can write something like the following code which might be
> > more intuitive.
> >
> > KafkaConnector kafka = KafkaConnector.newBuilder()...build();
> > tEnv.createTemporaryTable("MyTable", kafka);
> >
> > But I agree connector implementation will be simpler if this is not
> > strongly needed, e.g. we don't need the generic type for descriptor,
> > we don't need to pass the descriptor class in the builder. So I'm also
> fine
> > to not extend it if others don't against it. What's your opinion here
> @Timo
> > Walther  ?
> >
> > 2) LikeOptions
> > I am not very satisfied with the new design. Because the API is not very
> > fluent. Users will be interrupted to consider what the `overwrite()`
> > parameter to be.
> > And the API design doesn't protect users from using the wrong options
> > before running the code.
> > What about to list all possible options in one level? This will be more
> > aligned with SQL DDL and easy to understand and use for users.
> >
> > public enum LikeOption {
> >   INCLUDING_ALL,
> >   INCLUDING_CONSTRAINTS,
> >   INCLUDING_GENERATED,
> >   INCLUDING_OPTIONS,
> >   INCLUDING_PARTITIONS,
> >   INCLUDING_WATERMARKS,
> >
> >   EXCLUDING_ALL,
> >   EXCLUDING_CONSTRAINTS,
> >   EXCLUDING_GENERATED,
> >   EXCLUDING_OPTIONS,
> >   EXCLUDING_PARTITIONS,
> >   EXCLUDING_WATERMARKS,
> >
> >   OVERWRITING_GENERATED,
> >   OVERWRITING_OPTIONS
> > }
> >
> > 3) register the table under a generated table path
> > I'm afraid we have to do that. The generated table path is still needed
> for
> > `TableSourceTable#tableIdentifier` which is used to calculate the digest.
> > This requires that the registered table must have an unique identifier.
> The
> > old `TableSourceQueryOperation` will also generate the identifier
> according
> >  to the hashcode of the TableSource object. However, the generated
> > identifier "Unregistered_TableSource_1234" is still possible to be in
> > conflict with
> > the user's table path. Therefore, I prefer to register the generated name
> > in the (temporary) catalog to throw explicit exceptions, rather than
> > generating a wrong plan.
> >
> > 
> > Hi @Leonard Xu  and @Jingsong Li <
> jingsongl..

Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-24 Thread Dawid Wysakowicz
Hi Jark,

Ad. 1

Personally I don't see any benefit of having the specific descriptors
(KafkaDescriptor/ElasticDescriptor/...). E.g. the KafkaDescriptor would
not differ at all from other descriptors. I don't think returning always
a TableDescriptor would be confusing for users. Therefore I am very much
in favour of simplifying it.

Ad. 2

Honestly, here we are discussing personal preferences. Let's go with
your original suggestion. I am not strong on this.

Ad. 3

My personal take is that in this case the digest should not include a
table identifier as we don't have one. I could see the digest being
calculated from the properties/CatalogTable. That way you would have the
same digest for all inline tables with the same properties. In your
approach two scans of the same descriptor will have different digests.
(I am not saying this is particularly wrong, but different from
scan("test") ). Similarly, we do not register filters/joins/groupings
under generated names just to compute the digest. One additional problem
you will have is you will need to add handling of that special tables
when e.g. listing tables.

Nevertheless I don't want to block the progress on this. I'd appreciate
though if you consider that again during the implementation.

Best,

Dawid

On 23/07/2020 16:56, Jark Wu wrote:
> Hi Timo,
>
> That's a good point I missed in the design. I have updated the FLIP and
> added a note under the `KafkaConnector` to mention this.
> I will not list all the method names in the FLIP as the design doc is super
> long now.
>
> 
> Hi Dawid,
>
> 1) KafkaConnector not extends TableDescriptor
> The reason why KafkaConnector extends TableDescriptor is that, a builder
> pattern "KafkaConnector.newBuilder()...build()" should return
> "KafkaConnector" in theory.
> So users can write something like the following code which might be
> more intuitive.
>
> KafkaConnector kafka = KafkaConnector.newBuilder()...build();
> tEnv.createTemporaryTable("MyTable", kafka);
>
> But I agree connector implementation will be simpler if this is not
> strongly needed, e.g. we don't need the generic type for descriptor,
> we don't need to pass the descriptor class in the builder. So I'm also fine
> to not extend it if others don't against it. What's your opinion here @Timo
> Walther  ?
>
> 2) LikeOptions
> I am not very satisfied with the new design. Because the API is not very
> fluent. Users will be interrupted to consider what the `overwrite()`
> parameter to be.
> And the API design doesn't protect users from using the wrong options
> before running the code.
> What about to list all possible options in one level? This will be more
> aligned with SQL DDL and easy to understand and use for users.
>
> public enum LikeOption {
>   INCLUDING_ALL,
>   INCLUDING_CONSTRAINTS,
>   INCLUDING_GENERATED,
>   INCLUDING_OPTIONS,
>   INCLUDING_PARTITIONS,
>   INCLUDING_WATERMARKS,
>
>   EXCLUDING_ALL,
>   EXCLUDING_CONSTRAINTS,
>   EXCLUDING_GENERATED,
>   EXCLUDING_OPTIONS,
>   EXCLUDING_PARTITIONS,
>   EXCLUDING_WATERMARKS,
>
>   OVERWRITING_GENERATED,
>   OVERWRITING_OPTIONS
> }
>
> 3) register the table under a generated table path
> I'm afraid we have to do that. The generated table path is still needed for
> `TableSourceTable#tableIdentifier` which is used to calculate the digest.
> This requires that the registered table must have an unique identifier. The
> old `TableSourceQueryOperation` will also generate the identifier according
>  to the hashcode of the TableSource object. However, the generated
> identifier "Unregistered_TableSource_1234" is still possible to be in
> conflict with
> the user's table path. Therefore, I prefer to register the generated name
> in the (temporary) catalog to throw explicit exceptions, rather than
> generating a wrong plan.
>
> 
> Hi @Leonard Xu  and @Jingsong Li 
>  ,
>
> Do you have other concerns on the latest FLIP and the above discussion?
>
> Best,
> Jark
>
> On Thu, 23 Jul 2020 at 18:26, Dawid Wysakowicz 
> wrote:
>
>> Hi Jark,
>>
>> Thanks for the update. I think the FLIP looks really well on the high
>> level.
>>
>> I have a few comments to the code structure in the FLIP:
>>
>> 1) I really don't like how the TableDescriptor exposes protected fields.
>> Moreover why do we need to extend from it? I don't think we need
>> KafkaConnector extends TableDescriptor and alike. We only need the builders
>> e.g. the KafkaConnectorBuilder.
>

Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-23 Thread Jark Wu
Hi Timo,

That's a good point I missed in the design. I have updated the FLIP and
added a note under the `KafkaConnector` to mention this.
I will not list all the method names in the FLIP as the design doc is super
long now.


Hi Dawid,

1) KafkaConnector not extends TableDescriptor
The reason why KafkaConnector extends TableDescriptor is that, a builder
pattern "KafkaConnector.newBuilder()...build()" should return
"KafkaConnector" in theory.
So users can write something like the following code which might be
more intuitive.

KafkaConnector kafka = KafkaConnector.newBuilder()...build();
tEnv.createTemporaryTable("MyTable", kafka);

But I agree connector implementation will be simpler if this is not
strongly needed, e.g. we don't need the generic type for descriptor,
we don't need to pass the descriptor class in the builder. So I'm also fine
to not extend it if others don't against it. What's your opinion here @Timo
Walther  ?

2) LikeOptions
I am not very satisfied with the new design. Because the API is not very
fluent. Users will be interrupted to consider what the `overwrite()`
parameter to be.
And the API design doesn't protect users from using the wrong options
before running the code.
What about to list all possible options in one level? This will be more
aligned with SQL DDL and easy to understand and use for users.

public enum LikeOption {
  INCLUDING_ALL,
  INCLUDING_CONSTRAINTS,
  INCLUDING_GENERATED,
  INCLUDING_OPTIONS,
  INCLUDING_PARTITIONS,
  INCLUDING_WATERMARKS,

  EXCLUDING_ALL,
  EXCLUDING_CONSTRAINTS,
  EXCLUDING_GENERATED,
  EXCLUDING_OPTIONS,
  EXCLUDING_PARTITIONS,
  EXCLUDING_WATERMARKS,

  OVERWRITING_GENERATED,
  OVERWRITING_OPTIONS
}

3) register the table under a generated table path
I'm afraid we have to do that. The generated table path is still needed for
`TableSourceTable#tableIdentifier` which is used to calculate the digest.
This requires that the registered table must have an unique identifier. The
old `TableSourceQueryOperation` will also generate the identifier according
 to the hashcode of the TableSource object. However, the generated
identifier "Unregistered_TableSource_1234" is still possible to be in
conflict with
the user's table path. Therefore, I prefer to register the generated name
in the (temporary) catalog to throw explicit exceptions, rather than
generating a wrong plan.


Hi @Leonard Xu  and @Jingsong Li 
 ,

Do you have other concerns on the latest FLIP and the above discussion?

Best,
Jark

On Thu, 23 Jul 2020 at 18:26, Dawid Wysakowicz 
wrote:

> Hi Jark,
>
> Thanks for the update. I think the FLIP looks really well on the high
> level.
>
> I have a few comments to the code structure in the FLIP:
>
> 1) I really don't like how the TableDescriptor exposes protected fields.
> Moreover why do we need to extend from it? I don't think we need
> KafkaConnector extends TableDescriptor and alike. We only need the builders
> e.g. the KafkaConnectorBuilder.
>
> If I understand it correctly this is the interface needed from the
> TableEnvironment perspective and it is the contract that the
> TableEnvironment expects. I would suggest making it an interface:
> @PublicEvolving
> public interface TableDescriptor {
> List getPartitionedFields();
>   Schema getSchema();
> Map getOptions();
> LikeOption[] getLikeOptions();
> String getLikePath();
> }
>
> Then the TableDescriptorBuilder would work with an internal implementation
> of this interface
> @PublicEvolving
> public abstract class TableDescriptorBuilder TableDescriptorBuilder>
> {
>
> private final InternalTableDescriptor descriptor = new
> InternalTableDescriptor();
>
> /**
>  * Returns the this builder instance in the type of subclass.
>  */
> protected abstract BUILDER self();
>
> /**
>  * Specifies the table schema.
>  */
> public BUILDER schema(Schema schema) {
> descriptor.schema = schema;
> return self();
> }
>
> /**
>  * Specifies the partition keys of this table.
>  */
> public BUILDER partitionedBy(String... fieldNames) {
> checkArgument(descriptor.partitionedFields.isEmpty(), 
> "partitionedBy(...)
> shouldn't be called more than once.");
> descriptor.partitionedFields.addAll(Arrays.asList(fieldNames));
> return self();
> }
>
> /**
>  * Extends some parts from the original registered table path.
>  */
> public BUILDER like(String tablePath, LikeOption... likeOptions) {
> descriptor.likePath = tablePath;
> descript

Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-23 Thread Dawid Wysakowicz
Hi Jark,

Thanks for the update. I think the FLIP looks really well on the high level.

I have a few comments to the code structure in the FLIP:

1) I really don't like how the TableDescriptor exposes protected fields.
Moreover why do we need to extend from it? I don't think we need
KafkaConnector extends TableDescriptor and alike. We only need the
builders e.g. the KafkaConnectorBuilder.

If I understand it correctly this is the interface needed from the
TableEnvironment perspective and it is the contract that the
TableEnvironment expects. I would suggest making it an interface:

|@PublicEvolving|
|public| interface |TableDescriptor {|
|    ||List getPartitionedFields();|
      |Schema getSchema();|
|    ||Map getOptions();|
|    ||LikeOption[] getLikeOptions();|
|    ||String getLikePath();|
|}|
|
|

Then the TableDescriptorBuilder would work with an internal
implementation of this interface

|@PublicEvolving|
|public| |abstract| |class| |TableDescriptorBuilder<||BUILDER ||extends|
|TableDescriptorBuilder> {|
 
|||private| |final| |InternalTableDescriptor||descriptor = new
InternalTableDescriptor();|
 
|||/**|
| ||* Returns the this builder instance in the type of subclass.|
| ||*/|
|||protected| |abstract| |BUILDER self();|
 
|||/**|
| ||* Specifies the table schema.|
| ||*/|
|||public| |BUILDER schema(Schema schema) {|
|||descriptor.schema = schema;|
|||return| |self();|
|||}|
 
|||/**|
| ||* Specifies the partition keys of this table.|
| ||*/|
|||public| |BUILDER partitionedBy(String... fieldNames) {|
|||checkArgument(descriptor.partitionedFields.isEmpty(),
||"partitionedBy(...) shouldn't be called more than once."||);|
|||descriptor.partitionedFields.addAll(Arrays.asList(fieldNames));|
|||return| |self();|
|||}|
 
|||/**|
| ||* Extends some parts from the original registered table path.|
| ||*/|
|||public| |BUILDER like(String tablePath, LikeOption... likeOptions) {|
|||descriptor.likePath = tablePath;|
|||descriptor.likeOptions = likeOptions;|
|||return| |self();|
|||}|
 
|||protected| |BUILDER option(String key, String value) {|
|||descriptor.options.put(key, value);|
|||return| |self();|
|||}|
 
|||/**|
| ||* Returns created table descriptor.|
| ||*/|
|||public| |TableDescriptor||build() {|
|||return| |descriptor;|
|||}|
|}|


2) I'm also not the biggest fun of how the LikeOptions are suggested in
the doc. Can't we have something more like

|class LikeOption {|

|    public enum MergingStrategy {
        INCLUDING,
        EXCLUDING,
        OVERWRITING
    }
|

|    public enum FeatureOption {
        ALL,
        CONSTRAINTS,
        GENERATED,
        OPTIONS,
        PARTITIONS,
        WATERMARKS
    }

    private final MergingStrategy mergingStrategy;
    private final FeatureOption featureOption;|

|
|

|    public static final LikeOption including(FeatureOption option) {|

|        return new LikeOption(MergingStrategy.INCLUDING, option);
|

|    }|

|    public static final LikeOption overwriting(FeatureOption option) {|

|        Preconditions.checkArgument(option != ALL && ...);
|

|        return new LikeOption(MergingStrategy.INCLUDING, option);
|

|    }|

||

|}|


3) TableEnvironment#from(descriptor) will register descriptor under a
system generated table path (just like TableImpl#toString) first, and
scan from the table path to derive the Table. Table#executeInsert() does
it in the similar way.

I would try not to register the table under a generated table path. Do
we really need that? I am pretty sure we can use the tables without
registering them in a catalog. Similarly to the old
TableSourceQueryOperation.

Otherwise looks good||

|Best,|

|Dawid
|

| 
|
On 23/07/2020 10:35, Timo Walther wrote:
> Hi Jark,
>
> thanks for the update. I think the FLIP is in a really good shape now
> and ready to be voted. If others have no further comments?
>
> I have one last comment around the methods of the descriptor builders.
> When refactoring classes such as `KafkaConnector` or
> `ElasticsearchConnector`. We should align the method names with the
> new property names introduced in FLIP-122:
>
> KafkaConnector.newBuilder()
>   // similar to scan.startup.mode=earliest-offset
>   .scanStartupModeEarliest()
>   // similar to sink.partitioner=round-robin
>   .sinkPartitionerRoundRobin()
>
> What do you think?
>
> Thanks for driving this,
> Timo
>
>
> On 22.07.20 17:26, Jark Wu wrote:
>> Hi all,
>>
>> After some offline discussion with other people, I'm also fine with
>> using
>> the builder pattern now,
>>   even though I still think the `.build()` method is a little verbose
>> in the
>> user code.
>>
>> I have upd

Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-23 Thread Timo Walther

Hi Jark,

thanks for the update. I think the FLIP is in a really good shape now 
and ready to be voted. If others have no further comments?


I have one last comment around the methods of the descriptor builders. 
When refactoring classes such as `KafkaConnector` or 
`ElasticsearchConnector`. We should align the method names with the new 
property names introduced in FLIP-122:


KafkaConnector.newBuilder()
  // similar to scan.startup.mode=earliest-offset
  .scanStartupModeEarliest()
  // similar to sink.partitioner=round-robin
  .sinkPartitionerRoundRobin()

What do you think?

Thanks for driving this,
Timo


On 22.07.20 17:26, Jark Wu wrote:

Hi all,

After some offline discussion with other people, I'm also fine with using
the builder pattern now,
  even though I still think the `.build()` method is a little verbose in the
user code.

I have updated the FLIP with following changes:

1) use builder pattern instead of "new" keyword. In order to avoid
duplicate code and reduce development burden for connector developers,
  I introduced abstract classes `TableDescriptorBuilder` and
`FormatDescriptorBuilder`.
 All the common methods are pre-defined in the base builder class, all
the custom descriptor builder should extend from the base builder classes.
 And we can add more methods into the base builder class in the future
without changes in the connectors.
2) use Expression instead of SQL expression string for computed column and
watermark strategy
3) use `watermark(rowtime, expr)` as the watermark method.
4) `Schema.column()` accepts `AbstractDataType` instead of `DataType`
5) drop Schema#proctime and Schema#watermarkFor#boundedOutOfOrderTimestamps

A full example will look like this:

tEnv.createTemporaryTable(
 "MyTable",
 KafkaConnector.newBuilder()
 .version("0.11")
 .topic("user_logs")
 .property("bootstrap.servers", "localhost:9092")
 .property("group.id", "test-group")
 .startFromEarliest()
 .sinkPartitionerRoundRobin()
 .format(JsonFormat.newBuilder().ignoreParseErrors(false).build())
 .schema(
 Schema.newBuilder()
 .column("user_id", DataTypes.BIGINT())
 .column("user_name", DataTypes.STRING())
 .column("score", DataTypes.DECIMAL(10, 2))
 .column("log_ts", DataTypes.STRING())
 .column("part_field_0", DataTypes.STRING())
 .column("part_field_1", DataTypes.INT())
 .column("proc", proctime()) // define a processing-time
attribute with column name "proc"
 .column("ts", toTimestamp($("log_ts")))
 .watermark("ts", $("ts").minus(lit(3).seconds()))
 .primaryKey("user_id")
 .build())
 .partitionedBy("part_field_0", "part_field_1")  // Kafka doesn't
support partitioned table yet, this is just an example for the API
 .build()
);

I hope this resolves all your concerns. Welcome for further feedback!

Updated FLIP:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API#FLIP129:RefactorDescriptorAPItoregisterconnectorsinTableAPI-FormatDescriptor&FormatDescriptorBuilder

POC:
https://github.com/wuchong/flink/tree/descriptor-POC/flink-table/flink-table-common/src/main/java/org/apache/flink/table/descriptor3

Best,
Jark

On Thu, 16 Jul 2020 at 20:18, Jark Wu  wrote:


Thank you all for the discussion!

Here are my comments:

2) I agree we should support Expression as a computed column. But I'm in
favor of Leonard's point that maybe we can also support SQL string
expression as a computed column.
Because it also keeps aligned with DDL. The concern for Expression is that
converting Expression to SQL string, or (de)serializing Expression is
another topic not clear and may involve lots of work.
Maybe we can support Expression later if time permits.

6,7) I still prefer the "new" keyword over builder. I don't think
immutable is a strong reason. I care more about usability and experience
from users and devs perspective.
   - Users need to type more words if using builder:
`KafkaConnector.newBuilder()...build()`  vs `new KafkaConnector()...`
   - It's more difficult for developers to write a descriptor.  2 classes
(KafkaConnector and KafkaConnectorBuilder) + 5 more methods (builders,
schema, partitionedBy, like, etc..).
 With the "new" keyword all the common methods are defined by the
framework.
   - It's hard to have the same API style for different connectors, because
the common methods are defined by users. For example, some may have
`withSchema`, `partitionKey`, `withLike`, etc...

Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-22 Thread Jark Wu
Hi all,

After some offline discussion with other people, I'm also fine with using
the builder pattern now,
 even though I still think the `.build()` method is a little verbose in the
user code.

I have updated the FLIP with following changes:

1) use builder pattern instead of "new" keyword. In order to avoid
duplicate code and reduce development burden for connector developers,
 I introduced abstract classes `TableDescriptorBuilder` and
`FormatDescriptorBuilder`.
All the common methods are pre-defined in the base builder class, all
the custom descriptor builder should extend from the base builder classes.
And we can add more methods into the base builder class in the future
without changes in the connectors.
2) use Expression instead of SQL expression string for computed column and
watermark strategy
3) use `watermark(rowtime, expr)` as the watermark method.
4) `Schema.column()` accepts `AbstractDataType` instead of `DataType`
5) drop Schema#proctime and Schema#watermarkFor#boundedOutOfOrderTimestamps

A full example will look like this:

tEnv.createTemporaryTable(
"MyTable",
KafkaConnector.newBuilder()
.version("0.11")
.topic("user_logs")
.property("bootstrap.servers", "localhost:9092")
.property("group.id", "test-group")
.startFromEarliest()
.sinkPartitionerRoundRobin()
.format(JsonFormat.newBuilder().ignoreParseErrors(false).build())
.schema(
Schema.newBuilder()
.column("user_id", DataTypes.BIGINT())
.column("user_name", DataTypes.STRING())
.column("score", DataTypes.DECIMAL(10, 2))
.column("log_ts", DataTypes.STRING())
.column("part_field_0", DataTypes.STRING())
.column("part_field_1", DataTypes.INT())
.column("proc", proctime()) // define a processing-time
attribute with column name "proc"
.column("ts", toTimestamp($("log_ts")))
.watermark("ts", $("ts").minus(lit(3).seconds()))
.primaryKey("user_id")
.build())
.partitionedBy("part_field_0", "part_field_1")  // Kafka doesn't
support partitioned table yet, this is just an example for the API
.build()
);

I hope this resolves all your concerns. Welcome for further feedback!

Updated FLIP:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connectors+in+Table+API#FLIP129:RefactorDescriptorAPItoregisterconnectorsinTableAPI-FormatDescriptor&FormatDescriptorBuilder

POC:
https://github.com/wuchong/flink/tree/descriptor-POC/flink-table/flink-table-common/src/main/java/org/apache/flink/table/descriptor3

Best,
Jark

On Thu, 16 Jul 2020 at 20:18, Jark Wu  wrote:

> Thank you all for the discussion!
>
> Here are my comments:
>
> 2) I agree we should support Expression as a computed column. But I'm in
> favor of Leonard's point that maybe we can also support SQL string
> expression as a computed column.
> Because it also keeps aligned with DDL. The concern for Expression is that
> converting Expression to SQL string, or (de)serializing Expression is
> another topic not clear and may involve lots of work.
> Maybe we can support Expression later if time permits.
>
> 6,7) I still prefer the "new" keyword over builder. I don't think
> immutable is a strong reason. I care more about usability and experience
> from users and devs perspective.
>   - Users need to type more words if using builder:
> `KafkaConnector.newBuilder()...build()`  vs `new KafkaConnector()...`
>   - It's more difficult for developers to write a descriptor.  2 classes
> (KafkaConnector and KafkaConnectorBuilder) + 5 more methods (builders,
> schema, partitionedBy, like, etc..).
> With the "new" keyword all the common methods are defined by the
> framework.
>   - It's hard to have the same API style for different connectors, because
> the common methods are defined by users. For example, some may have
> `withSchema`, `partitionKey`, `withLike`, etc...
>
> 8) I'm -1 to `ConfigOption`. The ConfigOption is not used on `JsonFormat`,
> but the generic `Connector#option`. This doesn't work when using format
> options.
>
> new Connector("kafka")
>  .option(JsonOptions.IGNORE_PARSE_ERRORS, true);   // this is wrong,
> because "kafka" requires "json.ignore-parse-errors" as the option key, not
> the "ignore-parse-errors".
>
>
> 
> Hi Timo, regarding having a complete new stack, I hav

Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-16 Thread Jark Wu
atermark(“colName”, Expression watermarkStrategy), I
>> think the later one has can express the meaning of “ WATERMARK FOR
>> column_name AS watermark_strategy_expression“ well.
>>
>> 5)6)7) The new keyword vs the static method vs builder pattern
>>
>> I have not strong tendency,  the new keyword and the static method on
>> descriptor can nearly treated as a builder  and do same things like
>> builder.
>> For the builder pattern, we will introduce six
>> methods(connector.Builder()、connector.Builder.build(), format.Builder(),
>> format.Builder.build(), Schema.Builder(),Schema.Builder.build() ),I think
>> we could reduce these unnecessary methods.  I ‘m slightly +1 for new
>> keyword if we need a choice.
>>
>> 8) `Connector.option(...)` class should also accept `ConfigOption`
>> I’m slightly -1 for this, ConfigOption may not work because the key for
>> format configOption has not format prefix eg: FAIL_ON_MISSING_FIELD of
>> json, we need “json.fail-on-missing-field” rather than
>> “fail-on-missing-field”.
>>
>> public static final ConfigOption FAIL_ON_MISSING_FIELD =
>> ConfigOptions
>> .key("fail-on-missing-field")
>> .booleanType()
>> .defaultValue(false)
>>
>> WDYT?
>>
>> Best,
>> Leonard Xu
>>
>>
>> > 在 2020年7月15日,16:37,Timo Walther  写道:
>> >
>> > Hi Jark,
>> >
>> > thanks for working on this issue. It is time to fix this last part of
>> inconsistency in the API. I also like the core parts of the FLIP, esp. that
>> TableDescriptor is one entity that can be passed to different methods. Here
>> is some feedback from my side:
>> >
>> > 1) +1 for just `column(...)`
>> >
>> > 2) Expression DSL vs pure SQL string for computed columns
>> > I agree with Dawid. Using the Expression DSL is desireable for a
>> consistent API. Furthermore, otherwise people need to register functions if
>> they want to use them in an expression. Refactoring TableSchema is
>> definitely on the list for 1.12. Maybe we can come up with some
>> intermediate solution where we transform the expression to a SQL expression
>> for the catalog. Until the discussions around FLIP-80 and
>> CatalogTableSchema have been finalized.
>> >
>> > 3) Schema#proctime and Schema#watermarkFor#boundedOutOfOrderTimestamps
>> > We should design the descriptor very close to the SQL syntax. The more
>> similar the syntax the more likely it is too keep the new descriptor API
>> stable.
>> >
>> > 6) static method vs new keyword
>> > Actually, the `new` keyword was one of the things that bothered me most
>> in the old design. Fluent APIs avoid this nowadays.
>> >
>> > 7) make the descriptors immutable with builders
>> > The descriptors are some kind of builders already. But they are not
>> called "builder". Instead of coming up with the new concept of a
>> "descriptor", we should use terminology that people esp. Java/Scala users
>> are familiar with already.
>> >
>> > We could make the descriptors immutable to pass them around easily.
>> >
>> > Btw "Connector" and "Format" should always be in the classname. This
>> was also a mistake in the past. Instead of calling the descriptor just
>> `Kafka` we could call it `KafkaConnector`. An entire example could look
>> like:
>> >
>> > tEnv.createTemporaryTable(
>> >   "OrdersInKafka",
>> >   KafkaConnector.newBuilder() // builder pattern supported by IDE
>> >  .topic("user_logs")
>> >  .property("bootstrap.servers", "localhost:9092")
>> >  .property("group.id", "test-group")
>> >  .format(JsonFormat.newInstance()) // shortcut for no parameters
>> >  .schema(
>> > Schema.newBuilder()
>> >.column("user_id", DataTypes.BIGINT())
>> >.column("score", DataTypes.DECIMAL(10, 2))
>> >.column("log_ts", DataTypes.TIMESTAMP(3))
>> >.column("my_ts", toTimestamp($("log_ts"))
>> >.build()
>> >  )
>> >  .build()
>> > );
>> >
>> > Instead of refacoring the existing classes, we could also think about a
>> completly new stack. I think this would avoid confusion for the old users.
>> We could deprecate the entire `Kafka` class instead of dealing with
>> backwards compatibili

Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-16 Thread Jingsong Li
Thanks for the discussion.

Descriptor lacks the watermark and the computed column is too long.

1) +1 for just `column(...)`

2) +1 for being consistent with Table API, the Java Table API should be
Expression DSL. We don't need pure string support, users should just use
DDL instead. I think this is just a schema descriptor? The schema
descriptor should be consistent with DDL, so, definitely, it should
contain computed columns information.

3) +1 for not containing Schema#proctime and
Schema#watermarkFor#boundedOutOfOrderTimestamps, we can just leave them in
legacy apis.

6,7) +1 for removing "new" and builder and making it immutable, For Jark,
the starting method is the static method, the others are not.

8) +1 for `ConfigOption`, this can be consistent with `WritableConfig`. For
Leonard, I don't think user needs “json.fail-on-missing-field” rather than
“fail-on-missing-field”, user should need “fail-on-missing-field” rather
than “json.fail-on-missing-field", the recommended way is
"JsonFormat.newInstance().option()", should configure options in the
format scope.

Best,
Jingsong

On Wed, Jul 15, 2020 at 9:30 PM Leonard Xu  wrote:

> Thanks Jark bring this discussion and organize the FLIP document.
>
> Thanks Dawid and Timo for the feedback. Here are my thoughts.
>
> 1)  I’m +1 with using column() for both cases.
>
> 2) Expression DSL vs pure SQL string for computed columns
>
> I think we can support them both and implement the pure SQL String first,
> I agree that Expression DSL brings more possibility and flexibility, but
> using SQL string is a more unified way which can reuse most logic with DDL
> like validation and persist in Catalog,
> and Converting Expression DSL to SQL Expression is another big topic and I
> did not figure out a feasible idea until now.
> So, maybe we can postpone the Expression DSL support considered the
> reality.
>
> 3) Methods Schema#proctime and
> Schema#watermarkFor#boundedOutOfOrderTimestamps
>
>  +1 with Dawid’s proposal to offer SQL like methods.
>  Schema()
> .column("proctime", proctime());
> .watermarkFor("rowtime", $("rowtime").minus(lit(3).seconds()))
> And we can simplify watermarkFor(“colName”, Expression
> watermarkStrategy)to watermark(“colName”, Expression watermarkStrategy), I
> think the later one has can express the meaning of “ WATERMARK FOR
> column_name AS watermark_strategy_expression“ well.
>
> 5)6)7) The new keyword vs the static method vs builder pattern
>
> I have not strong tendency,  the new keyword and the static method on
> descriptor can nearly treated as a builder  and do same things like
> builder.
> For the builder pattern, we will introduce six
> methods(connector.Builder()、connector.Builder.build(), format.Builder(),
> format.Builder.build(), Schema.Builder(),Schema.Builder.build() ),I think
> we could reduce these unnecessary methods.  I ‘m slightly +1 for new
> keyword if we need a choice.
>
> 8) `Connector.option(...)` class should also accept `ConfigOption`
> I’m slightly -1 for this, ConfigOption may not work because the key for
> format configOption has not format prefix eg: FAIL_ON_MISSING_FIELD of
> json, we need “json.fail-on-missing-field” rather than
> “fail-on-missing-field”.
>
> public static final ConfigOption FAIL_ON_MISSING_FIELD =
> ConfigOptions
> .key("fail-on-missing-field")
> .booleanType()
> .defaultValue(false)
>
> WDYT?
>
> Best,
> Leonard Xu
>
>
> > 在 2020年7月15日,16:37,Timo Walther  写道:
> >
> > Hi Jark,
> >
> > thanks for working on this issue. It is time to fix this last part of
> inconsistency in the API. I also like the core parts of the FLIP, esp. that
> TableDescriptor is one entity that can be passed to different methods. Here
> is some feedback from my side:
> >
> > 1) +1 for just `column(...)`
> >
> > 2) Expression DSL vs pure SQL string for computed columns
> > I agree with Dawid. Using the Expression DSL is desireable for a
> consistent API. Furthermore, otherwise people need to register functions if
> they want to use them in an expression. Refactoring TableSchema is
> definitely on the list for 1.12. Maybe we can come up with some
> intermediate solution where we transform the expression to a SQL expression
> for the catalog. Until the discussions around FLIP-80 and
> CatalogTableSchema have been finalized.
> >
> > 3) Schema#proctime and Schema#watermarkFor#boundedOutOfOrderTimestamps
> > We should design the descriptor very close to the SQL syntax. The more
> similar the syntax the more likely it is too keep the new descriptor API
> stable.
> >
> > 6) static method vs new keyword
> > Actually, th

Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-15 Thread Leonard Xu
Thanks Jark bring this discussion and organize the FLIP document.

Thanks Dawid and Timo for the feedback. Here are my thoughts.

1)  I’m +1 with using column() for both cases.

2) Expression DSL vs pure SQL string for computed columns

I think we can support them both and implement the pure SQL String first,
I agree that Expression DSL brings more possibility and flexibility, but using 
SQL string is a more unified way which can reuse most logic with DDL like 
validation and persist in Catalog, 
and Converting Expression DSL to SQL Expression is another big topic and I did 
not figure out a feasible idea until now.
So, maybe we can postpone the Expression DSL support considered the reality.

3) Methods Schema#proctime and Schema#watermarkFor#boundedOutOfOrderTimestamps

 +1 with Dawid’s proposal to offer SQL like methods. 
 Schema()
.column("proctime", proctime());
.watermarkFor("rowtime", $("rowtime").minus(lit(3).seconds()))
And we can simplify watermarkFor(“colName”, Expression watermarkStrategy)to 
watermark(“colName”, Expression watermarkStrategy), I think the later one has 
can express the meaning of “ WATERMARK FOR column_name AS 
watermark_strategy_expression“ well.

5)6)7) The new keyword vs the static method vs builder pattern

I have not strong tendency,  the new keyword and the static method on 
descriptor can nearly treated as a builder  and do same things like builder. 
For the builder pattern, we will introduce six 
methods(connector.Builder()、connector.Builder.build(), format.Builder(), 
format.Builder.build(), Schema.Builder(),Schema.Builder.build() ),I think we 
could reduce these unnecessary methods.  I ‘m slightly +1 for new keyword if we 
need a choice.

8) `Connector.option(...)` class should also accept `ConfigOption`
I’m slightly -1 for this, ConfigOption may not work because the key for format 
configOption has not format prefix eg: FAIL_ON_MISSING_FIELD of json, we need 
“json.fail-on-missing-field” rather than “fail-on-missing-field”.

public static final ConfigOption FAIL_ON_MISSING_FIELD = ConfigOptions
.key("fail-on-missing-field")
.booleanType()
.defaultValue(false)

WDYT?

Best,
Leonard Xu


> 在 2020年7月15日,16:37,Timo Walther  写道:
> 
> Hi Jark,
> 
> thanks for working on this issue. It is time to fix this last part of 
> inconsistency in the API. I also like the core parts of the FLIP, esp. that 
> TableDescriptor is one entity that can be passed to different methods. Here 
> is some feedback from my side:
> 
> 1) +1 for just `column(...)`
> 
> 2) Expression DSL vs pure SQL string for computed columns
> I agree with Dawid. Using the Expression DSL is desireable for a consistent 
> API. Furthermore, otherwise people need to register functions if they want to 
> use them in an expression. Refactoring TableSchema is definitely on the list 
> for 1.12. Maybe we can come up with some intermediate solution where we 
> transform the expression to a SQL expression for the catalog. Until the 
> discussions around FLIP-80 and CatalogTableSchema have been finalized.
> 
> 3) Schema#proctime and Schema#watermarkFor#boundedOutOfOrderTimestamps
> We should design the descriptor very close to the SQL syntax. The more 
> similar the syntax the more likely it is too keep the new descriptor API 
> stable.
> 
> 6) static method vs new keyword
> Actually, the `new` keyword was one of the things that bothered me most in 
> the old design. Fluent APIs avoid this nowadays.
> 
> 7) make the descriptors immutable with builders
> The descriptors are some kind of builders already. But they are not called 
> "builder". Instead of coming up with the new concept of a "descriptor", we 
> should use terminology that people esp. Java/Scala users are familiar with 
> already.
> 
> We could make the descriptors immutable to pass them around easily.
> 
> Btw "Connector" and "Format" should always be in the classname. This was also 
> a mistake in the past. Instead of calling the descriptor just `Kafka` we 
> could call it `KafkaConnector`. An entire example could look like:
> 
> tEnv.createTemporaryTable(
>   "OrdersInKafka",
>   KafkaConnector.newBuilder() // builder pattern supported by IDE
>  .topic("user_logs")
>  .property("bootstrap.servers", "localhost:9092")
>  .property("group.id", "test-group")
>  .format(JsonFormat.newInstance()) // shortcut for no parameters
>  .schema(
> Schema.newBuilder()
>.column("user_id", DataTypes.BIGINT())
>.column("score", DataTypes.DECIMAL(10, 2))
>.column("log_ts", DataTypes.TIMESTAMP(3))
>.column("my_ts", toTimestamp($("

Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-15 Thread Timo Walther

Hi Jark,

thanks for working on this issue. It is time to fix this last part of 
inconsistency in the API. I also like the core parts of the FLIP, esp. 
that TableDescriptor is one entity that can be passed to different 
methods. Here is some feedback from my side:


1) +1 for just `column(...)`

2) Expression DSL vs pure SQL string for computed columns
I agree with Dawid. Using the Expression DSL is desireable for a 
consistent API. Furthermore, otherwise people need to register functions 
if they want to use them in an expression. Refactoring TableSchema is 
definitely on the list for 1.12. Maybe we can come up with some 
intermediate solution where we transform the expression to a SQL 
expression for the catalog. Until the discussions around FLIP-80 and 
CatalogTableSchema have been finalized.


3) Schema#proctime and Schema#watermarkFor#boundedOutOfOrderTimestamps
We should design the descriptor very close to the SQL syntax. The more 
similar the syntax the more likely it is too keep the new descriptor API 
stable.


6) static method vs new keyword
Actually, the `new` keyword was one of the things that bothered me most 
in the old design. Fluent APIs avoid this nowadays.


7) make the descriptors immutable with builders
The descriptors are some kind of builders already. But they are not 
called "builder". Instead of coming up with the new concept of a 
"descriptor", we should use terminology that people esp. Java/Scala 
users are familiar with already.


We could make the descriptors immutable to pass them around easily.

Btw "Connector" and "Format" should always be in the classname. This was 
also a mistake in the past. Instead of calling the descriptor just 
`Kafka` we could call it `KafkaConnector`. An entire example could look 
like:


tEnv.createTemporaryTable(
   "OrdersInKafka",
   KafkaConnector.newBuilder() // builder pattern supported by IDE
  .topic("user_logs")
  .property("bootstrap.servers", "localhost:9092")
  .property("group.id", "test-group")
  .format(JsonFormat.newInstance()) // shortcut for no parameters
  .schema(
 Schema.newBuilder()
.column("user_id", DataTypes.BIGINT())
.column("score", DataTypes.DECIMAL(10, 2))
.column("log_ts", DataTypes.TIMESTAMP(3))
.column("my_ts", toTimestamp($("log_ts"))
.build()
  )
  .build()
);

Instead of refacoring the existing classes, we could also think about a 
completly new stack. I think this would avoid confusion for the old 
users. We could deprecate the entire `Kafka` class instead of dealing 
with backwards compatibility.


8) minor extensions
A general `Connector.option(...)` class should also accept 
`ConfigOption` instead of only strings.
A `Schema.column()` should accept `AbstractDataType` that can be 
resolved to a `DataType` by access to a `DataTypeFactory`.


What do you think?

Thanks,
Timo


On 09.07.20 18:51, Jark Wu wrote:

Hi Dawid,

Thanks for the great feedback! Here are my responses:

1) computedColumn(..) vs column(..)
I'm fine to use `column(..)` in both cases.

2) Expression DSL vs pure SQL string for computed columns
This is a good point. Actually, I also prefer to use Expression DSL because
this is more Table API style.
However, this requires to modify TableSchema again to accept & expose
Expression as computed columns.
I'm not convinced about this, because AFAIK, we want to have a
CatalogTableSchema to hold this information
and don't want to extend TableSchema. Maybe Timo can give some points here.
Besides, this will make the descriptor API can't be persisted in Catalog
unless FLIP-80 is done.

3) Schema#proctime and Schema#watermarkFor#boundedOutOfOrderTimestamps
The original intention behind these APIs are providing shortcut APIs for
Table API users.
But I'm also fine to only provide the DDL-like methods if you have
concerns. We can discuss shortcuts in the future if users request.

4) LikeOption
LikeOption.INCLUDING.ALL is a constant (enum values). I have added more
description about this in the FLIP.

5) implementation?
I don't want to mention too much about implementation details in the FLIP
at the beginning, because the API is already very long.
But I also added an "Implementation" section to explain them.

6) static method vs new keyword
Personally I prefer the new keyword because it makes the API cleaner. If we
want remove new keyword and use static methods, we have to:
Either adding a `Schema.builder()/create()` method as the starting method,
Or duplicating all the methods as static methods, e.g. we have 12 methods
in `Kafka`, any of them can be a starting method, then we will have 24
methods in `Kafka`.
Both are not good, and it's hard to keep all the descriptors having the
same starting method name, but all the des

Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-09 Thread Jark Wu
Hi Dawid,

Thanks for the great feedback! Here are my responses:

1) computedColumn(..) vs column(..)
I'm fine to use `column(..)` in both cases.

2) Expression DSL vs pure SQL string for computed columns
This is a good point. Actually, I also prefer to use Expression DSL because
this is more Table API style.
However, this requires to modify TableSchema again to accept & expose
Expression as computed columns.
I'm not convinced about this, because AFAIK, we want to have a
CatalogTableSchema to hold this information
and don't want to extend TableSchema. Maybe Timo can give some points here.
Besides, this will make the descriptor API can't be persisted in Catalog
unless FLIP-80 is done.

3) Schema#proctime and Schema#watermarkFor#boundedOutOfOrderTimestamps
The original intention behind these APIs are providing shortcut APIs for
Table API users.
But I'm also fine to only provide the DDL-like methods if you have
concerns. We can discuss shortcuts in the future if users request.

4) LikeOption
LikeOption.INCLUDING.ALL is a constant (enum values). I have added more
description about this in the FLIP.

5) implementation?
I don't want to mention too much about implementation details in the FLIP
at the beginning, because the API is already very long.
But I also added an "Implementation" section to explain them.

6) static method vs new keyword
Personally I prefer the new keyword because it makes the API cleaner. If we
want remove new keyword and use static methods, we have to:
Either adding a `Schema.builder()/create()` method as the starting method,
Or duplicating all the methods as static methods, e.g. we have 12 methods
in `Kafka`, any of them can be a starting method, then we will have 24
methods in `Kafka`.
Both are not good, and it's hard to keep all the descriptors having the
same starting method name, but all the descriptors can start from the same
new keyword.

Best,
Jark

On Thu, 9 Jul 2020 at 15:48, Dawid Wysakowicz 
wrote:

> Correction to my point 4. The example is correct. I did not read it
> carefully enough. Sorry for the confusion. Nevertheless I'd still like
> to see a bit more explanation on the LikeOptions.
>
> On 07/07/2020 04:32, Jark Wu wrote:
> > Hi everyone,
> >
> > Leonard and I prepared a FLIP about refactoring current Descriptor API,
> > i.e. TableEnvironment#connect(). We would like to propose a new
> descriptor
> > API to register connectors in Table API.
> >
> > Since Flink 1.9, the community focused more on the new SQL DDL feature.
> > After a series of releases, the SQL DDL is powerful and has many rich
> > features now. However, Descriptor API (the `TableEnvironment#connect()`)
> > has been stagnant for a long time and missing lots of core features, such
> > as computed columns and primary keys. That's frustrating for Table API
> > users who want to register tables programmatically. Besides, currently, a
> > connector must implement a corresponding Descriptor (e.g. `new Kafka()`)
> > before using the "connect" API. Therefore, we hope to reduce this effort
> > for connector developers, that custom source/sinks can be registered via
> > the descriptor API without implementing a Descriptor.
> >
> > These are the problems we want to resolve in this FLIP. I'm looking
> forward
> > to your comments.
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connector+for+Table+API
> >
> > Best,
> > Jark
> >
>
>


Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-09 Thread Dawid Wysakowicz
Correction to my point 4. The example is correct. I did not read it
carefully enough. Sorry for the confusion. Nevertheless I'd still like
to see a bit more explanation on the LikeOptions.

On 07/07/2020 04:32, Jark Wu wrote:
> Hi everyone,
>
> Leonard and I prepared a FLIP about refactoring current Descriptor API,
> i.e. TableEnvironment#connect(). We would like to propose a new descriptor
> API to register connectors in Table API.
>
> Since Flink 1.9, the community focused more on the new SQL DDL feature.
> After a series of releases, the SQL DDL is powerful and has many rich
> features now. However, Descriptor API (the `TableEnvironment#connect()`)
> has been stagnant for a long time and missing lots of core features, such
> as computed columns and primary keys. That's frustrating for Table API
> users who want to register tables programmatically. Besides, currently, a
> connector must implement a corresponding Descriptor (e.g. `new Kafka()`)
> before using the "connect" API. Therefore, we hope to reduce this effort
> for connector developers, that custom source/sinks can be registered via
> the descriptor API without implementing a Descriptor.
>
> These are the problems we want to resolve in this FLIP. I'm looking forward
> to your comments.
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connector+for+Table+API
>
> Best,
> Jark
>



signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-09 Thread Dawid Wysakowicz
Hi Jark,

Thanks for starting the discussion. I think this is an important effort.
I really like the general concept, but I have a few comments to the
details of the FLIP.

1) I don't see any benefit in differentiating the computedColumn vs
column in the method name. It does add cognitive burden. Can we simply
have column in both cases?

2) I think we should use the Expression DSL for defining the expressions
of computed columns instead of just pure strings.

3) I am not convinced of having the Schema#proctime and
Schema#watermarkFor#boundedOutOfOrderTimestamps methods. That would
again make it different from the SQL DDL where we do:

|    proctime AS proctime()
    WATERMARK FOR rowtime AS rowtime - INTERVAL '3' SECONDS
|

respectively. Even if we do provide that helper methods I think the SQL
way should be the recommended approach. I would rather see it as:

|   Schema()||
||    .column("proctime", proctime());||
||    .watermarkFor("rowtime", $("rowtime").minus(lit(3).seconds())) //
or .watermarkFor("rowtime", $("rowtime").minus(Duration.ofSeconds(3)))
once we properly support interval types|

4) I think the section about LIKE clause requires a second pass through.
The example is wrong. Moreover I am not sure what is the
LikeOption.INCLUDING.ALL? Is this a constant? Is this some kind of a
builder?

5) The classes like TableDescriptor/Schema described in the FLIP cover
the user facing helper methods. They do not define though the contract
which the planner/TableEnvironment expects. How will the planner use
those classes to create actual tables? From the point of view of the
TableEnvironment#createTable or TableEnvironment#from none of the
methods listed in the FLIP are necessary. At the same time there are no
methods to retrieve the Schema/Options etc. which are required to
actually create the Table.

6) Lastly, how about we try to remove the new keyword from the syntax?
Personally I'd very much prefer to have it like:

    Schema
  .column()
  .column()

or

    Connector
   .type("kafka") // I am not strong on the "type" keyword here. I
can also see for/of/...

Best,

Dawid

On 07/07/2020 04:32, Jark Wu wrote:
> Hi everyone,
>
> Leonard and I prepared a FLIP about refactoring current Descriptor API,
> i.e. TableEnvironment#connect(). We would like to propose a new descriptor
> API to register connectors in Table API.
>
> Since Flink 1.9, the community focused more on the new SQL DDL feature.
> After a series of releases, the SQL DDL is powerful and has many rich
> features now. However, Descriptor API (the `TableEnvironment#connect()`)
> has been stagnant for a long time and missing lots of core features, such
> as computed columns and primary keys. That's frustrating for Table API
> users who want to register tables programmatically. Besides, currently, a
> connector must implement a corresponding Descriptor (e.g. `new Kafka()`)
> before using the "connect" API. Therefore, we hope to reduce this effort
> for connector developers, that custom source/sinks can be registered via
> the descriptor API without implementing a Descriptor.
>
> These are the problems we want to resolve in this FLIP. I'm looking forward
> to your comments.
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connector+for+Table+API
>
> Best,
> Jark
>


signature.asc
Description: OpenPGP digital signature


[DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

2020-07-06 Thread Jark Wu
Hi everyone,

Leonard and I prepared a FLIP about refactoring current Descriptor API,
i.e. TableEnvironment#connect(). We would like to propose a new descriptor
API to register connectors in Table API.

Since Flink 1.9, the community focused more on the new SQL DDL feature.
After a series of releases, the SQL DDL is powerful and has many rich
features now. However, Descriptor API (the `TableEnvironment#connect()`)
has been stagnant for a long time and missing lots of core features, such
as computed columns and primary keys. That's frustrating for Table API
users who want to register tables programmatically. Besides, currently, a
connector must implement a corresponding Descriptor (e.g. `new Kafka()`)
before using the "connect" API. Therefore, we hope to reduce this effort
for connector developers, that custom source/sinks can be registered via
the descriptor API without implementing a Descriptor.

These are the problems we want to resolve in this FLIP. I'm looking forward
to your comments.

https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connector+for+Table+API

Best,
Jark


[jira] [Created] (FLINK-18275) Register the DataStream as table how to write multiple fields

2020-06-12 Thread robert (Jira)
robert created FLINK-18275:
--

 Summary: Register the DataStream as table how to write multiple 
fields
 Key: FLINK-18275
 URL: https://issues.apache.org/jira/browse/FLINK-18275
 Project: Flink
  Issue Type: Improvement
Reporter: robert


 Register the DataStream as table how to write multiple fields

 

 

Official wording:

{{tableEnv.registerDataStream("myTable2", stream, 'myLong, 'myString)}}

{{}}

{{This is a known field, if unknown, how to write}}

{{}}

{{}}

{{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-17499) LazyTimerService used to register timers via State Processing API incorectly mixes event time timers with processing time timers

2020-05-03 Thread Adam Laczynski (Jira)
Adam Laczynski created FLINK-17499:
--

 Summary: LazyTimerService used to register timers via State 
Processing API incorectly mixes event time timers with processing time timers
 Key: FLINK-17499
 URL: https://issues.apache.org/jira/browse/FLINK-17499
 Project: Flink
  Issue Type: Bug
  Components: API / State Processor
Affects Versions: 1.10.0
Reporter: Adam Laczynski


@Override
public void register*ProcessingTime*Timer(long time) {
  ensureInitialized();
  
internalTimerService.register{color:#FF}*EventTime*{color}Timer(VoidNamespace.INSTANCE,
 time);
 }

Same issue for both registerEventTimeTimer and registerProcessingTimeTimer.

https://github.com/apache/flink/blob/master/flink-libraries/flink-state-processing-api/src/main/java/org/apache/flink/state/api/output/operators/LazyTimerService.java#L62



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16692) flink joblistener can register from config

2020-03-20 Thread jackylau (Jira)
jackylau created FLINK-16692:


 Summary: flink joblistener can register from config
 Key: FLINK-16692
 URL: https://issues.apache.org/jira/browse/FLINK-16692
 Project: Flink
  Issue Type: Improvement
  Components: API / Core
Affects Versions: 1.10.0
Reporter: jackylau
 Fix For: 1.11.0


we should do as spark does ,which can register listener from conf such as 

"spark.extraListeners"。 And it will be convinient for users when users just 
want to set hook



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16117) Avoid register source in TableTestBase#addTableSource

2020-02-17 Thread Zhenghua Gao (Jira)
Zhenghua Gao created FLINK-16117:


 Summary: Avoid register source in TableTestBase#addTableSource
 Key: FLINK-16117
 URL: https://issues.apache.org/jira/browse/FLINK-16117
 Project: Flink
  Issue Type: Sub-task
Reporter: Zhenghua Gao


This affects thousands of unit tests:

1) explainSourceAsString of CatalogSourceTable changes

2)JoinTest#testUDFInJoinCondition: SQL keywords must be escaped

3) GroupWindowTest#testTimestampEventTimeTumblingGroupWindowWithProperties: 
Reference to a rowtime or proctime window required

4) SetOperatorsTest#testInWithProject: legacy type vs new type
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16029) Remove register source and sink in test cases of planner

2020-02-12 Thread Zhenghua Gao (Jira)
Zhenghua Gao created FLINK-16029:


 Summary: Remove register source and sink in test cases of planner
 Key: FLINK-16029
 URL: https://issues.apache.org/jira/browse/FLINK-16029
 Project: Flink
  Issue Type: Sub-task
Reporter: Zhenghua Gao


Many test cases of planner use TableEnvironement.registerTableSource() and 
registerTableSink() which should be avoid。We want to refactor these cases via 
TableEnvironment.connect().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15990) Remove register source and sink in ConnectTableDescriptor

2020-02-11 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-15990:


 Summary: Remove register source and sink in ConnectTableDescriptor
 Key: FLINK-15990
 URL: https://issues.apache.org/jira/browse/FLINK-15990
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Jingsong Lee
 Fix For: 1.11.0


We should always use {{ConnectTableDescriptor.createTemporaryTable}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Register user jar files in {Stream}ExecutionEnvironment

2019-12-11 Thread Jark Wu
I think configuration "pipeline.jars" [1] works for you, because SQL Client
supports --jars to load user jars and it also uses this option internally.

But I'm not expert on this, maybe Kostas and Aljoscha can give
a definitive answer.

Best,
Jark

[1]:
https://ci.apache.org/projects/flink/flink-docs-master/ops/config.html#pipeline-jars

On Wed, 11 Dec 2019 at 15:41, Jingsong Li  wrote:

> Hi Leo,
>
> I think run job with external jars is important too.
> Have you took a look to PipelineOptions.JARS in configuration?
> I think this is a way to set external jars. And SQL-CLI need it too.
>
> Best,
> Jingsong Lee
>
> On Wed, Dec 11, 2019 at 9:18 AM 50man  wrote:
>
> > Hi everyone,
> >
> >
> > I propose an important API feature to register user jar files in
> > {Stream}ExecutionEnvironment.
> >
> > The proposal is here,
> >
> > https://issues.apache.org/jira/browse/FLINK-14319
> >
> > Let's discuss it and I hope we can acquire a consent ASAP cause this
> > feature is potentially helpful for some release issues about version
> > 1.10 on sub-task FLINK-14055
> > <https://issues.apache.org/jira/browse/FLINK-14055> under task
> > FLINK-10232 <https://issues.apache.org/jira/browse/FLINK-10232> ;-).
> >
> >
> > Best,
> >
> > Leo
> >
> >
>
> --
> Best, Jingsong Lee
>


Re: [DISCUSS] Register user jar files in {Stream}ExecutionEnvironment

2019-12-10 Thread Jingsong Li
Hi Leo,

I think run job with external jars is important too.
Have you took a look to PipelineOptions.JARS in configuration?
I think this is a way to set external jars. And SQL-CLI need it too.

Best,
Jingsong Lee

On Wed, Dec 11, 2019 at 9:18 AM 50man  wrote:

> Hi everyone,
>
>
> I propose an important API feature to register user jar files in
> {Stream}ExecutionEnvironment.
>
> The proposal is here,
>
> https://issues.apache.org/jira/browse/FLINK-14319
>
> Let's discuss it and I hope we can acquire a consent ASAP cause this
> feature is potentially helpful for some release issues about version
> 1.10 on sub-task FLINK-14055
> <https://issues.apache.org/jira/browse/FLINK-14055> under task
> FLINK-10232 <https://issues.apache.org/jira/browse/FLINK-10232> ;-).
>
>
> Best,
>
> Leo
>
>

-- 
Best, Jingsong Lee


[DISCUSS] Register user jar files in {Stream}ExecutionEnvironment

2019-12-10 Thread 50man

Hi everyone,


I propose an important API feature to register user jar files in 
{Stream}ExecutionEnvironment.


The proposal is here,

https://issues.apache.org/jira/browse/FLINK-14319

Let's discuss it and I hope we can acquire a consent ASAP cause this 
feature is potentially helpful for some release issues about version 
1.10 on sub-task FLINK-14055 
<https://issues.apache.org/jira/browse/FLINK-14055> under task 
FLINK-10232 <https://issues.apache.org/jira/browse/FLINK-10232> ;-).



Best,

Leo



[jira] [Created] (FLINK-14912) register,drop, and alter catalog functions from DDL to FunctionCatalog

2019-11-21 Thread Bowen Li (Jira)
Bowen Li created FLINK-14912:


 Summary: register,drop, and alter catalog functions from DDL to 
FunctionCatalog
 Key: FLINK-14912
 URL: https://issues.apache.org/jira/browse/FLINK-14912
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Bowen Li
Assignee: Zhenqiu Huang
 Fix For: 1.10.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14911) register temp catalog functions created by DDL to FunctionCatalog

2019-11-21 Thread Bowen Li (Jira)
Bowen Li created FLINK-14911:


 Summary: register temp catalog functions created by DDL to 
FunctionCatalog
 Key: FLINK-14911
 URL: https://issues.apache.org/jira/browse/FLINK-14911
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Bowen Li
Assignee: Zhenqiu Huang
 Fix For: 1.10.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14550) can't use proctime attribute when register datastream for table and exist nested fields

2019-10-28 Thread hehuiyuan (Jira)
hehuiyuan created FLINK-14550:
-

 Summary: can't use proctime attribute when register datastream for 
table and exist nested fields
 Key: FLINK-14550
 URL: https://issues.apache.org/jira/browse/FLINK-14550
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Reporter: hehuiyuan


*_The data schame :_*

 

final String schemaString =
 
"{\"type\":\"record\",\"name\":\"GenericUser\",\"namespace\":\"org.apache.flink.formats.avro.generated\","
 +
 "\"fields\": 
[\{\"name\":\"name\",\"type\":\"string\"},\{\"name\":\"favorite_number\",\"type\":[\"int\",\"null\"]},"
 +
 
"\{\"name\":\"favorite_color\",\"type\":[\"string\",\"null\"]},\{\"name\":\"type_long_test\",\"type\":[\"long\",\"null\"]}"
 +
 
",\{\"name\":\"type_double_test\",\"type\":\"double\"},\{\"name\":\"type_null_test\",\"type\":[\"null\",\"string\"]},"
 +
 
"\{\"name\":\"type_bool_test\",\"type\":[\"boolean\"]},{\"name\":\"type_array_string\",\"type\":"
 +
 
"\{\"type\":\"array\",\"items\":\"string\"}},{\"name\":\"type_array_boolean\",\"type\":{\"type\":\"array\","
 +
 
"\"items\":\"boolean\"}},{\"name\":\"type_nullable_array\",\"type\":[\"null\",{\"type\":\"array\","
 +
 
"\"items\":\"string\"}],\"default\":null},{\"name\":\"type_enum\",\"type\":{\"type\":\"enum\","
 +
 
"\"name\":\"Colors\",\"symbols\":[\"RED\",\"GREEN\",\"BLUE\"]}},{\"name\":\"type_map\",\"type\":{\"type\":\"map\","
 +
 
"\"values\":\"long\"}},{\"name\":\"type_fixed\",\"type\":[\"null\",{\"type\":\"fixed\",\"name\":\"Fixed16\","
 +
 
"\"size\":16}],\"size\":16},\{\"name\":\"type_union\",\"type\":[\"null\",\"boolean\",\"long\",\"double\"]},"
 +
 
*"{\"name\":\"type_nested\",\"type\":[\"null\",{\"type\":\"record\",\"name\":\"Address\",\"fields\":[{\"name\":\"num\","
 +*
 
*"\"type\":\"int\"},\{\"name\":\"street\",\"type\":\"string\"},\{\"name\":\"city\",\"type\":\"string\"},"
 +*
 
*"\{\"name\":\"state\",\"type\":\"string\"},\{\"name\":\"zip\",\"type\":\"string\"}]}]},*{\"name\":\"type_bytes\","
 +
 
"\"type\":\"bytes\"},\{\"name\":\"type_date\",\"type\":{\"type\":\"int\",\"logicalType\":\"date\"}},"
 +
 
"\{\"name\":\"type_time_millis\",\"type\":{\"type\":\"int\",\"logicalType\":\"time-millis\"}},{\"name\":\"type_time_micros\","
 +
 
"\"type\":\{\"type\":\"int\",\"logicalType\":\"time-micros\"}},{\"name\":\"type_timestamp_millis\",\"type\":{\"type\":\"long\","
 +
 
"\"logicalType\":\"timestamp-millis\"}},{\"name\":\"type_timestamp_micros\",\"type\":{\"type\":\"long\","
 +
 
"\"logicalType\":\"timestamp-micros\"}},{\"name\":\"type_decimal_bytes\",\"type\":{\"type\":\"bytes\","
 +
 
"\"logicalType\":\"decimal\",\"precision\":4,\"scale\":2}},{\"name\":\"type_decimal_fixed\",\"type\":{\"type\":\"fixed\","
 +
 
"\"name\":\"Fixed2\",\"size\":2,\"logicalType\":\"decimal\",\"precision\":4,\"scale\":2}}]}";

 

*_The code :_*

tableEnv.registerDataStream("source_table",deserilizeDataStreamRows,"type_nested$Address$street,userActionTime.proctime");

 

_*The error is as follows:*_

Exception in thread "main" org.apache.flink.table.api.TableException: The 
proctime attribute can only be appended to the table schema and not replace an 
existing field. Please move 'userActionTime' to the end of the schema.Exception 
in thread "main" org.apache.flink.table.api.TableException: The proctime 
attribute can only be appended to the table schema and not replace an existing 
field. Please move 'userActionTime' to the end of the schema. at 
org.apache.flink.table.api.StreamTableEnvironment.org$apache$flink$table$api$StreamTableEnvironment$$extractProctime$1(StreamTableEnvironment.scala:649)
 at 
org.apache.flink.table.api.StreamTableEnvironment$$anonfun$validateAndExtractTimeAttributes$1.apply(StreamTableEnvironment.scala:676)
 at 
org.apache.flink.table.api.StreamTableEnvironment$$anonfun$validateAndExtractTimeAttributes$1.apply(StreamTableEnvironment.scala:668)
 at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
 at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at 
org.apache.flink.table.api.StreamTableEnvironment.validateAndExtractTimeAttributes(StreamTableEnvironment.scala:668)
 at 
org.apache.flink.table.api.StreamTableEnvironment.registerDataStreamInternal(StreamTableEnvironment.scala:549)
 at 
org.apache.flink.table.api.java.StreamTableEnvironment.registerDataStream(StreamTableEnvironment.scala:136)
 at 
com.jd.flink.sql.demo.validate.schema.avro.AvroQuickStartMain.main(AvroQuickStartMain.java:145)

 

 

The code is ok.

tableEnv.registerDataStream("source_table",deserilizeDataStreamRows,"type_nested$Address$street");

 

The code is ok.

tableEnv.registerDataStream("source_table",deserilizeDataStreamRows,"type_nested,userActionTime.proctime");

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14319) Register user jar files in {Stream}ExecutionEnvironment

2019-10-04 Thread Leo Zhang (Jira)
Leo Zhang created FLINK-14319:
-

 Summary: Register user jar files in {Stream}ExecutionEnvironment 
 Key: FLINK-14319
 URL: https://issues.apache.org/jira/browse/FLINK-14319
 Project: Flink
  Issue Type: New Feature
  Components: API / DataSet, API / DataStream
Reporter: Leo Zhang
 Fix For: 1.10.0


 I see that there are some use cases in which people want to implement their 
own SQL application based on loading external jars for now. And the related API 
proposals have been issued in the task FLINK-10232 Add a SQL DDL .

And the related sub-task FLINK-14055 is unresolved and its status is still 
open. I feel that it's better to split this task FLINK-14055 into two goals, 
one for DDL and the other new task for 
\_{Stream\}ExecutionEnvironment::registerUserJarFile()_ interface. 

 

 I have implemented the interfaces both for java and scala API, and they are 
tested well. And my implementation exactly obeys to the design doc of the  
FLINK-10232 and chooses the first option from design alternatives. So I  wanna 
share my codes if it's ok.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14188) TaskExecutor derive and register with default slot resource profile

2019-09-24 Thread Xintong Song (Jira)
Xintong Song created FLINK-14188:


 Summary: TaskExecutor derive and register with default slot 
resource profile
 Key: FLINK-14188
 URL: https://issues.apache.org/jira/browse/FLINK-14188
 Project: Flink
  Issue Type: Sub-task
Reporter: Xintong Song


* Introduce config option for defaultSlotFraction
 * Derive default slot resource profile from the new config option, or the 
legacy config option "taskmanager.numberOfTaskSlots".
 * Register task executor with the default slot resource profile.

This step should not introduce any behavior changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14094) Fix OperatorIOMetricGroup repeat register problem

2019-09-17 Thread xymaqingxiang (Jira)
xymaqingxiang created FLINK-14094:
-

 Summary: Fix OperatorIOMetricGroup repeat register problem
 Key: FLINK-14094
 URL: https://issues.apache.org/jira/browse/FLINK-14094
 Project: Flink
  Issue Type: Bug
Reporter: xymaqingxiang


There will be OperatorIOMetricGroup duplicate registration in the 
TaskMetricGroup's getOrAddOperator() method.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-12626) Client should not register table-source-sink twice in TableEnvironment

2019-05-25 Thread Bowen Li (JIRA)
Bowen Li created FLINK-12626:


 Summary: Client should not register table-source-sink twice in 
TableEnvironment
 Key: FLINK-12626
 URL: https://issues.apache.org/jira/browse/FLINK-12626
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Client
Reporter: Bowen Li
 Fix For: 1.9.0


Currently for a table specified in SQL CLI yaml file, if it's with type: 
source-sink-table, it will be registered twice (as source and as sink) to 
TableEnv.

As we've moved table management in TableEnv to Catalogs which doesn't allow 
registering dup named table, we need to come up with a solution to fix this 
problem.

cc [~xuefuz] [~tiwalter] [~dawidwys]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12604) Register TableSource/Sink as CatalogTable

2019-05-23 Thread Dawid Wysakowicz (JIRA)
Dawid Wysakowicz created FLINK-12604:


 Summary: Register TableSource/Sink as CatalogTable
 Key: FLINK-12604
 URL: https://issues.apache.org/jira/browse/FLINK-12604
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API, Table SQL / Legacy Planner
Affects Versions: 1.9.0
Reporter: Dawid Wysakowicz
Assignee: Dawid Wysakowicz
 Fix For: 1.9.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12601) Register DataStream/DataSet as DataStream/SetTableOperations in Catalog

2019-05-23 Thread Dawid Wysakowicz (JIRA)
Dawid Wysakowicz created FLINK-12601:


 Summary: Register DataStream/DataSet as 
DataStream/SetTableOperations in Catalog
 Key: FLINK-12601
 URL: https://issues.apache.org/jira/browse/FLINK-12601
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API, Table SQL / Legacy Planner
Reporter: Dawid Wysakowicz
Assignee: Dawid Wysakowicz






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12404) Translate the "Register a custom serializer for your Flink program" page into Chinese

2019-05-05 Thread YangFei (JIRA)
YangFei created FLINK-12404:
---

 Summary: Translate the "Register a custom serializer for your 
Flink program" page into Chinese
 Key: FLINK-12404
 URL: https://issues.apache.org/jira/browse/FLINK-12404
 Project: Flink
  Issue Type: Sub-task
  Components: chinese-translation, Documentation
Reporter: YangFei
Assignee: YangFei


file is docs/dev/custom_serializers.zh.md

the url is 
[https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/custom_serializers.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12202) Consider introducing batch metric register in NetworkEnviroment

2019-04-15 Thread Andrey Zagrebin (JIRA)
Andrey Zagrebin created FLINK-12202:
---

 Summary: Consider introducing batch metric register in 
NetworkEnviroment
 Key: FLINK-12202
 URL: https://issues.apache.org/jira/browse/FLINK-12202
 Project: Flink
  Issue Type: Sub-task
Reporter: Andrey Zagrebin
Assignee: zhijiang


As we have some network specific metrics registered in TaskIOMetricGroup 
(In/OutputBuffersGauge, In/OutputBufferPoolUsageGauge), we can introduce batch 
metric registering in NetworkEnviroment.registerMetrics(ProxyMetricGroup, 
partitions, gates), where task passes its TaskIOMetricGroup into 
ProxyMetricGroup. This way we could break a tie between task and 
NetworkEnviroment. TaskIOMetricGroup.initializeBufferMetrics, 
In/OutputBuffersGauge, In/OutputBufferPoolUsageGauge could be moved into 
NetworkEnviroment.registerMetrics and network code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10542) Register Hive metastore as an external catalog in TableEnvironment

2018-10-12 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created FLINK-10542:
---

 Summary: Register Hive metastore as an external catalog in 
TableEnvironment
 Key: FLINK-10542
 URL: https://issues.apache.org/jira/browse/FLINK-10542
 Project: Flink
  Issue Type: New Feature
  Components: Table API & SQL
Affects Versions: 1.6.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang


Similar to FLINK-2167 but rather register Hive metastore as an external ctalog 
in the {{TableEnvironment}}. After registration, Table API and SQL queries 
should be able to access all Hive tables.

This might supersede the need of FLINK-2167 because Hive metastore stores a 
superset of tables available via hCat without an indirection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10239) Register eventtime timer only once in eventtrigger

2018-08-28 Thread buptljy (JIRA)
buptljy created FLINK-10239:
---

 Summary: Register eventtime timer only once in eventtrigger
 Key: FLINK-10239
 URL: https://issues.apache.org/jira/browse/FLINK-10239
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.5.3
Reporter: buptljy


I find that we call ctx.registerEventTimeTimer(window.maxTimestamp()) every 
time when an element is received in the window. Even though it doesn't affect 
the result because it uses a Set, but I think it can still be an improvement if 
we call it only once.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10079) Automatically resolve and register sink table name from external catalogs

2018-08-07 Thread Jun Zhang (JIRA)
Jun Zhang created FLINK-10079:
-

 Summary: Automatically resolve and register sink table name from 
external catalogs
 Key: FLINK-10079
 URL: https://issues.apache.org/jira/browse/FLINK-10079
 Project: Flink
  Issue Type: Improvement
  Components: Table API & SQL
Affects Versions: 1.6.0
Reporter: Jun Zhang
Assignee: Jun Zhang
 Fix For: 1.6.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Register

2018-07-11 Thread Chesnay Schepler
To subscribe to the dev list you have to send a mail to 
dev-subscr...@flink.apache.org


On 11.07.2018 09:17, 陈梓立 wrote:

Register





Register

2018-07-11 Thread 陈梓立
Register


Register now for ApacheCon and save $250

2018-07-09 Thread Rich Bowen

Greetings, Apache software enthusiasts!

(You’re getting this because you’re on one or more dev@ or users@ lists 
for some Apache Software Foundation project.)


ApacheCon North America, in Montreal, is now just 80 days away, and 
early bird prices end in just two weeks - on July 21. Prices will be 
going up from $550 to $800 so register NOW to save $250, at 
http://apachecon.com/acna18


And don’t forget to reserve your hotel room. We have negotiated a 
special rate and the room block closes August 24. 
http://www.apachecon.com/acna18/venue.html


Our schedule includes over 100 talks and we’ll be featuring talks from 
dozens of ASF projects.,  We have inspiring keynotes from some of the 
brilliant members of our community and the wider tech space, including:


 * Myrle Krantz, PMC chair for Apache Fineract, and leader in the open 
source financing space
 * Cliff Schmidt, founder of Literacy Bridge (now Amplio) and creator 
of the Talking Book project

 * Bridget Kromhout, principal cloud developer advocate at Microsoft
 * Euan McLeod, Comcast engineer, and pioneer in streaming video

We’ll also be featuring tracks for Geospatial science, Tomcat, 
Cloudstack, and Big Data, as well as numerous other fields where Apache 
software is leading the way. See the full schedule at 
http://apachecon.com/acna18/schedule.html


As usual we’ll be running our Apache BarCamp, the traditional ApacheCon 
Hackathon, and the Wednesday evening Lighting Talks, too, so you’ll want 
to be there.


Register today at http://apachecon.com/acna18 and we’ll see you in Montreal!

--
Rich Bowen
VP, Conferences, The Apache Software Foundation
h...@apachecon.com
@ApacheCon


[jira] [Created] (FLINK-9741) Register JVM metrics for each JM separately

2018-07-04 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-9741:
---

 Summary: Register JVM metrics for each JM separately
 Key: FLINK-9741
 URL: https://issues.apache.org/jira/browse/FLINK-9741
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Coordination, Metrics
Affects Versions: 1.5.0, 1.6.0
Reporter: Chesnay Schepler
 Fix For: 1.6.0


Currently, the {{Dispatcher}} contains a {{JobManagerMetricGroup}} on which the 
JVM metrics are registered. the JobManagers only receive a 
{{JobManagerJobMetricGroup}} and don't register the JVM metrics.

As the dispatcher and jobmanagers currently run in the same jvm, neither 
exposing their IDs to the metric system, this doesn't cause problem _right now_ 
as we can't differentiate between them anyway, but it will bite us down the 
line if either of the above assumptions is broken.

For example, with the proposed exposure of JM/Dispatcher IDs in FLINK-9543 we 
would not expose JVM metrics tied to a JM, but only the Dispatcher.

I propose to register all JVM metrics for each jobmanager separately to future 
proof the whole thing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9362) Document that Flink doesn't guarantee transaction of modifying state and register timers in processElement()

2018-05-14 Thread Bowen Li (JIRA)
Bowen Li created FLINK-9362:
---

 Summary: Document that Flink doesn't guarantee transaction of 
modifying state and register timers in processElement()
 Key: FLINK-9362
 URL: https://issues.apache.org/jira/browse/FLINK-9362
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.6.0
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.6.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9328) RocksDBStateBackend might use PlaceholderStreamStateHandle to restor due to StateBackendTestBase class not register snapshots in some UTs

2018-05-10 Thread Yun Tang (JIRA)
Yun Tang created FLINK-9328:
---

 Summary: RocksDBStateBackend might use 
PlaceholderStreamStateHandle to restor due to StateBackendTestBase class not 
register snapshots in some UTs
 Key: FLINK-9328
 URL: https://issues.apache.org/jira/browse/FLINK-9328
 Project: Flink
  Issue Type: Bug
  Components: State Backends, Checkpointing
Affects Versions: 1.4.2
Reporter: Yun Tang
 Fix For: 1.5.0


Currently, StateBackendTestBase class does not register snapshots to 
SharedStateRegistry in testValueState, testListState, testReducingState, 
testFoldingState and testMapState UTs, which may cause RocksDBStateBackend to 
restore from PlaceholderStreamStateHandle during the 2nd restore procedure if 
one specific sst file both existed in the 1st snapshot and the 2nd snapshot 
handle.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9305) Register flink-s3-fs-* for the s3a:// scheme as well

2018-05-07 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-9305:
--

 Summary: Register flink-s3-fs-* for the s3a:// scheme as well
 Key: FLINK-9305
 URL: https://issues.apache.org/jira/browse/FLINK-9305
 Project: Flink
  Issue Type: Improvement
  Components: filesystem-connector
Affects Versions: 1.4.2, 1.4.1, 1.4.0, 1.5.0, 1.6.0
Reporter: Nico Kruber
Assignee: Nico Kruber


For enhanced user experience, we should also register our shaded S3 file system 
implementations for the {{s3a://}} file system scheme, not just {{s3://}}. This 
way, the user can easily switch from the manual S3 integration to the shaded 
ones.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Subject: Last chance to register for Flink Forward SF (April 10). Get 25% discount

2018-03-29 Thread Stephan Ewen
Hi all!

There are still some spots left to attend Flink Forward San Francisco, so
sign up soon before registration closes.

Use this promo code to get 25% off: MailingListFFSF

The 1-day conference takes place on Tuesday, April 10 in downtown SF. We
have a great lineup of speakers from companies working with Flink,
including Alibaba, American Express, Capital One, Comcast, eBay, Google,
Lyft, MediaMath, Netflix, Uber, Walmart Labs, and many others. See the full
program of sessions and speakers:
https://sf-2018.flink-forward.org/conference/

Also on Monday, April 9 we'll be holding training sessions for Flink
(Standard and Advance classes) - it's good opportunity to learn from some
Apache Flink experts.

Hope to see you there!

Best,
Stephan


[jira] [Created] (FLINK-8885) The DispatcherThreadFactory should register uncaught exception handlers

2018-03-06 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-8885:
---

 Summary: The DispatcherThreadFactory should register uncaught 
exception handlers
 Key: FLINK-8885
 URL: https://issues.apache.org/jira/browse/FLINK-8885
 Project: Flink
  Issue Type: Bug
  Components: TaskManager
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.5.0, 1.6.0


The {{DispatcherThreadFactory}} is responsible for spawning the thread pool 
threads for TaskManager's async dispatcher and for the CheckpointCoordinators 
timed trigger.

In case of uncaught exceptions in these threads, the system is not healthy and 
more, hence they should register the {{FatalExitUcaughtExceptionsHandler}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8884) The DispatcherThreadFactory should register uncaught exception handlers

2018-03-06 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-8884:
---

 Summary: The DispatcherThreadFactory should register uncaught 
exception handlers
 Key: FLINK-8884
 URL: https://issues.apache.org/jira/browse/FLINK-8884
 Project: Flink
  Issue Type: Bug
  Components: TaskManager
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.5.0, 1.6.0


The {{DispatcherThreadFactory}} is responsible for spawning the thread pool 
threads for TaskManager's async dispatcher and for the CheckpointCoordinators 
timed trigger.

In case of uncaught exceptions in these threads, the system is not healthy and 
more, hence they should register the {{FatalExitUcaughtExceptionsHandler}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Sessions online for #flinkforward SF. Register with 25% discount code.

2018-02-16 Thread Fabian Hueske
Hi Flink Community,

On behalf of the data Artisans team, I’d like to announce that the sessions
for Flink Forward San Francisco are now online!

Check out the great lineup of speakers from companies such as American
Express, Comcast, Capital One, eBay, Google, Lyft, Netflix, Uber, Yelp, and
others. https://sf-2018.flink-forward.org/conference/

The conference on Monday April 10 will kick off with keynotes from leaders
of the Flink community followed by technical sessions with topics range
from production Flink use cases, to Apache Flink® internals, to the growth
of the Flink ecosystem. Also, on April 9 a full day of Standard and
Advanced Flink Training sessions will take place - more information is
coming soon on the website .

Registration is open so sign up to claim your spot. We’re offering you 25%
off which is a special discount only for members of the Flink mailing lists
- please don’t tweet the code or share outside the Flink mailing lists. :-)
Use this limited promotion code - MailingListFFSF - when registering for
the conference
.


Hope to see you in San Francisco!


Fabian


[jira] [Created] (FLINK-8635) Register asynchronous rescaling handler at WebMonitorEndpoint

2018-02-10 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-8635:


 Summary: Register asynchronous rescaling handler at 
WebMonitorEndpoint
 Key: FLINK-8635
 URL: https://issues.apache.org/jira/browse/FLINK-8635
 Project: Flink
  Issue Type: Sub-task
  Components: REST
Affects Versions: 1.5.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.5.0


In order to use the REST asynchronous rescaling handlers, we have to register 
them at the {{WebMonitorEndpoint}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8288) Register the web interface url to yarn for yarn job mode

2017-12-18 Thread shuai.xu (JIRA)
shuai.xu created FLINK-8288:
---

 Summary: Register the web interface url to yarn for yarn job mode
 Key: FLINK-8288
 URL: https://issues.apache.org/jira/browse/FLINK-8288
 Project: Flink
  Issue Type: Bug
  Components: Cluster Management
Reporter: shuai.xu
Assignee: shuai.xu


For flip-6 job mode, the resource manager is created before the web monitor, so 
the web interface url is not set to resource manager, and the resource manager 
can not register the url to yarn.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >