hi all:

Redefined the rules of the configuration file as follows:

flink:
option:
...
property: #@see:
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/config/
...
table: # @see
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/table/config/
...

app: # user's parameter
...
sql:
...



Looking forward to your opinion.



Best,
Huajie Wang



Huajie Wang <[email protected]> 于2022年10月20日周四 23:18写道:

> > 2. For the table config, could we use `env.table-property` as the prefix?
> If the prefix of flink table config isn't table, what can we do?
> StreamPark should not be affected by flink parameter naming.
>
> AFAIK, flink table'property key all startwith "table", sql-client is a
> special case, sql-client is just a program that flink comes with to execute
> sql, in other words, I don't need sql-client to execute sql, so I don't
> need those parameters, There is an essential difference between the
> parameters defined by sql-client and the table'properties
>
> Best,
> Huajie Wang
>
>
>
> Rui Fan <[email protected]> 于2022年10月20日周四 22:10写道:
>
>> Hi huajie,
>>
>> Thanks for your great proposal.
>>
>> I have 2 questions:
>> 1. Why do you write the sql content in the config file?
>> 2. For the table config, could we use `env.table-property` as the prefix?
>> If the prefix of flink table config isn't table, what can we do?
>> StreamPark should not be affected by flink parameter naming.
>>
>> The prefix of some table configs are sql-client. For example:
>> sql-client.display.max-column-width [1]
>>
>> My suggested format:
>>
>> ```
>> env:
>>    option: #cli option args
>>      target: yarn-application # yarn-application, yarn-perjob
>>      shutdownOnAttachedExit:
>>      jobmanager:
>>      ...
>>    property:
>>      ${StreamExecutionEnvironment.key} : $value
>>      ...
>>    table-property:
>>      table.exec.mini-batch.enabled : true
>> ```
>>
>>
>> [1]
>>
>> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/table/config/#sql-client-display-max-column-width
>>
>> Best
>> Rui Fan
>>
>> On Thu, Oct 20, 2022 at 4:48 PM Huajie Wang <[email protected]> wrote:
>>
>> > Hello everyone, this discussion is about the development of a unified
>> > specification for flink'job profiles in streampark, Welcome to join the
>> > discussion
>> >
>> >
>> >
>> >
>> > *background:*
>> > Streampark is positioned as a rapid development framework such as flink
>> &
>> > spark. An important part of it is standardized configuration: put all
>> the
>> > configurations hardcoded in the code into the configuration file. When
>> the
>> > project starts, you only need to pass in the agreed configuration. The
>> file
>> > can complete the initialization of the environment and the setting of
>> > parameters,
>> > Because the parameter specification customization in the current
>> version is
>> > somewhat unreasonable, the specific performance is as follows: the
>> format
>> > of the parameter is redefined, which is slightly different from the
>> > official configuration of flink. For this part, pr has already done
>> related
>> > work [1], the specific method It is to put the parameter settings of
>> env in
>> > flink under property.
>> > The key is the key of the standard parameter in flink,  but this part
>> only
>> > regulates the parameter configuration under the property, and does not
>> > regulate the global parameter setting.
>> >
>> >
>> > The current configuration rules is as follows:
>> >
>> > flink:
>> >   deployment:
>> >     option:
>> >         ...
>> >     property:
>> >         ...
>> >
>> >
>> > For example: Now the flink'job is deployed in yarn-perjob mode, the job
>> > name is: test-job, the parallelism is 2, and the entity class is:
>> > org.apache.streampark.FlinkJob, so the configuration is as follows:
>> >
>> > flink:
>> >   deployment:
>> >     option:
>> >         target: yarn-per-job
>> >     property:
>> >         $internal.application.main: org.apache.streampark.FlinkJob
>> >         pipeline.name: test-job
>> >         taskmanager.numberOfTaskSlots: 1
>> >         parallelism.default: 2
>> >
>> >
>> > we can see, root prefix is `flink`, The `option` defined the parameters
>> > related to the deployment task,
>> > and the `property` defined the parameter configuration in flink. The
>> > configurable parameters is completely consistent with the standard
>> > parameters in flink [2], There are deficiencies in this design
>> > specification, which are manifested as follows:
>> >
>> > 1. The format of table-related parameter settings is not defined
>> > 2. The user's business parameters are not defined
>> > 3. The content of flinksql is not defined.
>> >
>> > Therefore, the purpose of this discussion is to solve this problem and
>> > further standardize the parameters. Since the design of this part of the
>> > specification is more important, it will directly affect the users
>> > developed with the streampark api, so it is necessary for us to conduct
>> > in-depth communication and discussion.
>> >
>> >
>> >
>> > *Proposal:*
>> > The improved format I initially proposed[3] is for example, the
>> parameters
>> > are generally divided into three parts, env, app, sql, "env" defined
>> > deployment parameters and environment setting related parameters, and
>> table
>> > parameters, "app" defined user-defined parameters, "sql" defined the
>> > content of flinksql.
>> >
>> > env:
>> >   option: #cli opiton args
>> >     target: yarn-application # yarn-application, yarn-perjob
>> >     shutdownOnAttachedExit:
>> >     jobmanager:
>> >     ...
>> >   property:
>> >     ${StreamExecutionEnvironment.key} : $value
>> >     ...
>> >     table:
>> >       ${TableEnvironment.key} : $value
>> >       ...
>> > sql: # flinksql
>> >    my_flinksql: |
>> >     CREATE TABLE datagen (
>> >       f_sequence INT,
>> >       ts AS localtimestamp,
>> >       WATERMARK FOR ts AS ts
>> >     ) WITH (
>> >       ....
>> >     );
>> >     ...
>> >
>> > app:
>> >     kafka.bootstrap:
>> >     kafka.topic: test
>> >     ...
>> >
>> >
>> > Looking forward to your opinion.
>> >
>> >
>> >
>> > [1] : https://github.com/apache/incubator-streampark/issues/1762
>> > [2] :
>> >
>> >
>> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/config/
>> > [3] : https://github.com/apache/incubator-streampark/issues/1857
>> >
>> > Best,
>> > Huajie Wang
>> >
>> >
>> >
>> > Best,
>> > Huajie Wang
>> >
>>
>

Reply via email to