[ 
https://issues.apache.org/jira/browse/FLINK-27625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee closed FLINK-27625.
-------------------------------
    Resolution: Fixed

> Add query hint for async lookup join
> ------------------------------------
>
>                 Key: FLINK-27625
>                 URL: https://issues.apache.org/jira/browse/FLINK-27625
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table SQL / API
>            Reporter: lincoln lee
>            Assignee: lincoln lee
>            Priority: Major
>             Fix For: 1.16.0
>
>
> The hint name discuss thread: 
> https://lists.apache.org/thread/jm9kg33wk9z2bvo2b0g5bp3n5kfj6qv8
> FLINK-27623 adds a global parameter 'table.exec.async-lookup.output-mode' for 
> table users so that all three control parameters related to async I/O can be 
> configured at the same job level.
> As planned in the issue, we‘d like to go a step further to offer more precise 
> control for async join operation more than job level config, to introduce a 
> new join hint: ‘ASYNC_LOOKUP’.
> For the hint option, for intuitive and user-friendly reasons, we want to 
> support both simple and kv forms, with all options except table name being 
> optional (use job level configuration if not set)
> 1. simple form: (ordered hint option list)
> ```
> ASYNC_LOOKUP('tableName'[, 'output-mode', 'buffer-capacity', 'timeout'])
> optional:
> output-mode
> buffer-capacity
> timeout
> ```
> Note: since Calcite currently does not support the mixed type hint options,
> the table name here needs to be a string instead of an identifier. (For
> `SqlHint`: The option format can not be mixed in, they should either be all
> simple identifiers or all literals or all key value pairs.) We can improve
> this after Calcite support.
> 2. kv form: (support unordered hint option list)
> ```
> ASYNC_LOOKUP('table'='tableName'[, 'output-mode'='ordered|allow-unordered',
> 'capacity'='int', 'timeout'='duration'])
> optional kvs:
> 'output-mode'='ordered|allow-unordered'
> 'capacity'='int'
> 'timeout'='duration'
> ```
> e.g., if the job level configuration is:
> ```
> table.exec.async-lookup.output-mode: ORDERED
> table.exec.async-lookup.buffer-capacity: 100
> table.exec.async-lookup.timeout: 180s
> ```
> then the following hints:
> ```
> 1. ASYNC_LOOKUP('dim1', 'allow-unordered', '200', '300s')
> 2. ASYNC_LOOKUP('dim1', 'allow-unordered', '200')
> 3. ASYNC_LOOKUP('table'='dim1', 'output-mode'='allow-unordered')
> 4. ASYNC_LOOKUP('table'='dim1', 'timeout'='300s')
> 5. ASYNC_LOOKUP('table'='dim1', 'capacity'='300')
> ```
> are equivalent to:
> ```
> 1. ASYNC_LOOKUP('dim1', 'allow-unordered', '200', '300s')
> 2. ASYNC_LOOKUP('dim1', 'allow-unordered', '200', '180s')
> 3. ASYNC_LOOKUP('table'='dim1', 'output-mode'='allow-unordered',
> 'capacity'='100', 'timeout'='180s')
> 4. ASYNC_LOOKUP('table'='dim1', 'output-mode'='ordered', 'capacity'='100',
> 'timeout'='300s')
> 5. ASYNC_LOOKUP('table'='dim1', 'output-mode'='ordered', 'capacity'='300',
> 'timeout'='180s')
> ```
> In addition, if the lookup source implements both sync and async table
> function, the planner prefers to choose the async function when the
> 'ASYNC_LOOKUP' hint is specified.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to