featzhang created FLINK-38822:
---------------------------------

             Summary: Extend URL_DECODE with a recursive parameter for 
multi-level decoding
                 Key: FLINK-38822
                 URL: https://issues.apache.org/jira/browse/FLINK-38822
             Project: Flink
          Issue Type: Improvement
          Components: Table SQL / API
    Affects Versions: 2.0-preview
            Reporter: featzhang


h3. Background

Apache Flink SQL currently provides the built-in function 
{{{}URL_DECODE(str){}}}, which decodes a string in 
{{application/x-www-form-urlencoded}} format. This function was introduced in 
FLINK-34108 and is aligned with the corresponding functionality in Apache 
Calcite.

However, in real-world data processing scenarios, especially in log processing, 
tracking data, and external system integrations, it is common to encounter 
{*}multi-level URL-encoded strings{*}, for example:
 * Values that are URL-encoded multiple times by upstream systems

 * Nested encoding caused by redirects, proxies, or intermediate transformations

In such cases, calling {{URL_DECODE}} only once is insufficient.
h3. Problem

The current {{URL_DECODE(str)}} function only performs {*}a single decoding 
pass{*}. Users who need to fully decode multi-level encoded values must 
repeatedly apply the function manually, which:
 * Reduces readability of SQL queries

 * Makes the decoding intent less explicit

 * Is inconvenient and error-prone in complex SQL pipelines

h3. Proposal

Extend the existing {{URL_DECODE}} function with an {*}optional boolean 
parameter {{recursive}}{*}, which controls whether decoding should be applied 
repeatedly until the value can no longer be decoded.

Proposed function signatures:

 

{{URL_DECODE(str)
URL_DECODE(str, recursive)}}

Where:
 * {{recursive = false}} (default): preserves the current behavior (single-pass 
decoding)

 * {{{}recursive = true{}}}: repeatedly applies URL decoding until the result 
no longer changes

h3. Examples

 

{{-- Single-pass decoding (current behavior)
SELECT URL_DECODE('%252Fpath%252Fto%252Fresource');
-- Result: '%2Fpath%2Fto%2Fresource'

-- Recursive decoding
SELECT URL_DECODE('%252Fpath%252Fto%252Fresource', true);
-- Result: '/path/to/resource'}}
h3. Compatibility
 * This change is *fully backward-compatible*

 * Existing queries using {{URL_DECODE(str)}} will continue to work without any 
behavior changes

 * The new parameter is optional and only extends functionality

h3. Additional Notes
 * The recursive decoding should stop when the decoded result is identical to 
the previous value

 * This proposal builds directly on the existing implementation introduced in 
FLINK-34108

 * Similar behavior is commonly required in data cleansing and normalization 
use cases



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to