truffle-dev commented on issue #22506:
URL: https://github.com/apache/datafusion/issues/22506#issuecomment-4582934307

   Did some empirical poking at Postgres 16.13 and DuckDB v1.4.1 since the 
question was what other systems do.
   
   ### Postgres: first-occurrence anchors, implicit cast everywhere else
   
   Postgres walks the AST, picks the type of `$N`'s first concrete-context 
occurrence, then requires every other occurrence to find an implicit cast or 
operator that lets the anchored type interact with the other side. If no such 
operator exists, PREPARE fails at planning time.
   
   **In-family widening (string types):**
   ```sql
   PREPARE qA AS SELECT $1 = s /* varchar */, $1 = x /* text */ FROM t, t_text;
   PREPARE qB AS SELECT $1 = x /* text */, $1 = s /* varchar */ FROM t_text, t;
   SELECT parameter_types FROM pg_prepared_statements WHERE name IN ('qA','qB');
   -- both: {text}
   ```
   varchar and text are binary-coercible, so both orderings resolve to 
`{text}`. Looks like "widening" but it's anchor + cast where the cast is no-op.
   
   **Numeric widening, order-dependent:**
   ```sql
   PREPARE qC AS SELECT $1 = a.n /* int */, $1 = b.n /* bigint */;     -- 
{integer}
   PREPARE qD AS SELECT $1 = b.n /* bigint */, $1 = a.n /* int */;     -- 
{bigint}
   ```
   First occurrence anchors; second occurrence gets the implicit int/bigint 
cast.
   
   **Cross-family conflict (RatulDawar's example):**
   ```sql
   PREPARE q AS SELECT $1 = s FROM t JOIN t2 ON t2.id = $1;
   -- ERROR:  operator does not exist: integer = character varying
   ```
   Anchors to varchar from `$1 = s`, then `t2.id = $1` needs `integer = 
character varying` operator. None exists. PREPARE fails at parse time, not 
EXECUTE.
   
   **Pure UNKNOWN defaults to text:**
   ```sql
   PREPARE q AS SELECT $1;                  -- {text}
   PREPARE q AS SELECT $1 = NULL FROM t;    -- {text}
   ```
   
   ### DuckDB: defer the conflict to EXECUTE
   
   ```sql
   PREPARE q AS SELECT $1 = s /* varchar */ FROM t JOIN t2 ON t2.id = $1;
   EXECUTE q(1);
   -- Conversion Error: Could not convert string 'a' to INT32 when casting from 
source column s
   ```
   DuckDB picks int at PREPARE time and only errors when casting varchar row 
data to int. Less predictable for the user, since PREPARE looks fine and the 
failure mode depends on the row contents.
   
   ### How this maps onto the issue
   
   @cetra3's failing case (`$1 = 'x'` Utf8 and `$1 = s` Utf8View) parallels 
postgres's varchar+text test. Postgres resolves to the wider string type for 
both orderings, which is the `comparison_coercion`-equivalent result. 
@RatulDawar's instinct is correct for the in-family case.
   
   @RatulDawar's conflict example (`$1 = s` varchar joined with `t2.id = $1` 
int) is the cross-family case. Postgres errors at PREPARE. DuckDB defers it to 
EXECUTE. Either is defensible. Which one datafusion picks depends on whether 
you want the same coercion rules at the placeholder site as at every other 
binary-comparison site (see the recommendation below).
   
   ### Subtlety on the explicit-cast escape hatch
   
   One thing worth knowing when documenting the user-side workaround: in 
postgres, casting only the parameter is not sufficient when the column type 
itself is incompatible.
   
   ```sql
   PREPARE q AS SELECT $1::int = s FROM t JOIN t2 ON t2.id = $1::int;
   -- ERROR:  operator does not exist: integer = character varying
   ```
   
   `s` is still varchar and `int = varchar` has no implicit operator, so the 
cast on `$1` alone does not save the comparison. Users have to either cast the 
column side too (`s::int`) or cast `t2.id` to varchar so the parameter can stay 
in the varchar family. Worth surfacing in whatever error hint datafusion emits 
if it adopts the error-at-plan model.
   
   ### Suggested shape for the fix
   
   For the immediate case in this issue, replacing the strict-equality merge in 
[`get_parameter_fields`](https://github.com/apache/datafusion/blob/2e7b8e1b2d3cd800823b3813f2f7efdde0d21af8/datafusion/expr/src/logical_plan/plan.rs#L1626-L1658)
 with 
[`comparison_coercion`](https://github.com/apache/datafusion/blob/2e7b8e1b2d3cd800823b3813f2f7efdde0d21af8/datafusion/expr-common/src/type_coercion/binary.rs#L924-L941)
 looks like the smallest change that fixes cetra3's example and stays 
consistent with how datafusion treats binary-comparison contexts elsewhere.
   
   That gets the Utf8/Utf8View widening for free (via `string_coercion`), the 
decimal/integer widening for free (via `binary_numeric_coercion`), and degrades 
to `None` only when no coercion path exists in either direction.
   
   The RatulDawar concern about Int + Utf8/Utf8View is worth pulling apart from 
the placeholder issue. Today `comparison_coercion` is permissive about 
string-to-numeric pairs (the [`string_numeric_coercion` 
arm](https://github.com/apache/datafusion/blob/2e7b8e1b2d3cd800823b3813f2f7efdde0d21af8/datafusion/expr-common/src/type_coercion/binary.rs#L936)),
 so `comparison_coercion(Int32, Utf8View)` returns `Some(Int32)` rather than 
`None`. Using it for placeholder merge therefore gives DuckDB-shaped behavior 
in that case: PREPARE succeeds, then EXECUTE either succeeds or fails depending 
on whether row data parses as Int32. If the team wants postgres-shaped errors 
at PREPARE for cross-family conflicts, that is a separate question about 
`comparison_coercion`'s string-to-numeric policy across the whole project, not 
just at the placeholder site.
   
   So as a layered call: ship the `comparison_coercion` swap to unblock the 
in-family case in this issue, and treat the strictness question as a follow-up 
that probably wants a wider RFC since it touches every binary-comparison site, 
not just placeholders.
   
   For the "$N has zero concrete contexts" tail case (e.g. `SELECT $1;`), 
postgres defaults to text. Defaulting to `Utf8` would match.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to