jonahgao commented on issue #7344: URL: https://github.com/apache/arrow-datafusion/issues/7344#issuecomment-1690099314
> The docs would suggest that what is happening here is it is coercing the arguments to a string, I'll try to confirm this I have just checked the [documentation](https://www.postgresql.org/docs/current/functions-binarystring.html) and found that the length function in PostgresSQL has two different forms: ``` length(bytea) → integer Returns the number of bytes in the binary string. length('\x1234567890'::bytea) → 5 length(bytes bytea, encoding name) → integer Returns the number of characters in the binary string, assuming that it is text in the given encoding. length('jose'::bytea, 'UTF8') → 4 ``` It seems that there will be no type coercion happening. ----- I think we might need three different types of length functions. - `character_length` only operates on text strings. - `octet_length` can be applied to both text strings and binaries, and it returns the length of the underlying byte array. - `length` can be applied to both text strings and binaries. - For text strings, it has the same effect as `character_length`. - For binaries, it has the same effect as `octet_length`. ----- The following are the execution results of PostgreSQL for reference only. ``` postgres=# create table text_table(c text); CREATE TABLE postgres=# insert into text_table values('abc'), ('efg你好'); INSERT 0 2 postgres=# select length(c), octet_length(c), character_length(c) from text_table; length | octet_length | character_length --------+--------------+------------------ 3 | 3 | 3 5 | 9 | 5 (2 rows) postgres=# postgres=# postgres=# create table binary_table(c bytea); CREATE TABLE postgres=# postgres=# insert into binary_table values('abc'), ('\x00010203'::bytea); INSERT 0 2 postgres=# select length(c), octet_length(c) from binary_table; length | octet_length --------+-------------- 3 | 3 4 | 4 (2 rows) postgres=# select character_length(c) from binary_table; ERROR: function character_length(bytea) does not exist LINE 1: select character_length(c) from binary_table; ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
