[
https://issues.apache.org/jira/browse/CALCITE-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yu Xu updated CALCITE-6988:
---------------------------
Description:
DuckDB is a very popular high-performance analytical database
system([https://github.com/duckdb/duckdb]), many users are using DuckDB in
their work but Calcite had not supported with it yet.
I had relevant materials about mainly syntax difference between default dialect
and DuckDB (from document : [https://duckdb.org/docs/stable/] ):
*1. NullCollation is LAST mode, Nulls last for DESC and ASC, test as follow:*
D select * from t3 order by data asc;
┌───────┬─────────┐
│ id │ data │
│ int32 │ varchar │
├───────┼─────────┤
│ 1 │ aa │
│ 1 │ bb │
│ 1 │ │
└───────┴─────────┘
D select * from t3 order by data desc;
┌───────┬─────────┐
│ id │ data │
│ int32 │ varchar │
├───────┼─────────┤
│ 1 │ bb │
│ 1 │ aa │
│ 1 │ │
└───────┴─────────┘
{*}2. max precision of decimal is 38{*}(we can refer to
[https://duckdb.org/docs/stable/sql/data_types/numeric#fixed-point-decimals)|https://duckdb.org/docs/stable/sql/data_types/numeric#fixed-point-decimals)*]
*3. IdentifierQuoteString is ", test for it:*
test is OK:
D SELECT DATETRUNC('second', "d") from t5;
┌────────────────────────┐
│ datetrunc('second', d) │
│ timestamp │
├────────────────────────┤
│ 2025-01-22 12:21:23 │
│ 2025-01-22 12:21:25 │
└────────────────────────┘
ERROR result:
D SELECT DATETRUNC('second', 'd') from t5;
Binder Error: Could not choose a best candidate function for the function call
"datetrunc(STRING_LITERAL, STRING_LITERAL)". In order to select one, please add
explicit type casts.
Candidate functions:
datetrunc(VARCHAR, DATE) -> TIMESTAMP
datetrunc(VARCHAR, INTERVAL) -> INTERVAL
datetrunc(VARCHAR, TIMESTAMP WITH TIME ZONE) -> TIMESTAMP WITH TIME ZONE
datetrunc(VARCHAR, TIMESTAMP) -> TIMESTAMP
LINE 1: SELECT DATETRUNC('second', 'd') from t5;
^
D SELECT DATETRUNC('second', `d`) from t5;
Catalog Error: Scalar Function with name `__postfix does not exist!
Did you mean "!__postfix"?
LINE 1: SELECT DATETRUNC('second', `d`) from t5;
*4. support approx_count_distinct function:*
D select approx_count_distinct("id") from t3 ;
┌───────────────────────────┐
│ approx_count_distinct(id) │
│ int64 │
├───────────────────────────┤
│ 1 │
└───────────────────────────┘
*5. Not support character_length/char_length, need use length function to
support it:*
D select length('aa');
┌──────────────┐
│ length('aa') │
│ int64 │
├──────────────┤
│ 2 │
└──────────────┘
D select character_length('aa');
Catalog Error: Scalar Function with name character_length does not exist!
Did you mean "array_length"?
LINE 1: select character_length('aa');
^
D select char_length('aa');
Catalog Error: Scalar Function with name char_length does not exist!
Did you mean "array_length"?
LINE 1: select char_length('aa');
^
*6. Map construncor method need convert () to {}, test for it:*
result is OK:
D select MAP \{'a':1, 'b':2};
┌────────────────────────────────────────────────────────────┐
│ main.map(main.list_value('a', 'b'), main.list_value(1, 2)) │
│ map(varchar, integer) │
├────────────────────────────────────────────────────────────┤
│ \{a=1, b=2} │
└────────────────────────────────────────────────────────────┘
Error result:
D select MAP ('a':1, 'b':2);
Parser Error: syntax error at or near ":"
LINE 1: select MAP ('a':1, 'b':2);
*7. floor date should convert datetrunc function(can refer to:
[https://duckdb.org/docs/stable/sql/functions/date#date_truncpart-date)*]
D SELECT DATETRUNC('year', "d") from t5;
┌──────────────────────┐
│ datetrunc('year', d) │
│ date │
├──────────────────────┤
│ 2025-01-01 │
│ 2025-01-01 │
└──────────────────────┘
D SELECT DATETRUNC('month', "d") from t5;
┌───────────────────────┐
│ datetrunc('month', d) │
│ date │
├───────────────────────┤
│ 2025-01-01 │
│ 2025-01-01 │
└───────────────────────┘
D SELECT DATETRUNC('day', "d") from t5;
┌─────────────────────┐
│ datetrunc('day', d) │
│ date │
├─────────────────────┤
│ 2025-01-22 │
│ 2025-01-22 │
└─────────────────────┘
D SELECT DATETRUNC('week', "d") from t5;
┌──────────────────────┐
│ datetrunc('week', d) │
│ date │
├──────────────────────┤
│ 2025-01-20 │
│ 2025-01-20 │
└──────────────────────┘
D SELECT DATETRUNC('millisecond', "d") from t5;
┌─────────────────────────────┐
│ datetrunc('millisecond', d) │
│ timestamp │
├─────────────────────────────┤
│ 2025-01-22 12:21:23.123 │
│ 2025-01-22 12:21:25.123 │
└─────────────────────────────┘
D SELECT DATETRUNC('microseconds', "d") from t5;
┌──────────────────────────────┐
│ datetrunc('microseconds', d) │
│ timestamp │
├──────────────────────────────┤
│ 2025-01-22 12:21:23.123 │
│ 2025-01-22 12:21:25.123223 │
└──────────────────────────────┘
for these differences we should make a new DuckDB dialect.
was:
DuckDB is a very popular high-performance analytical database
system(https://github.com/duckdb/duckdb), many users are using DuckDB in their
work but Calcite had not supported with it yet.
I had relevant materials about mainly syntax difference between default dialect
and DuckDB (from document : https://duckdb.org/docs/stable/ ):
*1. NullCollation is LAST mode, Nulls last for DESC and ASC, test as follow:*
D select * from t3 order by data asc;
┌───────┬─────────┐
│ id │ data │
│ int32 │ varchar │
├───────┼─────────┤
│ 1 │ aa │
│ 1 │ bb │
│ 1 │ │
└───────┴─────────┘
D select * from t3 order by data desc;
┌───────┬─────────┐
│ id │ data │
│ int32 │ varchar │
├───────┼─────────┤
│ 1 │ bb │
│ 1 │ aa │
│ 1 │ │
└───────┴─────────┘
*2. max precision of decimal is 38(we can refer to
https://duckdb.org/docs/stable/sql/data_types/numeric#fixed-point-decimals)*
*3. IdentifierQuoteString is ", test for it:*
test is OK:
D SELECT DATETRUNC('second', "d") from t5;
┌────────────────────────┐
│ datetrunc('second', d) │
│ timestamp │
├────────────────────────┤
│ 2025-01-22 12:21:23 │
│ 2025-01-22 12:21:25 │
└────────────────────────┘
ERROR result:
D SELECT DATETRUNC('second', 'd') from t5;
Binder Error: Could not choose a best candidate function for the function call
"datetrunc(STRING_LITERAL, STRING_LITERAL)". In order to select one, please add
explicit type casts.
Candidate functions:
datetrunc(VARCHAR, DATE) -> TIMESTAMP
datetrunc(VARCHAR, INTERVAL) -> INTERVAL
datetrunc(VARCHAR, TIMESTAMP WITH TIME ZONE) -> TIMESTAMP WITH TIME ZONE
datetrunc(VARCHAR, TIMESTAMP) -> TIMESTAMP
LINE 1: SELECT DATETRUNC('second', 'd') from t5;
^
D SELECT DATETRUNC('second', `d`) from t5;
Catalog Error: Scalar Function with name `__postfix does not exist!
Did you mean "!__postfix"?
LINE 1: SELECT DATETRUNC('second', `d`) from t5;
*4. support approx_count_distinct function:*
D select approx_count_distinct("id") from t3 ;
┌───────────────────────────┐
│ approx_count_distinct(id) │
│ int64 │
├───────────────────────────┤
│ 1 │
└───────────────────────────┘
*5. Not support character_length/char_length, need use length function to
support it:*
D select length('aa');
┌──────────────┐
│ length('aa') │
│ int64 │
├──────────────┤
│ 2 │
└──────────────┘
D select character_length('aa');
Catalog Error: Scalar Function with name character_length does not exist!
Did you mean "array_length"?
LINE 1: select character_length('aa');
^
D select char_length('aa');
Catalog Error: Scalar Function with name char_length does not exist!
Did you mean "array_length"?
LINE 1: select char_length('aa');
^
*6. Map construncor method need convert () to {}, test for it:*
result is OK:
D select MAP \{'a':1, 'b':2};
┌────────────────────────────────────────────────────────────┐
│ main.map(main.list_value('a', 'b'), main.list_value(1, 2)) │
│ map(varchar, integer) │
├────────────────────────────────────────────────────────────┤
│ \{a=1, b=2} │
└────────────────────────────────────────────────────────────┘
Error result:
D select MAP ('a':1, 'b':2);
Parser Error: syntax error at or near ":"
LINE 1: select MAP ('a':1, 'b':2);
*7. floor date should convert datetrunc function(can refer to:
https://duckdb.org/docs/stable/sql/functions/date#date_truncpart-date)*
D SELECT DATETRUNC('year', "d") from t5;
┌──────────────────────┐
│ datetrunc('year', d) │
│ date │
├──────────────────────┤
│ 2025-01-01 │
│ 2025-01-01 │
└──────────────────────┘
D SELECT DATETRUNC('month', "d") from t5;
┌───────────────────────┐
│ datetrunc('month', d) │
│ date │
├───────────────────────┤
│ 2025-01-01 │
│ 2025-01-01 │
└───────────────────────┘
D SELECT DATETRUNC('day', "d") from t5;
┌─────────────────────┐
│ datetrunc('day', d) │
│ date │
├─────────────────────┤
│ 2025-01-22 │
│ 2025-01-22 │
└─────────────────────┘
D SELECT DATETRUNC('week', "d") from t5;
┌──────────────────────┐
│ datetrunc('week', d) │
│ date │
├──────────────────────┤
│ 2025-01-20 │
│ 2025-01-20 │
└──────────────────────┘
D SELECT DATETRUNC('millisecond', "d") from t5;
┌─────────────────────────────┐
│ datetrunc('millisecond', d) │
│ timestamp │
├─────────────────────────────┤
│ 2025-01-22 12:21:23.123 │
│ 2025-01-22 12:21:25.123 │
└─────────────────────────────┘
D SELECT DATETRUNC('microseconds', "d") from t5;
┌──────────────────────────────┐
│ datetrunc('microseconds', d) │
│ timestamp │
├──────────────────────────────┤
│ 2025-01-22 12:21:23.123 │
│ 2025-01-22 12:21:25.123223 │
└──────────────────────────────┘
for these differences we should make a new DuckDB dialect.
> DuckDB dialect implementation
> -----------------------------
>
> Key: CALCITE-6988
> URL: https://issues.apache.org/jira/browse/CALCITE-6988
> Project: Calcite
> Issue Type: New Feature
> Components: core
> Affects Versions: 1.39.0
> Reporter: Yu Xu
> Assignee: Yu Xu
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.40.0
>
>
> DuckDB is a very popular high-performance analytical database
> system([https://github.com/duckdb/duckdb]), many users are using DuckDB in
> their work but Calcite had not supported with it yet.
> I had relevant materials about mainly syntax difference between default
> dialect and DuckDB (from document : [https://duckdb.org/docs/stable/] ):
> *1. NullCollation is LAST mode, Nulls last for DESC and ASC, test as follow:*
> D select * from t3 order by data asc;
> ┌───────┬─────────┐
> │ id │ data │
> │ int32 │ varchar │
> ├───────┼─────────┤
> │ 1 │ aa │
> │ 1 │ bb │
> │ 1 │ │
> └───────┴─────────┘
> D select * from t3 order by data desc;
> ┌───────┬─────────┐
> │ id │ data │
> │ int32 │ varchar │
> ├───────┼─────────┤
> │ 1 │ bb │
> │ 1 │ aa │
> │ 1 │ │
> └───────┴─────────┘
>
> {*}2. max precision of decimal is 38{*}(we can refer to
> [https://duckdb.org/docs/stable/sql/data_types/numeric#fixed-point-decimals)|https://duckdb.org/docs/stable/sql/data_types/numeric#fixed-point-decimals)*]
>
> *3. IdentifierQuoteString is ", test for it:*
> test is OK:
> D SELECT DATETRUNC('second', "d") from t5;
> ┌────────────────────────┐
> │ datetrunc('second', d) │
> │ timestamp │
> ├────────────────────────┤
> │ 2025-01-22 12:21:23 │
> │ 2025-01-22 12:21:25 │
> └────────────────────────┘
> ERROR result:
> D SELECT DATETRUNC('second', 'd') from t5;
> Binder Error: Could not choose a best candidate function for the function
> call "datetrunc(STRING_LITERAL, STRING_LITERAL)". In order to select one,
> please add explicit type casts.
> Candidate functions:
> datetrunc(VARCHAR, DATE) -> TIMESTAMP
> datetrunc(VARCHAR, INTERVAL) -> INTERVAL
> datetrunc(VARCHAR, TIMESTAMP WITH TIME ZONE) -> TIMESTAMP WITH TIME
> ZONE
> datetrunc(VARCHAR, TIMESTAMP) -> TIMESTAMP
> LINE 1: SELECT DATETRUNC('second', 'd') from t5;
> ^
> D SELECT DATETRUNC('second', `d`) from t5;
> Catalog Error: Scalar Function with name `__postfix does not exist!
> Did you mean "!__postfix"?
> LINE 1: SELECT DATETRUNC('second', `d`) from t5;
>
> *4. support approx_count_distinct function:*
> D select approx_count_distinct("id") from t3 ;
> ┌───────────────────────────┐
> │ approx_count_distinct(id) │
> │ int64 │
> ├───────────────────────────┤
> │ 1 │
> └───────────────────────────┘
> *5. Not support character_length/char_length, need use length function to
> support it:*
> D select length('aa');
> ┌──────────────┐
> │ length('aa') │
> │ int64 │
> ├──────────────┤
> │ 2 │
> └──────────────┘
> D select character_length('aa');
> Catalog Error: Scalar Function with name character_length does not exist!
> Did you mean "array_length"?
> LINE 1: select character_length('aa');
> ^
> D select char_length('aa');
> Catalog Error: Scalar Function with name char_length does not exist!
> Did you mean "array_length"?
> LINE 1: select char_length('aa');
> ^
> *6. Map construncor method need convert () to {}, test for it:*
> result is OK:
> D select MAP \{'a':1, 'b':2};
> ┌────────────────────────────────────────────────────────────┐
> │ main.map(main.list_value('a', 'b'), main.list_value(1, 2)) │
> │ map(varchar, integer) │
> ├────────────────────────────────────────────────────────────┤
> │ \{a=1, b=2} │
> └────────────────────────────────────────────────────────────┘
> Error result:
> D select MAP ('a':1, 'b':2);
> Parser Error: syntax error at or near ":"
> LINE 1: select MAP ('a':1, 'b':2);
>
> *7. floor date should convert datetrunc function(can refer to:
> [https://duckdb.org/docs/stable/sql/functions/date#date_truncpart-date)*]
> D SELECT DATETRUNC('year', "d") from t5;
> ┌──────────────────────┐
> │ datetrunc('year', d) │
> │ date │
> ├──────────────────────┤
> │ 2025-01-01 │
> │ 2025-01-01 │
> └──────────────────────┘
> D SELECT DATETRUNC('month', "d") from t5;
> ┌───────────────────────┐
> │ datetrunc('month', d) │
> │ date │
> ├───────────────────────┤
> │ 2025-01-01 │
> │ 2025-01-01 │
> └───────────────────────┘
> D SELECT DATETRUNC('day', "d") from t5;
> ┌─────────────────────┐
> │ datetrunc('day', d) │
> │ date │
> ├─────────────────────┤
> │ 2025-01-22 │
> │ 2025-01-22 │
> └─────────────────────┘
> D SELECT DATETRUNC('week', "d") from t5;
> ┌──────────────────────┐
> │ datetrunc('week', d) │
> │ date │
> ├──────────────────────┤
> │ 2025-01-20 │
> │ 2025-01-20 │
> └──────────────────────┘
> D SELECT DATETRUNC('millisecond', "d") from t5;
> ┌─────────────────────────────┐
> │ datetrunc('millisecond', d) │
> │ timestamp │
> ├─────────────────────────────┤
> │ 2025-01-22 12:21:23.123 │
> │ 2025-01-22 12:21:25.123 │
> └─────────────────────────────┘
> D SELECT DATETRUNC('microseconds', "d") from t5;
> ┌──────────────────────────────┐
> │ datetrunc('microseconds', d) │
> │ timestamp │
> ├──────────────────────────────┤
> │ 2025-01-22 12:21:23.123 │
> │ 2025-01-22 12:21:25.123223 │
> └──────────────────────────────┘
> for these differences we should make a new DuckDB dialect.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)