[
https://issues.apache.org/jira/browse/CASSANDRA-18078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640884#comment-17640884
]
Yifan Cai edited comment on CASSANDRA-18078 at 11/29/22 6:26 PM:
-----------------------------------------------------------------
Generally, I agree that we should remove the redundant function.
A nice property of 'maxwritetime' is that it works on both single cell and
multi-cell column. So client does not need to check the column type before
applying the function.
I will check out if 'collection_max' does the same.
--- Update ---
'collection _max' requires the input to be set or list as of now. See the
example below. If we can make 'collection_max' work with a single item(, which
can be considered as a singleton collection), then both functions are
equivalent and we can remove 'maxwritetime' in the favor of the generic
'collection_max'. WDYT?
{code:sql}
cqlsh> CREATE TABLE test.max_ts ( a int PRIMARY KEY, b int, c list<int> );
cqlsh> INSERT INTO test.max_ts ( a, b, c ) VALUES ( 1, 1, [1, 2, 3] );
cqlsh> SELECT a, b, c, maxwritetime(b), maxwritetime(c),
collection_max(writetime(b)), collection_max(writetime(c)) FROM test.max_ts
WHERE a = 1;
InvalidRequest: Error from server: code=2200 [Invalid query] message="Function
system.collection_max requires a set or list argument, but found argument
writetime(b) of type bigint"
# Meanwhile, 'maxwritetime' works with both types.
cqlsh> SELECT a, b, c, maxwritetime(b), maxwritetime(c),
collection_max(writetime(c)) FROM test.max_ts WHERE a = 1;
a | b | c | maxwritetime(b) | maxwritetime(c) |
system.collection_max(writetime(c))
---+---+-----------+------------------+------------------+-------------------------------------
1 | 1 | [1, 2, 3] | 1669745997088534 | 1669745997088534 |
1669745997088534
(1 rows)
{code}
was (Author: yifanc):
Generally, I agree that we should remove the redundant function.
A nice property of 'maxwritetime' is that it works on both single cell and
multi-cell column. So client does not need to check the column type before
applying the function.
I will check out if 'collection_max' does the same.
---Update---
'collection _max' requires the input to be set or list as of now. See the
example below. If we can make 'collection_max' work with a single item(, which
can be considered as a singleton collection), then both functions are
equivalent and we can remove 'maxwritetime' in the favor of the generic
'collection_max'. WDYT?
{code:sql}
cqlsh> CREATE TABLE test.max_ts ( a int PRIMARY KEY, b int, c list<int> );
cqlsh> INSERT INTO test.max_ts ( a, b, c ) VALUES ( 1, 1, [1, 2, 3] );
cqlsh> SELECT a, b, c, maxwritetime(b), maxwritetime(c),
collection_max(writetime(b)), collection_max(writetime(c)) FROM test.max_ts
WHERE a = 1;
InvalidRequest: Error from server: code=2200 [Invalid query] message="Function
system.collection_max requires a set or list argument, but found argument
writetime(b) of type bigint"
# Meanwhile, 'maxwritetime' works with both types.
cqlsh> SELECT a, b, c, maxwritetime(b), maxwritetime(c),
collection_max(writetime(c)) FROM test.max_ts WHERE a = 1;
a | b | c | maxwritetime(b) | maxwritetime(c) |
system.collection_max(writetime(c))
---+---+-----------+------------------+------------------+-------------------------------------
1 | 1 | [1, 2, 3] | 1669745997088534 | 1669745997088534 |
1669745997088534
(1 rows)
{code}
> Consider removing MAXWRITETIME function
> ---------------------------------------
>
> Key: CASSANDRA-18078
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18078
> Project: Cassandra
> Issue Type: Improvement
> Components: CQL/Syntax
> Reporter: Andres de la Peña
> Priority: Normal
> Fix For: 4.2
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> CASSANDRA-17425 added a new {{MAXWRITETIME}} function that allows to retrieve
> the maximum timestamp of a multi-cell column. For example:
> {code:java}
> > CREATE TABLE t (k int PRIMARY KEY, v set<int>);
> > INSERT INTO t (k, v) VALUES (1, {1, 2}) USING TIMESTAMP 100;
> > UPDATE t USING TIMESTAMP 200 SET v += {3} WHERE k=1;
> > SELECT maxwritetime(v) FROM t;
> maxwritetime(v)
> -----------------
> 200
> {code}
> Later, CASSANDRA-8877 added the means to retrieve the write times and TTLs of
> each of the cells in a multi-cell column:
> {code:java}
> > SELECT writetime(v) FROM t;
> writetime(v)
> -----------------
> [100, 100, 200]
> > SELECT writetime(v[1]) FROM t;
> writetime(v[1])
> -----------------
> 100
> {code}
> Quite recently, CASSANDRA-18060 has added generic CQL functions to get the
> min and max items in a collection. Those functions can be used in combination
> with the classic {{writetime}} function to get the same results as the new
> {{maxwritetime}} function:
> {code:java}
> > SELECT collection_max(writetime(v)) FROM t;
> system.collection_max(writetime(v))
> -------------------------------------
> 200
> {code}
> Those new functions can also be used to get the min timestamp, or the min/max
> TTL, for which there isn't a specific function:
> {code:java}
> SELECT collection_min(writetime(v)) FROM t;
> SELECT collection_max(writetime(v)) FROM t;
> SELECT collection_avg(writetime(v)) FROM t;
> SELECT collection_min(ttl(v)) FROM t;
> SELECT collection_max(ttl(v)) FROM t;
> SELECT collection_avg(ttl(v)) FROM t;
> {code}
> I think this makes the new {{maxwritetime}} mostly redundant, since the new
> functions can achieve the same results in a more generic way. Since the new
> {{maxwritetime}} function is only on trunk, we should consider whether we
> want to remove it in favour of the generic functions.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]