[
https://issues.apache.org/jira/browse/CASSANDRA-20425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nadav Har'El updated CASSANDRA-20425:
-------------------------------------
Description:
Cassandra's documentation
[https://cassandra.apache.org/doc/stable/cassandra/cql/ddl.html] claims that:
_Both keyspace and table name ... are limited in size to 48 characters (that
limit exists mostly to avoid filenames (which may include the keyspace and
table name) to go over the limits of certain file systems)._
I checked, and although this limitation was true in Cassandra 3, it is no
longer enforced since Cassandra 4.
It seems this change was not intentional, and happened eight years ago in
commit [this
commit|https://github.com/apache/cassandra/commit/af3fe39dcabd9ef77a00309ce6741268423206df].
Before this commit, we had
{{public static boolean isNameValid(String name) {}}
return name != null && !name.isEmpty() && name.length() <=
SchemaConstants.NAME_LENGTH && PATTERN_WORD_CHARS.matcher(name).matches();
}
After it, this definition was dropped, IndexMetadata.isNameValid() is used, and
this one never had a name length check.
We can see that this dropping of the limit check was unintentional in several
ways. First, the documentation still mentions this no-longer-existing
limitation. Second, the code still has SchemaConstants.NAME_LENGTH existing and
set to 48, but no longer used. Or rather- it is used - but only in error
messages, which no longer match what is actually being tested!
I think Cassandra should either return this name length limit that was
accidentally dropped 8 years ago, or we could decide officially that this name
length limit is lifted - and drop it from the documentation and also all
mentions of NAME_LENGTH in error messages (and remove this constant entirely).
But before deciding what to do, I want to make another point. The documentation
rightly explained the original raison d'etre for this 48-character limitation:
"that limit exists mostly to avoid filenames to go over the limits of certain
file systems". This reason is still relevant. I checked what happens when on a
standard Linux filesystem I try to create a table name with a 300-character
name. On Cassandra 4, the behavior was reasonable - I got a bizarre error, but
afterwards the database was still functional. On Cassandra 5, the result was a
disaster - Cassandra hang on some strange loop of IO errors and never
recovered. So if the decision is to allow longer-than-48-chars table names, as
were allowed in the last 8 years, we should consider whether we should have a
higher limit instead, or perhaps try to catch the error of filenames too long
for this filesystem and fail more gracefully.
was:
Cassandra's documentation
[https://cassandra.apache.org/doc/stable/cassandra/cql/ddl.html] claims that:
_Both keyspace and table name ... are limited in size to 48 characters (that
limit exists mostly to avoid filenames (which may include the keyspace and
table name) to go over the limits of certain file systems)._
I checked, and although this limitation was true in Cassandra 3, it is no
longer enforced since Cassandra 4. It seems this change is not intentional, and
happened eight years ago in commit [this
commit|https://github.com/apache/cassandra/commit/af3fe39dcabd9ef77a00309ce6741268423206df].
Before this commit, we had
{{public static boolean isNameValid(String name) {}}
return name != null && !name.isEmpty() && name.length() <=
SchemaConstants.NAME_LENGTH && PATTERN_WORD_CHARS.matcher(name).matches();
}
After it, this definition was dropped, IndexMetadata.isNameValid() is used, and
this one never had a name length check.
We can see that this dropping of the limit check was unintentional in several
ways. First, the documentation still mentions this no-longer-existing
limitation. Second, the code still has SchemaConstants.NAME_LENGTH existing and
set to 48, but no longer used. Or rather- it is used - but only in error
messages, which no longer match what is actually being tested!
I think Cassandra should either return this name length limit that was
accidentally dropped 8 years ago, or we could decide officially that this name
length limit is lifted - and drop it from the documentation and also all
mentions of NAME_LENGTH in error messages (and remove this constant entirely).
But before deciding what to do, I want to make another point. The documentation
rightly explained the original raison d'etre for this 48-character limitation:
"that limit exists mostly to avoid filenames to go over the limits of certain
file systems". This reason is still relevant. I checked what happens when on a
standard Linux filesystem I try to create a table name with a 300-character
name. On Cassandra 4, the behavior was reasonable - I got a bizarre error, but
afterwards the database was still functional. On Cassandra 5, the result was a
disaster - Cassandra hang on some strange loop of IO errors and never
recovered. So if the decision is to allow longer-than-48-chars table names, as
were allowed in the last 8 years, we should consider whether we should have a
higher limit instead, or perhaps try to catch the error of filenames too long
for this filesystem and fail more gracefully.
> Table name length validation accidentally lost since Cassandra 4
> ----------------------------------------------------------------
>
> Key: CASSANDRA-20425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20425
> Project: Apache Cassandra
> Issue Type: Bug
> Reporter: Nadav Har'El
> Priority: Normal
>
> Cassandra's documentation
> [https://cassandra.apache.org/doc/stable/cassandra/cql/ddl.html] claims that:
> _Both keyspace and table name ... are limited in size to 48 characters (that
> limit exists mostly to avoid filenames (which may include the keyspace and
> table name) to go over the limits of certain file systems)._
> I checked, and although this limitation was true in Cassandra 3, it is no
> longer enforced since Cassandra 4.
> It seems this change was not intentional, and happened eight years ago in
> commit [this
> commit|https://github.com/apache/cassandra/commit/af3fe39dcabd9ef77a00309ce6741268423206df].
> Before this commit, we had
> {{public static boolean isNameValid(String name) {}}
> return name != null && !name.isEmpty() && name.length() <=
> SchemaConstants.NAME_LENGTH && PATTERN_WORD_CHARS.matcher(name).matches();
> }
> After it, this definition was dropped, IndexMetadata.isNameValid() is used,
> and this one never had a name length check.
> We can see that this dropping of the limit check was unintentional in several
> ways. First, the documentation still mentions this no-longer-existing
> limitation. Second, the code still has SchemaConstants.NAME_LENGTH existing
> and set to 48, but no longer used. Or rather- it is used - but only in error
> messages, which no longer match what is actually being tested!
> I think Cassandra should either return this name length limit that was
> accidentally dropped 8 years ago, or we could decide officially that this
> name length limit is lifted - and drop it from the documentation and also all
> mentions of NAME_LENGTH in error messages (and remove this constant entirely).
> But before deciding what to do, I want to make another point. The
> documentation rightly explained the original raison d'etre for this
> 48-character limitation: "that limit exists mostly to avoid filenames to go
> over the limits of certain file systems". This reason is still relevant. I
> checked what happens when on a standard Linux filesystem I try to create a
> table name with a 300-character name. On Cassandra 4, the behavior was
> reasonable - I got a bizarre error, but afterwards the database was still
> functional. On Cassandra 5, the result was a disaster - Cassandra hang on
> some strange loop of IO errors and never recovered. So if the decision is to
> allow longer-than-48-chars table names, as were allowed in the last 8 years,
> we should consider whether we should have a higher limit instead, or perhaps
> try to catch the error of filenames too long for this filesystem and fail
> more gracefully.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]