[
https://issues.apache.org/jira/browse/CASSANDRA-17617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brandon Williams updated CASSANDRA-17617:
-----------------------------------------
Reviewers: Brandon Williams
Description:
It appears that the list of escaped unicode control characters
[here|https://github.com/apache/cassandra/blob/53a67ff2c36d90d337aba1409498de29931d4279/pylib/cqlshlib/formatting.py#L32]
is a bit too liberal. It seems to include characters such as '1' (0x31) and
'0' (0x30) which do not need to be escaped. It seems that the actual range
should be 0x00 - 0x1F and 0x7F+ as corroborated [by this
page|[https://en.wikipedia.org/wiki/Unicode_control_characters].]
This causes unnecessary escaping and regex substitutions on the CQLSH end
whenever common characters such as any punctuation or a 0 or a 1 appear in the
text column of a table. One might notice that a table with a text column filled
with 2's will take much less time to print than one with all 0's for this
reason.
was:
It appears that the list of escaped unicode control characters
[here|https://github.com/apache/cassandra/blob/53a67ff2c36d90d337aba1409498de29931d4279/pylib/cqlshlib/formatting.py#L32]
is a bit too liberal. It seems to include characters such as '1' (0x31) and
'0' (0x30) which do not need to be escaped. It seems that the actual range
should be 0x00 - 0x1F and 0x7F+ as corroborated [by this
page|https://en.wikipedia.org/wiki/Unicode_control_characters].
This causes unnecessary escaping and regex substitutions on the CQLSH end
whenever common characters such as any punctuation or a 0 or a 1 appear in the
text column of a table. One might notice that a table with a text column filled
with 2's will take much less time to print than one with all 0's for this
reason.
> CQLSH unicode control character list is too liberal
> ---------------------------------------------------
>
> Key: CASSANDRA-17617
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17617
> Project: Cassandra
> Issue Type: Improvement
> Components: CQL/Interpreter
> Reporter: Tanuj Nayak
> Assignee: Tanuj Nayak
> Priority: Normal
> Fix For: 3.11.x, 4.0.x, 4.1.x
>
>
> It appears that the list of escaped unicode control characters
> [here|https://github.com/apache/cassandra/blob/53a67ff2c36d90d337aba1409498de29931d4279/pylib/cqlshlib/formatting.py#L32]
> is a bit too liberal. It seems to include characters such as '1' (0x31) and
> '0' (0x30) which do not need to be escaped. It seems that the actual range
> should be 0x00 - 0x1F and 0x7F+ as corroborated [by this
> page|[https://en.wikipedia.org/wiki/Unicode_control_characters].]
>
> This causes unnecessary escaping and regex substitutions on the CQLSH end
> whenever common characters such as any punctuation or a 0 or a 1 appear in
> the text column of a table. One might notice that a table with a text column
> filled with 2's will take much less time to print than one with all 0's for
> this reason.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]