[jira] [Commented] (CASSANDRA-6060) Remove internal use of Strings for ks/cf names
[ https://issues.apache.org/jira/browse/CASSANDRA-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345250#comment-14345250 ] Brian Hess commented on CASSANDRA-6060: I know this ticket is closed, but there is another use case that might make this more useful. Namely, with the advent of CTAS (CASSANDRA-8234), you could want to change the primary key of a table. To do that, you could create a new table with the new primary key and select the old data into it. The last step, for cleanliness, might be to drop the original table alter the name of the new table to the original table name - thereby completing the change of the primary key. Remove internal use of Strings for ks/cf names -- Key: CASSANDRA-6060 URL: https://issues.apache.org/jira/browse/CASSANDRA-6060 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Ariel Weisberg Labels: performance We toss a lot of Strings around internally, including across the network. Once a request has been Prepared, we ought to be able to encode these as int ids. Unfortuntely, we moved from int to uuid in CASSANDRA-3794, which was a reasonable move at the time, but a uuid is a lot bigger than an int. Now that we have CAS we can allow concurrent schema updates while still using sequential int IDs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6060) Remove internal use of Strings for ks/cf names
[ https://issues.apache.org/jira/browse/CASSANDRA-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14241364#comment-14241364 ] Ariel Weisberg commented on CASSANDRA-6060: --- I am still digging but I am not sure there is much value here. For prepared statements between client and server there are no ks/cf names. Here is the breakdown for a minimum size mutation inside the cluster Size of Ethernet frame - 24 Bytes Size of IPv4 Header (without any options) - 20 bytes Size of TCP Header (without any options) - 20 Bytes 4-bytes protocol magic 4-bytes version 4-bytes timestamp 4-bytes verb 4-bytes parameter count 4-bytes payload length prefix No keyspace name in current versions 2-byte key length key say 10 bytes 4-byte mutation count 1-byte boolean 16-byte cf id 4-byte count of columns Per column 2-byte column name length prefix column name say 8 bytes 1-byte serialization flags 8-byte timestamp 4-byte length prefix column value say 8 bytes Total is 158 bytes. Saving 12 bytes on the CF uuid would be 7.5 %. For single CF mutations this is not a win. Loading data points 16 bytes at a time isn't going to work so hot anyways so people might look into batching at that point. The UUID is not repeated for each cell so it is a one time cost so for workloads that modify multiple cells per CF. The one case where the 12-bytes becomes significant is single cell updates to multiple CFs in one mutation. There the 12-byte overhead converges on 23%. I am going to look at the read path next, but I kind of expect to find something similar. A read is going t o have key overhead and possibly overhead for all the other query parameters that should match the simple single cell mutation case. Remove internal use of Strings for ks/cf names -- Key: CASSANDRA-6060 URL: https://issues.apache.org/jira/browse/CASSANDRA-6060 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Ariel Weisberg Labels: performance Fix For: 3.0 We toss a lot of Strings around internally, including across the network. Once a request has been Prepared, we ought to be able to encode these as int ids. Unfortuntely, we moved from int to uuid in CASSANDRA-3794, which was a reasonable move at the time, but a uuid is a lot bigger than an int. Now that we have CAS we can allow concurrent schema updates while still using sequential int IDs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6060) Remove internal use of Strings for ks/cf names
[ https://issues.apache.org/jira/browse/CASSANDRA-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968216#comment-13968216 ] Benedict commented on CASSANDRA-6060: - Bumping to 3.0, as this won't make it into 2.1 now Remove internal use of Strings for ks/cf names -- Key: CASSANDRA-6060 URL: https://issues.apache.org/jira/browse/CASSANDRA-6060 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Vijay Labels: performance Fix For: 3.0 We toss a lot of Strings around internally, including across the network. Once a request has been Prepared, we ought to be able to encode these as int ids. Unfortuntely, we moved from int to uuid in CASSANDRA-3794, which was a reasonable move at the time, but a uuid is a lot bigger than an int. Now that we have CAS we can allow concurrent schema updates while still using sequential int IDs. -- This message was sent by Atlassian JIRA (v6.2#6252)