[jira] [Commented] (CASSANDRA-4139) Add varint encoding to Messaging service
[ https://issues.apache.org/jira/browse/CASSANDRA-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14242529#comment-14242529 ] Sylvain Lebresne commented on CASSANDRA-4139: - To offer some kind of counter-point, I don't think this ticket would require lots of effort since we already have code to do the vint encoding/decoding. I might be missing something, but from what I can tell, it should be enough to pass the {{TypeSizes}} in {{IVersionedSerializer.serializedSize}} plus make sure both sides agree on whether vint is enabled or not, none of which is terribly involved (nor would add much complexity to the code). And since the investissement is not that big, I do think it's not completely worthless to evaluate it. It will probably not help in all cases or even with the default configuration, but I suspect it's faster than generic compression and so it could be interesting when you want a middle-ground between no compression at all and full messages compression. Anyway, not trying to convince anyone to prioritize this in any way, but just to say that unless someone beats me to it, I do intend to give this a shot at some point in the future (especially because some parts I made in CASSANDRA-8099 would benefit more from vint that what the current format probaby do). Add varint encoding to Messaging service Key: CASSANDRA-4139 URL: https://issues.apache.org/jira/browse/CASSANDRA-4139 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Vijay Assignee: Ariel Weisberg Fix For: 3.0 Attachments: 0001-CASSANDRA-4139-v1.patch, 0001-CASSANDRA-4139-v2.patch, 0001-CASSANDRA-4139-v4.patch, 0002-add-bytes-written-metric.patch, 4139-Test.rtf, ASF.LICENSE.NOT.GRANTED--0001-CASSANDRA-4139-v3.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-4139) Add varint encoding to Messaging service
[ https://issues.apache.org/jira/browse/CASSANDRA-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14241388#comment-14241388 ] Ariel Weisberg commented on CASSANDRA-4139: --- I think variable length integer encoding could be a big space saving in several contexts, but there is an argument against varints. If you want to do zero deserialization/copy varints will fight you because you can't random access fields by offset. What you can do instead is use generic compression. Counter-intuitive but think of the two use cases. I care about bandwidth therefore I need compression anyways for non-integer fields, or I don't care about bandwidth so why not maximize performance. Where this becomes important is in handling large messages where you don't want parse all of it because you are forwarding or may not consume the entire contents. If you have varints and want to be lazy it gets tricky. I am up for trying it and out and measuring.. Add varint encoding to Messaging service Key: CASSANDRA-4139 URL: https://issues.apache.org/jira/browse/CASSANDRA-4139 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Vijay Assignee: Ariel Weisberg Fix For: 3.0 Attachments: 0001-CASSANDRA-4139-v1.patch, 0001-CASSANDRA-4139-v2.patch, 0001-CASSANDRA-4139-v4.patch, 0002-add-bytes-written-metric.patch, 4139-Test.rtf, ASF.LICENSE.NOT.GRANTED--0001-CASSANDRA-4139-v3.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-4139) Add varint encoding to Messaging service
[ https://issues.apache.org/jira/browse/CASSANDRA-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14241456#comment-14241456 ] Benedict commented on CASSANDRA-4139: - We aren't bandwidth constrained for any workloads I'm aware of, so what are we hoping to achieve here? We already apply compression to the stream, so this will likely only help bandwidth consumption for individual small payloads where compression cannot be expected to yield much. In such scenarios bandwidth is especially unlikely to be a constraint. Add varint encoding to Messaging service Key: CASSANDRA-4139 URL: https://issues.apache.org/jira/browse/CASSANDRA-4139 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Vijay Assignee: Ariel Weisberg Fix For: 3.0 Attachments: 0001-CASSANDRA-4139-v1.patch, 0001-CASSANDRA-4139-v2.patch, 0001-CASSANDRA-4139-v4.patch, 0002-add-bytes-written-metric.patch, 4139-Test.rtf, ASF.LICENSE.NOT.GRANTED--0001-CASSANDRA-4139-v3.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-4139) Add varint encoding to Messaging service
[ https://issues.apache.org/jira/browse/CASSANDRA-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14241644#comment-14241644 ] Ariel Weisberg commented on CASSANDRA-4139: --- Is bandwidth a constraint for WAN replication? In practice is the default for messaging to have compression on? What are people doing in the wild? I could imagine varint encoding being a win for Cells where the names and values are integers and queries are bulk loading or selecting ranges. At the storage level it seems like the kind of thing that could beat general purpose compression if you know what data type you are dealing with and have a lot of 0 padded values. I have heard talk about using a column store and run length encoding approach for storage which makes it seem like varint encoding would be the tool of choice for storage either. The code changes don't look bad. It's mostly swapping types for streams and changes to calculating serialized size so that it is aware of the impact of variable length encoded integers. It could save bandwidth, but it could also be slower since you spend more cycles calculating serialized size and encoding/decoding integers. If you end up using compression in bandwidth sensitive scenarios you may not win much. Not varint encoding the data going in/out of the database means you only save real space proportionally when you have small operations going in/out. The flip side is that you can't do that many small ops anyways so you aren't bandwidth constrained. Add varint encoding to Messaging service Key: CASSANDRA-4139 URL: https://issues.apache.org/jira/browse/CASSANDRA-4139 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Vijay Assignee: Ariel Weisberg Fix For: 3.0 Attachments: 0001-CASSANDRA-4139-v1.patch, 0001-CASSANDRA-4139-v2.patch, 0001-CASSANDRA-4139-v4.patch, 0002-add-bytes-written-metric.patch, 4139-Test.rtf, ASF.LICENSE.NOT.GRANTED--0001-CASSANDRA-4139-v3.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-4139) Add varint encoding to Messaging service
[ https://issues.apache.org/jira/browse/CASSANDRA-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14241790#comment-14241790 ] Jonathan Ellis commented on CASSANDRA-4139: --- bq. Is bandwidth a constraint for WAN replication? In practice is the default for messaging to have compression on? Often, yes. internode_compression has defaulted to all for a while now. Most people probably leave it at that; the rest change it to dc. Add varint encoding to Messaging service Key: CASSANDRA-4139 URL: https://issues.apache.org/jira/browse/CASSANDRA-4139 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Vijay Assignee: Ariel Weisberg Fix For: 3.0 Attachments: 0001-CASSANDRA-4139-v1.patch, 0001-CASSANDRA-4139-v2.patch, 0001-CASSANDRA-4139-v4.patch, 0002-add-bytes-written-metric.patch, 4139-Test.rtf, ASF.LICENSE.NOT.GRANTED--0001-CASSANDRA-4139-v3.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-4139) Add varint encoding to Messaging service
[ https://issues.apache.org/jira/browse/CASSANDRA-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453136#comment-13453136 ] Jonathan Ellis commented on CASSANDRA-4139: --- (Vijay reports that he still has problems getting my branch to work over SSL.) Add varint encoding to Messaging service Key: CASSANDRA-4139 URL: https://issues.apache.org/jira/browse/CASSANDRA-4139 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Vijay Assignee: Vijay Fix For: 1.3 Attachments: 0001-CASSANDRA-4139-v1.patch, 0001-CASSANDRA-4139-v2.patch, 0001-CASSANDRA-4139-v4.patch, 0002-add-bytes-written-metric.patch, 4139-Test.rtf, ASF.LICENSE.NOT.GRANTED--0001-CASSANDRA-4139-v3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4139) Add varint encoding to Messaging service
[ https://issues.apache.org/jira/browse/CASSANDRA-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13410003#comment-13410003 ] Jonathan Ellis commented on CASSANDRA-4139: --- rebased and added some modifications at https://github.com/jbellis/cassandra/tree/4139-5. untested but hopefully shows a reasonable approach. Add varint encoding to Messaging service Key: CASSANDRA-4139 URL: https://issues.apache.org/jira/browse/CASSANDRA-4139 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Vijay Assignee: Vijay Fix For: 1.2 Attachments: 0001-CASSANDRA-4139-v1.patch, 0001-CASSANDRA-4139-v2.patch, 0001-CASSANDRA-4139-v3.patch, 0001-CASSANDRA-4139-v4.patch, 0002-add-bytes-written-metric.patch, 4139-Test.rtf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4139) Add varint encoding to Messaging service
[ https://issues.apache.org/jira/browse/CASSANDRA-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280580#comment-13280580 ] Vijay commented on CASSANDRA-4139: -- * Tried the implementation, We cannot have it in the socket level because of the backward compatibility, we should wrap the connection only if the version is 1.2 (which will change over the period of a connection). * One thing which we can do is to cache the object which wrapped around but the wrapper is light weight so it should'nt be a lot of over head. * In addition to this I noticed some wired problem (connection disconnects while reading, which is not the case when we are not caching) which some how breaks if i cache the objects and while communicating via SSL (i pushed the code to https://github.com/Vijay2win/cassandra/commit/ebf54b3df0419d6a4305aa4b8813e351d4ba7188#L1R176 to reproduce change FBUtilities.getDataOutput to getCachedDataOut) Add varint encoding to Messaging service Key: CASSANDRA-4139 URL: https://issues.apache.org/jira/browse/CASSANDRA-4139 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Vijay Assignee: Vijay Fix For: 1.2 Attachments: 0001-CASSANDRA-4139-v1.patch, 0001-CASSANDRA-4139-v2.patch, 0001-CASSANDRA-4139-v3.patch, 0002-add-bytes-written-metric.patch, 4139-Test.rtf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4139) Add varint encoding to Messaging service
[ https://issues.apache.org/jira/browse/CASSANDRA-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280588#comment-13280588 ] Jonathan Ellis commented on CASSANDRA-4139: --- bq. which will change over the period of a connection how's that? Add varint encoding to Messaging service Key: CASSANDRA-4139 URL: https://issues.apache.org/jira/browse/CASSANDRA-4139 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Vijay Assignee: Vijay Fix For: 1.2 Attachments: 0001-CASSANDRA-4139-v1.patch, 0001-CASSANDRA-4139-v2.patch, 0001-CASSANDRA-4139-v3.patch, 0002-add-bytes-written-metric.patch, 4139-Test.rtf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4139) Add varint encoding to Messaging service
[ https://issues.apache.org/jira/browse/CASSANDRA-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280607#comment-13280607 ] Vijay commented on CASSANDRA-4139: -- Trunk: we start with the latest version (OTCPC) and the lower versions in ITCPC (will start ignoring messages). Once we detect the right version we change the version of the message sent. OutputTCPConnection {code} public void write(MessageOut? message, String id, DataOutputStream out) throws IOException { write(message, id, out, Gossiper.instance.getVersion(poolReference.endPoint())); } {code} IncomingTcpConnection {code} MessagingService.validateMagic(input.readInt()); header = input.readInt(); assert isStream == (MessagingService.getBits(header, 3, 1) == 1) : Connections cannot change type: + isStream; version = MessagingService.getBits(header, 15, 8); logger.trace(Version is now {}, version); receiveMessage(input, version); {code} Add varint encoding to Messaging service Key: CASSANDRA-4139 URL: https://issues.apache.org/jira/browse/CASSANDRA-4139 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Vijay Assignee: Vijay Fix For: 1.2 Attachments: 0001-CASSANDRA-4139-v1.patch, 0001-CASSANDRA-4139-v2.patch, 0001-CASSANDRA-4139-v3.patch, 0002-add-bytes-written-metric.patch, 4139-Test.rtf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4139) Add varint encoding to Messaging service
[ https://issues.apache.org/jira/browse/CASSANDRA-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280671#comment-13280671 ] Jonathan Ellis commented on CASSANDRA-4139: --- Why not just reconnect on version change? (Only new-version nodes need to worry about this, since old-version nodes always send the same version.) Add varint encoding to Messaging service Key: CASSANDRA-4139 URL: https://issues.apache.org/jira/browse/CASSANDRA-4139 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Vijay Assignee: Vijay Fix For: 1.2 Attachments: 0001-CASSANDRA-4139-v1.patch, 0001-CASSANDRA-4139-v2.patch, 0001-CASSANDRA-4139-v3.patch, 0002-add-bytes-written-metric.patch, 4139-Test.rtf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4139) Add varint encoding to Messaging service
[ https://issues.apache.org/jira/browse/CASSANDRA-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280677#comment-13280677 ] Vijay commented on CASSANDRA-4139: -- Sure we can (Even better we can let the new version disconnect the old connection in that way we dont need to do ConcurrentMap.get for every message), but we still have that wired problem mentioned above :( Add varint encoding to Messaging service Key: CASSANDRA-4139 URL: https://issues.apache.org/jira/browse/CASSANDRA-4139 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Vijay Assignee: Vijay Fix For: 1.2 Attachments: 0001-CASSANDRA-4139-v1.patch, 0001-CASSANDRA-4139-v2.patch, 0001-CASSANDRA-4139-v3.patch, 0002-add-bytes-written-metric.patch, 4139-Test.rtf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4139) Add varint encoding to Messaging service
[ https://issues.apache.org/jira/browse/CASSANDRA-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13276224#comment-13276224 ] Jonathan Ellis commented on CASSANDRA-4139: --- Sorry, didn't think of this the first time around... What if we do the EncodedDataOutput wrapping in OutboundTcpConnection? That way we wouldn't need to re-wrap for every object we serialize. And the encoding does feel more like a connection-level thing than object-level. Add varint encoding to Messaging service Key: CASSANDRA-4139 URL: https://issues.apache.org/jira/browse/CASSANDRA-4139 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Vijay Assignee: Vijay Fix For: 1.2 Attachments: 0001-CASSANDRA-4139-v1.patch, 0001-CASSANDRA-4139-v2.patch, 0001-CASSANDRA-4139-v3.patch, 0002-add-bytes-written-metric.patch, 4139-Test.rtf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4139) Add varint encoding to Messaging service
[ https://issues.apache.org/jira/browse/CASSANDRA-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13276230#comment-13276230 ] Vijay commented on CASSANDRA-4139: -- Hi Jonathan, The problem is the backward compatibility part of it... We do out.writeInt(header) which is used IncomingTcpConnection (header = input.readInt()) for the older versions the header will be corrupted... the other option is to write constant 4 bytes just for the header, If it sounds reasonable i can rewrite it? Add varint encoding to Messaging service Key: CASSANDRA-4139 URL: https://issues.apache.org/jira/browse/CASSANDRA-4139 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Vijay Assignee: Vijay Fix For: 1.2 Attachments: 0001-CASSANDRA-4139-v1.patch, 0001-CASSANDRA-4139-v2.patch, 0001-CASSANDRA-4139-v3.patch, 0002-add-bytes-written-metric.patch, 4139-Test.rtf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4139) Add varint encoding to Messaging service
[ https://issues.apache.org/jira/browse/CASSANDRA-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270862#comment-13270862 ] Jonathan Ellis commented on CASSANDRA-4139: --- Can you rebase? Finally got CASSANDRA-3617 committed, which conflicts. (Probably useful: I added TypeSizes.sizeof(String) to supplement the raw encodedUTF8Length.) Add varint encoding to Messaging service Key: CASSANDRA-4139 URL: https://issues.apache.org/jira/browse/CASSANDRA-4139 Project: Cassandra Issue Type: Sub-task Components: Core Reporter: Vijay Assignee: Vijay Fix For: 1.2 Attachments: 0001-CASSANDRA-4139-v1.patch, 0002-add-bytes-written-metric.patch, 4139-Test.rtf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira