[ https://issues.apache.org/jira/browse/CASSANDRA-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903200#comment-14903200 ]
Benedict commented on CASSANDRA-10378: -------------------------------------- I would be in favour of that, as we could at the same time encode the _prior_ row's size as well, and this would permit us scanning both forwards and backwards. > Make skipping more efficient > ---------------------------- > > Key: CASSANDRA-10378 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10378 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Benedict > Assignee: Benedict > Fix For: 3.x > > > Following on from the impact of CASSANDRA-10322, we can improve the > efficiency of our calls to skipping methods. CASSANDRA-10326 is showing our > performance to be in-and-around the same ballpark except for seeks into the > middle of a large partition, which suggests (possibly) that the higher > density of data we're storing may simply be resulting in a more significant > CPU burden as we have more data to skip over (and since CASSANDRA-10322 > improves performance here really dramatically, further improvements are > likely to be of similar benefit). > I propose doing our best to flatten the skipping of macro data items into as > few skip invocations as necessary. One way of doing this would be to > introduce a special {{skipUnsignedVInts(int)}} method, that can efficiently > skip a number of unsigned vints. Almost the entire body of a cell and row > consist of vints now, each data component with their own special {{skipX}} > method that invokes {{readUnsignedVint}}. This would permit more efficient > despatch. > We could also potentially avoid the construction of a new {{Columns}} > instance for each row skip, since all we need is an iterator over the > columns, and share the temporary space used for storing them, which should > further reduce the GC burden for skipping many rows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)