[ 
https://issues.apache.org/jira/browse/CASSANDRA-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14629939#comment-14629939
 ] 

Sylvain Lebresne commented on CASSANDRA-6492:
---------------------------------------------

bq. If we translate a byte-based page size into a row-based one using internal 
metrics, we lose most of those advantages.

I don't understand, how is that different from your "Perhaps a good first step 
is to add support for automatic page size selection"? What did you had in mind 
for that? Because the only idea I had to do that from the internal metrics 
would be to use the metrics to get a estimated average row size, pick some 
presumably hard-coded bytes size target for a page, and compute the actual page 
size in rows from that. In which case, I'm saying that instead of hard-coding 
that target and since we'll need a modification to the protocol anyway, let's 
allow the user to provide that target. It's more flexible than just having the 
options of "a page size in row" or "some default".

Or to put it another way, having the server pick a default is not the problem 
we're trying to fix. The problem we're trying to fix is that to pick a proper 
page size, you currently have to guess-estimate the average size of your rows, 
but we can do a better guess-estimation server side and that's what we should 
provide here. Of course its still imperfect, but I think we're in agreement 
that the no-guess-estimate solution is a lot more involved.

And one of the bonus of directly modifying the protocol to allow a page size 
target in bytes (rather than only providing a default mode with hard-coded 
target server side) is that once we do implement the more involved 
change-the-internals solution, we'll have no additional use visible change to 
do, thing will just get auto-magically better and safer.

> Have server pick query page size by default
> -------------------------------------------
>
>                 Key: CASSANDRA-6492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6492
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API
>            Reporter: Jonathan Ellis
>            Assignee: Benjamin Lerer
>            Priority: Minor
>
> We're almost always going to do a better job picking a page size based on 
> sstable stats, than users will guesstimating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to