Given what was said, I propose rephrasing this functionality to limit the
memory used to execute a query. We will not expose the page size measured
in bytes to the client. Instead, an upper limit will be a guardrail so that
we won't fetch more data.
Aggregation query with grouping is a special
>
> So my other question - for aggregation with the "group by" clause, we
> return an aggregated row which is computed from a group of rows - with my
> current implementation, it is approximated by counting the size of the
> largest row in that group - I think it is the safest and simplest
>
Josh, that answers my question exactly; thank you.
I will not implement limiting the result set in CQL (that is, by LIMIT
clause) and stay with just paging. Whether the page size is defined in
bytes or rows can be determined by a flag - there are many unused bits for
that.
So my other question -
> As long as it is valid in the paging protocol to return a short page, but
> still say “there are more pages”, I think that is fine to do that.
Thankfully the v3-v5 spec all make it clear that clients need to respect what
the server has to say about there being more pages:
As long as it is valid in the paging protocol to return a short page, but
still say “there are more pages”, I think that is fine to do that. For an
actual LIMIT that is part of the user query, I think the server must always
have returned all data that fits into the LIMIT when all pages have been
Yeah, my bad. I have paging on the brain. Seriously.
I can't think of a use-case in which a LIMIT based on # bytes makes sense from
a user perspective.
On Mon, Jun 12, 2023, at 1:35 PM, Jeff Jirsa wrote:
>
>
> On Mon, Jun 12, 2023 at 9:50 AM Benjamin Lerer wrote:
>>> If you have rows that
On Mon, Jun 12, 2023 at 9:50 AM Benjamin Lerer wrote:
> If you have rows that vary significantly in their size, your latencies
>> could end up being pretty unpredictable using a LIMIT BY . Being
>> able to specify a limit by bytes at the driver / API level would allow app
>> devs to get more
>
> If you have rows that vary significantly in their size, your latencies
> could end up being pretty unpredictable using a LIMIT BY . Being
> able to specify a limit by bytes at the driver / API level would allow app
> devs to get more deterministic results out of their interaction w/the DB if
>
> I do not have in mind a scenario where it could be useful to specify a LIMIT
> in bytes. The LIMIT clause is usually used when you know how many rows you
> wish to display or use. Unless somebody has a useful scenario in mind I do
> not think that there is a need for that feature.
If you have
Yes, LIMIT BY provided by the user in CQL does not make much sense
to me either
pon., 12 cze 2023 o 11:20 Benedict napisał(a):
> I agree that this is more suitable as a paging option, and not as a CQL
> LIMIT option.
>
> If it were to be a CQL LIMIT option though, then it should be accurate
>
Limiting the amount of returned data in bytes in addition to the row limit
could be helpful when applied transparently by the server as a kind of
guardrail. The server could fail the query if it exceeds some
administratively imposed limit on the configuration level, WDYT?
pon., 12 cze 2023 o
I agree that this is more suitable as a paging option, and not as a CQL LIMIT option. If it were to be a CQL LIMIT option though, then it should be accurate regarding result set IMO; there shouldn’t be any further results that could have been returned within the LIMIT.On 12 Jun 2023, at 10:16,
Thanks Jacek for raising that discussion.
I do not have in mind a scenario where it could be useful to specify a
LIMIT in bytes. The LIMIT clause is usually used when you know how many
rows you wish to display or use. Unless somebody has a useful scenario in
mind I do not think that there is a
Hi,
I was working on limiting query results by their size expressed in bytes,
and some questions arose that I'd like to bring to the mailing list.
The semantics of queries (without aggregation) - data limits are applied on
the raw data returned from replicas - while it works fine for the row
14 matches
Mail list logo