I would also say, if we add a feature like...

gfsh> set APP_FETCH_SIZE=1000

That this only applies to my session, i.e. no other tooling is affected nor
is the cluster affected globally.



On Wed, Jul 12, 2017 at 11:14 AM, John Blum <[email protected]> wrote:

> Agreed!
>
> What I explained and meant before was... if a user wants to "limit" the
> results of the query, then they should do so "explicitly" by using the
> "LIMIT" OQL keyword within the query itself.  This is not difficult to do...
>
> gfsh> query -query="SELECT * FROM /Region WHERE ... LIMIT 1000"
>
>
> There should not be some superficial *Gfsh* query command option, like '
> --limit' or a System property (ugh! We really need to get away from this
> System property nonsense).
>
> Imagine for a moment the user wants to run...
>
> SELECT count(*) FROM /Region
>
> And we impose a default LIMIT of 1000, or 100.  Then what?
>
> No.  It is simple/common enough in any Query language (SQL alike) without
> adding arbitrary options to the `query` command in *Gfsh* to limit the
> results of a query if that is what the user wants. Besides, there is
> nothing preventing a user from circumventing the arbitrary/default limit
> anyway, by saying...
>
> SELECT * FROM /Region WHERE ... LIMIT 2147483647 <(214)%20748-3647>
>
> or...
>
> gfsh> query -query=".." --limit=2147483647 <(214)%20748-3647>
>
> I believe our users are smart enough and conscientious to the fact that 
> 'SELECT
> * FROM /Region' (without a predicate) is not a smart query.
>
> However, I do think there is value in not streaming back the entire result
> set to a client "tool" (e.g. *Gfsh*, *Pulse*, etc).
>
> Actual GemFire cache client applications should not affected by anything a
> tool does unless it is applied to the system itself as a whole (like
> Cluster Config).
>
> $0.02
>
> -j
>
>
>
> On Wed, Jul 12, 2017 at 10:46 AM, Michael Stolz <[email protected]> wrote:
>
>> I'm fine with imposing limits on queries from within our own tooling, but
>> we cannot impose arbitrary limits on queries that are performed by
>> application code.
>>
>> That would be a silent breaking change to existing behavior at any
>> customer who has large queries. There is no way to know by examining code
>> or queries if the query is supposed to return 10,000 rows, so only by
>> testing every query they have could they determine if the imposed limit
>> breaks the intent of the query.
>>
>> Silent breaking changes to public APIs are not acceptable.
>>
>> --
>> Mike Stolz
>> Principal Engineer, GemFire Product Manager
>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>
>> On Wed, Jul 12, 2017 at 1:29 PM, [email protected] <[email protected]>
>> wrote:
>>
>>> We would like to avoid letting user accidentally issues a query that
>>> would yield large result set even if they are dumping the result into a
>>> file for performance reasons. If they want a large result set sent back by
>>> gfsh, they have to do so consciously by adding a large limit in the query
>>> themselves.
>>>
>>>
>>>
>>> -------- Original Message --------
>>> Subject: Re: refactor query command
>>> From: Swapnil Bawaskar
>>> To: [email protected]
>>> CC:
>>>
>>>
>>> +1
>>> One suggestion I would like to make is that if the user specifies that
>>> the query results should go to a file, we should not apply the limit clause
>>> on the server.
>>>
>>> On Tue, Jul 11, 2017 at 5:19 PM Jinmei Liao <[email protected]> wrote:
>>>
>>>> Basically, our reasoning is client-side pagination is not as useful as
>>>> people would think, you can either get all the results dumped to the
>>>> console, and use scroll bar to move back and forth, or dump it into a file,
>>>> and uses whatever piping mechanism supported by your environment. The
>>>> server side retrieves everything at once anyway and saves the entire result
>>>> set in the backend. It's not like we are saving any server side work here.
>>>>
>>>> On Tue, Jul 11, 2017 at 4:22 PM, Jinmei Liao <[email protected]> wrote:
>>>>
>>>>> Currently the way it's implementing the client-side pagination is
>>>>> convoluted and doubtfully useful. We are proposing to get rid of the
>>>>> client-side pagination and only have the server side impose a limit (and
>>>>> maybe implement pagination on the server side later on).
>>>>>
>>>>> The new behavior should look like this:
>>>>>
>>>>> gfsh> set  APP_FETCH_SIZE  50;
>>>>> gfsh> query --query="select * from /A"  // suppose entry size is 3
>>>>>
>>>>> Result : true
>>>>> Limit  : 50
>>>>> Rows   : 3
>>>>>
>>>>> Result
>>>>> --------
>>>>> value1
>>>>> value2
>>>>> value3
>>>>>
>>>>>
>>>>> gfsh> query --query="select * from /A"  // suppose entry size is 1000
>>>>>
>>>>> Result : true
>>>>> Limit  : 50
>>>>> Rows   : 50
>>>>>
>>>>> Result
>>>>> --------
>>>>> value1
>>>>> ...
>>>>> value50
>>>>>
>>>>> gfsh> query --query="select * from /A limit 100"  // suppose entry
>>>>> size is 1000
>>>>> Result : true
>>>>> Rows   : 100
>>>>>
>>>>> Result
>>>>> --------
>>>>> value1
>>>>> ...
>>>>> value100
>>>>>
>>>>>
>>>>> gfsh> query --query="select * from /A limit 500" --file="output.txt"
>>>>>  // suppose entry size is 1000
>>>>> Result : true
>>>>> Rows   : 500
>>>>>
>>>>> Query results output to /var/tempFolder/output.txt
>>>>>
>>>>> (And the output.txt content to be:
>>>>> Result
>>>>> --------
>>>>> value1
>>>>> ...
>>>>> value500)
>>>>>
>>>>>
>>>>> Bear in mind that we are trying to get rid of client side pagination,
>>>>> so the --page-size or --limit option would not apply anymore. Only the
>>>>> limit inside the query will be honored by the server side. If they query
>>>>> does not have a limit clause, the server side will impose a limit (default
>>>>> to 100). The limit can only be explicitly overridden if user chooses to do
>>>>> so. So that user would not accidentally execute a query that would result
>>>>> in a large result set.
>>>>>
>>>>> Would this be sufficient to replace the client-side pagination?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Jul 11, 2017 at 2:26 PM, Anilkumar Gingade <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> To make it clear, gfsh could print the query it sent to server in the
>>>>>> result summary (it shows if it got executed with the limit):
>>>>>> Query     :
>>>>>> Result     : true
>>>>>> startCount : 0
>>>>>> endCount   : 20
>>>>>> Rows       : 1
>>>>>>
>>>>>> -Anil.
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 11, 2017 at 12:48 PM, John Blum <[email protected]> wrote:
>>>>>>
>>>>>>> I think it might be worth differentiating the result "LIMIT" (as
>>>>>>> used in the OQL query statement like so... "SELECT * FROM /Region
>>>>>>> WHERE ... LIMIT 1000")  from what is actually "streamed" back to
>>>>>>> *Gfsh* as the default (e.g. 100).
>>>>>>>
>>>>>>> Clearly sending all the results back is quite expensive depending on
>>>>>>> the number of results/LIMIT specified.  Therefore, whatever "
>>>>>>> --option" is provided to the `query` command is a further reduction
>>>>>>> in what is actually streamed back to the client (e.g. *Gfsh*)
>>>>>>> initially, sort of like paging, therefore ... `gfsh> query
>>>>>>> --query="SELECT * FROM /Region WHERE ... LIMIT 1000" --page-size=25`...
>>>>>>> perhaps?
>>>>>>>
>>>>>>> Therefore, I think having 2 limits, as in OQL LIMIT and a --limit
>>>>>>> option would just be confusing to users.  LIMIT like sort (ORDER BY) can
>>>>>>> only be effectively applied to the OQL as it determines what results the
>>>>>>> query actually returns.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jul 11, 2017 at 11:24 AM, Anilkumar Gingade <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> >> Actually a really nice thing would be to put the pagination
>>>>>>>> feature into the OQL engine where it belongs.
>>>>>>>> +1 on this.
>>>>>>>>
>>>>>>>> >> if they query mode is interactive, it sends the first 20
>>>>>>>> (page-size, not configurable) records. and user uses "n" to go to the 
>>>>>>>> next
>>>>>>>> page,
>>>>>>>> >> once it hits the last page (showing all 1000 record or get to
>>>>>>>> the end of the result set), the command finishes.
>>>>>>>>
>>>>>>>> We could provide one more option to end user to quit getting to
>>>>>>>> next page and go-back to gfsh command for new commands (if its not 
>>>>>>>> there).
>>>>>>>>
>>>>>>>> I think providing multiple options to view large result set, is a
>>>>>>>> nice feature from tooling perspective (interactive result batching, 
>>>>>>>> dumping
>>>>>>>> into an external file, etc...)
>>>>>>>>
>>>>>>>> >> It’s fairly common in query tooling to be able to set a result
>>>>>>>> set limit.
>>>>>>>> Yes...many of the interactive query tools allow pagination/batching
>>>>>>>> as part of the result display.
>>>>>>>>
>>>>>>>> >> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>>> We need to make sure that user can differentiate query commands
>>>>>>>> from options provided by tool.
>>>>>>>>
>>>>>>>> -Anil.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jul 11, 2017 at 9:56 AM, William Markito Oliveira <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> The way I read this is: One is limiting on the server side, the
>>>>>>>>> other is limiting the client side.  IOW within the query string is 
>>>>>>>>> acting
>>>>>>>>> on server side.
>>>>>>>>>
>>>>>>>>> On Tue, Jul 11, 2017 at 11:19 AM, Jinmei Liao <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> what if user wants to do:
>>>>>>>>>> gfsh> query --query='select * from /A limit 10' --limit=100
>>>>>>>>>>
>>>>>>>>>> What's the difference between put it inside the query string or
>>>>>>>>>> outside? I think eventually it's adding the limit clause to the 
>>>>>>>>>> query.
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 11, 2017 at 8:41 AM, Anthony Baker <[email protected]
>>>>>>>>>> > wrote:
>>>>>>>>>>
>>>>>>>>>>> It’s fairly common in query tooling to be able to set a result
>>>>>>>>>>> set limit.  I would make this a first class option within gfsh 
>>>>>>>>>>> instead of
>>>>>>>>>>> an environment variable.
>>>>>>>>>>>
>>>>>>>>>>> gfsh> set query-limit=1000
>>>>>>>>>>>
>>>>>>>>>>> or
>>>>>>>>>>>
>>>>>>>>>>> gfsh> query --query='select * from /A’ --limit=1000
>>>>>>>>>>>
>>>>>>>>>>> The result set limit is semantically different from specifying a
>>>>>>>>>>> LIMIT on the OQL query itself.
>>>>>>>>>>>
>>>>>>>>>>> Anthony
>>>>>>>>>>>
>>>>>>>>>>> On Jul 11, 2017, at 7:53 AM, William Markito Oliveira <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>> +1 for the combination of 1 and 2 as well.  It would be
>>>>>>>>>>> interesting to explore at least a couple output formats, csv being 
>>>>>>>>>>> one of
>>>>>>>>>>> the most common for people that wants to import or analyze the data 
>>>>>>>>>>> using
>>>>>>>>>>> other tools.
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jul 11, 2017 at 8:31 AM, Michael Stolz <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Actually a really nice thing would be to put the pagination
>>>>>>>>>>>> feature into the OQL engine where it belongs. Clients shouldn't 
>>>>>>>>>>>> have to
>>>>>>>>>>>> implement pagination.
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Mike Stolz
>>>>>>>>>>>> Principal Engineer, GemFire Product Manager
>>>>>>>>>>>> Mobile: +1-631-835-4771 <(631)%20835-4771>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Jul 11, 2017 at 12:00 AM, Michael William Dodge <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I prefer to redirect output to a file when there is any chance
>>>>>>>>>>>>> that the results might be huge. Thus I find the combination of #1 
>>>>>>>>>>>>> and #2 to
>>>>>>>>>>>>> be sufficient for me.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sarge
>>>>>>>>>>>>>
>>>>>>>>>>>>> > On 10 Jul, 2017, at 17:13, Jinmei Liao <[email protected]>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Hi, all gfsh-users,
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > In our refactor week, we are trying to refactor how
>>>>>>>>>>>>> multi-step command is implemented. The currently implementation 
>>>>>>>>>>>>> is hard to
>>>>>>>>>>>>> understand to begin with. The implementation breaks the OO design
>>>>>>>>>>>>> principals in multiple ways. It's not thread-safe either. This is 
>>>>>>>>>>>>> an
>>>>>>>>>>>>> internal command type, and and only our "query" command uses it.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > This is how our current "query" command works:
>>>>>>>>>>>>> > 1) user issues a "query --query='select * from /A'" command,
>>>>>>>>>>>>> > 2) server retrieves the first 1000 (fetch-size, not
>>>>>>>>>>>>> configurable) rows,
>>>>>>>>>>>>> > 3) if the query mode is NOT interactive, it sends back all
>>>>>>>>>>>>> the result at one.
>>>>>>>>>>>>> > 4) if they query mode is interactive, it sends the first 20
>>>>>>>>>>>>> (page-size, not configurable) records. and user uses "n" to go to 
>>>>>>>>>>>>> the next
>>>>>>>>>>>>> page, once it hits the last page (showing all 1000 record or get 
>>>>>>>>>>>>> to the end
>>>>>>>>>>>>> of the result set), the command finishes.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > we would like to ask how useful is this interactive feature.
>>>>>>>>>>>>> Is it critical for you? Would the following simplification be 
>>>>>>>>>>>>> sufficient?
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > 1) query command always returns the entire fetch size. We
>>>>>>>>>>>>> can make it configurable through environment variables, default 
>>>>>>>>>>>>> to be 100,
>>>>>>>>>>>>> and you can also reset it in each individual query command using 
>>>>>>>>>>>>> "query
>>>>>>>>>>>>> --query='select * from /A limit 10'
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > 2) provide an option for you to specify a file where we can
>>>>>>>>>>>>> dump all the query result in and you can use shell pagination to 
>>>>>>>>>>>>> list the
>>>>>>>>>>>>> content of the file.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Please let us know your thoughts/comments. Thanks!
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > --
>>>>>>>>>>>>> > Cheers
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Jinmei
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> ~/William
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Cheers
>>>>>>>>>>
>>>>>>>>>> Jinmei
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> ~/William
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> -John
>>>>>>> john.blum10101 (skype)
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Cheers
>>>>>
>>>>> Jinmei
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Cheers
>>>>
>>>> Jinmei
>>>>
>>>
>>
>
>
> --
> -John
> john.blum10101 (skype)
>



-- 
-John
john.blum10101 (skype)

Reply via email to