Re: [U2] Unidata index and short-circuit evaluation

Wjhonson Fri, 08 Mar 2013 14:48:43 -0800

I didn't miss it.
The point of the request, was from the beginning to the ending.
Of course the first *portion* will be quick and use few disk reads.
I was discussing the full example.


 

 

 

-----Original Message-----
From: Wols Lists <antli...@youngman.org.uk>
To: u2-users <u2-users@listserver.u2ug.org>
Sent: Fri, Mar 8, 2013 2:43 pm
Subject: Re: [U2] Unidata index and short-circuit evaluation


On 08/03/13 22:07, Wjhonson wrote:
> If your file is small enough, and your system idle enough that the file 
remains *in memory* for all possible scenarios below, than you may not notice 
speed issues.
> 
> However, the monster in the kitchen, is the number of DISK READS you are 
doing.  If your prior reads get cycled out before they are read again, then you 
should run a single combined select which will do all accesses at the same 
instant.
> 
You missed the fact that the first select is based on an index. That
should not go anywhere near the data anyway. So doing it before or at
the same time as the other selects is irrelevant.

But yes. I based my recommendations on minimizing the number of disk
accesses ...

Cheers,
Wol
>  
> 
>  
> 
>  
> 
> -----Original Message-----
> From: Wols Lists <antli...@youngman.org.uk>
> To: u2-users <u2-users@listserver.u2ug.org>
> Sent: Fri, Mar 8, 2013 1:17 pm
> Subject: Re: [U2] Unidata index and short-circuit evaluation
> 
> 
> On 08/03/13 21:03, Jeffrey Butera wrote:
>> While I'm on a roll...  I often look at how to make queries run faster. 
>> In short, we index all the commonly used data fields we can and (of
>> course) it makes world of difference.  However, I have some questions
>> about optimal ways to query data using a mix of indexed data,
>> non-indexed data and i-descriptors.
>>
>> Let's say that in a table FOO I want to do SELECT FOO WITH A='foo' AND
>> B='bar' AND C='bang'
>> where:
>>
>> A = indexed data field
>> B = non-index data field
>> C = I-descriptor (assume it's time-consuming: 2 seconds per record)
>>
>> Which is the optimal way to attack?
>>
>> 1) I could just go for it with:
>>
>> SELECT FOO WITH A='foo' AND B='bar' AND C='bang'
>>
>> 2) I could do the following:
>>
>> SELECT FOO WITH A='foo' AND B='bar'
>> SELECT FOO WITH A='foo' AND B='bar' AND C='bang' REQUIRE.SELECT
>>
>> 3) I could do the following:
>>
>> SELECT FOO WITH A='foo'
>> SELECT FOO WITH A='foo' AND B='bar' AND C='bang' REQUIRE.SELECT
>>
>>
>> I've done benchmarks, but really curious about the innards of Unidata
>> and how/when it does short-circuit evaluation of AND clauses, etc.
>>
>> My gut tells me that (3) should be good because it first weeds out bad
>> records based solely on an indexed data field, thereby reducing the
>> number of records that "C" needs to be evaluated.  Conversely, if it's
>> doing a good job with short-circuit evaluation then (3) and (1)
>> shouldn't be terribly different because failure of A='foo' would imply
>> that C never gets evaluated.
>>
> While I don't know UniData, I'm guessing that it's the same as UniVerse
> in this sense, and imho the second select of both (2) and (3) is broken
> (as in, the first select is a waste of time ...)
> 
> I notice you're using REQUIRE.SELECT. So...
> 
> SELECT FOO WITH A='foo'       ;* will use the index
> SELECT FOO WITH B='bar' REQUIRE.SELECT ;* will now find the records with
> A equal to foo and B equal to bar
> SELECT FOO WITH C='bang' REQUIRE.SELECT ;* now finishes off the select.
> 
> Whether you want to combine the second two selects as
> SELECT FOO WITH B='bar' AND C='bang' REQUIRE.SELECT
> depends on what C does.
> 
> If, in order to evaluate C, you need to read the contents of FOO, then
> you should combine the two. If UniData reads @RECORD regardless of
> whether it's required when evaluating an i-desc, then likewise.
> 
> If, however, evaluating C can be done without reading @RECORD, then you
> may be better doing two selects.
> 
> Whatever happens, there is no point (indeed, it could easily be
> positively harmful) in repeating an earlier select. The REQUIRE.SELECT
> keyword guarantees that if the previous select fails to find any
> records, the subsequent select will also fail rather than starting again
> from scratch.
> 
> Cheers,
> Wol
> _______________________________________________
> U2-Users mailing list
> U2-Users@listserver.u2ug.org
> http://listserver.u2ug.org/mailman/listinfo/u2-users
> 
>  
> _______________________________________________
> U2-Users mailing list
> U2-Users@listserver.u2ug.org
> http://listserver.u2ug.org/mailman/listinfo/u2-users
> 

_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

Re: [U2] Unidata index and short-circuit evaluation

Reply via email to