John,

Make sure you include testing a single query, if you haven’t already, then see 
what 4D’s query analyzer does with it in the query plan. Your current use of 
two queries may very well be optimal for your data set, since most countries 
don’t span much of the world in terms of longitude. Your example of Russia is 
the broadest case. You may see more benefits for Luxembourg, or even Chile, or 
not. Only testing will tell.

Years ago I worked with Josh Fletcher to understand the different types of 4D 
indexes. The result was a recommendation to leave them as the default B-Tree. 
Any benefit to fiddling with each index was negligible. However, with a 
specialized data set, such as longitude, it might make a bigger difference.

HTH,

Tom Benedict

> On Apr 10, 2020, at 06:26, John J Foster via 4D_Tech <[email protected]> 
> wrote:
> 
> Hi Arnaud,
> 
>> what is the field type? 
> 
> The field type is a real number.
> 
> There are so many similar values in terms of the integer part but many many 
> variations of the decimal part. Currently I am using a B-Tree index.
> 
> I’ve thought about breaking it the two parts but that just adds an extra 
> field and storage and if I did that then maybe a Cluster index would be 
> obvious.
> 
> I'll play with the Query plan and see if it’s helpful. Having almost 5 
> million cities is a good testing ground and a good reference database when I 
> am matching smaller sets of GEO data (currently imported and massaged from 
> another database).
> 
> In this case I am searching by country (or country code) and longitude to 
> find (if found) an Olson TimeZone. The fastest search I've found so far 
> (B-Tree index):
> 
>  // [PE_Geo_Location]lon_dec = 28.6
>  // [PE_Geo_Location]Country_Code = RU
> 
> GEO_GetTimeZone([PE_Geo_Location]Country_Code;[PE_Geo_Location]lon_dec)
> 
> $cntryCode:=$1
> $lon_r:=Int($2)    // 28
> 
> QUERY([City];[City]country code=$cntryCode)                           
> QUERY SELECTION([City];[City]longitude>=$lon_r;*)
> QUERY SELECTION([City]; & ;[City]longitude<=$lon_r)
> ….
> 
> $0:=[City]timezone
> 
> It’s not a “live” database so I don't want to spend too much time with it. 
> But when I need to match data then, of course, I want it to search as fast as 
> is possible.
> 
> Appreciate,
> John…
> 
> 
> 
>>> Le 10 avr. 2020 à 02:03, John J Foster via 4D_Tech <[email protected] 
>>> <mailto:[email protected]>> a écrit :
>>> 
>>> Hi All,
>>> 
>>> I have a GEO database of almost 5 million cities. They include various data 
>>> and latitude and longitude.
>>> 
>>> I need to use the Longitude file in searches. To speed it up I need to 
>>> index. Many of the values are similar like 141.235 or 141 3.23 and so on. 
>>> So for those who might know would a Cluster index be better suited or a 
>>> B-Tree?
>> 
>> I'd try distinct values first, a cluster is supposed to be better when 
>> there's enough repeated values. And most probably I'd try… I didn't 
>> understand from "141.235 or 141 3.23": what is the field type? 
>> You may also consider using another field with "simplified value" from 
>> longitude, depending on what's expected from that search. Or, if the search 
>> is always combined with other fields than longitude, make some tests and 
>> analyse query plan/query path to see if that index is required. 
>> 
>> -- 
>> Arnaud de Montard 
> 
> 

**********************************************************************
4D Internet Users Group (4D iNUG)
Archive:  http://lists.4d.com/archives.html
Options: https://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[email protected]
**********************************************************************

Reply via email to