Re: [sqlite] Search query alternatives.

Michael Falconer Wed, 15 Oct 2014 16:55:23 -0700

Igor,

Nice one. A quick test using 10 lookup trees and the same search criteria:
*Showing rows 0 - 29 (30 total, Query took 0.4838 sec)*


Now with UNION ALL replacing UNION:
*Showing rows 0 - 29 (30 total, Query took 0.2050 sec)*

You weren't kidding about cheaper were you? LOL


On 16 October 2014 10:30, Michael Falconer <[email protected]>
wrote:

> Yes, I agree with the general sentiment. It is not exactly the design I
> would have chosen either, but it is what exists. Design change is occurring
> but for the moment we are stuck with the current schema.
>
> If you cannot change the schemata to be more suitable, then your
>> demonstrated queries are very near as good as it gets. I can think of other
>> ways to achieve the same, but nothing that would be more efficient to a
>> tangible level. Maybe someone else here can come up with something a bit
>> more tight.
>>
> You are preaching to the converted I'm afraid, but I appreciate the points
> you made. The quoted paragraph probably answers most of my questions
> really, except for this from Igor:
>
> Unless you do expect duplicates and need to eliminate them, use UNION ALL
>> - it's much cheaper (this is assuming you insist on keeping multiple
>> tables).
>>
> Cheaper because it drops the operation to remove the duplicate records
> from the result set? I'm assuming. I'll give that a try.
>
> On 16 October 2014 10:05, RSmith <[email protected]> wrote:
>
>>
>> On 2014/10/16 00:05, Michael Falconer wrote:
>>
>>> Hi all,
>>>
>>> first off I must start with an apology. I know I'm sort of doing the
>>> wrong//...
>>>
>>
>> No need to apologise, this flies quite close to the central theme.
>> Whether you are using SQLite or any other SQL RDBMS, this is horrible DB
>> design and it is so for precisely the reasons you are asking the question.
>> Tables should not be dynamic: The simple rule of thumb being - if you
>> cannot tell (or at least accurately predict) before-hand exactly how many
>> tables will be in the final system, then you are doing it wrong.  Why not
>> have one set of tables and in stead of preceding each of them with a
>> <prefix>, simply add a column that can host the <prefix> as a simple value,
>> which will immediately make your life very much easier and get the SQL
>> engine to do the work you are now trying to compensate for in a lop-sided
>> manual kind of way.
>>
>> I can only imagine the amount of code you invested into your system to
>> track and deal with these dynamic sets of tables, so I know changing it
>> will seem like a rather large undertaking, but really it will mostly
>> involve removing loads of code to end up with a small set of simple queries
>> that does all the work for you.
>>
>> The only reason I can imagine this sort of breakdown useful is if your
>> tree tables are all really really huge, like Gigabytes, and so there might
>> be some efficiency to be gained from splitting it up, but I doubt this is
>> the case.
>>
>> Just imagine the ease of those search queries you demonstrated... one
>> query, no UNIONs, a single set of joins and an extra where-clause check...
>> Add to that increased efficiency at finding results (run one query in stead
>> of 50).  Easy-mode. As a proof-of-concept, just make an SQLite DB, one set
>> of tables as per your list, add column(s) for prefixing as needed, import
>> all the tables from some large set of them (specifying additionally the
>> column for the prefix) and ten run search queries on them. I am willing to
>> bet on the fact it will be much quicker in addition to being much simpler.
>>
>> If you cannot change the schemata to be more suitable, then your
>> demonstrated queries are very near as good as it gets. I can think of other
>> ways to achieve the same, but nothing that would be more efficient to a
>> tangible level. Maybe someone else here can come up with something a bit
>> more tight.
>>
>>
>>  thing here as this question is NOT related to sqlite. It is a general SQL
>>> question but I ask it here because I have great respect for the answers
>>> and
>>> discussions I have seen on this forum over many years. I rarely post
>>> myself
>>> as there are always several contributors who beat me to the answer and
>>> often their response is far better than mine would have been. I'm not a
>>> code leper, I don't need the actual SQL just the method really, though a
>>> short code example would be well received for illustration.
>>>
>>> It's about a search performed on multiple tables. However the structure
>>> of
>>> this database is somewhat unorthodox. It contains genealogical data and
>>> this is clustered into trees. Each tree has 8 tables, there can be as
>>> many
>>> as 100 trees but most installations of the associated application
>>> software
>>> contain between 5 - 50 trees. These 8 tables contain a family trees data
>>> and are named:
>>> <prefix>addresses
>>> <prefix>connections
>>> <prefix>events
>>> <prefix>family
>>> <prefix>person
>>> <prefix>repositories
>>> <prefix>sources
>>> <prefix>texts
>>>
>>> The <prefix> changes and is unique for each tree.
>>>
>>> There are extensive search options offered to users as well as simple
>>> name
>>> searching and it is one of the best features of the app. It works pretty
>>> well, so it ain't broke in any sense, we just wonder if there is a better
>>> way to perform this search in SQL. Is there a general technique which is
>>> superior either in speed, efficiency or load bearing contexts?
>>>
>>> I am sure you can see one of the pitfalls here is the exponential growth
>>> of
>>> such a search query as both total number of trees and indeed user search
>>> criteria increase. For each criteria component, and there are quite a
>>> few,
>>> the appropriate tables must be queried and results joined.
>>>
>>> Searches return records of individuals meeting the entered search
>>> criteria
>>> so the query focuses on the person table as it's anchor, performs
>>> searches
>>> on required other tables in the tree and joins the results to the person
>>> data. The results from each tree search are then UNION'ed to provide the
>>> result set. Here is a contrived example of the SQL query code which
>>> should
>>> make things clearer.
>>>
>>> The user enters simple search criteria - any person with a last name
>>> containing 'mac' and a first name containing the character 'a'. This is
>>> the
>>> resulting query (generated by php code), which searches a small 4 family
>>> tree installation.
>>>
>>> (SELECT humo1_person.*, event_kind, event_event, address_place,
>>> address_zip
>>>      FROM humo1_person
>>>      LEFT JOIN humo1_events ON pers_gedcomnumber=event_person_id
>>>      LEFT JOIN humo1_addresses ON pers_gedcomnumber=address_person_id
>>>          WHERE pers_lastname LIKE '%mac%'
>>>            AND (pers_firstname LIKE '%a%'
>>>                OR (event_kind='name' AND event_event LIKE '%a%')
>>>            )
>>>      GROUP BY pers_gedcomnumber
>>> )
>>> UNION
>>>      (SELECT humo2_person.*, event_kind, event_event, address_place,
>>> address_zip
>>>          FROM humo2_person LEFT JOIN humo2_events ON
>>> pers_gedcomnumber=event_person_id
>>>          LEFT JOIN humo2_addresses ON pers_gedcomnumber=address_
>>> person_id
>>>              WHERE pers_lastname LIKE '%mac%'
>>>                AND (pers_firstname LIKE '%a%'
>>>                    OR (event_kind='name' AND event_event LIKE '%a%') )
>>>          GROUP BY pers_gedcomnumber
>>>      )
>>> UNION
>>>      (SELECT humo3_person.*, event_kind, event_event, address_place,
>>> address_zip
>>>          FROM humo3_person LEFT JOIN humo3_events ON
>>> pers_gedcomnumber=event_person_id
>>>          LEFT JOIN humo3_addresses ON pers_gedcomnumber=address_
>>> person_id
>>>              WHERE pers_lastname LIKE '%mac%'
>>>                AND (pers_firstname LIKE '%a%'
>>>                    OR (event_kind='name' AND event_event LIKE '%a%')
>>>                )
>>>          GROUP BY pers_gedcomnumber
>>>      )
>>> UNION
>>>      (SELECT humo4_person.*, event_kind, event_event, address_place,
>>> address_zip
>>>          FROM humo4_person
>>>          LEFT JOIN humo4_events ON pers_gedcomnumber=event_person_id
>>>          LEFT JOIN humo4_addresses ON pers_gedcomnumber=address_
>>> person_id
>>>              WHERE pers_lastname LIKE '%mac%'
>>>                AND (pers_firstname LIKE '%a%'
>>>                    OR (event_kind='name' AND event_event LIKE '%a%')
>>>                )
>>>          GROUP BY pers_gedcomnumber
>>>      )
>>>      ORDER BY pers_lastname ASC , pers_firstname ASC LIMIT 0,30
>>>
>>> Any thoughts? Suggestions? Missiles? Good approach, bad approach, or
>>> completely off the grid? I do use sqlite quite a bit, but not on this
>>> particular project.
>>>
>>>
>>>
>> _______________________________________________
>> sqlite-users mailing list
>> [email protected]
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
>
> --
> Regards,
>      Michael.j.Falconer.
>



-- 
Regards,
     Michael.j.Falconer.
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Search query alternatives.

Reply via email to