Re: [sqlite] Search query alternatives.

Michael Falconer Wed, 15 Oct 2014 16:30:35 -0700

Yes, I agree with the general sentiment. It is not exactly the design I
would have chosen either, but it is what exists. Design change is occurring
but for the moment we are stuck with the current schema.


If you cannot change the schemata to be more suitable, then your
> demonstrated queries are very near as good as it gets. I can think of other
> ways to achieve the same, but nothing that would be more efficient to a
> tangible level. Maybe someone else here can come up with something a bit
> more tight.
>
You are preaching to the converted I'm afraid, but I appreciate the points
you made. The quoted paragraph probably answers most of my questions
really, except for this from Igor:

Unless you do expect duplicates and need to eliminate them, use UNION ALL -
> it's much cheaper (this is assuming you insist on keeping multiple tables).
>
Cheaper because it drops the operation to remove the duplicate records from
the result set? I'm assuming. I'll give that a try.

On 16 October 2014 10:05, RSmith <rsm...@rsweb.co.za> wrote:

>
> On 2014/10/16 00:05, Michael Falconer wrote:
>
>> Hi all,
>>
>> first off I must start with an apology. I know I'm sort of doing the
>> wrong//...
>>
>
> No need to apologise, this flies quite close to the central theme. Whether
> you are using SQLite or any other SQL RDBMS, this is horrible DB design and
> it is so for precisely the reasons you are asking the question.  Tables
> should not be dynamic: The simple rule of thumb being - if you cannot tell
> (or at least accurately predict) before-hand exactly how many tables will
> be in the final system, then you are doing it wrong.  Why not have one set
> of tables and in stead of preceding each of them with a <prefix>, simply
> add a column that can host the <prefix> as a simple value, which will
> immediately make your life very much easier and get the SQL engine to do
> the work you are now trying to compensate for in a lop-sided manual kind of
> way.
>
> I can only imagine the amount of code you invested into your system to
> track and deal with these dynamic sets of tables, so I know changing it
> will seem like a rather large undertaking, but really it will mostly
> involve removing loads of code to end up with a small set of simple queries
> that does all the work for you.
>
> The only reason I can imagine this sort of breakdown useful is if your
> tree tables are all really really huge, like Gigabytes, and so there might
> be some efficiency to be gained from splitting it up, but I doubt this is
> the case.
>
> Just imagine the ease of those search queries you demonstrated... one
> query, no UNIONs, a single set of joins and an extra where-clause check...
> Add to that increased efficiency at finding results (run one query in stead
> of 50).  Easy-mode. As a proof-of-concept, just make an SQLite DB, one set
> of tables as per your list, add column(s) for prefixing as needed, import
> all the tables from some large set of them (specifying additionally the
> column for the prefix) and ten run search queries on them. I am willing to
> bet on the fact it will be much quicker in addition to being much simpler.
>
> If you cannot change the schemata to be more suitable, then your
> demonstrated queries are very near as good as it gets. I can think of other
> ways to achieve the same, but nothing that would be more efficient to a
> tangible level. Maybe someone else here can come up with something a bit
> more tight.
>
>
>  thing here as this question is NOT related to sqlite. It is a general SQL
>> question but I ask it here because I have great respect for the answers
>> and
>> discussions I have seen on this forum over many years. I rarely post
>> myself
>> as there are always several contributors who beat me to the answer and
>> often their response is far better than mine would have been. I'm not a
>> code leper, I don't need the actual SQL just the method really, though a
>> short code example would be well received for illustration.
>>
>> It's about a search performed on multiple tables. However the structure of
>> this database is somewhat unorthodox. It contains genealogical data and
>> this is clustered into trees. Each tree has 8 tables, there can be as many
>> as 100 trees but most installations of the associated application software
>> contain between 5 - 50 trees. These 8 tables contain a family trees data
>> and are named:
>> <prefix>addresses
>> <prefix>connections
>> <prefix>events
>> <prefix>family
>> <prefix>person
>> <prefix>repositories
>> <prefix>sources
>> <prefix>texts
>>
>> The <prefix> changes and is unique for each tree.
>>
>> There are extensive search options offered to users as well as simple name
>> searching and it is one of the best features of the app. It works pretty
>> well, so it ain't broke in any sense, we just wonder if there is a better
>> way to perform this search in SQL. Is there a general technique which is
>> superior either in speed, efficiency or load bearing contexts?
>>
>> I am sure you can see one of the pitfalls here is the exponential growth
>> of
>> such a search query as both total number of trees and indeed user search
>> criteria increase. For each criteria component, and there are quite a few,
>> the appropriate tables must be queried and results joined.
>>
>> Searches return records of individuals meeting the entered search criteria
>> so the query focuses on the person table as it's anchor, performs searches
>> on required other tables in the tree and joins the results to the person
>> data. The results from each tree search are then UNION'ed to provide the
>> result set. Here is a contrived example of the SQL query code which should
>> make things clearer.
>>
>> The user enters simple search criteria - any person with a last name
>> containing 'mac' and a first name containing the character 'a'. This is
>> the
>> resulting query (generated by php code), which searches a small 4 family
>> tree installation.
>>
>> (SELECT humo1_person.*, event_kind, event_event, address_place,
>> address_zip
>>      FROM humo1_person
>>      LEFT JOIN humo1_events ON pers_gedcomnumber=event_person_id
>>      LEFT JOIN humo1_addresses ON pers_gedcomnumber=address_person_id
>>          WHERE pers_lastname LIKE '%mac%'
>>            AND (pers_firstname LIKE '%a%'
>>                OR (event_kind='name' AND event_event LIKE '%a%')
>>            )
>>      GROUP BY pers_gedcomnumber
>> )
>> UNION
>>      (SELECT humo2_person.*, event_kind, event_event, address_place,
>> address_zip
>>          FROM humo2_person LEFT JOIN humo2_events ON
>> pers_gedcomnumber=event_person_id
>>          LEFT JOIN humo2_addresses ON pers_gedcomnumber=address_person_id
>>              WHERE pers_lastname LIKE '%mac%'
>>                AND (pers_firstname LIKE '%a%'
>>                    OR (event_kind='name' AND event_event LIKE '%a%') )
>>          GROUP BY pers_gedcomnumber
>>      )
>> UNION
>>      (SELECT humo3_person.*, event_kind, event_event, address_place,
>> address_zip
>>          FROM humo3_person LEFT JOIN humo3_events ON
>> pers_gedcomnumber=event_person_id
>>          LEFT JOIN humo3_addresses ON pers_gedcomnumber=address_person_id
>>              WHERE pers_lastname LIKE '%mac%'
>>                AND (pers_firstname LIKE '%a%'
>>                    OR (event_kind='name' AND event_event LIKE '%a%')
>>                )
>>          GROUP BY pers_gedcomnumber
>>      )
>> UNION
>>      (SELECT humo4_person.*, event_kind, event_event, address_place,
>> address_zip
>>          FROM humo4_person
>>          LEFT JOIN humo4_events ON pers_gedcomnumber=event_person_id
>>          LEFT JOIN humo4_addresses ON pers_gedcomnumber=address_person_id
>>              WHERE pers_lastname LIKE '%mac%'
>>                AND (pers_firstname LIKE '%a%'
>>                    OR (event_kind='name' AND event_event LIKE '%a%')
>>                )
>>          GROUP BY pers_gedcomnumber
>>      )
>>      ORDER BY pers_lastname ASC , pers_firstname ASC LIMIT 0,30
>>
>> Any thoughts? Suggestions? Missiles? Good approach, bad approach, or
>> completely off the grid? I do use sqlite quite a bit, but not on this
>> particular project.
>>
>>
>>
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
Regards,
     Michael.j.Falconer.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Search query alternatives.

Reply via email to