Re: [sqlite] Search query alternatives.

2014-10-17 Thread Michael Falconer
I am glad I posted the question.

Yes James, there is little I can disagree with in your excellent summary.
Even the critique of my perhaps poorly framed question is indeed valid. I
take your point regarding spec vs implementation, and in my experience
across different rdbms's I have frequently seen evidence which supports
your assertions regarding db/os system influence on implementation
approaches.

Application code too has it's part to play. How do we plan to access the
data? There's a strong case too, IMHO, to have flexibility in the design,
perhaps leading to differing approaches with the variable types of data we
may be storing.

The original design decisions were made by someone who was, and still is,
essentially a hobby programmer. I don't think he'd ever heard of of Chris
Date or Mr Codd at that time and like all novice application programmers,
he had little understanding about the effect db design could have on his
application and it's source code. His design choices were initially made on
the basis of what he could easily understand and what was (as it appeared
to him then) easy to program with. It is some time ago, and we who have
lived with rdbms's for years get to say, 'that is a horrible design!'. I
think R. Smith hit on a point above, regarding code overhead. Yep, plenty
of that. And so the lesson is learned the hard way for someone who until
recently had viewed normalisation as a way to make coding harder and to
slow down the execution of queries.

So with that perspective you can perhaps come some way to understanding the
why component. On analysis, I agree with suggested design changes at the
higher level. i.e. Dynamic tables are at the root of issues going forward.
They are requiring tedious application code gymnastics, more difficult
query analysis and poorer query performance.  I suppose the upside is that
it will be a challenge to see what improvements can be made, and that is
always fun and games. I kind of like Mr Smith's other suggestion about an
SQLITE testbed or prototype. So easy to work with SQLITE, and probably
perfect for this task. Thanks all for your contributions.

Just FYI James, the application is coded in php and connects to a mysql
database. It can be installed either as a browser based, stand alone or
client server app. It's common implementation is on low end shared hosts,
even free hosting services. So this limits us somewhat to what is commonly
allowed on such platforms. Things like Stored Procedures are unfortunately
outside our scope when it come to design considerations.

Thanks all.


On 18 October 2014 02:24, James K. Lowden  wrote:

> On Thu, 16 Oct 2014 09:05:51 +1100
> Michael Falconer  wrote:
>
> > we just wonder if there is a better way to perform this search in
> > SQL. Is there a general technique which is superior either in speed,
> > efficiency or load bearing contexts?
>
> The simple answer is No, because SQL is a specification, not an
> implementation.  Different systems implement it differently and
> therefore perform differently.  Any "general technique" affecting
> performance belongs to the implementation per se, not the SQL, which is
> a logical construction. SQLite itself has changed its performance
> characteristics over the course of its development.
>
> For that reason, any question of performance has to be answered in
> terms of a particular implementation, even its specific version, and
> the OS and hardware it's running on.
>
> That said, there is reason to suppose that a single-table design would
> be more efficient.  If the queries can be expressed with recursion and
> the indexes lead to efficient searches, the query optimizer has less
> work to do.  It has fewer permutations to consider, and the search is
> apt to touch fewer pages.  The analysis tools of the system you're
> using should be able to confirm or deny that supposition.
>
> I would remind your fellows, though, that efficiency is not all.  The
> utility of a model (that is, the database design) is measured by how
> well, to its purpose, it describes the real world.  Any model that must
> be changed as that reality changes in predictable ways isn't really
> much of a model; it turns the designer into a component of the model.
> By recognizing all trees as one, you generalize your model and make it
> do work you are now doing yourself (manually, or in application
> logic).  By any measure, that makes it a better model.
>
> HTH.
>
> --jkl
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
Regards,
 Michael.j.Falconer.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Search query alternatives.

2014-10-17 Thread James K. Lowden
On Thu, 16 Oct 2014 09:05:51 +1100
Michael Falconer  wrote:

> we just wonder if there is a better way to perform this search in
> SQL. Is there a general technique which is superior either in speed,
> efficiency or load bearing contexts?

The simple answer is No, because SQL is a specification, not an
implementation.  Different systems implement it differently and
therefore perform differently.  Any "general technique" affecting
performance belongs to the implementation per se, not the SQL, which is
a logical construction. SQLite itself has changed its performance
characteristics over the course of its development.  

For that reason, any question of performance has to be answered in
terms of a particular implementation, even its specific version, and
the OS and hardware it's running on.  

That said, there is reason to suppose that a single-table design would
be more efficient.  If the queries can be expressed with recursion and
the indexes lead to efficient searches, the query optimizer has less
work to do.  It has fewer permutations to consider, and the search is
apt to touch fewer pages.  The analysis tools of the system you're
using should be able to confirm or deny that supposition.  

I would remind your fellows, though, that efficiency is not all.  The
utility of a model (that is, the database design) is measured by how
well, to its purpose, it describes the real world.  Any model that must
be changed as that reality changes in predictable ways isn't really
much of a model; it turns the designer into a component of the model.
By recognizing all trees as one, you generalize your model and make it
do work you are now doing yourself (manually, or in application
logic).  By any measure, that makes it a better model.  

HTH.  

--jkl
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Search query alternatives.

2014-10-16 Thread GB

Michael,

a Guy named Joe Celko elaborated about trees and hierarchies in SQL a 
few years ago. Have a look here: 
http://www.amazon.com/Hierarchies-Smarties-Edition-Kaufmann-Management/dp/0123877334/ref=dp_ob_title_bk



regards
gerd
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Search query alternatives.

2014-10-16 Thread Michael Falconer
Thanks Eduardo,

a most interesting link.

On 17 October 2014 05:41, Eduardo Morras  wrote:

> On Thu, 16 Oct 2014 09:05:51 +1100
> Michael Falconer  wrote:
>
> > Hi all,
> >
> > first off I must start with an apology. I know I'm sort of doing the
> > wrong thing here as this question is NOT related to sqlite. It is a
> > general SQL question but I ask it here because I have great respect
> > for the answers and discussions I have seen on this forum over many
> > years. I rarely post myself as there are always several contributors
> > who beat me to the answer and often their response is far better than
> > mine would have been. I'm not a code leper, I don't need the actual
> > SQL just the method really, though a short code example would be well
> > received for illustration.
> >
> > Any thoughts? Suggestions? Missiles? Good approach, bad approach, or
> > completely off the grid? I do use sqlite quite a bit, but not on this
> > particular project.
>
> I point you to sqlite closure extension. It may shows you some ideas for
> tree implementation and parent/child relations under sql/sqlite.
>
> http://www.sqlite.org/src/artifact/636024302cde41b2bf0c542f81c40c624cfb7012
> http://www.sqlite.org/src/finfo?name=ext/misc/closure.c
>
> HTH
>
> >
> > --
> > Regards,
> >  Michael.j.Falconer.
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
>
> ---   ---
> Eduardo Morras 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
Regards,
 Michael.j.Falconer.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Search query alternatives.

2014-10-16 Thread Eduardo Morras
On Thu, 16 Oct 2014 09:05:51 +1100
Michael Falconer  wrote:

> Hi all,
> 
> first off I must start with an apology. I know I'm sort of doing the
> wrong thing here as this question is NOT related to sqlite. It is a
> general SQL question but I ask it here because I have great respect
> for the answers and discussions I have seen on this forum over many
> years. I rarely post myself as there are always several contributors
> who beat me to the answer and often their response is far better than
> mine would have been. I'm not a code leper, I don't need the actual
> SQL just the method really, though a short code example would be well
> received for illustration.
>
> Any thoughts? Suggestions? Missiles? Good approach, bad approach, or
> completely off the grid? I do use sqlite quite a bit, but not on this
> particular project.

I point you to sqlite closure extension. It may shows you some ideas for tree 
implementation and parent/child relations under sql/sqlite.

http://www.sqlite.org/src/artifact/636024302cde41b2bf0c542f81c40c624cfb7012
http://www.sqlite.org/src/finfo?name=ext/misc/closure.c

HTH
 
> 
> -- 
> Regards,
>  Michael.j.Falconer.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


---   ---
Eduardo Morras 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Search query alternatives.

2014-10-15 Thread Michael Falconer
Thanks Stephen,

good point, I was just after some general results, I do take your point
about caching etc. However it is logical to apply UNION ALL if appropriate
in preference to UNION which in this context is a bit lazy. I'm a bit
annoyed I didn't pick up on it myself, but thankful to Igor for reminding
me of the difference in the two statements.


On 16 October 2014 11:14, Stephen Chrzanowski  wrote:

> Careful with the timing.  You may be looking at OS memory caching the
> result set instead of pulling from the drive.  For best bets, either re-run
> both queries several times, ditch the longest and shortest times, then take
> the mean or average times and do the comparison that way.
>
> On Wed, Oct 15, 2014 at 7:54 PM, Michael Falconer <
> michael.j.falco...@gmail.com> wrote:
>
> > Igor,
> >
> > Nice one. A quick test using 10 lookup trees and the same search
> criteria:
> > *Showing rows 0 - 29 (30 total, Query took 0.4838 sec)*
> >
> > Now with UNION ALL replacing UNION:
> > *Showing rows 0 - 29 (30 total, Query took 0.2050 sec)*
> >
> > You weren't kidding about cheaper were you? LOL
> >
> >
> > On 16 October 2014 10:30, Michael Falconer  >
> > wrote:
> >
> > > Yes, I agree with the general sentiment. It is not exactly the design I
> > > would have chosen either, but it is what exists. Design change is
> > occurring
> > > but for the moment we are stuck with the current schema.
> > >
> > > If you cannot change the schemata to be more suitable, then your
> > >> demonstrated queries are very near as good as it gets. I can think of
> > other
> > >> ways to achieve the same, but nothing that would be more efficient to
> a
> > >> tangible level. Maybe someone else here can come up with something a
> bit
> > >> more tight.
> > >>
> > > You are preaching to the converted I'm afraid, but I appreciate the
> > points
> > > you made. The quoted paragraph probably answers most of my questions
> > > really, except for this from Igor:
> > >
> > > Unless you do expect duplicates and need to eliminate them, use UNION
> ALL
> > >> - it's much cheaper (this is assuming you insist on keeping multiple
> > >> tables).
> > >>
> > > Cheaper because it drops the operation to remove the duplicate records
> > > from the result set? I'm assuming. I'll give that a try.
> > >
> > > On 16 October 2014 10:05, RSmith  wrote:
> > >
> > >>
> > >> On 2014/10/16 00:05, Michael Falconer wrote:
> > >>
> > >>> Hi all,
> > >>>
> > >>> first off I must start with an apology. I know I'm sort of doing the
> > >>> wrong//...
> > >>>
> > >>
> > >> No need to apologise, this flies quite close to the central theme.
> > >> Whether you are using SQLite or any other SQL RDBMS, this is horrible
> DB
> > >> design and it is so for precisely the reasons you are asking the
> > question.
> > >> Tables should not be dynamic: The simple rule of thumb being - if you
> > >> cannot tell (or at least accurately predict) before-hand exactly how
> > many
> > >> tables will be in the final system, then you are doing it wrong.  Why
> > not
> > >> have one set of tables and in stead of preceding each of them with a
> > >> , simply add a column that can host the  as a simple
> > value,
> > >> which will immediately make your life very much easier and get the SQL
> > >> engine to do the work you are now trying to compensate for in a
> > lop-sided
> > >> manual kind of way.
> > >>
> > >> I can only imagine the amount of code you invested into your system to
> > >> track and deal with these dynamic sets of tables, so I know changing
> it
> > >> will seem like a rather large undertaking, but really it will mostly
> > >> involve removing loads of code to end up with a small set of simple
> > queries
> > >> that does all the work for you.
> > >>
> > >> The only reason I can imagine this sort of breakdown useful is if your
> > >> tree tables are all really really huge, like Gigabytes, and so there
> > might
> > >> be some efficiency to be gained from splitting it up, but I doubt this
> > is
> > >> the case.
> > >>
> > >> Just imagine the ease of those search queries you demonstrated... one
> > >> query, no UNIONs, a single set of joins and an extra where-clause
> > check...
> > >> Add to that increased efficiency at finding results (run one query in
> > stead
> > >> of 50).  Easy-mode. As a proof-of-concept, just make an SQLite DB, one
> > set
> > >> of tables as per your list, add column(s) for prefixing as needed,
> > import
> > >> all the tables from some large set of them (specifying additionally
> the
> > >> column for the prefix) and ten run search queries on them. I am
> willing
> > to
> > >> bet on the fact it will be much quicker in addition to being much
> > simpler.
> > >>
> > >> If you cannot change the schemata to be more suitable, then your
> > >> demonstrated queries are very near as good as it gets. I can think of
> > other
> > >> ways to achieve the same, but nothing that would be 

Re: [sqlite] Search query alternatives.

2014-10-15 Thread Stephen Chrzanowski
Careful with the timing.  You may be looking at OS memory caching the
result set instead of pulling from the drive.  For best bets, either re-run
both queries several times, ditch the longest and shortest times, then take
the mean or average times and do the comparison that way.

On Wed, Oct 15, 2014 at 7:54 PM, Michael Falconer <
michael.j.falco...@gmail.com> wrote:

> Igor,
>
> Nice one. A quick test using 10 lookup trees and the same search criteria:
> *Showing rows 0 - 29 (30 total, Query took 0.4838 sec)*
>
> Now with UNION ALL replacing UNION:
> *Showing rows 0 - 29 (30 total, Query took 0.2050 sec)*
>
> You weren't kidding about cheaper were you? LOL
>
>
> On 16 October 2014 10:30, Michael Falconer 
> wrote:
>
> > Yes, I agree with the general sentiment. It is not exactly the design I
> > would have chosen either, but it is what exists. Design change is
> occurring
> > but for the moment we are stuck with the current schema.
> >
> > If you cannot change the schemata to be more suitable, then your
> >> demonstrated queries are very near as good as it gets. I can think of
> other
> >> ways to achieve the same, but nothing that would be more efficient to a
> >> tangible level. Maybe someone else here can come up with something a bit
> >> more tight.
> >>
> > You are preaching to the converted I'm afraid, but I appreciate the
> points
> > you made. The quoted paragraph probably answers most of my questions
> > really, except for this from Igor:
> >
> > Unless you do expect duplicates and need to eliminate them, use UNION ALL
> >> - it's much cheaper (this is assuming you insist on keeping multiple
> >> tables).
> >>
> > Cheaper because it drops the operation to remove the duplicate records
> > from the result set? I'm assuming. I'll give that a try.
> >
> > On 16 October 2014 10:05, RSmith  wrote:
> >
> >>
> >> On 2014/10/16 00:05, Michael Falconer wrote:
> >>
> >>> Hi all,
> >>>
> >>> first off I must start with an apology. I know I'm sort of doing the
> >>> wrong//...
> >>>
> >>
> >> No need to apologise, this flies quite close to the central theme.
> >> Whether you are using SQLite or any other SQL RDBMS, this is horrible DB
> >> design and it is so for precisely the reasons you are asking the
> question.
> >> Tables should not be dynamic: The simple rule of thumb being - if you
> >> cannot tell (or at least accurately predict) before-hand exactly how
> many
> >> tables will be in the final system, then you are doing it wrong.  Why
> not
> >> have one set of tables and in stead of preceding each of them with a
> >> , simply add a column that can host the  as a simple
> value,
> >> which will immediately make your life very much easier and get the SQL
> >> engine to do the work you are now trying to compensate for in a
> lop-sided
> >> manual kind of way.
> >>
> >> I can only imagine the amount of code you invested into your system to
> >> track and deal with these dynamic sets of tables, so I know changing it
> >> will seem like a rather large undertaking, but really it will mostly
> >> involve removing loads of code to end up with a small set of simple
> queries
> >> that does all the work for you.
> >>
> >> The only reason I can imagine this sort of breakdown useful is if your
> >> tree tables are all really really huge, like Gigabytes, and so there
> might
> >> be some efficiency to be gained from splitting it up, but I doubt this
> is
> >> the case.
> >>
> >> Just imagine the ease of those search queries you demonstrated... one
> >> query, no UNIONs, a single set of joins and an extra where-clause
> check...
> >> Add to that increased efficiency at finding results (run one query in
> stead
> >> of 50).  Easy-mode. As a proof-of-concept, just make an SQLite DB, one
> set
> >> of tables as per your list, add column(s) for prefixing as needed,
> import
> >> all the tables from some large set of them (specifying additionally the
> >> column for the prefix) and ten run search queries on them. I am willing
> to
> >> bet on the fact it will be much quicker in addition to being much
> simpler.
> >>
> >> If you cannot change the schemata to be more suitable, then your
> >> demonstrated queries are very near as good as it gets. I can think of
> other
> >> ways to achieve the same, but nothing that would be more efficient to a
> >> tangible level. Maybe someone else here can come up with something a bit
> >> more tight.
> >>
> >>
> >>  thing here as this question is NOT related to sqlite. It is a general
> SQL
> >>> question but I ask it here because I have great respect for the answers
> >>> and
> >>> discussions I have seen on this forum over many years. I rarely post
> >>> myself
> >>> as there are always several contributors who beat me to the answer and
> >>> often their response is far better than mine would have been. I'm not a
> >>> code leper, I don't need the actual SQL just the method really, though
> a
> >>> short code example would be well 

Re: [sqlite] Search query alternatives.

2014-10-15 Thread Michael Falconer
Igor,

Nice one. A quick test using 10 lookup trees and the same search criteria:
*Showing rows 0 - 29 (30 total, Query took 0.4838 sec)*

Now with UNION ALL replacing UNION:
*Showing rows 0 - 29 (30 total, Query took 0.2050 sec)*

You weren't kidding about cheaper were you? LOL


On 16 October 2014 10:30, Michael Falconer 
wrote:

> Yes, I agree with the general sentiment. It is not exactly the design I
> would have chosen either, but it is what exists. Design change is occurring
> but for the moment we are stuck with the current schema.
>
> If you cannot change the schemata to be more suitable, then your
>> demonstrated queries are very near as good as it gets. I can think of other
>> ways to achieve the same, but nothing that would be more efficient to a
>> tangible level. Maybe someone else here can come up with something a bit
>> more tight.
>>
> You are preaching to the converted I'm afraid, but I appreciate the points
> you made. The quoted paragraph probably answers most of my questions
> really, except for this from Igor:
>
> Unless you do expect duplicates and need to eliminate them, use UNION ALL
>> - it's much cheaper (this is assuming you insist on keeping multiple
>> tables).
>>
> Cheaper because it drops the operation to remove the duplicate records
> from the result set? I'm assuming. I'll give that a try.
>
> On 16 October 2014 10:05, RSmith  wrote:
>
>>
>> On 2014/10/16 00:05, Michael Falconer wrote:
>>
>>> Hi all,
>>>
>>> first off I must start with an apology. I know I'm sort of doing the
>>> wrong//...
>>>
>>
>> No need to apologise, this flies quite close to the central theme.
>> Whether you are using SQLite or any other SQL RDBMS, this is horrible DB
>> design and it is so for precisely the reasons you are asking the question.
>> Tables should not be dynamic: The simple rule of thumb being - if you
>> cannot tell (or at least accurately predict) before-hand exactly how many
>> tables will be in the final system, then you are doing it wrong.  Why not
>> have one set of tables and in stead of preceding each of them with a
>> , simply add a column that can host the  as a simple value,
>> which will immediately make your life very much easier and get the SQL
>> engine to do the work you are now trying to compensate for in a lop-sided
>> manual kind of way.
>>
>> I can only imagine the amount of code you invested into your system to
>> track and deal with these dynamic sets of tables, so I know changing it
>> will seem like a rather large undertaking, but really it will mostly
>> involve removing loads of code to end up with a small set of simple queries
>> that does all the work for you.
>>
>> The only reason I can imagine this sort of breakdown useful is if your
>> tree tables are all really really huge, like Gigabytes, and so there might
>> be some efficiency to be gained from splitting it up, but I doubt this is
>> the case.
>>
>> Just imagine the ease of those search queries you demonstrated... one
>> query, no UNIONs, a single set of joins and an extra where-clause check...
>> Add to that increased efficiency at finding results (run one query in stead
>> of 50).  Easy-mode. As a proof-of-concept, just make an SQLite DB, one set
>> of tables as per your list, add column(s) for prefixing as needed, import
>> all the tables from some large set of them (specifying additionally the
>> column for the prefix) and ten run search queries on them. I am willing to
>> bet on the fact it will be much quicker in addition to being much simpler.
>>
>> If you cannot change the schemata to be more suitable, then your
>> demonstrated queries are very near as good as it gets. I can think of other
>> ways to achieve the same, but nothing that would be more efficient to a
>> tangible level. Maybe someone else here can come up with something a bit
>> more tight.
>>
>>
>>  thing here as this question is NOT related to sqlite. It is a general SQL
>>> question but I ask it here because I have great respect for the answers
>>> and
>>> discussions I have seen on this forum over many years. I rarely post
>>> myself
>>> as there are always several contributors who beat me to the answer and
>>> often their response is far better than mine would have been. I'm not a
>>> code leper, I don't need the actual SQL just the method really, though a
>>> short code example would be well received for illustration.
>>>
>>> It's about a search performed on multiple tables. However the structure
>>> of
>>> this database is somewhat unorthodox. It contains genealogical data and
>>> this is clustered into trees. Each tree has 8 tables, there can be as
>>> many
>>> as 100 trees but most installations of the associated application
>>> software
>>> contain between 5 - 50 trees. These 8 tables contain a family trees data
>>> and are named:
>>> addresses
>>> connections
>>> events
>>> family
>>> person
>>> repositories
>>> sources
>>> texts
>>>
>>> The  changes and is unique for each 

Re: [sqlite] Search query alternatives.

2014-10-15 Thread Michael Falconer
Yes, I agree with the general sentiment. It is not exactly the design I
would have chosen either, but it is what exists. Design change is occurring
but for the moment we are stuck with the current schema.

If you cannot change the schemata to be more suitable, then your
> demonstrated queries are very near as good as it gets. I can think of other
> ways to achieve the same, but nothing that would be more efficient to a
> tangible level. Maybe someone else here can come up with something a bit
> more tight.
>
You are preaching to the converted I'm afraid, but I appreciate the points
you made. The quoted paragraph probably answers most of my questions
really, except for this from Igor:

Unless you do expect duplicates and need to eliminate them, use UNION ALL -
> it's much cheaper (this is assuming you insist on keeping multiple tables).
>
Cheaper because it drops the operation to remove the duplicate records from
the result set? I'm assuming. I'll give that a try.

On 16 October 2014 10:05, RSmith  wrote:

>
> On 2014/10/16 00:05, Michael Falconer wrote:
>
>> Hi all,
>>
>> first off I must start with an apology. I know I'm sort of doing the
>> wrong//...
>>
>
> No need to apologise, this flies quite close to the central theme. Whether
> you are using SQLite or any other SQL RDBMS, this is horrible DB design and
> it is so for precisely the reasons you are asking the question.  Tables
> should not be dynamic: The simple rule of thumb being - if you cannot tell
> (or at least accurately predict) before-hand exactly how many tables will
> be in the final system, then you are doing it wrong.  Why not have one set
> of tables and in stead of preceding each of them with a , simply
> add a column that can host the  as a simple value, which will
> immediately make your life very much easier and get the SQL engine to do
> the work you are now trying to compensate for in a lop-sided manual kind of
> way.
>
> I can only imagine the amount of code you invested into your system to
> track and deal with these dynamic sets of tables, so I know changing it
> will seem like a rather large undertaking, but really it will mostly
> involve removing loads of code to end up with a small set of simple queries
> that does all the work for you.
>
> The only reason I can imagine this sort of breakdown useful is if your
> tree tables are all really really huge, like Gigabytes, and so there might
> be some efficiency to be gained from splitting it up, but I doubt this is
> the case.
>
> Just imagine the ease of those search queries you demonstrated... one
> query, no UNIONs, a single set of joins and an extra where-clause check...
> Add to that increased efficiency at finding results (run one query in stead
> of 50).  Easy-mode. As a proof-of-concept, just make an SQLite DB, one set
> of tables as per your list, add column(s) for prefixing as needed, import
> all the tables from some large set of them (specifying additionally the
> column for the prefix) and ten run search queries on them. I am willing to
> bet on the fact it will be much quicker in addition to being much simpler.
>
> If you cannot change the schemata to be more suitable, then your
> demonstrated queries are very near as good as it gets. I can think of other
> ways to achieve the same, but nothing that would be more efficient to a
> tangible level. Maybe someone else here can come up with something a bit
> more tight.
>
>
>  thing here as this question is NOT related to sqlite. It is a general SQL
>> question but I ask it here because I have great respect for the answers
>> and
>> discussions I have seen on this forum over many years. I rarely post
>> myself
>> as there are always several contributors who beat me to the answer and
>> often their response is far better than mine would have been. I'm not a
>> code leper, I don't need the actual SQL just the method really, though a
>> short code example would be well received for illustration.
>>
>> It's about a search performed on multiple tables. However the structure of
>> this database is somewhat unorthodox. It contains genealogical data and
>> this is clustered into trees. Each tree has 8 tables, there can be as many
>> as 100 trees but most installations of the associated application software
>> contain between 5 - 50 trees. These 8 tables contain a family trees data
>> and are named:
>> addresses
>> connections
>> events
>> family
>> person
>> repositories
>> sources
>> texts
>>
>> The  changes and is unique for each tree.
>>
>> There are extensive search options offered to users as well as simple name
>> searching and it is one of the best features of the app. It works pretty
>> well, so it ain't broke in any sense, we just wonder if there is a better
>> way to perform this search in SQL. Is there a general technique which is
>> superior either in speed, efficiency or load bearing contexts?
>>
>> I am sure you can see one of the pitfalls here is the exponential growth
>> of
>> such a search 

Re: [sqlite] Search query alternatives.

2014-10-15 Thread RSmith


On 2014/10/16 00:05, Michael Falconer wrote:

Hi all,

first off I must start with an apology. I know I'm sort of doing the wrong//...


No need to apologise, this flies quite close to the central theme. Whether you are using SQLite or any other SQL RDBMS, this is 
horrible DB design and it is so for precisely the reasons you are asking the question.  Tables should not be dynamic: The simple 
rule of thumb being - if you cannot tell (or at least accurately predict) before-hand exactly how many tables will be in the final 
system, then you are doing it wrong.  Why not have one set of tables and in stead of preceding each of them with a , simply 
add a column that can host the  as a simple value, which will immediately make your life very much easier and get the SQL 
engine to do the work you are now trying to compensate for in a lop-sided manual kind of way.


I can only imagine the amount of code you invested into your system to track and deal with these dynamic sets of tables, so I know 
changing it will seem like a rather large undertaking, but really it will mostly involve removing loads of code to end up with a 
small set of simple queries that does all the work for you.


The only reason I can imagine this sort of breakdown useful is if your tree tables are all really really huge, like Gigabytes, and 
so there might be some efficiency to be gained from splitting it up, but I doubt this is the case.


Just imagine the ease of those search queries you demonstrated... one query, no UNIONs, a single set of joins and an extra 
where-clause check... Add to that increased efficiency at finding results (run one query in stead of 50).  Easy-mode. As a 
proof-of-concept, just make an SQLite DB, one set of tables as per your list, add column(s) for prefixing as needed, import all the 
tables from some large set of them (specifying additionally the column for the prefix) and ten run search queries on them. I am 
willing to bet on the fact it will be much quicker in addition to being much simpler.


If you cannot change the schemata to be more suitable, then your demonstrated queries are very near as good as it gets. I can think 
of other ways to achieve the same, but nothing that would be more efficient to a tangible level. Maybe someone else here can come up 
with something a bit more tight.



thing here as this question is NOT related to sqlite. It is a general SQL
question but I ask it here because I have great respect for the answers and
discussions I have seen on this forum over many years. I rarely post myself
as there are always several contributors who beat me to the answer and
often their response is far better than mine would have been. I'm not a
code leper, I don't need the actual SQL just the method really, though a
short code example would be well received for illustration.

It's about a search performed on multiple tables. However the structure of
this database is somewhat unorthodox. It contains genealogical data and
this is clustered into trees. Each tree has 8 tables, there can be as many
as 100 trees but most installations of the associated application software
contain between 5 - 50 trees. These 8 tables contain a family trees data
and are named:
addresses
connections
events
family
person
repositories
sources
texts

The  changes and is unique for each tree.

There are extensive search options offered to users as well as simple name
searching and it is one of the best features of the app. It works pretty
well, so it ain't broke in any sense, we just wonder if there is a better
way to perform this search in SQL. Is there a general technique which is
superior either in speed, efficiency or load bearing contexts?

I am sure you can see one of the pitfalls here is the exponential growth of
such a search query as both total number of trees and indeed user search
criteria increase. For each criteria component, and there are quite a few,
the appropriate tables must be queried and results joined.

Searches return records of individuals meeting the entered search criteria
so the query focuses on the person table as it's anchor, performs searches
on required other tables in the tree and joins the results to the person
data. The results from each tree search are then UNION'ed to provide the
result set. Here is a contrived example of the SQL query code which should
make things clearer.

The user enters simple search criteria - any person with a last name
containing 'mac' and a first name containing the character 'a'. This is the
resulting query (generated by php code), which searches a small 4 family
tree installation.

(SELECT humo1_person.*, event_kind, event_event, address_place, address_zip
 FROM humo1_person
 LEFT JOIN humo1_events ON pers_gedcomnumber=event_person_id
 LEFT JOIN humo1_addresses ON pers_gedcomnumber=address_person_id
 WHERE pers_lastname LIKE '%mac%'
   AND (pers_firstname LIKE '%a%'
   OR (event_kind='name' AND event_event LIKE '%a%')

Re: [sqlite] Search query alternatives.

2014-10-15 Thread Igor Tandetnik

On 10/15/2014 6:05 PM, Michael Falconer wrote:

addresses
connections
events
family
person
repositories
sources
texts


Personally, I'd have one set of tables, each with an extra column 
containing . Why do you need a separate set of tables, only to 
UNION them on every request? Instead of sharding by table, shard by row, 
using "prefix" column as a discriminator.



(SELECT humo1_person.*, event_kind, event_event, address_place, address_zip
UNION
 (SELECT humo2_person.*, event_kind, event_event, address_place,


Unless you do expect duplicates and need to eliminate them, use UNION 
ALL - it's much cheaper (this is assuming you insist on keeping multiple 
tables).

--
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Search query alternatives.

2014-10-15 Thread Michael Falconer
Hi all,

first off I must start with an apology. I know I'm sort of doing the wrong
thing here as this question is NOT related to sqlite. It is a general SQL
question but I ask it here because I have great respect for the answers and
discussions I have seen on this forum over many years. I rarely post myself
as there are always several contributors who beat me to the answer and
often their response is far better than mine would have been. I'm not a
code leper, I don't need the actual SQL just the method really, though a
short code example would be well received for illustration.

It's about a search performed on multiple tables. However the structure of
this database is somewhat unorthodox. It contains genealogical data and
this is clustered into trees. Each tree has 8 tables, there can be as many
as 100 trees but most installations of the associated application software
contain between 5 - 50 trees. These 8 tables contain a family trees data
and are named:
addresses
connections
events
family
person
repositories
sources
texts

The  changes and is unique for each tree.

There are extensive search options offered to users as well as simple name
searching and it is one of the best features of the app. It works pretty
well, so it ain't broke in any sense, we just wonder if there is a better
way to perform this search in SQL. Is there a general technique which is
superior either in speed, efficiency or load bearing contexts?

I am sure you can see one of the pitfalls here is the exponential growth of
such a search query as both total number of trees and indeed user search
criteria increase. For each criteria component, and there are quite a few,
the appropriate tables must be queried and results joined.

Searches return records of individuals meeting the entered search criteria
so the query focuses on the person table as it's anchor, performs searches
on required other tables in the tree and joins the results to the person
data. The results from each tree search are then UNION'ed to provide the
result set. Here is a contrived example of the SQL query code which should
make things clearer.

The user enters simple search criteria - any person with a last name
containing 'mac' and a first name containing the character 'a'. This is the
resulting query (generated by php code), which searches a small 4 family
tree installation.

(SELECT humo1_person.*, event_kind, event_event, address_place, address_zip
FROM humo1_person
LEFT JOIN humo1_events ON pers_gedcomnumber=event_person_id
LEFT JOIN humo1_addresses ON pers_gedcomnumber=address_person_id
WHERE pers_lastname LIKE '%mac%'
  AND (pers_firstname LIKE '%a%'
  OR (event_kind='name' AND event_event LIKE '%a%')
  )
GROUP BY pers_gedcomnumber
)
UNION
(SELECT humo2_person.*, event_kind, event_event, address_place,
address_zip
FROM humo2_person LEFT JOIN humo2_events ON
pers_gedcomnumber=event_person_id
LEFT JOIN humo2_addresses ON pers_gedcomnumber=address_person_id
WHERE pers_lastname LIKE '%mac%'
  AND (pers_firstname LIKE '%a%'
  OR (event_kind='name' AND event_event LIKE '%a%') )
GROUP BY pers_gedcomnumber
)
UNION
(SELECT humo3_person.*, event_kind, event_event, address_place,
address_zip
FROM humo3_person LEFT JOIN humo3_events ON
pers_gedcomnumber=event_person_id
LEFT JOIN humo3_addresses ON pers_gedcomnumber=address_person_id
WHERE pers_lastname LIKE '%mac%'
  AND (pers_firstname LIKE '%a%'
  OR (event_kind='name' AND event_event LIKE '%a%')
  )
GROUP BY pers_gedcomnumber
)
UNION
(SELECT humo4_person.*, event_kind, event_event, address_place,
address_zip
FROM humo4_person
LEFT JOIN humo4_events ON pers_gedcomnumber=event_person_id
LEFT JOIN humo4_addresses ON pers_gedcomnumber=address_person_id
WHERE pers_lastname LIKE '%mac%'
  AND (pers_firstname LIKE '%a%'
  OR (event_kind='name' AND event_event LIKE '%a%')
  )
GROUP BY pers_gedcomnumber
)
ORDER BY pers_lastname ASC , pers_firstname ASC LIMIT 0,30

Any thoughts? Suggestions? Missiles? Good approach, bad approach, or
completely off the grid? I do use sqlite quite a bit, but not on this
particular project.


-- 
Regards,
 Michael.j.Falconer.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users