On Apr 29, 2015, at 12:25 PM, Ladislav Lenart wrote:

> Could you please explain to me the error(s) in my reasoning?

Let me just flip your list in reverse... and add in some elements (marked with 
a *):

posting  ts  context
p60      60  friend
p55      55 friend*
p54      54 friend*
p50      50  group
p50      49  group*
p50      49  group*
p40      40  friend
p30      30  group
p20      20  friend
p10      10  group
p0        0  friend

With the 2 limited subqueries, the results would be:
        60F, 55F, 50G, 49G

But the "most recent" data is
        50F, 55F, 54F, 50G

So we end up showing 49 which is less relevant than 54.

In some situations this isn't much of an issue, but in others it is 
detrimental. 
For example, one of my "feeds" contains a distribution of events 
according-to-type that is very uneven.  While "friend" and "group" might be 
relatively close in time to one another, "system" or other events may be months 
old -- and that older content gets pulled in with this style of query.  

If you need to paginate the data and select the next 10 overall items, it gets 
even more complicated.

IIRC, the best mix of performance and "product" that I've found is do something 
like this:

        SELECT * FROM (
                SELECT a,b,c FROM table_a ORDER BY TIMESTAMP DESC LIMIT 10000;
                UNION
                SELECT a,b,c FROM table_b ORDER BY TIMESTAMP DESC LIMIT 10000;
        ) as unioned
        order by unioned TIMESTAMP DESC LIMIT 100 OFFSET 0; 

by creating an artificial limit on the inner queries, you can save postgres 
from doing a lot of intensive i/o work and memory usage (like a lot)
then, joining a few lists and sorting 20k (or even 100k) items is really cheap.
the downside is that you effectively limit the 'relevancy' of the query to 
whatever the inner limit is (ie, 10000 -- not the combined total of 20000), but 
that number can be arbitrarily high enough that it is irrelevant while still 
showing the right amount of content for people.







-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to