On Aug 4, 2014, at 7:12 PM, Jonathan Vanasco <[email protected]> wrote:


>     b - sequentially scanning the 'id' column.  this function won't use the 
> primary key, or any other index, for id.  it seq scans the whole table.

ah because ROW_NUMBER() needs...the row number!   that is, we need to count the 
rows between PK values.   Well if you have truly increment-by-one primary keys 
and that's what you're windowing on, you can make that the 
"rownum"...otherwise, you're looking for windows of N rows within the whole set 
of rows.  Counting is necessary, I'm not sure how to force it to use an 
index-only scan to figure this out.   I haven't looked into this that deeply.

> 
> 2. Window ranges are created for all ids in the table.  If I only want to 
> work on half the records, with a min/max, the 'iterable' query doesn't use 
> the min/max filter values; it generates 100% of potential ranges based on the 
> id, and the filters are only used for the 'where' clause.  

add a WHERE clause to the query that is selecting the windows, that is instead 
of:

SELECT anon_1.widget_data AS anon_1_widget_data 
FROM (SELECT widget.data AS widget_data, row_number() OVER (ORDER BY 
widget.data) AS rownum 
FROM widget) AS anon_1 
WHERE rownum %% 1000=1


we'd have:

SELECT anon_1.widget_data AS anon_1_widget_data 
FROM (SELECT widget.data AS widget_data, row_number() OVER (ORDER BY 
widget.data) AS rownum 
FROM widget WHERE widget.id BETWEEN (:low AND :high)) AS anon_1 
WHERE rownum %% 1000=1

> Has anyone worked out an improved version of this technique, or am I cobbling 
> something together myself?

you can also skip the window altogether and use the last row from the previous 
batch as the end of the window:

SELECT * FROM widget ORDER BY id LIMIT 1000    -- first batch
SELECT * FROM widget WHERE id > 4895 ORDER BY id LIMIT 1000  -- second batch
SELECT * FROM widget WHERE id > 6652 ORDER BY id LIMIT 1000  -- third batch

etc.

that is, we are selecting for the "beginning" of the range using what we just 
got from the previous batch.

This has to ORDER BY on every query, but might be better index-wise.



-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to