On 06/07/2016 03:34 AM, Damian Dimmich wrote:
Hi,

We've been staring at a problem for some time, where an sqlalchemy
generated query we run returns less objects when run with .all() than
the number given to us with a .count().

The same, generated query which we get in our logs/from print(query)
returns the same number of rows as the count() when run through
DBSession.execute("...").

This happens with both sqlalchemy 1.0.12 and 1.0.13



the Query.all() method, assuming you are querying for complete instances, will first off deduplicate a redundant instance, and secondly if a row does not contain a full primary key, no instance will be generated for that row. The difference in row count is probably due to one of these factors.




So - running the below by hand using execute in the DBSession gives us
the correct row count:

        SELECT content.id AS content_id, content.title AS content_title,
content.subtitle AS content_subtitle, content.teaser AS content_teaser,
content.published_date AS content_published_date, content.published AS
content_published, content.created_date AS content_created_date,
content.microscription AS content_microscription,
content.product_category_id AS content_product_category_id,
content.discipline_type_id AS content_discipline_type_id,
content.research_approach_id AS content_research_approach_id,
content.weighting_action_id AS content_weighting_action_id,
content.weighting_id AS content_weighting_id, content.channel_id AS
content_channel_id, content.provider_id AS content_provider_id,
content.thumb_url AS content_thumb_url, content.pdf AS content_pdf,
content.meta AS content_meta, content.guide_price AS
content_guide_price, content.currency_id AS content_currency_id,
content.fade_pricing AS content_fade_pricing, content.min_fade_price AS
content_min_fade_price, content.is_research AS content_is_research,
content.non_material AS content_non_material, content.subscription_only
AS content_subscription_only FROM content
            LEFT OUTER JOIN (content_sector_ass AS content_sector_ass_1
JOIN sectors AS sectors_1 ON sectors_1.id =
content_sector_ass_1.sector_id) ON content.id =
content_sector_ass_1.content_id
            LEFT OUTER JOIN (content_asset_class_ass AS
content_asset_class_ass_1 JOIN asset_classes AS asset_classes_1 ON
asset_classes_1.id = content_asset_class_ass_1.asset_class_id) ON
content.id = content_asset_class_ass_1.content_id
            LEFT OUTER JOIN (content_specialty_ass AS
content_specialty_ass_1 JOIN specialties AS specialties_1 ON
specialties_1.id = content_specialty_ass_1.specialty_id) ON content.id =
content_specialty_ass_1.content_id
        WHERE
            content.published = true
            AND content.published_date <= '2016-06-07'
            AND (
                asset_classes_1.name = 'Equity'
                OR
                specialties_1.name = 'FundamentalGeneralEquity'
            )
            AND sectors_1.name = 'Consumer Discretionary'
        ORDER BY content.published_date DESC, content.created_date DESC

return 1663 rows.

whereas running the sqlalchemy generated query that returns "Content"
objects:

2016-06-07 11:24:29,477 INFO  [sqlalchemy.engine.base.Engine][waitress]

SELECT content.id AS content_id, content.title AS content_title,
content.subtitle AS content_subtitle, content.teaser AS content_teaser,
content.published_date AS content_published_date, content.published AS
content_published, content.created_date AS content_created_date,
content.microscription AS content_microscription,
content.product_category_id AS content_product_category_id,
content.discipline_type_id AS content_discipline_type_id,
content.research_approach_id AS content_research_approach_id,
content.weighting_action_id AS content_weighting_action_id,
content.weighting_id AS content_weighting_id, content.channel_id AS
content_channel_id, content.provider_id AS content_provider_id,
content.thumb_url AS content_thumb_url, content.pdf AS content_pdf,
content.meta AS content_meta, content.guide_price AS
content_guide_price, content.currency_id AS content_currency_id,
content.fade_pricing AS content_fade_pricing, content.min_fade_price AS
content_min_fade_price, content.is_research AS content_is_research,
content.non_material AS content_non_material, content.subscription_only
AS content_subscription_only
FROM content

    LEFT OUTER JOIN (content_sector_ass AS content_sector_ass_1 JOIN
sectors AS sectors_1 ON sectors_1.id = content_sector_ass_1.sector_id)
ON content.id = content_sector_ass_1.content_id

    LEFT OUTER JOIN (content_asset_class_ass AS
content_asset_class_ass_1 JOIN asset_classes AS asset_classes_1 ON
asset_classes_1.id = content_asset_class_ass_1.asset_class_id) ON
content.id = content_asset_class_ass_1.content_id

    LEFT OUTER JOIN (content_specialty_ass AS content_specialty_ass_1
JOIN specialties AS specialties_1 ON specialties_1.id =
content_specialty_ass_1.specialty_id) ON content.id =
content_specialty_ass_1.content_id

WHERE

    content.published = true

    AND content.published_date <= %(published_date_1)s

    AND (

                 asset_classes_1.name = %(name_1)s

                OR specialties_1.name = %(name_2)s

    )

    AND sectors_1.name = %(name_3)s ORDER BY content.published_date
DESC, content.created_date DESC

2016-06-07 11:24:29,478 INFO  [sqlalchemy.engine.base.Engine][waitress]

{'name_2': 'FundamentalGeneralEquity',

'name_3': 'Consumer Discretionary',

'name_1': 'Equity',

'param_1': 24, 'published_date_1': datetime.date(2016, 6, 7)}

returns 619 Content objects.  This is computed by running len(result)
where result = query.all()

I'm a bit stuck - not sure where to begin debugging.  As far as I can
tell the query that is generated is correct, and the database returns
the correct set of results, however some of these seem to get 'lost'
when sqlalchemy assembles the result set.

Any ideas?

Damian



--
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to