On May 24, 10:51 pm, Michael Bayer <[EMAIL PROTECTED]> wrote:
> On May 25, 2008, at 1:04 AM, gatto wrote:
> > been working all day long on this, and i don't feel like i've gotten
> > any closer than when i started..  read all the related posts on this
> > group and couldn't determine what to do based on those.  hoping
> > someone can shed some light.
>
> > i'm attempting to convert the codebase of an existing project i've
> > developed to use SA.  going ok so far, but i'm stuck on an important
> > query that contains a subquery and that subquery has a count
> > function.  the original query looks like this:
>
> >    SELECT id, name, count
> >    FROM (
> >        SELECT t.id, t.name, COUNT(*) AS count
> >        FROM tags t, object_tags ot
> >        WHERE t.id= ot.tag_id
> >        AND ot.object_type_id = 2
> >        GROUP BY t.id, t.name
> >        ORDER BY COUNT(*) DESC, t.name ASC
> >        LIMIT 35
> >    ) AS t
> >    ORDER BY name ASC
>
> > tho actually, i will be changing the LIMIT clause and the final ORDER
> > BY clause depending on what the user chooses.  this is a query whose
> > results are used to render a tag cloud..  i've got an intermediate
> > table called 'object_types' that has a lookup of different types of
> > entities that could be tagged.  e.g. articles or users.  so articles
> > has the object_type_id 2 in the above sql, and that's what i'm
> > generating the cloud for.  object_tags is the relation table that maps
> > tags and object types to objects..
>
> > i had to do the query this way so that i can first get the top [limit]
> > number of tags ordered by count and name, and then being able to order
> > that top tag result set further.  basically i'm just saying that i
> > don't see a way to do this without the subselect..
>
> > so, how would i go about doing this??  it seems like i need to use
> > session.query(Tag) with from_statement, func and add_column, but i
> > just can't figure it out.  i'm seriously just about to pull my hair
> > out trying to understand what to do. any help is greatly appreciated!
>
> if you just want the columns back, you can issue the text directly, or
> use a select() construct, which would look like
>
> s = select([tags.c.id, tags.c.name,
> func.count('*').label('count')]).where(tags.c.id==object_tags.c.tag_id).
> where(object_tags.c.type_id==2).group_by([t.c.id,
> t.c.name]).order_by([func.count('*').desc(), t.c.name]).
> limit(35)
>
> rows = s.execute().fetchall()
>
> if you want to map "id"/"name" to a Tag object, in 0.4 you'd keep a
> select() like the above and use it with the Query.  it would be along
> the lines of
> sess
> .query(Tag).add_column(s.c.count).select_from(s).order_by(Tag.name) .
>
> Overall the query is not the best approach as you're mixing aggregates
> with primary keys (assuming tags.id is the primary key).  you really
> want the count of maching "object_tags" rows, I think:
>
> select tag.id, tag.name, obj_count from
> tags join
> (select tag_id, count('*') AS obj_count from object_tags where
> object_type_id=2 group by tag_id order by obj_count limit 35) as
> obj_tag_count
> on tags.id=obj_tag_count.id order by tag.name
>
> that way you keep unneeded columns out of the GROUP BY.  the subquery
> looks like:
>
> s = select([object_tags.c.tag_id,
> func
> .count('*').label('obj_count')]).where(object_tags.c.object_type_id==2).
> group_by(object_tags.c.tag_id).order_by(func.count('*')).limit(35)
>
> Note that the "order_by()" hits the "func.count()" again.   On some
> databases this is required (i.e., you can't ORDER BY a label).  If you
> truly want to force the label in there, say
> order_by(literal_column("obj_count")).
>
> then join it into an ORM query using the join:
>
> sess.query(Tag).add_column(s.c.obj_count).select_from(tags.join(s,
> Tag.id==s.c.tag_id)).order_by(Tag.name)
>
> The difference between the two approaches is described (in a little
> bit of a rant) here:  
> http://weblogs.sqlteam.com/jeffs/archive/2005/12/14/8546.aspx

you are absolutely correct..  thanks for the education!  i will be
using that technique in the future on all group by clauses.

thanks *so much* for the advice.  i'm not really close to being done
now i think.  i didn't mention before, but i'm actually using Elixir
also.  i didn't think it would matter for the querying, but i may be
wrong..  i got the select() part working well, nice...  but i'm
getting an error when i try to run the session.query() part.  here is
my full code now:

s = select([ObjectTag.c.tag_id, func.count('*').label('obj_count')]).\
        where(ObjectTag.c.object_type_id == 2).\
        group_by(ObjectTag.c.tag_id).\
        order_by(desc(func.count('*'))).\
        limit(35)

tags = session.query(Tag).add_column(s.c.obj_count).\
        select_from(Tag.table.join(s, Tag.id ==
s.c.tag_id)).order_by(Tag.name)

for t in tags:
        t.name

now i'm getting the error: sqlalchemy.exceptions.OperationalError:
(OperationalError) (1248, 'Every derived table must have its own
alias').  is this due to my using elixir?  somehow i don't think so..
i maybe just need to label the subquery?
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to