I would definitely add a "word_count" column to document table no
matter what your resulting query will look like. But you are going to
have some trouble making an efficient query which will also take into
account the given formula (the part where you expect element "0/4").

Tvrtko

On 4/3/06, Jonathan Hayward http://JonathansCorner.com
<[EMAIL PROTECTED]> wrote:
> I owe a BIG thank you to those of you who have patiently helped me. My
> program
>  is basically running now, and though it's definitely a prototype, it works.
>  (I had started the move in the hopes that a sluggish program could be sped
> up,
>  and it is DEFINITELY speeding up.)
>
>  There is one immediate help request for a complex use of SQLAlchemy:
>
>  A document_table represents a webpage (roughly); a histogram_table
> represents
>  an entry into a histogram of words. If the webpage says "egg bacon spam
> spam"
>  it will have:
>
>  * one histogram with its primary key, "egg", and a count of 1
>  * one histogram with its primary key, "bacon", and a count of 1
>  * one histogram with its primary key, "spam", and a count of 2
>
>  In the kind of search I want to enable, a webpage has a score which is
>  calculated by dividing its count for each search keyword by the total word
> count for the webpage. Thus if someone searches for "egg bacon spam", the
> score
>  will be 1/4 * 1/4 * 2/4 = 0.03125. (This is actually a very high score,
>  althoguh it looks low.) If a keyword is missing, it will have a score of 0:
>  "egg sausage spam" will have a score of 1/4 * 0/4 * 2/4 = 0.0.
>
>  What I want to do is select all documents with a nonzero score based on the
>  keyword search terms, and sort them by score descending. I know I could
>  bludgeon it and eventually get it working, but this is complex enough
> (should I
>  make a separate histogram_grand_total_table with the total word count for a
>  webpage?) that I wanted to ask for help.
>
>
> --
> ++ Jonathan Hayward, [EMAIL PROTECTED]
> ** To see an award-winning website with stories, essays, artwork,
> ** games, and a four-dimensional maze, why not visit my home page?
> ** All of this is waiting for you at http://JonathansCorner.com
>
> ** If you'd like a Google Mail (gmail.com) account, please tell me!


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
Sqlalchemy-users mailing list
Sqlalchemy-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sqlalchemy-users

Reply via email to