Re: [sqlalchemy] How to SELECT computed data using SQLAlchemy?

Mike Bayer Tue, 14 Dec 2021 05:31:13 -0800


On Tue, Dec 14, 2021, at 5:40 AM, 'Daniel Grindrod' via sqlalchemy wrote:
> Hi all,
> 
> I'm working on a REST API which is built using Flask-SQLAlchemy and 
> Connexion. I'm fairly new to SQLAlchemy, but it's been brilliant so far :) 
> This API uses SQLAlchemy 1.3.16, and connects to an Oracle Database (12c 
> 12.1.0.1.0 64bit). 
 <https://stackoverflow.com/posts/70341129/timeline>
> I'm having an issue generating the correct SQL from a SQLAlchemy query. I 
> would really appreciate any help. The troublesome function is shown below.
> 
> def similar_matches(input_descriptor, threshold, lim=None, offset): 
> 
>   query = db.select([tbl, text(":q_descriptors as query_descriptors, 
> comparison(descriptors, :q_descriptors) as 
> similarity")bindparams(q_descriptor=input_descriptor).\
>   where( text("comparison(descriptors, :q_descriptors) >=  
> q_threshold").bindparams(q_descriptor=input_descriptor, q_threshold = 
> threshold)


heya -

it's early here but I almost want to be able to tell a story about that pattern 
above, which has select(text("cola, colb, colc, ...))  in it.   It's kind of an 
"anti-unicorn" for me, as I've done many refactorings to the result-processing 
part of SQLAlchemy's engine and each time I do so, there's some internal 
handwringing over, "what if someone is SELECTING from a text() that has 
multiple columns comma-separated in them?", which specifically is a problem 
because it means we can't positionally link the columns we get back from the 
cursor to the Python expressions that are in the select() object, and each time 
it's like, "nah, nobody would do that", or, "nah, nobody *should* do that", but 
yet, as there's not a straightforward way to detect/warn for that, there's a 
whole set of code / commentary at 
https://github.com/sqlalchemy/sqlalchemy/blob/main/lib/sqlalchemy/engine/cursor.py#L325
 which wonders if we'd ever see this.   

and here it is!  :)   the dark unicorn.    So, it's also the source of the 
issue here, because the Oracle dialect has to restructure the query to simulate 
limit/offset.   Soooo.... back into the barn w/ the unicorn and what we do here 
is make sure the select() has enough structure so that SQLAlchemy knows what's 
going on and here that would look like (note I'm making some syntactical 
assumptions about the code above which seems to be incomplete ):

from sqlalchemy import literal, func

query = db.select(
    [
        tbl,
        literal(input_descriptor).label("query_descriptors"),
        func.comparison(tbl.c.descriptors, q_descriptors).label("similarity")
    ]).
  where(
    func.comparison(tbl.c.descriptors, q_descriptors) >= threshold

  )

that way your select() will have .selected_columns entries for every column in 
"tbl" plus columns "query_descriptors" and "similarity", and these will export 
on outwards to the subquery that the Oracle dialect creates.








> 
>   res = query.execute().fetchall() 
> 
>   if len(res)=0 return '', 204 
> 
>   return MatchLimaScheme(many = True).dump(res) 
> This SQLAlchemy code takes two inputs (descriptor and threshold), and 
> searches through each descriptor in an Oracle database, calculating a 
> similarity measure  between the queried descriptor and each stored 
> descriptor. All rows where similarity score >= threshold are returned in a 
> JSON.
> 
> The above code works fine, but returns all results - whereas I want to also 
> be able to include a .offset() and a .limit() (for lazy loading). The code 
> above generates SQL along these lines:
> 
> SELECT ID, last_modified, descriptors, :q_descriptors as query_descriptors, 
> comparison(descriptors, :q_descriptors) as similarity' FROM tbl WHERE 
> compare(descriptors, :q_descriptors) >= :q_threshold 
> which works well. However, when I add .limit() or .offset() on the end of my 
> query i.e.
> 
> query = db.select([tbl, text(":q_descriptors as query_descriptors, 
> comparison(descriptors, :q_descriptors) as similarity" 
> ).bindparams(q_descriptor=input_descriptor).where( 
> text("comparison(descriptors, :q_descriptors) >= :q_threshold") 
> .bindparams(q_descriptor=input_descriptor,q_threshold = 
> threshold).limit(limit) 
> The SQL generated changes to be along these lines: 
> 
> SELECT ID, last_modified, descriptors FROM (SELECT tbl.ID as ID, 
> tbl.last_modified as last_modified, tbl.descriptors as descriptors, 
> :q_descriptors as query_descriptors, comparison(descriptors, :q_descriptors) 
> as similarity) FROM tbl WHERE compare(descriptors, :q_descriptors) >= 
> :q_threshold WHERE ROWNUM <= :q_limit 
> As a raw SQL query this is fine, but I'm no longer including the 
> query_descriptors and similarity metrics in my SELECT clause. Thus I get a 
> columnNotFoundError. How do I adjust the .select() function above so that my 
> SQL looks more like:
> 
> SELECT ID, last_modified, descriptors, query_descriptors, similarity FROM 
> (SELECT tbl.ID as ID, tbl.last_modified as last_modified, tbl.descriptors as 
> descriptors,:q_descriptors as query_descriptors comparison(descriptors, 
> :q_descriptors) as similarity' FROM tbl WHERE compare(descriptors, 
> :q_descriptors) >= :q_threshold WHERE ROWNUM <= :q_limit OFFSET :q_offset 
> Basically, I'm looking to explicitly tell SQLAlchemy that I want to SELECT 
> tbl AND query_descriptors AND similarity.
> 
> I've also been informed that it's bad practice to not include a .order_by() 
> in these queries, but I don't think that is what's causing the issue here. It 
> is on my to do list though.
> 
> Please let me know if I need to provide more information.
> 
> Thanks for any help,
> 
> Dan
> 
> 
> 
> -- 
> SQLAlchemy - 
> The Python SQL Toolkit and Object Relational Mapper
>  
> http://www.sqlalchemy.org/
>  
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> --- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/adf3ee50-d198-4193-bcfb-e866cad62e52n%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/sqlalchemy/adf3ee50-d198-4193-bcfb-e866cad62e52n%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/a7c7da47-f8e8-4903-8778-3075c3c221d4%40www.fastmail.com.

Re: [sqlalchemy] How to SELECT computed data using SQLAlchemy?

Reply via email to