Re: [sqlalchemy] Feedback appreciated

Michael Bayer Thu, 22 Aug 2013 15:53:19 -0700

On Aug 22, 2013, at 3:37 PM, Konsta Vesterinen <[email protected]> 
wrote:


> Hi all,
> 
> First I want to say how much I appreciate SQLAlchemy. I think it is by far 
> the best ORM available for any language. Thank you Mike for this wonderful 
> piece of software. I can’t imagine how many countless hours you’ve put into 
> this.
> 
> From the day I started coding I’ve always been enthusiastic about databases. 
> 2007 I created Doctrine, which is now the most popular ORM for PHP.


wow that is quite a compliment, I of course have heard of Doctrine and met many 
PHP users who use it extensively.   This is really amazing that you're A. using 
Python now B. using SQLAlchemy and C. writing full blown software for it, wow !

>  Over the years I’ve switched to Python (I could’ve switched to Ruby, but 
> didn’t like Rails’ ORM at all). Now I’m the CEO of a company called Fast 
> Monkeys. We employ 15 people currently and develop solely with Python, 
> SQLAlchemy and Flask. One of the key principles of our company is to 
> contribute to our chosen open source technologies as much as possible. There 
> are couple of interesting projects I would like to hear some feedback:
> 
> 
> SQLAlchemy-Utils
> https://github.com/kvesteri/sqlalchemy-utils 
> 
> Provides number of things for SQLAlchemy. Some highlights:
> Batch fetching utilities (experimental at the moment)
> We had a real life scenario were join loading, lazy loading and subquery 
> loading were all too slow. 
> I used this https://groups.google.com/forum/#!topic/sqlalchemy/vHQlm0U5f2k as 
> an inspiration. I agree this is a bottomless hole but I’m nevertheless 
> willing to try to make something useful for SQLAlchemy users. :)
> Number of new datatypes
> 
> 
> SQLAlchemy-Continuum
> https://github.com/kvesteri/sqlalchemy-continuum
> 
> Hibernate Envers style versioning for SQLAlchemy declarative models.
> 
> 
> WTForms-Alchemy
> https://github.com/kvesteri/wtforms-alchemy
> 
> Easily create WTForms forms from SQLAlchemy declarative models.

These are all really interesting projects and I had a lot of thoughts looking 
at all of them just briefly.   I also wonder at what points within here 
should/can some of this be part of SQLA itself, or not.  Here's my notes:

wtforms:

1. in all cases, use inspect(cls) to get at a Mapper.  if on 0.7, use 
class_mapper(cls).   but it would be great if you could target 0.8 and up as a 
lot of functions were added for exactly these kinds of use cases (See 
http://docs.sqlalchemy.org/en/rel_0_8/faq.html#how-do-i-get-a-list-of-all-columns-relationships-mapped-attributes-etc-given-a-mapped-class)

2. ClassManager.values() is not terrible, but again isn't super "public".  you 
can use mapper.attrs as well as mapper.column_attrs and others as of 0.8.

versioning:

1. wow this is really ambitious.

2. unit_of_work.py: you can tell if a statement is an INSERT/DELETE usually by 
looking at context.isinsert/context.isdelete, and also the table name 
you can get from context.statement.table (something like that).  Similar things 
can be done where I see you're regexping the DELETE
later on.   Digging into the string is fine but once you're targeting the broad 
spectrum of scenarios, like users that are adding SQL comments and such to 
their SQL, backends that don't actually use SQL, you want to stick with 
inspecting the expression trees as much as possible.

3. make schema object names configurable, i.e. "transaction_id"

4. This code looks great but I'd still be scared to use it, because versioning 
is such a particular thing, not to mention
interactions with other schema complexities.    But I don't say that to be 
discouraging, just to state how non-trivial a problem 
this is.   When i do versioning for real, there's always weird quirks and 
things 
specific to the app, which are easier to hardcode in my versioning code rather 
than having to configure a 3rd party library to do it.
it's why i kept it as just an "example" in SQLA itself, it's a huge job...   
but if you can make this extension successful,
that'll be very impressive.   In the docs it would be nice if I could see 
immediately what happens to the SQL schema when I use this.

sqlalchemy_utils:

1. have coercion_listener configure itself?   coercion_listener.configure().  
since it's global usually,
and you could always pass a target base class to configure() as an option.

2. ScalarListType vs. Postgresql ARRAY ?   same/better?  should SLT use ARRAY 
on a PG backend ?

3. operators for types!   I see these are mostly string storage but you can 
start adding special operations as
needed using TypeEngine.Comparator: 
http://docs.sqlalchemy.org/en/rel_0_8/core/types.html#types-operators  e.g. 
look at all the operations that pg.ARRAY has (contains, indexed access, 
concatenation, etc.). then you can make all these types *really* slick.

4a. batch_fetch - hmmmmmmmmmmmm.....  I see the idea is avoid JOIN by just 
feeding the keys into an IN (caveat there, IN works well for small lists, but 
less so for large - Oracle at least limits their size to 1000, when I have to 
use batch IN I will actually batch within the IN itself in groups of 500 or 
so).   You know you could build this as a loader strategy.  an API overhaul of 
that system
is coming up but the LoaderStrategy API shouldn't change much.   Then you could 
just say query.options(batch_load_all("a.b.c")) like
any other option.   LoaderStrategy isn't an API that people use often but it is 
extensible, and 0.9 it's even nicer already, with more to come.   I recently 
posted about it on the development list, if you want to check out 
sqlalchemy-devel (it's a pretty dead list but I'd value your input).    I will 
note that the subquery loader strategy, which this is very similar to,  was 
*very* hard to get working in all cases, compared to how it was super easy to 
get working for simple cases.   because relationship() is *so* flexible, esp. 
with things like self-referential inheritance setups, loading gets very hard 
very quick.

4b. does that IN do tuple lookups for composite primary keys?   you can get 
that by saying tuple_(*keys).in_(list_of_tuples).

5. sort_query - I think you can avoid the private attribute access if you use 
query.column_descriptions: 
http://docs.sqlalchemy.org/en/rel_0_8/orm/query.html?highlight=query.column_descriptions#sqlalchemy.orm.query.Query.column_descriptions
 I added that for a similar purpose as the one here.

Overall: maybe you want to "brand" all these packages under a common package 
using namespace packages? (http://www.python.org/dev/peps/pep-0420/ is the 
future of it, you still need the explicit directives for now).   I do that for 
"dogpile.", e.g. "dogpile.cache", "dogpile.core", etc.  like monkeysql.wtforms, 
monkeysql.utils, monkeysql.versioning etc. (not sure of the role of the Monkey 
character in your canon...).  since your packages do have some dependencies 
between them.

thanks very much for supporting the project, can we get FastMonkey and such up 
on http://www.sqlalchemy.org/organizations.html ?     looks great !


- mike

signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: [sqlalchemy] Feedback appreciated

Reply via email to