Re: [sqlalchemy] Why mapper = public_factory(Mapper, ".orm.mapper") pattern is needed?

Michael Bayer Wed, 24 Dec 2014 08:29:57 -0800

if it’s giving you problems, it may be because the __module__ attribute is not 
set in 0.9, that is set in 1.0.

SQLAlchemy since the beginning has had this kind of pattern:

def a_thing(*args):
    “””Produce a Thing”””

    if (condition):
       return Thing(something)
    else:
       return Thing(something else)

that is, lots of factory functions separate from the classes.    One reason is 
that this allows the functions to sometimes return composite objects at once, 
or objects with various settings; another is that we can change what the 
functions return without impacting compatibility.   Being able to switch around 
like this is key, so for SQLAlchemy’s “DSL”-ness, the majority of constructs we 
deal with are invoked from lower case names.

But in practice, the majority of these function / class combos turned out to be 
a lot of extra verbiage; in most cases, it was just a function that called into 
the class constructor. 

So what do we do with that.   We start naming classes with the all_lowercase() 
names is one, I see this in the Python stdlib a lot.  But I *hate* when actual 
Python classes are named with all_lowercase().     Additionally, this seriously 
screws up documentation; it means the file location and/or __module__ of all 
these all-in-one pointers, which is what Sphinx autodoc uses for documentation, 
are now deep within sqlalchemy.foo.bar.bat, and I don’t want the documentation 
of the public API to be exposed to all that depth, or to changes in location; 
all major functions are reported in coarse-grained buckets, this is more on the 
Core side, like “sqlalchemy.sql.expression.<name>”.

Additionally it also breaks code that does any kind of introspection, since 
functions and classes via their constructor are inspected entirely differently. 
 If in Sphinx I have a reference like :class:`.mapper`, and then later mapper() 
becomes a function that calls to Mapper, now all the :class:`.mapper` links are 
broken.    We had this problem big time when i changed sessionmaker() to be a 
class.  Establishing a rule that any direct-to-class function needs to just be 
the class constructor would be unmanageable.

So that’s the end of part one - for the Core API, and a large chunk of ORM, we 
stick with all functions in the API, which call upon classes.

Now is part two.

So then we have this:

def text(…):
    “””Produce a TextClause construct

    docs docs docs

    params

    “””
   return TextClause()

text() is the front-facing API, it’s essential that it has those docs.  But 
then we have this:

class TextClause(ClauseElement):
    “””A TextClause construct”””

    def __init__(…):
       “””Produce a TextClause construct.

       docs docs docs (or no docs)

       params (or no params)
       “””

Problem 1 - In order for me to document TextClause.__init__(), I have to *copy* 
all the documentation almost word for word from that of text().   Or leave 
__init__ totally without documentation.  Yuk.

Problem 2 - As far as Sphinx API documentation, I’m looking to keep these docs 
as generated as possible, that is, I want to use straight “..autofunction::” 
and that’s it - I don’t want to type the documentation out a *third* time in an 
.rst file! (which is what python.org does, see 
https://docs.python.org/2/_sources/library/collections.txt, triple yuk!!)   So 
the file location and/or __module__ of these objects is what Sphinx uses to 
report where you’d be importing this function from.  So in order for me to have 
lots and lots of functions in sqlalchemy.expression that are documented, 
sqlalchemy.expression needs to be *huge*, because everything has to be there, 
and we have all those long docstrings, that were only getting longer.    

Problem 3 - I have to maintain the *args and **kwargs of all the public API 
functions distinctly in two different places, the function and the constructor, 
as well as the list of :paramref:.  These can get out of sync.

Problem 4 - Now that our API documentation has become very rich, we have all of 
these classes in the documentation as well.     So users can see these things.  
 With the above, they will see documentation for “text()”, “TextClause”, and 
“TextClause.__init__() -> Produce a new TextClause() construct”””.     So then, 
wait which API do I call, do I call text(), or TextClause()?  They are both 
documented?  What’s the one obvious way to do it ?

public_factory() solves all of these problems amazingly in one fell swoop, and 
literally nobody has noticed (except for people dealing with the code or with 
the bug regarding __module__).

With public_factory(), I can:

1. write the docs for the function / class in just one place, in the __init__ 
method, and they are available immediately in the SQLAlchemy function 
interface.    

2. There’s no need to have huge docstrings filling up the one giant namespace 
where you want the actual public API functions to be imported from / documented 
as part of.

3. functions stay as functions, classes as classes, with no need to build out a 
separate “def foo():  “””docs docs docs”””” elsewhere, in the case that the 
function just calls on the class, this is just the one liners you see in 
sqlalchemy.sql.expression for example.

4. support class-bound factory methods on a key class, document them, and have 
them automatically be part of the SQLAlchemy function interface, thereby 
allowing more of the use cases for a particular class to be in one place, see 
https://bitbucket.org/zzzeek/sqlalchemy/src/5659ecb2e8a4aac83a1eb9b2c5ea348f0077ca72/lib/sqlalchemy/sql/elements.py?at=master#cl-2394
 for good examples of this; UnaryExpression provides for about five different 
constructors, so the code can stay with UnaryExpression and the public API is 
just a public_factory() declaration

5. fix the “do I call the function or the __init__?” problem, as public_factory 
rewrites the docs for the __init__ as you see here: 
http://docs.sqlalchemy.org/en/latest/orm/mapping_api.html?highlight=mapper#sqlalchemy.orm.mapper.Mapper.__init__
 the verbiage "This constructor is mirrored as a public API function; see 
mapper() 
<http://docs.sqlalchemy.org/en/latest/orm/mapping_api.html?highlight=mapper#sqlalchemy.orm.mapper>
 for a full usage and argument description.” is automatically generated - 
nobody is confused that this might be the function they’re supposed to be 
calling.   In the source code, Mapper.__init__ is documented fully.

6. I can add new arguments to constructors, like a new Mapper argument, add the 
:paramref: right there below it in the docstring, and not have to worry at all 
about the mapper() function that is elsewhere; it is automatically exported 
with the correct signature and documentation.

7. I can now move all the classes in sqlalchemy.sql.expression to their own 
modules, without any concern that the public “import” space of the 
public-facing function will change, public_factory() allows me to just directly 
give each function the location that I want to display in the documentation.

public_factory() is basically a system to declaratively produce a fixed API 
with fully controllable documentation behavior and zero repetition of verbiage 
on top of a changing set of classes.    

Dmitry Mugtasimov <[email protected]> wrote:

> As I can see from code public_factory() returns a function that instantiate a 
> given class. I can not understand why this can be useful. Why we can't just 
> instantiate class directly as Mapper(...)? Could you please, explain. This is 
> need for preparation of pull request for Spyne (which integrates with 
> SQLAlchemy, but does not support custom mappers).
> 
> mapper = public_factory(Mapper, ".orm.mapper")
> 
> def public_factory(target, location):
>     """Produce a wrapping function for the given cls or classmethod.
> 
>     Rationale here is so that the __init__ method of the
>     class can serve as documentation for the function.
> 
>     """
>     if isinstance(target, type):
>         fn = target.__init__
>         callable_ = target
>         doc = "Construct a new :class:`.%s` object. \n\n"\
>             "This constructor is mirrored as a public API function; see 
> :func:`~%s` "\
>             "for a full usage and argument description." % (
>                 target.__name__, location, )
>     else:
>         fn = callable_ = target
>         doc = "This function is mirrored; see :func:`~%s` "\
>             "for a description of arguments." % location
> 
>     location_name = location.split(".")[-1]
>     spec = compat.inspect_getfullargspec(fn)
>     del spec[0][0]
>     metadata = format_argspec_plus(spec, grouped=False)
>     metadata['name'] = location_name
>     code = """\
> def %(name)s(%(args)s):
>     return cls(%(apply_kw)s)
> """ % metadata
>     env = {'cls': callable_, 'symbol': symbol}
>     exec(code, env)
>     decorated = env[location_name]
>     decorated.__doc__ = fn.__doc__
>     if compat.py2k or hasattr(fn, '__func__'):
>         fn.__func__.__doc__ = doc
>     else:
>         fn.__doc__ = doc
>     return decorated
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> <mailto:[email protected]>.
> To post to this group, send email to [email protected] 
> <mailto:[email protected]>.
> Visit this group at http://groups.google.com/group/sqlalchemy 
> <http://groups.google.com/group/sqlalchemy>.
> For more options, visit https://groups.google.com/d/optout 
> <https://groups.google.com/d/optout>.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Why mapper = public_factory(Mapper, ".orm.mapper") pattern is needed?

Reply via email to