Michael Bayer wrote:
Mati Skiva wrote:
I believe I took these issues into account.
But just to be sure, maybe you can highlight something for me.
I assume the followings about the process of session.add:
* after session.add is called, the objects are placed in a to-do pool
(maybe)
* for self-generating-id objects, a special process is used, one that
fetches the generated id after the insert
* for no self-generating-id objects, a regular process of insert is used
I came to this conclusion, because otherwise, after each insert all the
data of the row is retrieved and placed inside the object. Which covers
the newly generated id.
Since you're interested in hacking SQLAlchemy internals, it would helpful
if you could familiarize yourself fully with the SQL expression and
execution API, all of which is public and documented. "id generation" is
a standard feature of this system and occurs outside of the ORM. Fetching
the ID for a new row with the SQLA expression API is a matter of calling a
ResultProxy method which the ORM only calls if it doesn't already know the
full set of primary key columns for a specific object. The backend
implementation of "id fetching" varies wildly across all dialects as well
as versions of each database in use.
The flush procedure updates or expires those fields of the object which
were known to be generated by the database. Fields which are immediately
updated include some or all of the primary key columns. Fields which are
expired include any other columns which were generated via server-side SQL
expressions or known schema-level defaults - these are fetched when the
object's expired attributes are next accessed, if at all.
The SQL compiler procedure is aware of these columns when it generates an
individual insert() construct, given a list of parameters which are to be
bound literals and others which are to be embedded SQL expressions. When
executed, the ResultProxy API provides these details fully - you should
familiarize yourself with all of its methods.
I would also point you to http://www.sqlalchemy.org/trac/ticket/1518 ,
which suggests that we should open up Mapper to being subclassable (for
much more exotic purposes than what you have here). However, this
subclassing is possible now, which is where you can have your "insertmany"
functionality rolled in a rudimental way for your immediate use case
without burdening SQLAlchemy core with the need to have a fully
tested/supported/performant feature. See the attached example.
Thank you for the pointers, as well as the example.
I attempted using it, however quickly enough I was forced to dig into
the internals. I shall explain why.
from inspecting mapper.py's _save_obj method, I found that after every
insert the following items are required:
* value_params - used by _postfetch
* params - used by _postfetch
* ResultProxy.last_inserted_params() - used by _postfetch
* ResultProxy.postfetch_cols() - used within _postfetch
* ResultProxy.prefetch_cols() - used within _postfetch
* ResultProxy.last_inserted_ids() - used by by _save_obj
All of these items shared the same problem, they are all dictionaries or
lists. One for each insert, making them single-dimension data (in
respect to the insert).
But when executing many, they became either None or kept their
single-dimension property (i.e. for two inserts, only one list is
returned, when I expect a list of lists)
That alone prevented me from overloading _save_obj. Because I am
required to have these items for every inserted row.
So, I set upon transforming these items into multi-dimensional upon
execute-many.
It brought me to default.py's __process_defaults where I implemented the
followings:
def __process_defaults(self):
"""generate default values for compiled insert/update statements,
and generate last_inserted_ids() collection."""
if self.executemany:
if len(self.compiled.prefetch):
drunner = self.dialect.defaultrunner(self)
params = self.compiled_parameters
for param in params:
# assign each dict of params to
self.compiled_parameters;
# this allows user-defined default generators to
access the full
# set of bind params for the row
self.compiled_parameters = param
for c in self.compiled.prefetch:
if self.isinsert:
val = drunner.get_column_default(c)
else:
val = drunner.get_column_onupdate(c)
if val is not None:
param[c.key] = val
self.compiled_parameters = params
if self.isinsert:
self._last_inserted_ids = []
self._last_inserted_params = []
for compiled_parameters in self.compiled_parameters:
drunner = self.dialect.defaultrunner(self)
for c in self.compiled.prefetch:
if self.isinsert:
val = drunner.get_column_default(c)
if val is not None:
compiled_parameters[c.key] = val
if self.isinsert:
self._last_inserted_ids.append([compiled_parameters.get(c.key, None) for
c in self.compiled.statement.table.primary_key])
self._last_inserted_params.append(compiled_parameters)
self.postfetch_cols = self.compiled.postfetch
self.prefetch_cols =
self.compiled.prefetch
else: ... #rest of the code stays the same
I have one reservations:
* Is there any code that relies on some of the added attributes not to
exists upon execute-many?
I was forced to make one extra change, for sqlite.py's post_exec method.
Basically, populating the self._last_inserted_ids attribute. But for
some reason self.cursor.lastrowid is None for execute-many (while it can
actually be retrieved from the database server still)
So, these changes yielded...
def post_exec(self):
if ( self.cursor.lastrowid == None ):
self.cursor.execute("SELECT last_insert_rowid()")
lastrowid = self.cursor.fetchone()[0]
else:
lastrowid = self.cursor.lastrowid
if self.compiled.isinsert:
if ( not self.executemany ):
if not len(self._last_inserted_ids) or
self._last_inserted_ids[0] is None:
self._last_inserted_ids = [lastrowid] +
self._last_inserted_ids[1:]
else:
for i in range(len(self._last_inserted_ids)):
_last_inserted_ids = self._last_inserted_ids[i]
if not len(_last_inserted_ids) or
_last_inserted_ids[0] is None:
_last_inserted_ids = [lastrowid] +
_last_inserted_ids[1:]
self._last_inserted_ids[i] = _last_inserted_ids
And than, everything was working great :)
And I mean everything, including default values, sql expressions and
inserted ids.
Would you care to offer your opinion about my efforts?
Thanks in advance,
Mati
This mail was sent via Mobileye Mail-SeCure system.
--
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sqlalchemy?hl=en.