The two instances don't have the same ID, the uniqueness is tested on two other
columns.
> session.query(cls).filter(cls.the_unique_column==new_object.the_unique_column).first()
Is there a way to get the tuple (usually a couple) of columns that are involved
in a unique constraint ? if there is such a way, than the line of code above
could be generically applied to all models, which would be sweet :)
something like :
for row in csv:
new_object = make_an_object_from_csv_row(row)
existing_object = get_existing_object(new_object)
if existing_object is not None:
new_object.id = existing_object.id
new_object = session.merge(new_object)
session.commit()
def get_existing_object(new_object):
unique_constraints = [constraint.columns for constraint in
cls.get_constraints()
if constraint.get_type() == UniqueConstraint]
u_c_columns = [constraint.columns for constraint in unique_constraints]
query = session.query(cls)
for column_tuples in u_c_columns :
for column in column_tuples :
query = query.filter(cls.get_attr(column) ==
new_object.get_attr(column))
return query.first()
--- On Thu, 9/9/10, Michael Bayer <[email protected]> wrote:
> From: Michael Bayer <[email protected]>
> Subject: Re: [sqlalchemy] Copying instances from old to new schema DB
> To: [email protected]
> Date: Thursday, September 9, 2010, 9:54 AM
>
> On Sep 9, 2010, at 11:31 AM, chaouche yacine wrote:
>
> > Thank you Michael, that's for the catch-the-error
> part. How about the set-relations-right part, if I decided
> to go with my fix-as-you go recipe ? do you have any idea ?
> the dictionary approach seems good, but I still am curious
> about how to set relations generically on models. I think
> I'll use it somewhere else in my code.
>
> You don't need to set any relations. If
> your new B has a fully populated primary key, you should
> just be using session.merge() for the whole thing.
> This can work with your "integrity error" scheme, or
> more simply with the "select first" schemes below:
>
>
> for row in csv:
> new_object =
> make_an_object_from_csv_row(row)
> new_object = session.merge(new_object)
> session.commit()
>
>
> if new_object does *not* have the correct primary key,
> then:
>
> for row in csv:
> new_object =
> make_an_object_from_csv_row(row)
> existing_object =
> session.query(cls).filter(cls.the_unique_column==new_object.the_unique_column).first()
> if existing_object is not None:
> new_object.id =
> existing_object.id
> new_object = session.merge(new_object)
> session.commit()
>
>
> if you don't like the many indvidual SELECT statements,
> then
>
> # load a dictionary of (unique attr, primary key)
> lookup = dict(
> session.query(cls.the_unique_column,
> cls.id)
> )
> for row in csv:
> new_object =
> make_an_object_from_csv_row(row)
> if new_object.the_unique_column
> in lookup:
> new_object.id =
> lookup[new_object.the_unique_column]
> new_object = session.merge(new_object)
> session.commit()
>
> or if most of your rows are replacement rows, quicker to
> preload everything:
>
>
> # load a dictionary of (unique attr, instance)
> lookup = dict(
> session.query(cls.the_unique_column,
> cls)
> )
> for row in csv:
> new_object =
> make_an_object_from_csv_row(row)
> if new_object.the_unique_column
> in lookup:
> new_object.id =
> lookup[new_object.the_unique_column].id
> new_object = session.merge(new_object)
> session.commit()
>
>
>
>
>
> >
> > By the way, I think the pseudo should have been :
> >
> > instance = next_instance_from_csv()
> > try:
> > session.commit()
> > except IntegrityError,e :
> > session.rollback()
> > if e.orig == UniqueConstraintError :
> > original_instance =
> ModelClass.get(instance.id)
> > for relation_name in
> instance.get_relation_names() :
> > # Is this
> correct ?
> >
> path_to_attribute= "mapper."+ relation +".inverse" # I don't
> know where to find this ?
> >
> instance.set_attribute(path_to_attribute,original_instance)
> > session.commit()
> >
> > class BaseModel (DeclarativeBase):
> > ...
> > def set_attribute(self,path,value):
> > nodes
> = path.explode(".")
> > current_node = self
> > for next_node in nodes :
> > current_node
> = getattr(current_node,next_node,None)
> > if not
> current_node:
> >
> raise InvalidAttribute
> >
> set_attribute(current_node,value) # sqla's set_attribute,
> for whatever reason...
> >
> >
> > Y.Chaouche
> >
> >
> > --- On Thu, 9/9/10, Michael Bayer <[email protected]>
> wrote:
> >
> >> From: Michael Bayer <[email protected]>
> >> Subject: Re: [sqlalchemy] Copying instances from
> old to new schema DB
> >> To: [email protected]
> >> Date: Thursday, September 9, 2010, 5:59 AM
> >>
> >> On Sep 9, 2010, at 6:31 AM, chaouche yacine
> wrote:
> >>
> >>> Hello list,
> >>>
> >>> My schema has changed, and now I want to
> retrieve my
> >> old data (of the old schema) to the new database
> (with the
> >> new schema) from csv files (I export the tables of
> the old
> >> database to csv files and then load those csv
> files to the
> >> new database with some column mapping and some gap
> filling
> >> for new columns)
> >>>
> >>> The problem is that my new schema have a
> >> UniqueConstraint for some tables (like : the name
> column of
> >> the city table should be unique within a country).
> This
> >> constraint was not present in the old schema and
> data is
> >> corrupt (two cities with the same name in the same
> country).
> >> So when I try to insert them in the new database,
> I have
> >> IntegrityErrors.
> >>>
> >>> The solution I thought of was :
> >>> * Catch the IntegrityError exception
> >>> * If it's a problem on a UniqueConstraint,
> then the
> >> exception was raised because I tried to insert
> instance B
> >> that has the same "key" as instance A that was
> inserted
> >> before.
> >>> * So for all children of B (B's relations),
> set their
> >> parent to A. For example, for all citizens of B,
> set their
> >> city to A, beause A and B ought to be the same.
> >>> * Then, safely ignore B and move on to the
> next
> >> instance.
> >>>
> >>> Here's what has been done so far (that works,
> I just
> >> use psuedo code for "illustration" purpose. If
> necessary,
> >> you can look at the actual attached source files):
>
> >>>
> >>> line =
> >> csvloader.next_row()
> >>> ModelClass = get_current_model()
> >>> instance =
> >> ModelClass.create_instance(**(to_dict(line)))
> >>> session.add(instance)
> >>>
> >>> I wish I could do something like this :
> >>> try:
> >>> session.commit()
> >>> except IntegrityError,e :
> >>> session.rollback()
> >>> errror = get_error()
> >>> if type_of(error) ==
> >> UniqueConstraintError :
> >>>
> original_instance =
> >> ModelClass.get(instance.id)
> >>> for
> relation in
> >> instance.get_relations() :
> >>>
> # Is this
> >> correct ?
> >>>
> >> instance.relation.inverse = original_instance
> >>>
> session.commit()
> >>>
> >>> My questions are : how to write get_error,
> type_of,
> >> where to get UniqueContraintError, how to write
> >> get_relations, how to set the inverse of a
> relation (is
> >> instance.realtion.inverse the right thing to set
> ?) and is
> >> this approach correct ?
> >>
> >> There's no portable way to detect "unique
> constraint"
> >> errors across DBAPIs, you'd have to catch the
> specific error
> >> that you've observed your DBAPI emits as well as
> the
> >> message/codes contained within it, and code
> against
> >> that. SQLAlchemy wraps DBAPI
> exceptions in
> >> its own same-named wrapper and the object emitted
> by the
> >> DBAPI is available under the ".orig" member.
> >>
> >> One improvement to the recipe would be if you
> used
> >> savepoints, if supported by your database, so that
> when you
> >> issue rollback() inside a savepoint block, the
> whole
> >> transaction isn't rolled back and you can maintain
> the whole
> >> operation in one transaction. The
> session
> >> can start a savepoint using begin_nested().
> >>
> >> When I do csv loads like these however I usually
> load the
> >> full list of identifiers to be checked into memory
> ahead of
> >> time. If the csv is less than 100K rows and
> represents
> >> the full table of data, I just load the whole
> thing into a
> >> dictionary, formatted according to whatever "these
> are the
> >> fields that are unique" I'm dealing with, and
> consult the
> >> dictionary as I go along. Otherwise,
> I load
> >> individual blocks of data in as I scan through
> portions of
> >> the csv (like, give me all the records where id in
> (a, b, c,
> >> d, e, f, g, ...) for this
> block). There's
> >> nothing "wrong" with the "catch-the-error"
> approach except
> >> that I like an entire operation to be in a single
> >> transaction, and its usually simpler and faster
> just to work
> >> with a Python dictionary.
> >>
> >>
> >>
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> You received this message because you are
> subscribed
> >> to the Google Groups "sqlalchemy" group.
> >>> To post to this group, send email to [email protected].
> >>> To unsubscribe from this group, send email to
> >> [email protected].
> >>> For more options, visit this group at
> >>> http://groups.google.com/group/sqlalchemy?hl=en.
> >>>
> >>> <loadCSV.py><db.py>
> >>
> >> --
> >> You received this message because you are
> subscribed to the
> >> Google Groups "sqlalchemy" group.
> >> To post to this group, send email to [email protected].
> >> To unsubscribe from this group, send email to
> [email protected].
> >> For more options, visit this group at
> >> http://groups.google.com/group/sqlalchemy?hl=en.
> >>
> >>
> >
> >
> >
> >
> > --
> > You received this message because you are subscribed
> to the Google Groups "sqlalchemy" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to
> [email protected].
> > For more options, visit this group at
> > http://groups.google.com/group/sqlalchemy?hl=en.
> >
>
> --
> You received this message because you are subscribed to the
> Google Groups "sqlalchemy" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/sqlalchemy?hl=en.
>
>
--
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sqlalchemy?hl=en.