Re: [sqlalchemy] Re: Temporarily disable DB persistence for optimization routine

Simon King Fri, 12 Jan 2018 01:54:51 -0800

If I understand your code correctly, scheduler.propagate() creates a
large number of Passage instances, and you only want a small subset of
them to be added to the database. Is that correct?


I would guess that the passages are getting added to the session
because you are setting their 'satellite' property to point to a
Satellite which is already in the database. This then causes the
passages to be added to the session due to the default cascade rules
on the relationship
(http://docs.sqlalchemy.org/en/latest/orm/cascades.html).

If that really is the case, you can change the cascade rules for that
relationship, and then you'll probably need to explicitly add the
passages you want to *keep* to the session instead.

Hope that helps,

Simon

On Fri, Jan 12, 2018 at 2:10 AM, Mike Bayer <[email protected]> wrote:
> I can't give you much detail except to say the unique object recipe is
> doing an .add() when it finds an identity that isn't taken, if you
> don't want those persisted then take out the part of the recipe doing
> add().  However, you'd need to alter the recipe further such that if
> the program asks for that same identity again which you didn't want to
> flush to the DB, and you'd like to use the same object, you need to
> pull that from some kind of local dictionary of "pending" objects with
> those identities, if that makes sense.
>
> the second email with the after_attach thing implies you are already
> adding an object to the Session.
>
> Neither of these code examples show example of use, where you are
> doing things that make objects and you'd like them to not be
> persisted.   If you need to create unique objects in memory without
> persisting, you just need to store them in some dictionary that sets
> up the in-memory uniqueness you are looking for.
>
>
>
> On Thu, Jan 11, 2018 at 11:37 AM, Ruben Di Battista
> <[email protected]> wrote:
>> Last copy paste went wrong.
>>
>> The uniqueness is ensured by:
>>
>> @event.listens_for(orm.session.Session, "after_attach")
>> def after_attach(session, instance):
>>     # when ConstrainedSatellite objects are attached to a Session,
>>     # figure out if in the database there's already the Constraint,
>>     # requested, if yes return that object, if not create a new one.
>>     # This is an adaptation of the UniqueObject pattern
>>     # suggested by SQLAlchemy documentation
>>     # https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/UniqueObject
>>     if isinstance(instance, UniqueAssociationProxy):
>>         instance = instance.ensure_unicity(session)
>>
>>
>>
>>
>> On Thursday, January 4, 2018 at 6:05:38 PM UTC+1, Ruben Di Battista wrote:
>>>
>>> Hello,
>>> I'm writing a satellite passage scheduler that has a database persistence
>>> layer to store the scheduled passages.
>>>
>>> The DB schema is organized as follows:
>>> - A table storing the satellites (using NORAD No as Primary Key)
>>> - A table storing the ground stations where to compute the passages of the
>>> satellites
>>> - A table storing the passages of these satellites, with two foreign keys
>>> linking each passage to a Ground Station and a Satellite
>>> - A table storing all the types of constraints a satellite can have
>>> - A table storing all the types of weights (used to perform the
>>> scheduling) that can be assigned to each satellite
>>>
>>> Than I configured some association proxies (with the related `Unique
>>> Object` pattern) in order to assign the weights and the satellites as a
>>> dictionary
>>>
>>> sat.constraints['min_elevation']= 10
>>>
>>> The details of the relationships here below:
>>>
>>> # Relationship definitions
>>> orm.mapper(Satellite, satellite, properties={
>>>     'passages': orm.relationship(Passage,
>>>                                  backref='satellite',
>>>                                  order_by=passage.c.aos,
>>>                                  cascade='all, delete-orphan'),
>>>
>>>
>>>     '_constraints': orm.relationship(
>>>         ConstrainedSatellite, backref='satellite',
>>>
>>> collection_class=orm.collections.attribute_mapped_collection('name'),
>>>         cascade='all, delete-orphan'),
>>>
>>>
>>>     '_weights': orm.relationship(
>>>         WeightedSatellite, backref='satellite',
>>>
>>> collection_class=orm.collections.attribute_mapped_collection('name'),
>>>         lazy='joined',
>>>         cascade='all, delete-orphan'),
>>>
>>>
>>>     '_tle': satellite.c.tle
>>>
>>>
>>> })
>>>
>>>
>>> orm.mapper(Constraint, constraint, properties={
>>>     'satellites': orm.relationship(ConstrainedSatellite,
>>> backref='constraint')
>>>
>>>
>>> })
>>>
>>>
>>> orm.mapper(Weight, weight, properties={
>>>     'satellites': orm.relationship(WeightedSatellite, backref='weight')
>>> })
>>>
>>>
>>>
>>>
>>> orm.mapper(ConstrainedSatellite, constraint_satellite)
>>>
>>>
>>> orm.mapper(WeightedSatellite, weight_satellite)
>>>
>>>
>>> orm.mapper(PassageData, passage_data)
>>>
>>>
>>> orm.mapper(Passage, passage, properties={
>>>     'angles': orm.relationship(PassageData, backref='passage',
>>>                                order_by=passage_data.c.time,
>>>                                cascade='all, delete-orphan')
>>>
>>>
>>> },
>>>     confirm_deleted_rows=False
>>> )
>>>
>>>
>>> orm.mapper(GroundStation, ground_station, properties={
>>>     'passages': orm.relationship(Passage, backref='ground_station',
>>>                                  order_by=passage.c.aos,
>>>                                  cascade='all, delete-orphan')
>>> })
>>>
>>>
>>>
>>>
>>> # Association Proxies
>>> Satellite.constraints = association_proxy(
>>>     '_constraints', 'value', creator=constrained_sat_creator
>>> )
>>>
>>>
>>> Satellite.weights = association_proxy(
>>>     '_weights', 'value', creator=weighted_sat_creator
>>> )
>>>
>>>
>>>
>>>
>>> ConstrainedSatellite.constraint_name = association_proxy('constraint',
>>> 'name')
>>> WeightedSatellite.weight_name = association_proxy('weight', 'name')
>>>
>>>
>>>
>>>
>>> From the tests everything is working as expected. The problem is that I'm
>>> getting some performance issues while performing the scheduling procedure:
>>> this procedure needs to read the `constraints` and `weights` associated to
>>> each satellite to select the best passages among all the ones possible on
>>> each ground station. While reading the values of the `constraints` and
>>> `weights`, SQLAlchemy needs to store the Passage object in the Database. So
>>> all the passages are stored in the database during the propagation of all
>>> the possible passages, and then I need to manually expunge or delete the
>>> passages that haven't been scheduled from the DB.
>>>
>>> What I would like to achieve is to "shut down", after selecting the ground
>>> stations and the satellites for which to perform the optimization, the
>>> SQLAlchemy persistence in order to perform the optimization procedure only
>>> on Python objects, without having SQLAlchemy to store them in the DB, and
>>> then just storing efficiently the optimized, smaller, list of them that are
>>> computed by the scheduling algorithm.
>>>
>>> Currently the scheduling script is like this (schematically):
>>>
>>>
>>>     # Create a DB session
>>>     session_factory = orm.sessionmaker(db)
>>>     session = orm.scoped_session(session_factory)
>>>
>>>
>>>     # Retrieve satellites from DB
>>>     sats = session.query(Satellite).all()
>>>
>>>
>>>     # Retrieve gss
>>>     ground_stations = session.query(GroundStation).all()
>>>
>>>
>>>     # Init Scheduler instance
>>>     scheduler = Scheduler(sats, ground_stations, start_day)
>>>
>>>
>>>     # This methods generates all the possible passages of all satellites
>>> on all ground stations
>>>     # it needs to read `constraints` and `weights` for each satellite to
>>> perform what it needs.
>>>     # currently all the passages get stored in the DB
>>>     all_passages = scheduler.propagate()
>>>
>>>
>>>     # This method selects from the totality of all_passages, a subset that
>>> fulfills all
>>>     # the constraints.
>>>     scheduled_passages = prop.schedule(all_passages, iterations, pool)
>>>
>>>
>>>     # Remove from session passages not scheduled
>>>     # === This is the thing I would like to avoid since it's slow ===
>>>     for passage in all_passages:
>>>         if passage not in scheduler_passages:
>>>             session.expunge(passage)
>>>             session.delete(passage)
>>>
>>>
>>>     session.commit()
>>>     session.close()
>>>
>>> As you can see, I need to manually expunge or delete the passages that are
>>> not scheduled by the algorithm. If the number of passages is huge, that
>>> means performing a DELETE query for each passage, that is slow. What I would
>>> like to achieve is to have `scheduler.propagate`, and `scheduler.schedule`
>>> method to perform operations only on Python objects (fully loaded with all
>>> the correct parameters from the DB), avoiding the need to store and then
>>> delete all the passages in the DB. Can I "shut down" temporarily the DB
>>> persistence?
>>>
>>> Thanks for you help,
>>> RdB
>>>
>>> PS: I'm on MySQL (MariaDB)
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>> --
>> SQLAlchemy -
>> The Python SQL Toolkit and Object Relational Mapper
>>
>> http://www.sqlalchemy.org/
>>
>> To post example code, please provide an MCVE: Minimal, Complete, and
>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
>> description.
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "sqlalchemy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/sqlalchemy.
>> For more options, visit https://groups.google.com/d/optout.
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example.  See  http://stackoverflow.com/help/mcve for a full 
> description.
> ---
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] Re: Temporarily disable DB persistence for optimization routine

Reply via email to