[sqlalchemy] Re: Questions about "session"

Az Mon, 07 Jun 2010 17:28:08 -0700

> By default, deepcopy will make one copy of everything in the object
> graph reachable by the object you feed it. The scary part is that,
> unless you also pass in a /memo/ argument to each call to deepcopy, it
> will copy the entire graph /every single call/. So if you deepcopy the
> students dictionary and then deepcopy the projects dictionary, each
> student's allocated_proj attribute will not match any instance in the
> projects dictionary. This is why a use-case-specific copy function is
> recommended: it is a lot easier to predict which objects will get copied
> and which objects will be shared.


Shouldn't it match? I mean the student can only get allocated a
project if it exists in the projects dictionary... or is that not the
point?

By use-case-specific, you mean I'll have to redefine deepcopy inside
each class like this: def __deepcopy__(self): something, something?

The only two places where this is an issue is for Supervisor's
"offered_proj" attribute (a set) where, naturally, each project is an
object and in Project where "proj_sup" is naturally a supervisor
object :D

The usefulness of my data structures comes back to bite me now...

> class Student(object):
>     [existing definitions]
>
>     def create_db_record(self):
>         result = StudentDBRecord()
>         result.ee_id = self.ee_id
>         [copy over other attributes]
>         return result
>
> class StudentDBRecord(object):
>     pass

The create_db_record function... does it have to called explicitly
somewhere or does it automatically run?

If I now had to commit my Student data to the database... what would I
do?

> I think a primary key of
> (run_id, session_id/trial_id, stud_id) would be good

If I make them all primary keys I get a composite key right? Within an
entire M-C simulation the stud_id's would repeat in groups -- so if
there are 100 simulations, each stud_id appears 100 times in that
commit.

Run_id is a fantastic idea! I'd probably have it be the date and time?
Given that the simulation takes a while to run... the time will have
changed sufficiently for uniqueness. However, then querying becomes a
pain because of whatever format the date and time data will be in...
so in that case, what is a GUID and is that something we could give to
the Monte-Carlo ourselves before the run as some sort of argument? It
would be the same for an entire run but different from run to run (so
not unique from row to row, but unique from one run set to the other).
Any thoughts on this?

> Never used it, sorry. In general, every UI toolkit has a message/event
> queue to which you can post messages from any thread. So you could do
> something like:
>
> result = monteCarloBasic(...)
>
> def runs_in_ui_thread():
>     update_database(result)
>
> ui_toolkit.post_callback(runs_in_ui_thread)

Thanks for that. Now I know what to search for (message, event queue,
callback) :)


On Jun 7, 10:50 pm, Conor <[email protected]> wrote:
> On 06/07/2010 02:56 PM, Az wrote:
>
> >> Sounds good. Just beware that deepcopy will try to make copies of all
> >> the objects referenced by your StudentUnmapped objects (assuming you
> >> didn't define __deepcopy__), so you may end up copying projects,
> >> supervisors, etc.
>
> > Good point. I'm deepcopying my students, projects and supervisors
> > dictionaries. But yes you're right, all of them have a reference to
> > other objects.
>
> > ï¿½ï¿½[Q1:] How will deepcopying the objects referenced by my
> > StudentUnmapped object affect me?
>
> By default, deepcopy will make one copy of everything in the object
> graph reachable by the object you feed it. The scary part is that,
> unless you also pass in a /memo/ argument to each call to deepcopy, it
> will copy the entire graph /every single call/. So if you deepcopy the
> students dictionary and then deepcopy the projects dictionary, each
> student's allocated_proj attribute will not match any instance in the
> projects dictionary. This is why a use-case-specific copy function is
> recommended: it is a lot easier to predict which objects will get copied
> and which objects will be shared.
>
>
>
> > I also tried another structure for elegance...
>
> > class Student(object):
> >    def __init__(self, ee_id, name, stream_id, overall_proby):
> >            self.ee_id = ee_id
> >            self.name = name
> >            self.stream_id = stream_id
> >            self.preferences = collections.defaultdict(set)
> >            self.allocated_project = None
> >            self.allocated_proj_ref = None
> >            self.allocated_rank = None
> >            self.own_project_id = None
> >            self.own_project_sup = None
> >            self.overall_proby = overall_proby
>
> >    def __repr__(self):
> >            return str(self)
>
> >    def __str__(self):
> >            return "%s %s %s: %s (OP: %s)" %(self.ee_id, self.name,
> > self.allocated_rank, self.allocated_project, self.overall_proby)
>
> > class StudentDBRecord(Student):
> >    def __init__(self, student):
> >            super(StudentDBRecord, self).__init__(student.ee_id,
> >                                                                            
> > student.name,
> >                                                                            
> > student.stream_id,
> >                                                                            
> > student.preferences,
> >                                                                            
> > student.allocated_project,
> >                                                                            
> > student.allocated_proj_ref,
> >                                                                            
> > student.allocated_rank,
> >                                                                            
> > student.own_project_id,
> >                                                                            
> > student.own_project_sup,
> >                                                                            
> > student.overall_proby)
>
> > mapper(StudentDBRecord, students_table, properties={'proj_id' :
> > relation(Project)})
>
> > Basically, the theory was I'd do all my algorithm stuff on the Student
> > objects and then after I've found an optimal solution I'll push those
> > onto the StudentDBRecord table for persistence...
>
> I don't see any benefit to making StudentDBRecord inherit from Student.
> Try this:
>
> class Student(object):
>     [existing definitions]
>
>     def create_db_record(self):
>         result = StudentDBRecord()
>         result.ee_id = self.ee_id
>         [copy over other attributes]
>         return result
>
> class StudentDBRecord(object):
>     pass
>
>
>
> > However I ended up getting the following error:
>
> > #####
>
> > File "Main.py", line 25, in <module>
> >     prefsTableFile = 'Database/prefs-table.txt')
> >   File "/Users/Azfar/Dropbox/Final Year Project/SPAllocation/
> > DataReader.py", line 158, in readData
> >     readProjectsFile(projectsFile)
> >   File "/Users/Azfar/Dropbox/Final Year Project/SPAllocation/
> > DataReader.py", line 66, in readProjectsFile
> >     supervisors[ee_id] = Supervisor(ee_id, name, original_quota,
> > loading_limit)
> >   File "<string>", line 4, in __init__
> >   File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
> > lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/
> > state.py", line 71, in initialize_instance
> >     fn(self, instance, args, kwargs)
> >   File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
> > lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/
> > mapper.py", line 1829, in _event_on_init
> >     instrumenting_mapper.compile()
> >   File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
> > lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/
> > mapper.py", line 687, in compile
> >     mapper._post_configure_properties()
> >   File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
> > lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/
> > mapper.py", line 716, in _post_configure_properties
> >     prop.init()
> >   File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
> > lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/
> > interfaces.py", line 408, in init
> >     self.do_init()
> >   File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
> > lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/
> > properties.py", line 714, in do_init
> >     self._get_target()
> >   File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
> > lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/
> > properties.py", line 726, in _get_target
> >     self.mapper = mapper.class_mapper(self.argument, compile=False)
> >   File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/
> > lib/python2.6/site-packages/SQLAlchemy-0.5.8-py2.6.egg/sqlalchemy/orm/
> > util.py", line 564, in class_mapper
> >     raise exc.UnmappedClassError(class_)
> > sqlalchemy.orm.exc.UnmappedClassError: Class 'ProjectParties.Student'
> > is not mapped
>
> > #####
>
> > ï¿½ï¿½[Q2:] What's that all about? Something wrong with the inheritence?
>
> I don't know if there is a way to get the inheritance to work they way
> you want it, but not using inheritance like I did above sidesteps the issue.
>
> >> I would recommend using a database
> >> sequence or GUIDs to ensure that each call to monteCarloBasic gets a
> >> unique value for this column.
>
> > As another key sequence different from the simple "ident ==
> > row_number" I'm currently using right? I'll look into that.
>
> The problem is that your ident always starts at 1 for each call to
> monteCarloBasic. So, assuming your primary key for SimAllocation
> consists of some combination of (session_id, ident, stud_id), you will
> be reusing the same primary keys for each call to monteCarloBasic. If
> you want to overwrite the rows with the primary keys, then you should
> either DELETE the old rows first or maybe use session.merge(temp_alloc)
> to get the "find or create" behavior. If you do NOT want to overwrite
> the rows, then you need to ensure that some set of columns in
> SimAllocation is globally unique, regardless of how many times
> monteCarloBasic has been called. An easy way to do this is to change
> ident to use a database sequence or GUID, but there are many other
> solutions. You probably want to group together SimAllocations from a
> particular call to monteCarloBasic together, in which case you would add
> a run_id column to SimAllocation, where rows with the same run_id were
> created in the same call to monteCarloBasic. I think a primary key of
> (run_id, session_id/trial_id, stud_id) would be good.
>
> > The thread business is indeed going over my head :S.
>
> >> In this way, monteCarloBasic returns its
> >>      results as a set of objects that are not attached to any session
> >>      (either because they are unmapped or are transient
> >>      
> >> <http://www.sqlalchemy.org/docs/reference/orm/sessions.html#sqlalchemy...>
> >>      instances), which the UI thread uses to update the database. How
> >>      you pass data from worker threads to the UI thread is dependent on
> >>      your GUI toolkit.
>
> > My GUI toolkit is Tkinter?
>
> Never used it, sorry. In general, every UI toolkit has a message/event
> queue to which you can post messages from any thread. So you could do
> something like:
>
> result = monteCarloBasic(...)
>
> def runs_in_ui_thread():
>     update_database(result)
>
> ui_toolkit.post_callback(runs_in_ui_thread)
>
> -Conor

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

[sqlalchemy] Re: Questions about "session"

Reply via email to