2:25 minutes, even for 1,267,000 rows, sounds like quite a long time
to me. Are you physically a long way from the database, querying
across the internet? Do the tables contain many columns, or perhaps
large BLOB or TEXT fields?

If there are large fields that you don't need very often, you could
consider marking them as "deferred", then they wouldn't be loaded by
default:

https://docs.sqlalchemy.org/en/13/orm/loading_columns.html#deferred-column-loading

I suppose if your database server was underpowered, loading over a
million rows might exhaust its memory and cause it to start swapping,
which would hurt performance, in which case querying for smaller
amounts of data might be better.

Simon

On Tue, Jul 7, 2020 at 12:53 PM Justvuur <justv...@gmail.com> wrote:
>
> I'm currently testing with 7000 students with 181 subjects.
> I first went over to the DB to run the query directly on there. I did an 
> innerjoin between the two tables.
> It took about 2:25 minutes. I then selected specific columns from the two 
> tables and the time dropped to about 2 minutes or so.
> I then tried a "select subjects where student ID in (...)" query and that 
> executed a lot quicker than the others.
>
> So then on the python side of things, I first fetched the students and only 
> selected id, created_on and updated_on columns using the with_entities option.
> That seemed to take less than 1 second. I then, for each student, executed a 
> DB query to fetch the subjects using the with_entities option.
> For some reason, this seems to work faster than using the  
> eager-loading/students.subjects/attribute_mapped_collection. They seem to 
> take 2 minutes longer than the above (what should be inefficient) approach.
>
> On Tuesday, 7 July 2020 at 12:50:13 UTC+2 Simon King wrote:
>>
>> How long is it taking? You mentioned 2000 students with 100 subjects
>> each, so there are something like 200,000 rows in the Subjects table,
>> and you need to load all of it. I wouldn't expect that to take longer
>> than a couple of seconds though.
>>
>> Simon
>>
>> On Mon, Jul 6, 2020 at 7:34 PM Justvuur <just...@gmail.com> wrote:
>> >
>> > I added the eager-loading but it seems to slow down the SQL query quite a 
>> > lot.
>> > It's as if now, the SQL query is taking longer but the generating of the 
>> > file is quicker hehe... I guess now the queries are being fired before to 
>> > populate subjects.
>> > It's still taking relatively the same amount of time though.
>> >
>> >
>> > On Friday, 3 July 2020 17:07:43 UTC+2, Simon King wrote:
>> >>
>> >> Are you eager-loading the "student.subjects" relationship? If not,
>> >> that will give you the biggest performance increase. Without that, you
>> >> will be issuing a separate DB query for each of the students, to load
>> >> that student's subjects. Eager-loading allows you to preload the
>> >> subjects for every student in a single query:
>> >>
>> >> https://docs.sqlalchemy.org/en/13/orm/loading_relationships.html#joined-eager-loading
>> >>
>> >> Simon
>> >>
>> >> On Fri, Jul 3, 2020 at 1:36 PM Justvuur <just...@gmail.com> wrote:
>> >> >
>> >> > Hi Simon, thanks for the help! I've never used that before, it's quite 
>> >> > handy.
>> >> >
>> >> > I'm looping through all the students and printing them and their 
>> >> > subject details to a CSV file.
>> >> > What makes things a tad complicated is the subjects must appear in a 
>> >> > specific order.
>> >> > There is a table that has the subject code and order number 
>> >> > (ordered_subjects used below is the resultset from it).
>> >> > I printed out the timing and found the problem to be with a nested for 
>> >> > loop.
>> >> >
>> >> > I was hoping to reduce that process time by using a map that 
>> >> > automatically gets populated instead of having to create it on the fly.
>> >> >
>> >> > Before - subjects_collection "attribute_mapped_collection":
>> >> > ********************************************************************************************
>> >> > for row in students:
>> >> > row_no += 1
>> >> >
>> >> > for subject in row.subjects:
>> >> > student_subjects[subject.code] = subject.value
>> >> >
>> >> > csv_row = [row_no]
>> >> > csv_row += [student_subjects.get(x.code, '') for x in ordered_subjects]
>> >> > csv_row += [row.created_on, row.updated_on]
>> >> >
>> >> > writer.writerow([x.encode('utf-8') if type(x) == unicode else x for x 
>> >> > in csv_row])
>> >> >
>> >> >
>> >> > After adding the subjects_collection "attribute_mapped_collection", I 
>> >> > unfortunately did not see a change in performance.
>> >> >
>> >> > After - subjects_collection "attribute_mapped_collection":
>> >> > ********************************************************************************************
>> >> > for row in students:
>> >> > row_no += 1
>> >> > csv_row = [row_no]
>> >> > csv_row += [row.subjects_collection.get(x.code, '').value for x in 
>> >> > ordered_subjects]
>> >> > csv_row += [row.created_on, row.updated_on]
>> >> >
>> >> > writer.writerow([x.encode('utf-8') if type(x) == unicode else x for x 
>> >> > in csv_row])
>> >> >
>> >> >
>> >> > class Subject(db.Model):
>> >> > __tablename__ = 'subjects'
>> >> >
>> >> > student_id = db.Column(db.Integer, db.ForeignKey('students.id'), 
>> >> > primary_key=True)
>> >> >
>> >> > code = db.Column(db.String(50), primary_key=True)
>> >> >
>> >> > value= db.Column(db.String)
>> >> >
>> >> > def __init__(self, code , value):
>> >> > self.code = code
>> >> > self.value = value
>> >> >
>> >> >
>> >> > class Student(ResourceMixin, db.Model):
>> >> > __tablename__ = 'students'
>> >> >
>> >> > subjects= db.relationship('Subject', backref='student')
>> >> >
>> >> > id = db.Column(db.Integer, primary_key=True)
>> >> >
>> >> > subjects_collection = relationship("Subject", 
>> >> > collection_class=attribute_mapped_collection('code'))
>> >> >
>> >> > Can you see a way I can optimize this? Any ideas?
>> >> >
>> >> >
>> >> > On Friday, 3 July 2020 12:31:03 UTC+2, Simon King wrote:
>> >> >>
>> >> >> Are you trying to optimise the database access (ie. minimize the
>> >> >> number of queries), or provide a nice dictionary-style API for your
>> >> >> Student objects? What do you mean when you say that looping over
>> >> >> student.subjects is quite heavy?
>> >> >>
>> >> >> An association proxy can be used to get dict-style access to a 
>> >> >> relationship:
>> >> >>
>> >> >> https://docs.sqlalchemy.org/en/13/orm/extensions/associationproxy.html#proxying-to-dictionary-based-collections
>> >> >>
>> >> >> There are also a couple of examples in the SQLAlchemy docs that
>> >> >> provide a dictionary-style API:
>> >> >>
>> >> >> https://docs.sqlalchemy.org/en/13/orm/examples.html#module-examples.dynamic_dict
>> >> >>
>> >> >> https://docs.sqlalchemy.org/en/13/orm/examples.html#module-examples.vertical
>> >> >>
>> >> >> Hope that helps,
>> >> >>
>> >> >> Simon
>> >> >>
>> >> >> On Thu, Jul 2, 2020 at 8:46 PM Justvuur <just...@gmail.com> wrote:
>> >> >> >
>> >> >> > Hi there,
>> >> >> >
>> >> >> > I'm struggling to find an efficient way to get a two columned subset 
>> >> >> > into dictionary form.
>> >> >> >
>> >> >> > I have an entity that has a subset of data. The subset is linked to 
>> >> >> > the entity via Id. The order of the subset of data is defined in 
>> >> >> > another table.
>> >> >> >
>> >> >> > Example:
>> >> >> > Student - Id, firstname, lastname
>> >> >> > Subjects - StudentId, SubjectCode, SubjectName
>> >> >> >
>> >> >> > At the moment I'm looping through the SqlAlchemy result of 
>> >> >> > "student.subjects" in python and creating a dictionary from that. 
>> >> >> > It's quite heavy, especially when there are 2000+ students with a 
>> >> >> > potential of 100+ subjects each.
>> >> >> >
>> >> >> > For each student, how do I get the subjects as a dictionary for a 
>> >> >> > student where the key is the SubjectCode and the value is the 
>> >> >> > SubjectName?
>> >> >> > Better yet, how can I get a result set: Id, firstname, lastname 
>> >> >> > SubjectCode x, SubjectCode y, etc etc (where the SubjectName becomes 
>> >> >> > the value and the SubjectCode becomes the column)?
>> >> >> >
>> >> >> > Regards,
>> >> >> > Justin
>> >> >> >
>> >> >> > --
>> >> >> > SQLAlchemy -
>> >> >> > The Python SQL Toolkit and Object Relational Mapper
>> >> >> >
>> >> >> > http://www.sqlalchemy.org/
>> >> >> >
>> >> >> > To post example code, please provide an MCVE: Minimal, Complete, and 
>> >> >> > Verifiable Example. See http://stackoverflow.com/help/mcve for a 
>> >> >> > full description.
>> >> >> > ---
>> >> >> > You received this message because you are subscribed to the Google 
>> >> >> > Groups "sqlalchemy" group.
>> >> >> > To unsubscribe from this group and stop receiving emails from it, 
>> >> >> > send an email to sqlal...@googlegroups.com.
>> >> >> > To view this discussion on the web visit 
>> >> >> > https://groups.google.com/d/msgid/sqlalchemy/57b74c9a-a6e5-494b-b468-d0bdcbcce60co%40googlegroups.com.
>> >> >
>> >> > --
>> >> > SQLAlchemy -
>> >> > The Python SQL Toolkit and Object Relational Mapper
>> >> >
>> >> > http://www.sqlalchemy.org/
>> >> >
>> >> > To post example code, please provide an MCVE: Minimal, Complete, and 
>> >> > Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
>> >> > description.
>> >> > ---
>> >> > You received this message because you are subscribed to the Google 
>> >> > Groups "sqlalchemy" group.
>> >> > To unsubscribe from this group and stop receiving emails from it, send 
>> >> > an email to sqlal...@googlegroups.com.
>> >> > To view this discussion on the web visit 
>> >> > https://groups.google.com/d/msgid/sqlalchemy/67439e94-2666-402a-a642-7f4e7de8fbffo%40googlegroups.com.
>> >
>> > --
>> > SQLAlchemy -
>> > The Python SQL Toolkit and Object Relational Mapper
>> >
>> > http://www.sqlalchemy.org/
>> >
>> > To post example code, please provide an MCVE: Minimal, Complete, and 
>> > Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
>> > description.
>> > ---
>> > You received this message because you are subscribed to the Google Groups 
>> > "sqlalchemy" group.
>> > To unsubscribe from this group and stop receiving emails from it, send an 
>> > email to sqlalchemy+...@googlegroups.com.
>> > To view this discussion on the web visit 
>> > https://groups.google.com/d/msgid/sqlalchemy/9c86dfc4-3666-454e-8b5e-2b9e2a32166ao%40googlegroups.com.
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> ---
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/29b184cc-9f5c-4fe2-a69d-4fd12651c88cn%40googlegroups.com.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/CAFHwexdP%2B%3DR7p8ATZG2VmUQguwt7cnAT2TimzTm7RxymsiBNRg%40mail.gmail.com.

Reply via email to