Thanks, yes, this well help a lot. On Wednesday, 8 July 2020 at 11:33:31 UTC+2 Simon King wrote:
> You should start by enabling SQLAlchemy logging to see the actual > queries that are being run. The easiest way is to pass "echo=True" to > your create_engine call. (You could also try echo="debug", but since > you've got hundreds of thousands of rows you'll be swamped) > > Verify that the queries look correct (for example, they have the right > join conditions). If they look OK but they run slowly, use your > database's tools (eg. EXPLAIN or EXPLAIN ANALYZE) to understand why. > > Simon > > On Wed, Jul 8, 2020 at 8:22 AM Justvuur <just...@gmail.com> wrote: > > > > I'd like to redesign the DB but that's not on the cards at the moment > (or ever hehe). > > > > Thanks for the feedback, I appreciate all the help, I really do. > > What puzzles me now is, why would the looping through each student > (using with_entities) and getting the subjects (using with_entities) for > each student be faster (from 2:25 min down to 0:19 min) than using the > student.subjects approach? > > I know I'm including 3 or 4 less columns but surely it wont cost 2 > minutes to have them included? > > > > > > > > On Tuesday, 7 July 2020 at 19:31:56 UTC+2 Jonathan Vanasco wrote: > >> > >> Based on what you shared above: > >> > >> * The "Subject" table is: `StudentId, SubjectCode, SubjectName` > >> * There are 181 subjects > >> > >> It looks like you don't have a "Subject" table, but a > "StudentAndSubject" table. > >> > >> I think you'd have a bigger performance improvement by normalizing that > data into two tables: > >> > >> Subject: SubjectId (primary key), SubjectCode, SubjectName > >> Student2Subject: StudentId, SubjectId, (primary key is both) > >> > >> Assuming this can be done with your data... the database performance > should improve because > >> > >> 1. The raw filestorage will decrease > >> 2. The in-memory dataset size will decrease > >> > >> You could then either > >> > >> 1. use the Subject table as part of a joined query to keep things > simple, or > >> 2. just select off a join of Student+Student2Subject , and query all > the Subjects separately. Even if there are 2000 subjects total, it should > only take a few ms to get all that into a python datastructure that is used > to generate your csv > >> > >> > > -- > > SQLAlchemy - > > The Python SQL Toolkit and Object Relational Mapper > > > > http://www.sqlalchemy.org/ > > > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > > --- > > You received this message because you are subscribed to the Google > Groups "sqlalchemy" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to sqlalchemy+...@googlegroups.com. > > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/3b701d26-2f3b-4407-bc69-2dc13df60caan%40googlegroups.com > . > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/bd8b5c27-2534-4c2d-8e2d-d3354c2179b0n%40googlegroups.com.