I would rewrite p_here = [p for p in cpaths if loc_id in db.steps[int(p['steps'][0])]. > locations] > > as
ids = [int(p['steps][0]) for p in p in cpaths]) rpaths = db(db.steps.id.belongs(ids)(db.steps.locations.contains(loc_id)).select(db.steps.id) p_here = [p for p in cpaths if int(p['steps'][0]) in rpaths.as_dict()] (if locations is searchable else) ids = [int(p['steps][0]) for p in p in cpaths]) rpaths = db(db.steps.id.belongs(ids).select(db.steps.id, db.steps.locations).as_dict() p_here = [p for p in cpaths if int(p['steps'][0]) in rpaths and loc_id in paths[int(p['steps'][0])].locations] In both case you do a single select. In the second case it returns more records because the filter is done in python, not in db > > The refactored version looks like this: > > pid_here = [p['path2steps']['path_id'] for p > in db((db.path2steps.step_id == db.steps.id) & > (db.steps.locations.contains(loc_id)) > ).iterselect(db.path2steps.path_id, db.steps. > locations) > if loc_id in p['steps']['locations'] > ] > p_here = [p for p in cpaths if p['id'] in pid_here] > > It looks less elegant, but it's *much* lighter on memory. Let me break > down the changes I made. > > 1. I removed the db access (select) from the "if" condition which is > called on every iteration of the loop. Instead I access the db once and > iterate over the result. > 2. I use iterselect instead of select. > 3. In the iterselect I specified just the fields I'm actually going to > use, so that useless data doesn't go into memory. > > In order to make these changes I had to reorganize the logic > significantly. Rather than trying to pinpoint my desired data set in one > pass through the list comprehension, I first gather a larger (!) data set > and then refine it in a second step that doesn't require db access. The > details aren't important here. What surprised me, though, is that it was > far more memory efficient to iterate over a single, stripped down > iterselect than to make multiple selects. This is true even though the > resulting list is larger and has to be pared down in a second stage. > > The larger takeaway for me is that db access is generally very expensive > in terms of memory. It's worth it for me to organize my logic around > minimizing db calls, even if the result is less elegant code. > > On Monday, September 2, 2019 at 3:44:18 PM UTC-4, Ian W. Scott wrote: >> >> I'm trying to lower the memory use of an app and have some general >> questions about how memory is used in DAL selects: >> >> >> 1. Am I right that the memory used while performing the select isn't >> released right away, even if the select isn't assigned to a variable? >> 2. I'm aware of iterselect. Am I right that with iterselect the >> memory used is just enough to store one row of data (instead of the whole >> selected set)? >> 3. Does this mean that, generally, you want to perform as few >> separate selects as possible, unless you can use iterselect? >> 4. Selects seem to occupy a significant amount of memory, even when >> the result set is only one row. Is the memory use for the select >> determined >> by the table size or the result size? >> >> Here's an example of the kind of situation I'm working with. I'm using a >> list comprehension to loop through a list, performing a select in each loop: >> >> p_here = [p for p in cpaths if loc_id in db.steps[int(p['steps'][0 >> ])].locations] >> >> >> When I run a memory profiler, this line results in over 1MB of memory >> being occupied, and that memory isn't released for at least several >> minutes. The table "steps" has about 3000 rows, so it's not enormous. The >> result for each select is a single row and doesn't include a huge amount of >> data (a few strings, ints, etc.). The "cpaths" list might have 50 or so >> items. So is the memory issue emerging because (a) the memory use for each >> select is determined by the table size, and (b) memory is being occupied >> (and not released) separately for each iteration of the loop? Is there a >> way to rewrite this so that it uses less memory? >> > -- Resources: - http://web2py.com - http://web2py.com/book (Documentation) - http://github.com/web2py/web2py (Source code) - https://code.google.com/p/web2py/issues/list (Report Issues) --- You received this message because you are subscribed to the Google Groups "web2py-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/web2py/7916869a-fe94-42a7-8283-f88bbf11b4d2%40googlegroups.com.

