John Darrington <[email protected]> writes: > On Sat, Mar 17, 2012 at 12:15:17PM -0700, Ben Pfaff wrote: > John Darrington <[email protected]> writes: > > Ah, yes. I've been aware of related problems for a long time, > but I haven't come up with a good solution. One must limit the > total memory allocated, not the memory allocated per-instance, of > course, but the proper way to distribute the available memory > among the competing users is not obvious. I guess that the > easiest way is first-come-first-served. That might be just fine > in the common case, so perhaps we should implement it that way as > a first cut. > > Unless the number of cases per instances is known a priori > (which in general it isn't) I don't see any better alternative > to first-come-first-served. -- perhaps decadically decreasing > might be one way, in the assumption that if there are many > instances, then hopefully they are small ones. > > Is it feasible to have workspaces which dynamically change > their allocation or is that not possible?
For casereaders, it's easy enough to dynamically change, since casereaders are able to dump all of their in-memory data to disk. > For categoricals, though, what's the fallback if the memory usage > becomes too high? Can we fall back to some kind of on-disk > storage, or do we just fail? "Just fail" is probably not a good > way to go, if first-come-first-served is the strategy we use, > because it means that unrelated memory use (e.g. for cases) can > cause even small number of categories to break. > > Maybe we should do the "just fail" option in the first instance and see > if we can improve it later. OK. > Here's another idea that comes to mind: is there a maximum number > of categories that makes sense? Would a "max categories" setting > defaulting to, say, 1000, still allow most users to get real work > done in realistic cases? > > 1000 would be much too high. How many machines can allocate 64GB of heap? > "Realistic cases" is somewhat subjective. But I cannot envisage that in > most instances more than 20 categories would be involved - but who knows? I mean, 1000 categories per instance, not 1000 instances. Presumably, 1000 categories do not need much memory (a few kilobytes?) unless the space for categories is, say, O(n**2) in the number of categories (I haven't looked). -- Ben Pfaff http://benpfaff.org _______________________________________________ pspp-dev mailing list [email protected] https://lists.gnu.org/mailman/listinfo/pspp-dev
