On 26/07/12 17:22, Jeroen De Dauw wrote: > Hey, > > > On the other hand, it would be even more useful to cache all results > per (sub)query, ignoring the limit > > This can reduce computing overlapping results, but on the other hand is > likely to compute results we'll never actually use. And it makes the > implementation more complex. Since I'm not convinced the actual result > would be better (I suspect that in fact it's be worse), I prefer to keep > it simple for now. And if you have a case where the "store everything" > approach really makes sense, you can always use a concept right? > > > I was thinking of caching the query result only, not the printouts. > One could cache a list of results, instead of caching all data needed to > display the query result. > > Similar arguments apply here. Any query obtaining a single property > would automatically fetch all properties for all matching objects. Again > I don't think it's that much of an improvement. Esp considering the > following: > > > Having the lists of query results that are displayed in one query now > could be useful for updating (if you have a data blob, you cannot check > quickly for which queries a page occurs as a displayed result). > > Sure, I'd make it easier to figure this out. At least, if you go > invalidate it whenever a single property changes. So now our query > obtaining a single property does not only result into all properties > getting obtained, but it'll also have it's cache invalidated whenever > one of those other properties is changed.
Not necessarily. One can still store the printout properties and look at the diff to see if any of them was affected. > This seems like something we > really should avoid, so we'll have to hold into account the affected > properties anyway, making the "just store all properties" approach not > simpler to implement. > Not sure what "just store all properties" means. I was arguing for the opposite: not to store the properties again, since the printouts can easily be fetched from DB in the (relatively rare) cases that the parser cache needs to be rebuild. Mirroring all printout properties in the query cache would require more frequent updates to it and make it more specific to one single page. But it does not matter much for now. The big issue with all of the query result caching is to limit the amount of cache invalidation that happens on updates. We need to think about how to get more specific information about queries than the properties that they refer to. Some wikis have thousands of pages with very similar queries, always using the same property (from a template), where each query has only a few results (referata gives a good example). A property-based cache invalidation would kill most of the query caches on almost every property edit (there are often just a handful of properties). Storing results for (sub)conditions as an exhaustive list could allow a much more fine-grained control of cache invalidation. The challenge is to keep these sets small. Maybe there are other approaches as well, such as singling out certain "selective" subqueries for this purpose. Markus ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel