David E Jones wrote: > On Dec 5, 2010, at 8:47 PM, Adam Heath wrote: > >> David E Jones wrote: >>> On Dec 5, 2010, at 8:11 PM, Adam Heath wrote: >>> >>>> David E Jones wrote: >>>>> There are other reasons why you might not want to cache something. Quite >>>>> a few of them, even without trying to be too creative. >>>>> >>>>> For example: memory use. Why cache things like contact mech or order >>>>> information that isn't likely to be read a very many times, but if cached >>>>> would take up memory until expired and cleared (and would only be cleared >>>>> if something is cleaning up expired entries) or until it makes it to the >>>>> end of the LRU queue or whatever, and the whole time requiring additional >>>>> resources to keep track of and making more legitimate caching less >>>>> efficient. >>>>> >>>>> Another example: freshness. If you want to be sure the data is good, get >>>>> it from the db. If it won't cause big problems to have stale data, then >>>>> the cache is okay. >>>>> >>>>> Am I the only one who things caching everything by default (as Marc >>>>> suggested) or popping out warning messages (as Adam suggests here) is a >>>>> little crazy? I haven't seen anyone else speak up, so just wondering... >>>> I've got a change queued that has deprecated getRelatedOne(as >>>> suggested earlier). While doing that, I also decided to go thru and >>>> fix all the remaining findByPrimaryKey, replacing with findOne. >>>> During that change, I noticed several FooWorker/FooHelper, that should >>>> really be using caching variants. >>> Yes, I don't doubt that there is code in OFBiz that could leverage caching >>> more effectively. >>> >>>> A read-only fetch should be cached. A read-only web request should >>>> never hit the database. >>> Where do these rules come from? How did they become rules? >>> >>>> If it's a question of memory, and a >>>> particular install is throwing away cache entries too quickly, then >>>> that install needs more memory. >>> I guess my points weren't very good if that's all the rebuttal you thought >>> was needed. >>> >>> How about this case: running a little ecommerce site with 100,000 >>> customers. The typical customer visits the site around once per week, >>> though some customers rarely visit the site, maybe a few times per year. A >>> customer logs in, and looks their customer profile, which is 100% read-only >>> (not data is changed as the page loads). They may look at the profile page >>> again if they decide to edit something, but generally will not otherwise. >>> Should all of the data on the profile be cached so that on repeated views >>> we can follow your two rules above (read-only data cached, read-only web >>> request never hit the database (I assume after the first hit, of course))? >>> in >>> So, we should all of the customer's contact, order, contact/mailing list, >>> etc information all cached? How much memory are you thinking of... >>> something roughly the same size as the database? Or is the whole point of >>> this to try to argue for in-memory databases... yeah... I see where you're >>> going... >>> >>> And, BTW, this isn't some weird case... for most data in an ERP or >>> ecommerce system this IS the case. In other words, it has a low >>> reads/time-period rate, and a low read to update ratio, and therefore >>> caching is a waste of system resources and more likely to cause issues with >>> bad/stale data (especially across multiple app servers... unless you want >>> to keep all caches perfectly in sync all the time and take the performance >>> hit from that goal, especially making sure other caches are updated before >>> you commit, ie involve the caches in the 2-phase commit so that they are >>> all updated before the data can be committed to the db). >> Where in the universe does 'should' mean always? I never said all >> read-only things must be cached. >> >> There are definite times during a production install where parts of >> the system perform exceptionally slow. My suggested change would make >> it simpler to identify where those slow points are, finding values >> that are being looked up repeatedly, and changing them to be cached. >> >> We currently have a client experiencing timeouts when submitting >> orders(ofbiz takes more than 2 minutes to process it). Yes, we could >> increase the default timeout; but that has it's own problems. If we >> could identify the hot-points, with an easy to use tool, then everyone >> would benefit. Where is the harm in that? > > If it's not really the problem, then maybe it would do more harm than good, > like massively increasing log message volume without helping you get valuable > information. > > I actually just recently worked with a client who had a big performance > problem on add to cart and it turned out that the problem was caching too > much stuff which resulted in less important data pushing important data out > of memory as the heap memory started getting full. > > There's always a trade-off... > >> Are you suggesting that because sometimes you don't want to cache some >> things, that we shouldn't cache anything? > > Is that what it sounded like I wrote, or is this just a random thought? > >> <segway> >> Lately, I've gotten very bad responses to suggestions of new >> features(not just from you, David). Someone is wanting to do work, >> not asking anyone else to implement something for them. Yet, the >> response to such suggestions are filled with vitriol. If someone is >> willing to do work, to improve ofbiz, then go ahead and let them. >> </segway> > > Just a total guess here, but maybe because different isn't always better? > Getting into more detail: > > Unfortunately OFBiz has suffered a lot from "improvements" that make more > things worse than they make better. And yes, there absolutely should be a > pretty good burden for contributors to convince others than changes are > indeed an improvement, and in fact IMO there should be a MUCH MUCH higher > burden of proof. Still, that's not the model this project runs under, so what > you're seeing it people reacting to the incredible burden of proof required > to keep new features out since any committer can throw stuff in whenever...
Maybe you missed where I said that this feature would not be turned on by default. It would have 0 impact during all runs, until such time as a problem is detected, and an admin turns it on. How could that cause problems? The feature would be used to find possible problems. I never said it would be used to force all issued warnings to go away, by adding caching everywhere. This feature would make it simpler to find the poor performance points. It won't make it easy, as those involved would still have to know what is going on. Again, why are people pushing back on this? It hurts no one, and could find cases where caching actually would help. Really, what is the harm?
