Ah, I overlooked this one. Thanks for the reminder. In theory all combinations of prefetches should be valid, so def. a bug.
As to the overall caching architecture, the ORM can't work efficiently without cache. But the issue is that the cache structure is complex making it extremely hard to reason about as a user. Shared vs local; query vs objects; objects vs snapshots; read vs write operations (each using cache differently). Too many dimensions with unclear interaction with each other. I am still looking for a single simple model to be able to fully replace the current one. Andrus > On Jan 21, 2025, at 12:55 PM, John Huss <johnth...@gmail.com> wrote: > > Thanks, that's helpful! The other prefetching problem I've had is when > mixing *joint* prefetches with other types of prefetches (*disjoint* or > *disjointById*), which is documented here: > https://github.com/apache/cayenne/pull/624 > In that case the prefetched data is ignored and the relationship value > appears as null when it isn't, which is *bad*. > > From a higher level perspective, I think the default of fetching objects > into the snapshot cache where they can live forever is a bad default. It > trades correctness (freshness) for performance. I'd like to have better > ways of determining which entities are eligible for the snapshot cache and > how long they are allowed to be there before they are stale, and by > default I wouldn't allow anything there except for objects in the local > context. > > There are also problems with caching even for the local context when the > app gets low on memory - then prefetched objects or objects in the snapshot > cache can become evicted and result in a ton of single row fetches as it > re-resolves them one by one. That tradeoff may be worth it to avoid > crashing, but it would be nice to have a more intentional decision around > the behavior as the programmer. > > On Thu, Jan 16, 2025 at 3:30 PM Andrus Adamchik <aadamc...@gmail.com> wrote: > >> Hi there, >> >> I wanted to share some findings on our object graph refresh algorithms. >> >> For many years I've mostly relied on query cache to refresh data graphs, >> almost never depending on on-demand faulting and the shared snapshot cache. >> But recently I came across a few use cases that exposed pretty big holes in >> our object graph management: >> >> 1. https://issues.apache.org/jira/browse/CAY-2877 >> >> Here multiple somewhat unrelated queries instead of collaborating in >> retrieving data, stomp on each other, invalidating other's prefetches. >> >> 2. https://issues.apache.org/jira/browse/CAY-2878 >> >> When resolving a certain category of to-one relationships (optional PK to >> PK), we run a query where we could've taken the object from the cache. >> >> I probably wouldn't have easily identified #1, if #2 worked as expected, >> as all those invalidated relationships would've been picked up >> transparently from the cache. But that of course wouldn't have been very >> efficient. >> >> I suspect there may be more similar issues, but these are the ones I was >> able to reproduce. >> >> Andrus >> >> >> >>