Providing methods that work sometimes and don't work other times is generally a bad idea.
No matter how much you document it, users *will* try to use it and expect it to always work (either because they didn't read the docs that say otherwise, they think they'll stick to a configuration where it does work, etc.) And then when it doesn't work (because they pushed something to production which has a different configuration than dev, etc) it's a frustrating experience. -Dennis On 03/11/2014 09:37 AM, Randall Hauch wrote: > I’m struggling with this same question in ModeShape. The JCR API exposes a > method that returns the number of results, but at least the spec allows the > implementation to return -1 if the size is not known (or very expensive to > compute). Yet this still does not satisfy all cases. > > Depending upon the technology, computing the **exact size** ranges from very > cheap to extremely expensive to calculate. For example, consider a system > that has to take into account access control limitations of the user. My > current opinion is that few applications actually need an exact size, and if > they do there may be alternatives (like counting as they iterate over the > results). > > An alternative is to expose an **approximate size**, which is likely to be > sufficient for generating display or other pre-computed information such as > links or paging details. I think that this is sufficient for most needs, and > that even an order of magnitude is sufficient. When the results are known to > be small, the system might want to determine the exact size (e.g., by > iterating). > > So one option is to expose both methods, but allow the exact size method to > return -1 if the system can’t determine the size or if doing so is very > expensive. This allows the system a way out for large/complex queries and > flexibility in the implementation technology. The approximate size method > probably always needs to return at least some usable value. > > BTW, computing an exact size by iterating can be expensive unless you can > keep all the results in memory. That’s not ideal - a query with large results > could fill up available memory. If you don’t keep all results in memory, then > if you’re going to allow clients to access the results more than once you > have to provide a way to buffer the results. > > > On Mar 10, 2014, at 7:23 AM, Sanne Grinovero <[email protected]> wrote: > >> Hi all, >> we are exposing a nice feature inherited from the Search engine via >> the "simple" DSL version, the one which is also available via Hot Rod: >> >> org.infinispan.query.dsl.Query.getResultSize() >> >> To be fair I hadn't noticed we do expose this, I just noticed after a >> recent PR review and I found it surprising. >> >> This method returns the size of the full resultset, disregarding >> pagination options; you can imagine it fit for situations like: >> >> "found 6 million matches, these are the top 20: " >> >> A peculiarity of Hibernate Search is that the total number of matches >> is extremely cheap to figure out as it's generally a side effect of >> finding the 20 results. Essentially we're just exposing an int value >> which was already computed: very cheap, and happens to be useful in >> practice. >> >> This is not the case with a SQL statement, in this case you'd have to >> craft 2 different SQL statements, often incurring the cost of 2 round >> trips to the database. So this getResultSize() is not available on the >> Hibernate ORM Query, only on our FullTextQuery extension. >> >> Now my doubt is if it is indeed a wise move to expose this method on >> the simplified DSL. Of course some people might find it useful, still >> I'm wondering how much we'll be swearing at needing to maintain this >> feature vs its usefulness when we'll implement alternative execution >> engines to run queries, not least on Map/Reduce based filtering, and >> ultimately hybrid strategies. >> >> In case of Map/Reduce I think we'll need to keep track of possible >> de-duplication of results, in case of a Teiid integration it might >> need a second expensive query; so in this case I'd expect this method >> to be lazily evaluated. >> >> Should we rather remove this functionality? >> >> Sanne >> _______________________________________________ >> infinispan-dev mailing list >> [email protected] >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
