Performance is ALWAYS an issue. But am I correct in assuming that the performance increase would be approximately inversely proportional to the number of shards queried?
My point is that the amount of work required to implement this should be worth what we think the expected gain will be. John G. http://thediningphilosopher.blogspot.com -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Thursday, July 31, 2008 10:00 AM To: hibernate-dev@lists.jboss.org Subject: hibernate-dev Digest, Vol 25, Issue 28 Send hibernate-dev mailing list submissions to hibernate-dev@lists.jboss.org To subscribe or unsubscribe via the World Wide Web, visit https://lists.jboss.org/mailman/listinfo/hibernate-dev or, via email, send a message with subject or body 'help' to [EMAIL PROTECTED] You can reach the person managing the list at [EMAIL PROTECTED] When replying, please edit your Subject line so it is more specific than "Re: Contents of hibernate-dev digest..." Today's Topics: 1. HSearch: Using sharding and avoiding query on multiple shards (Emmanuel Bernard) 2. Re: HSearch: Using sharding and avoiding query on multiple shards (Sanne Grinovero) ---------------------------------------------------------------------- Message: 1 Date: Wed, 30 Jul 2008 20:36:13 -0400 From: Emmanuel Bernard <[EMAIL PROTECTED]> Subject: [hibernate-dev] HSearch: Using sharding and avoiding query on multiple shards To: hibernate-dev@lists.jboss.org, Aaron Walker <[EMAIL PROTECTED]> Message-ID: <[EMAIL PROTECTED]> Content-Type: text/plain; charset="us-ascii" Today, in Hibernate Search, a query is applied on all shards. We use a MultiReader to wrap them together. In some sharding scenario, it makes sense to apply the query on a single shard or a subset of the shards. We could add the following API to IndexShardingStrategy public DirectoryProvider<?>[] getDirectoryProvidersForQuery(o.a.l.search.Query query); The query could be analyzed by the sharding strategy to detect boolean queries on their sharding criteria //query building BooleanQuery bQuery = new BooleanQuery(); bQuery.add(regularQuery, Occur.MUST); bQuery.add( new TermQuery( new Term("distributor.id", "2"), Occur.MUST ); //only occurs in shard 1 public DirectoryProvider<?>[] getDirectoryProvidersForQuery(o.a.l.search.Query query) { if (query instanceof BooleanQuery) { List<BooleanClause> clauses = BooleanQuery.class.cast(query).clauses } int restrictedShard; boolean isAllMust = true; for (BooleanClause clause : clauses) { if (clause.getOccur() != Occur.MUST) { isAllMust = false; break; } if ( clause.getQuery() instanceof TermQuery ) { Term term = TermQuery.class.cast( clause.getQuery() ).getTerm(); if (term.field().equals("distributor.id")) { restrictedShard = Integer.parse(term.text(); } } } if (isAllMust && restrictedShard != null) return new Provider[] { providers[restrictedShard-1] }; else return providers; } That's very flexibile but quite hard to implement correctly especially since the query tree structure might not be trivial The alternative strategy is to have the following API on IndexShardingStrategy public DirectoryProvider<?>[] getDirectoryProvidersForQuery(Object hint); and a corresponding fullTextQuery.setShardHint(Object); A query could "know it targets shard 2 and pass the information to the strategy through a standard language: fullTextQuery.setShardHint("Sony"); public DirectoryProvider<?>[] getDirectoryProvidersForQuery(Object hint) { if (String.class.isInstance(hint) && String.class.cast(hint).equals("Sony")) { return new Provider[] { providers[2] } } else { return providers; } } WDYT? How useful would that be? -- Emmanuel Bernard http://in.relation.to/Bloggers/Emmanuel | http://blog.emmanuelbernard.com | http://twitter.com/emmanuelbernard Hibernate Search in Action (http://is.gd/Dl1) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.jboss.org/pipermail/hibernate-dev/attachments/20080730/69df77df /attachment-0001.html ------------------------------ Message: 2 Date: Thu, 31 Jul 2008 15:22:52 +0200 From: "Sanne Grinovero" <[EMAIL PROTECTED]> Subject: Re: [hibernate-dev] HSearch: Using sharding and avoiding query on multiple shards To: "Emmanuel Bernard" <[EMAIL PROTECTED]> Cc: hibernate-dev@lists.jboss.org, Aaron Walker <[EMAIL PROTECTED]> Message-ID: <[EMAIL PROTECTED]> Content-Type: text/plain; charset=UTF-8 Hello, the feature is awesome and I know of several real world cases were it would have been both useful and would have performed better. about the API, wouldn't it make more sense to have it look like a filter? regards, Sanne 2008/7/31 Emmanuel Bernard <[EMAIL PROTECTED]>: > Today, in Hibernate Search, a query is applied on all shards. We use a > MultiReader to wrap them together. > In some sharding scenario, it makes sense to apply the query on a single > shard or a subset of the shards. > We could add the following API to IndexShardingStrategy > public > DirectoryProvider<?>[] getDirectoryProvidersForQuery(o.a.l.search.Query > query); > The query could be analyzed by the sharding strategy to detect boolean > queries on their sharding criteria > //query building > BooleanQuery bQuery = new BooleanQuery(); > bQuery.add(regularQuery, Occur.MUST); > bQuery.add( new TermQuery( new Term("distributor.id", "2"), Occur.MUST ); > //only occurs in shard 1 > public > DirectoryProvider<?>[] getDirectoryProvidersForQuery(o.a.l.search.Query > query) { > if (query instanceof BooleanQuery) { > List<BooleanClause> clauses = BooleanQuery.class.cast(query).clauses > } > int restrictedShard; > boolean isAllMust = true; > for (BooleanClause clause : clauses) { > if (clause.getOccur() != Occur.MUST) { isAllMust = false; break; } > if ( clause.getQuery() instanceof TermQuery ) { > Term term = TermQuery.class.cast( clause.getQuery() ).getTerm(); > if (term.field().equals("distributor.id")) { restrictedShard = > Integer.parse(term.text(); } > } > } > if (isAllMust && restrictedShard != null) return new Provider[] { > providers[restrictedShard-1] }; > else return providers; > } > > That's very flexibile but quite hard to implement correctly especially since > the query tree structure might not be trivial > The alternative strategy is to have the following API on > IndexShardingStrategy > public DirectoryProvider<?>[] getDirectoryProvidersForQuery(Object hint); > and a corresponding fullTextQuery.setShardHint(Object); > A query could "know it targets shard 2 and pass the information to the > strategy through a standard language: > fullTextQuery.setShardHint("Sony"); > public DirectoryProvider<?>[] getDirectoryProvidersForQuery(Object hint) { > if (String.class.isInstance(hint) && > String.class.cast(hint).equals("Sony")) { > return new Provider[] { providers[2] } > } > else { > return providers; > } > } > WDYT? How useful would that be? > -- > Emmanuel Bernard > http://in.relation.to/Bloggers/Emmanuel | http://blog.emmanuelbernard.com | http://twitter.com/emmanuelbernard > Hibernate Search in Action (http://is.gd/Dl1) > > _______________________________________________ > hibernate-dev mailing list > hibernate-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/hibernate-dev > > ------------------------------ _______________________________________________ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev End of hibernate-dev Digest, Vol 25, Issue 28 ********************************************* _______________________________________________ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev