[
https://issues.apache.org/jira/browse/BLUR-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13835526#comment-13835526
]
Colton McInroy commented on BLUR-296:
-------------------------------------
Ok, sorry for the delayed response on this, I have been rather busy.
I originally posted this code on the mailing list, and I think it would help
explain a way to access a proper facet implementation...
public static void queryBlur(String queryString, String table) {
Iface client =
BlurClient.getClient(mainConfig.getString("controllers"));
Query query = new Query();
query.setQuery(queryString);
Selector selector = new Selector();
// This will fetch all the columns in family "fam0".
selector.addToColumnFamiliesToFetch("event");
selector.addToColumnFamiliesToFetch("msg");
BlurQuery blurQuery = new BlurQuery();
int matches = 10;
List<Facet> facets = Arrays.asList(new Facet("field1", matches),new
Facet("field2", matches));
blurQuery.setFacets(facets);
blurQuery.setFetch(50);
blurQuery.setQuery(query);
blurQuery.setSelector(selector);
try {
BlurResults results = client.query(table, blurQuery);
for (Facet facet : result.getFacetResults()) {
System.out.println(facet.name+" "+facet.value+" "+facet.count);
}
} catch (BlurException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (TException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return null;
}
To elaborate a bit, instead of facets being a sub query returning a count, the
facet list contains a number of column fields and the number of matches to
return for that field. Or it could be done this way also...
List<Facet> facets = Arrays.asList(new Facet("field1"), new
Facet("field2"));
blurQuery.setFacets(facets, matches);
Instead of specifying the number of facets for each field, this would specify
the number of matches for all facets. Perhaps a combination of the two would be
ideal.
On return, facets should return a list containing the facet results, which
should be something like the following for the above println statement...
field1 value1 4
field1 value2 2
field1 value3 1
field2 value1 4
field2 value2 2
field2 value3 1
This would be fairly similar to the facet implementation in lucene, as well as
specifications for what facets are according to online definitions.
Now, for things that need to be done to accomplish this, I am not entirely
sure. When I build my implementation I created a sub directory under each index
that contained a facet index. Facet data in my experience with lucene are
stored in separate indexes. So, with my understanding in blur, I believe
another index would need to be created along with each shard. With my still
limited knowledge of blur, I am guessing that the following would need to be
implemented.
- Some kind of flag needs to be associated with each table for if it does facet
indexing (perhaps something in the create process)
- Code that handles column declarations needs to see if facet indexing is
enabled for a shard and when a column is declared, start collecting facet data
for mutates.
- Controller/Shard servers need to support collecting facet data along with
queries if query and table request/support facet queries.
- Controller servers need to handle aggregating data from shard servers into
final query response.
- API for executing queries needs to be able to support new facet system.
> Facets are subqueries not facets
> --------------------------------
>
> Key: BLUR-296
> URL: https://issues.apache.org/jira/browse/BLUR-296
> Project: Apache Blur
> Issue Type: Improvement
> Components: Blur
> Affects Versions: experimental-dev, 0.2.0, 0.3.0, 0.2.1
> Environment: N/A
> Reporter: Colton McInroy
> Labels: features
>
> Based on the classification of what Facets are from Lucene and other search
> systems, the current implementation in Blur does not really support this
> functionality.
> http://en.wikipedia.org/wiki/Faceted_classification
> http://en.wikipedia.org/wiki/Faceted_search
> It is entirely different than anything really described in the Lucene
> documentation.
> http://lucene.apache.org/core/4_3_0/facet/org/apache/lucene/facet/doc-files/userguide.html
--
This message was sent by Atlassian JIRA
(v6.1#6144)