[ 
https://issues.apache.org/jira/browse/BLUR-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13835526#comment-13835526
 ] 

Colton McInroy commented on BLUR-296:
-------------------------------------

Ok, sorry for the delayed response on this, I have been rather busy.

I originally posted this code on the mailing list, and I think it would help 
explain a way to access a proper facet implementation...

   public static void queryBlur(String queryString, String table) {
        Iface client = 
BlurClient.getClient(mainConfig.getString("controllers"));
        Query query = new Query();
        query.setQuery(queryString);

        Selector selector = new Selector();

        // This will fetch all the columns in family "fam0".
        selector.addToColumnFamiliesToFetch("event");
        selector.addToColumnFamiliesToFetch("msg");

        BlurQuery blurQuery = new BlurQuery();
        int matches = 10;
        List<Facet> facets = Arrays.asList(new Facet("field1", matches),new 
Facet("field2", matches));
        blurQuery.setFacets(facets);
        blurQuery.setFetch(50);
        blurQuery.setQuery(query);
        blurQuery.setSelector(selector);

        try {
            BlurResults results = client.query(table, blurQuery);
            for (Facet facet : result.getFacetResults()) {
                System.out.println(facet.name+" "+facet.value+" "+facet.count);
            }
        } catch (BlurException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (TException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        return null;
    } 

To elaborate a bit, instead of facets being a sub query returning a count, the 
facet list contains a number of column fields and the number of matches to 
return for that field. Or it could be done this way also...

    List<Facet> facets = Arrays.asList(new Facet("field1"), new 
Facet("field2"));
    blurQuery.setFacets(facets, matches); 

Instead of specifying the number of facets for each field, this would specify 
the number of matches for all facets. Perhaps a combination of the two would be 
ideal.

On return, facets should return a list containing the facet results, which 
should be something like the following for the above println statement...
field1 value1 4
field1 value2 2
field1 value3 1
field2 value1 4
field2 value2 2
field2 value3 1

This would be fairly similar to the facet implementation in lucene, as well as 
specifications for what facets are according to online definitions.

Now, for things that need to be done to accomplish this, I am not entirely 
sure. When I build my implementation I created a sub directory under each index 
that contained a facet index. Facet data in my experience with lucene are 
stored in separate indexes. So, with my understanding in blur, I believe 
another index would need to be created along with each shard. With my still 
limited knowledge of blur, I am guessing that the following would need to be 
implemented.

- Some kind of flag needs to be associated with each table for if it does facet 
indexing (perhaps something in the create process)
- Code that handles column declarations needs to see if facet indexing is 
enabled for a shard and when a column is declared, start collecting facet data 
for mutates.
- Controller/Shard servers need to support collecting facet data along with 
queries if query and table request/support facet queries.
- Controller servers need to handle aggregating data from shard servers into 
final query response.
- API for executing queries needs to be able to support new facet system.

> Facets are subqueries not facets
> --------------------------------
>
>                 Key: BLUR-296
>                 URL: https://issues.apache.org/jira/browse/BLUR-296
>             Project: Apache Blur
>          Issue Type: Improvement
>          Components: Blur
>    Affects Versions: experimental-dev, 0.2.0, 0.3.0, 0.2.1
>         Environment: N/A
>            Reporter: Colton McInroy
>              Labels: features
>
> Based on the classification of what Facets are from Lucene and other search 
> systems, the current implementation in Blur does not really support this 
> functionality.
> http://en.wikipedia.org/wiki/Faceted_classification 
> http://en.wikipedia.org/wiki/Faceted_search
> It is entirely different than anything really described in the Lucene 
> documentation.
> http://lucene.apache.org/core/4_3_0/facet/org/apache/lucene/facet/doc-files/userguide.html



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to