Dennis Gove created SOLR-8530:
---------------------------------

             Summary: Add HavingStream to Streaming API and StreamingExpressions
                 Key: SOLR-8530
                 URL: https://issues.apache.org/jira/browse/SOLR-8530
             Project: Solr
          Issue Type: Improvement
          Components: SolrJ
    Affects Versions: Trunk
            Reporter: Dennis Gove
            Priority: Minor


The goal here is to support something similar to SQL's HAVING clause where one 
can filter documents based on data that is not available in the index. For 
example, filter the output of a reduce(....) based on the calculated metrics.

{code}
having(
  reduce(
    search(.....),
    sum(cost),
    on=customerId
  ),
  q="sum(cost):[500 TO *]"
)
{code}

This example would return all where the total spent by each distinct customer 
is >= 500. The total spent is calculated via the sum(cost) metric in the reduce 
stream.

The intent is to support as the filters in the having(...) clause the full 
query syntax of a search(...) clause. I see this being possible in one of two 
ways. 

1. Use Lucene's MemoryIndex and as each tuple is read out of the underlying 
stream creating an instance of MemoryIndex and apply the query to it. If the 
result of that is >0 then the tuple should be returned from the HavingStream.

2. Create an in-memory solr index via something like RamDirectory, read all 
tuples into that in-memory index using the UpdateStream, and then stream out of 
that all the matching tuples from the query.

There are benefits to each approach but I think the easiest and most direct one 
is the MemoryIndex approach. With MemoryIndex it isn't necessary to read all 
incoming tuples before returning a single tuple. With a MemoryIndex there is a 
need to parse the solr query parameters and create a valid Lucene query but I 
suspect that can be done using existing QParser implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to