> Short answer is that no, there isn't an aggregate > function. And you shouldn't even try
If that is the case why does a 'stats' component exist for Solr with the SUM function built in? http://wiki.apache.org/solr/StatsComponent On Thu, Jan 5, 2012 at 1:37 PM, Erick Erickson <erickerick...@gmail.com> wrote: > You will encounter endless grief until you stop > thinking of Solr/Lucene as a replacement for > an RDBMS. It is a *text search engine*. > Whenever you start asking "how do I implement > a SQL statement in Solr", you have to stop > and reconsider *why* you are trying to do that. > Then recast the question in terms of searching. > > Short answer is that no, there isn't an aggregate > function. And you shouldn't even try. > > Best > Erick > > On Thu, Jan 5, 2012 at 12:53 PM, prasenjit mukherjee > <prasen....@gmail.com> wrote: >> Thanks Eric for the response. >> >> Will lucene/solr provide me aggregations ( of field vaues ) satisying >> a query criteria ? e.g. SELECT SUM(price) WHERE item=fruits >> >> Or I need to use hitCollector to achieve that ? >> >> Any sample solr/lucene query to compte aggregates ( like SUM ) will be great. >> >> -Thanks, >> Prasenjit >> >> On Thu, Jan 5, 2012 at 7:10 PM, Erick Erickson <erickerick...@gmail.com> >> wrote: >>> the time interval is just a RangeQuery in the Lucene >>> world. The rest is pretty standard search stuff. >>> >>> You probably want to have a look at the NRT >>> (near real time) stuff in trunk. >>> >>> Your reads/writes are pretty high, so you'll need >>> some experimentation to size your site >>> correctly. >>> >>> Best >>> Erick >>> >>> On Wed, Jan 4, 2012 at 12:17 AM, prasenjit mukherjee >>> <prasen....@gmail.com> wrote: >>>> I have a requirement where reads and writes are quite high ( @ 100-500 >>>> per-sec ). A document has the following fields : timestamp, >>>> unique-docid, content-text, keyword. Average content-text length is ~ >>>> 20 bytes, there is only 1 keyword for a given docid. >>>> >>>> At runtime, given a query-term ( which could be null ) and a >>>> time-interval, I need to find out top-k frequent keywords which >>>> contains the query-term ( optional if its null ) in its context-text >>>> field within that time-interval. I can purge the data every day, hence >>>> no need for me to have more than a days data. >>>> >>>> I have quite a few options here : Starting with MySQL, NoSQLs ( >>>> Cassandra, Mongo, Couch, Riak, Redis ) , Search-Engine based ( >>>> lucene/solr ) each having its own pros/cons. >>>> >>>> In MySQL we can achieve this via : GROUP-BY/COUNT clause >>>> In NoSQL I can probably write a map/reduce task to query these >>>> numbers. Although I am not very sure about the query response time. >>>> Not sure of we can achieve it via lucene/solr OOB. >>>> >>>> Any suggestions on what would be a good choice for this use case ? >>>> >>>> -Thanks, >>>> prasenjit >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org