> Short answer is that no, there isn't an aggregate > function. And you shouldn't even try
If that is the case why does a 'stats' component exist for Solr with the SUM function built in? http://wiki.apache.org/solr/StatsComponent On Thu, Jan 5, 2012 at 1:37 PM, Erick Erickson <[email protected]> wrote: > You will encounter endless grief until you stop > thinking of Solr/Lucene as a replacement for > an RDBMS. It is a *text search engine*. > Whenever you start asking "how do I implement > a SQL statement in Solr", you have to stop > and reconsider *why* you are trying to do that. > Then recast the question in terms of searching. > > Short answer is that no, there isn't an aggregate > function. And you shouldn't even try. > > Best > Erick > > On Thu, Jan 5, 2012 at 12:53 PM, prasenjit mukherjee > <[email protected]> wrote: >> Thanks Eric for the response. >> >> Will lucene/solr provide me aggregations ( of field vaues ) satisying >> a query criteria ? e.g. SELECT SUM(price) WHERE item=fruits >> >> Or I need to use hitCollector to achieve that ? >> >> Any sample solr/lucene query to compte aggregates ( like SUM ) will be great. >> >> -Thanks, >> Prasenjit >> >> On Thu, Jan 5, 2012 at 7:10 PM, Erick Erickson <[email protected]> >> wrote: >>> the time interval is just a RangeQuery in the Lucene >>> world. The rest is pretty standard search stuff. >>> >>> You probably want to have a look at the NRT >>> (near real time) stuff in trunk. >>> >>> Your reads/writes are pretty high, so you'll need >>> some experimentation to size your site >>> correctly. >>> >>> Best >>> Erick >>> >>> On Wed, Jan 4, 2012 at 12:17 AM, prasenjit mukherjee >>> <[email protected]> wrote: >>>> I have a requirement where reads and writes are quite high ( @ 100-500 >>>> per-sec ). A document has the following fields : timestamp, >>>> unique-docid, content-text, keyword. Average content-text length is ~ >>>> 20 bytes, there is only 1 keyword for a given docid. >>>> >>>> At runtime, given a query-term ( which could be null ) and a >>>> time-interval, I need to find out top-k frequent keywords which >>>> contains the query-term ( optional if its null ) in its context-text >>>> field within that time-interval. I can purge the data every day, hence >>>> no need for me to have more than a days data. >>>> >>>> I have quite a few options here : Starting with MySQL, NoSQLs ( >>>> Cassandra, Mongo, Couch, Riak, Redis ) , Search-Engine based ( >>>> lucene/solr ) each having its own pros/cons. >>>> >>>> In MySQL we can achieve this via : GROUP-BY/COUNT clause >>>> In NoSQL I can probably write a map/reduce task to query these >>>> numbers. Although I am not very sure about the query response time. >>>> Not sure of we can achieve it via lucene/solr OOB. >>>> >>>> Any suggestions on what would be a good choice for this use case ? >>>> >>>> -Thanks, >>>> prasenjit >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [email protected] >>>> For additional commands, e-mail: [email protected] >>>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
