Hi, Here's an example of raw data that would be in my Sales index:
*Affiliate / SaleDate / SaleAmount* * mike / 2010-03-01 / 10.00 * john / 2010-03-01 / 10.00 * mike / 2010-03-02 / 15.00 * john / 2010-03-02 / 5.00 * mike / 2010-03-03 / 20.00 * john / 2010-03-03 / 1.00 * mike / 2010-03-04 / 10.00 * john / 2010-03-04 / 10.00 * mike / 2010-03-05 / 15.00 * john / 2010-03-05 / 5.00 * mike / 2010-03-06 / 20.00 * john / 2010-03-06 / 1.00 So our 2 affiliates mike and john made 6 sales each between 2010-03-01 and 2010-03-06. My ultimate query should return this for a query between 2010-03-01 and 2010-03-06 : *Affiliate / TotalSales* * mike / 90.00 * john / 32.00 So it's exactly like - SELECT Affiliate, sum(SaleAmount) as TotalSales FROM Sales WHERE SaleDate >= '2010-03-01' AND SaleDate <= '2010-03-06' GROUP BY Affiliate ORDER BY TotalSales DESC; - Mike aka...@gmail.com On Thu, Apr 1, 2010 at 8:11 AM, prasenjit mukherjee <prasen....@gmail.com>wrote: > Not sure what you mean by "joining" in lucene , since conceptually > there is only 1 table ( with many field aka columns ) in lucene. A > representative query would be good to know the use case. > > Again didn't get the "sorting" part. SUM() will return only 1 > aggregated value, so what do you want to sort it on ? > > -Prasen > > On Thu, Apr 1, 2010 at 7:44 AM, Michel Nadeau <aka...@gmail.com> wrote: > > Are you planning to be able to sort by these SUMs? A SpanQuery would work > > great to get the integers... then you would loop and sum up... but what > > about "joining" with your other data and sorting? > > > > - Mike > > aka...@gmail.com > > > > > > On Wed, Mar 31, 2010 at 9:23 PM, prasenjit mukherjee > > <prasen....@gmail.com>wrote: > > > >> I too am trying to achieve something. > >> > >> I am thinking of storing the integer values in payloads and then > >> using spanquery classes to compute the respective SUMs > >> > >> -Prasen > >> > >> On Thu, Apr 1, 2010 at 6:47 AM, Michel Nadeau <aka...@gmail.com> wrote: > >> > Hi, > >> > > >> > We're currently in the process of switching many of our screens from > >> MySQL > >> > to Lucene because MySQL simply dies because we have too much data and > >> it's > >> > becoming too long to generate the stats we need. > >> > > >> > So here's one MySQL query that we use to find out our Top 10 > Affiliates : > >> > > >> > SELECT SUM(sale_amount) AS total_sales, affialiate_id FROM sales WHERE > >> > sale_date>='2010-03-01' AND sale_date<='2010-03-31' GROUP BY > >> affialiate_id > >> > ORDER BY total_sales DESC LIMIT 10; > >> > > >> > We currently have our "sales" index, containing all sales and all > fields > >> - > >> > and it's one big index (over 10M records). We could fetch all > documents > >> > within the date range, loop them and add up the total_sales, but it > would > >> be > >> > just crazy to do this all the time (we have a high volume of search). > >> > > >> > We made several tests with Solr (Facets, and even the beta > >> CollapseFields), > >> > but nothing is really helping us. We could pre-generate the > total_sales > >> for > >> > all possible date ranges... but that would be quite crazy too as the > date > >> > range possibilities quickly become endless. > >> > > >> > So - is there any known way to efficiently do SUM(), COUNT() (and even > >> AVG() > >> > ) using Lucene/Solr/others? I also checked Bobo Browse but it doesn't > >> seem > >> > to offer what I need either. > >> > > >> > Thanks for any hints!!! > >> > > >> > - Mike > >> > aka...@gmail.com > >> > > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> > >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >