Right, Solr will not do anything other than basic aggregations (facets) and range queries.
On Tue, Jun 21, 2011 at 3:16 PM, Dan Kuebrich <dan.kuebr...@gmail.com>wrote: > Solandra is indeed distributed search, not distributed number-crunching. > As a previous poster said, you could imagine structuring the data in a > series of documents with fields containing playername, teamname, position, > location, day, time, inning, at bat, outcome, etc. Then you could query to > get a slice of the data that matches your predicate and run statistics on > that subset. > > The statistics would have to come from other code (eg. R), but solr will > filter it for you. So, this approach only works if the slices are reasonably > small, but gives you great granularity on search as long as you put all the > info in. The users of this datastore (or you) must be willing to write > their own simple aggregation functions ("show me only the unique player > names returned by this solr query", "show me the average of field X returned > by this solr query", ...) > > If the numbers of results are too great, MR may be the way to go. > > On Tue, Jun 21, 2011 at 3:04 PM, Victor K. <victor.kabde...@gmail.com>wrote: > >> If I may ask Sasha, what exactly are you trying to achieve using SolR (or >> Solandra, I guess it's about the same) ? >> Because from what I understood of your problem you need to do statistics >> on your matches, players etc... Or do you just want to retrieve information >> that are already been computed ? >> If it is the first thing you are trying to achieve (data aggregation, >> statistics, etc...) SolR won't be of a big use because it is not meant to do >> statistics. If you want to achieve the second then SolR is just the tool for >> you. >> >> >> >> On 6/21/2011 2:47 PM, Sasha Dolgy wrote: >> >>> Without getting overly complicated and long winded ... are there >>> practical references / examples I can review that demonstrate the >>> cassandra/solandra benefits....i had a quick look at >>> https://github.com/tjake/**Solandra/wiki/Solandra-Wiki<https://github.com/tjake/Solandra/wiki/Solandra-Wiki>and >>> it wasn't >>> dead obvious to me.... >>> >>> On Tue, Jun 21, 2011 at 8:19 PM, Jake Luciani<jak...@gmail.com> wrote: >>> >>>> Solandra can answer the question you used as an example and it's more of >>>> a >>>> fit for low-latency ad-hoc reporting then PIG. Pig queries will take >>>> minutes not seconds. >>>> On Tue, Jun 21, 2011 at 12:12 PM, Sasha Dolgy<sdo...@gmail.com> wrote: >>>> >>>>> Folks, >>>>> >>>>> Simple question ... Assuming my current use case is the ability to log >>>>> lots of trivial and seemingly useless sports statistics ... I want a >>>>> user to be able to query / compare .... For example: >>>>> >>>>> --> Show me all baseball players in cheektowaga and ontario, >>>>> california who have hit a grandslam on tuesdays where it was just a >>>>> leap year. >>>>> >>>>> Each baseball player is represented by a single row in a CF: >>>>> >>>>> player_uuid, fullname, hometown, game1, game2, game3, game4 >>>>> >>>>> Game's are UUID's that are a reference to another row in the same CF >>>>> that provides information about that game... >>>>> >>>>> location, final score, date (unix timestamp or ISO format) , and >>>>> statitics which are represented as a new column timestamp:player_uuid >>>>> >>>>> I can use PIG, as I understand, to run a query to generate specific >>>>> information about specific "things" and populate that data back into >>>>> Cassandra in another CF ... similar to the hypothetical search >>>>> above....as the information is structured already, i assume PIG is the >>>>> right tool for the job, but may not be ideal for a web application and >>>>> enabling ad-hoc queries ... it could take anywhere from 2-....? >>>>> seconds for that query to generate, populate, and return to the >>>>> user...? >>>>> >>>>> On the other hand, I have started to read about Solr / Solandra / >>>>> Lucandra .... can this provide similar functionality or better ? or >>>>> is it more geared towards full text search and indexing ... >>>>> >>>>> I don't want to get into the habit of guessing what my potential users >>>>> want to search for ... trying to think of ways to offload this to >>>>> them. >>>>> >>>>> >>>>> >>>>> -- >>>>> Sasha Dolgy >>>>> sasha.do...@gmail.com >>>>> >>>> >>>> >>>> -- >>>> http://twitter.com/tjake >>>> >>>> >>> >>> >> > -- http://twitter.com/tjake