If at all possible, denormalize the data. Anytime you find yourself trying to make Solr behave like a database, the probability is high that you're mis-using Solr or the DB.
Best Erick On Wed, Sep 29, 2010 at 12:40 PM, Sharma, Raghvendra < sraghven...@corelogic.com> wrote: > Some questions. > > 1. I have about 3-5 tables. Now designing schema.xml for a single table > looks ok, but whats the direction for handling multiple table structures is > something I am not sure about. Would it be like a big huge xml, wherein > those three tables (assuming its three) would show up as three different > tag-trees, nullable. > > My source provides me a single flat file per table (tab delimited). > > Do you think having multiple indexes could be a solution for this case ?? > or do I really need to spend effort in denormalizing the data ? > > 2. Further, loading into solr can use some perf tuning.. any tips ? best > practices ? > > 3. Also, is there a way to specify a xslt at the server side, and make it > default, i.e. whenever a response is returned, that xslt is applied to the > response automatically... > > 4. And last question for the day - :) there was one post saying that the > spatial support is really basic in solr and is going to be improved in next > versions... Can you ppl help me get a definitive yes or no on spatial > support... in the current form, does it work on not ? I would store lat and > long, and would need to make them searchable... > > --raghav.. > > -----Original Message----- > From: Sharma, Raghvendra [mailto:sraghven...@corelogic.com] > Sent: Tuesday, September 28, 2010 11:45 AM > To: solr-user@lucene.apache.org > Subject: RE: Is Solr right for my business situation ? > > Thanks for the responses people. > > @Grant > > 1. can you show me some direction on that.. loading data from an incoming > stream.. do I need some third party tools, or need to build something > myself... > > 4. I am basically attempting to build a very fast search interface for the > existing data. The volume I mentioned is more like static one (data is > already there). The sql statements I mentioned are daily updates coming. The > good thing is that the history is not there, so the overall volume is not > growing, but I need to apply the update statements. > > One workaround I had in mind is, (though not so great performance) is to > apply the updates to a copy of rdbms, and then feed the rdbms extract to > solr. Sounds like overkill, but I don't have another idea right now. > Perhaps business discussions would yield something. > > @All - > > Some more questions guys. > > 1. I have about 3-5 tables. Now designing schema.xml for a single table > looks ok, but whats the direction for handling multiple table structures is > something I am not sure about. Would it be like a big huge xml, wherein > those three tables (assuming its three) would show up as three different > tag-trees, nullable. > > My source provides me a single flat file per table (tab delimited). > > 2. Further, loading into solr can use some perf tuning.. any tips ? best > practices ? > > 3. Also, is there a way to specify a xslt at the server side, and make it > default, i.e. whenever a response is returned, that xslt is applied to the > response automatically... > > 4. And last question for the day - :) there was one post saying that the > spatial support is really basic in solr and is going to be improved in next > versions... Can you ppl help me get a definitive yes or no on spatial > support... in the current form, does it work on not ? I would store lat and > long, and would need to make them searchable... > > Looks like I m close to my solution.. :) > > --raghav > > -----Original Message----- > From: Grant Ingersoll [mailto:gsing...@apache.org] > Sent: Tuesday, September 28, 2010 1:05 AM > To: solr-user@lucene.apache.org > Subject: Re: Is Solr right for my business situation ? > > Inline. > > On Sep 27, 2010, at 1:26 PM, Walter Underwood wrote: > > > When do you need to deploy? > > > > As I understand it, the spatial search in Solr is being rewritten and is > slated for Solr 4.0, the release after next. > > It will be in 3.x, the next release > > > > > The existing spatial search has some serious problems and is deprecated. > > > > Right now, I think the only way to get spatial search in Solr is to > deploy a nightly snapshot from the active development on trunk. If you are > deploying a year from now, that might change. > > > > There is not any support for SQL-like statements or for joins. The best > practice for Solr is to think of your data as a single table, essentially > creating a view from your database. The rows become Solr documents, the > columns become Solr fields. > > There is now group-by capabilities in trunk as well, which may or may not > help. > > > > > wunder > > > > On Sep 27, 2010, at 9:34 AM, Sharma, Raghvendra wrote: > > > >> I am sure these kind of questions keep coming to you guys, but I want to > raise the same question in a different context...my own business situation. > >> I am very very new to solr and though I have tried to read through the > documentation, I have nowhere near completing the whole read. > >> > >> The need is like this - > >> > >> We have a huge rdbms database/table. A single table perhaps houses 100+ > million rows. Though oracle is doing a fine job of handling the insertion > and updation of data, the querying is where our main concerns lie. Since we > have spatial data, the index building takes hours and hours for such tables. > >> > >> That's when we thought of moving away from standard rdbms and thought of > trying something different and fast. > >> My last week has been spent in a journey reading through bigtable to > hadoop to hbase, to hive and then finally landed on solr. As far as I am in > my tests, it looks pretty good, but I have a few unanswered questions still. > Trying this group for them :) (I am sure I can find some answers if I > read/google more on the topic, but now I m being lazy and feel asking the > people who are already using it/or perhaps developing it is a better bet). > >> > >> 1. Can I get my solr instance to load data (fresh data for indexing) > from a stream (imagine a mq kind of queue, or similar) ? > > Yes, with a little bit of work. > > >> 2. Can I host my solr instance to use hbase as the database/file system > (read HDFS) ? > > Probably, but I doubt it will be fast. Local disk is usually the best. > 100+ M rows is large but not unreasonable. > > >> 3. are there somewhere any reports available (as in benchmarks ) for a > solr instance's performance ? > > You can probably search the web for these. I've personally seen several > installs w/ 1B+ docs and subsecond search and faceting and heard of others. > You might look at the stuff the Hathi trust has put up. > > >> 4. are there any APIs available which might help me apply ANSI sql kind > of statements to my solr data ? > > No. Question back? What kinds of things are you trying to do? > > >> > >> It would be great if people could help share their experience in the > area... if it's too much trouble writing all of it, perhaps url would be > easier... I welcome all kinds of help here... any advice/suggestions are > good ... > >> > >> Looking forward to your viewpoints.. > >> > >> --raghav.. > >> > ****************************************************************************************** > >> This message may contain confidential or proprietary information > intended only for the use of the > >> addressee(s) named above or may contain information that is legally > privileged. If you are > >> not the intended addressee, or the person responsible for delivering it > to the intended addressee, > >> you are hereby notified that reading, disseminating, distributing or > copying this message is strictly > >> prohibited. If you have received this message by mistake, please > immediately notify us by > >> replying to the message and delete the original message and any copies > immediately thereafter. > >> > >> Thank you. > >> > ****************************************************************************************** > >> CLLD > >> > > > > > > > > > > -------------------------- > Grant Ingersoll > http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct 7-8 > >