RE: Any realtime indexing plugin available for SOLR
If it is your app that is updating data in the DB, then you could have it update Solr at the same time Regards Stefan Maric -Original Message- From: bbarani [mailto:bbar...@gmail.com] Sent: Wednesday, May 26, 2010 10:39 AM To: solr-user@lucene.apache.org Subject: Any realtime indexing plugin available for SOLR Hi, Sorry if I am asking this question again in this forum.. Is there any plugin which I can use to do a realtime indexing? I have a requirement where we have an application which sits on top of SQL server DB and updates happen on day to day basis. Users would like to see the changes made to the DB immediately in the search results. I am thinking of using JMS queue for achieving this, but before that I just want to check if anyone has implemented similar kind of requirement before? Any help / suggestions would be greatly appreciated. Thanks, bb -- View this message in context: http://lucene.472066.n3.nabble.com/Any-realtime-indexing-plugin-available-for-SOLR-tp845026p845026.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: How to not limit maximum number of documents?
I was just thinking along similar lines As far as I can tell you can use the parameters start rows in combination to control the retrieval of query results So http://host:port/solr/select/?q=query Will retrieve up to results 1..10 http://host:port/solr/select/?q=querystart=11rows=10 Will retrieve up results 11..20 So it is up to your application to control result traversal/pagination Question - does this mean that http://host:port/solr/select/?q=querystart=11rows=10 Runs the query a 2nd time And so on Regards Stefan Maric
RE: How to not limit maximum number of documents?
Egon If you first run your query with q=queryrows=0 Then your you get back an indication of the total number of docs result name=response numFound=53 start=0/ Now your app can query again to get 1st n rows manage forward|backward traversal of results by subsequent queries Regards Stefan Maric -Original Message- From: ego...@gmx.de [mailto:ego...@gmx.de] Sent: 10 February 2010 14:08 To: solr-user@lucene.apache.org Subject: Re: How to not limit maximum number of documents? Hi Stefan, you are right. I noticed this page-based result handling too. For web pages it is handy to maintain a number-of-results-per-page parameter together with an offset to browse result pages. Both can be done be solr's 'start' and 'rows' parameters. But as I don't use Solr in a web context it's important for me to get all results in one go. While waiting for answers I was working on a work-around and came across the LukeRequestHandler (http://wiki.apache.org/solr/LukeRequestHandler). It allows to query the index and obtain meta information about it. I found a parameter in the response called 'numDocs' which seams to contain the current number of index rows. So I was now thinking about first asking for the number of index rows via the LukeRequestHandler and then setting the 'rows' parameter to this value. Apparently, this is quite expensive as one front-end query always leads to two back-end queries. So I'm still searching for a better way to do this! Cheers, Egon Original-Nachricht Datum: Wed, 10 Feb 2010 13:19:05 + Von: stefan.ma...@bt.com An: solr-user@lucene.apache.org Betreff: RE: How to not limit maximum number of documents? I was just thinking along similar lines As far as I can tell you can use the parameters start rows in combination to control the retrieval of query results So http://host:port/solr/select/?q=query Will retrieve up to results 1..10 http://host:port/solr/select/?q=querystart=11rows=10 Will retrieve up results 11..20 So it is up to your application to control result traversal/pagination Question - does this mean that http://host:port/solr/select/?q=querystart=11rows=10 Runs the query a 2nd time And so on Regards Stefan Maric -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
RE: How to not limit maximum number of documents?
Yes, I tried the q=queryrows=-1 - the other day and gave up But as you say it wouldn't help because you might get a) timeouts because you have to wait a 'long' time for the large set of results to be returned b) exceptions being thrown because you're retrieving too much info to be thrown around the system Regards Stefan Maric -Original Message- From: ego...@gmx.de [mailto:ego...@gmx.de] Sent: 10 February 2010 15:06 To: solr-user@lucene.apache.org Subject: Re: How to not limit maximum number of documents? Setting the 'rows' parameter to a number larger than the number of documents available requires that you know how much are available. That's what I intended to retrieve via the LukeRequestHandler. Anyway, nice approach Stefan. I'm afraid I forgot this 'numFound' aspect. :) But still, it feels like a hack. Originally I was searching more for something like: q=queryrows=-1 Which leaves the API to do the job (efficiently!). :) The question is: Does Solr support something? Or should we write a feature request? Cheers, Egon Original-Message Datum: Wed, 10 Feb 2010 14:38:51 + (GMT) Von: Ron Chan rc...@i-tao.com An: solr-user@lucene.apache.org Betreff: Re: How to not limit maximum number of documents? just set the rows to a very large number, larger than the number of documents available useful to set the fl parameter with the fields required to avoid memory problems, if each document contains a lot of information - Original Message - From: stefan maric stefan.ma...@bt.com To: solr-user@lucene.apache.org Sent: Wednesday, 10 February, 2010 2:14:05 PM Subject: RE: How to not limit maximum number of documents? Egon If you first run your query with q=queryrows=0 Then your you get back an indication of the total number of docs result name=response numFound=53 start=0/ Now your app can query again to get 1st n rows manage forward|backward traversal of results by subsequent queries Regards Stefan Maric -- NEU: Mit GMX DSL über 1000,- ¿ sparen! http://portal.gmx.net/de/go/dsl02
RE: Indexing / querying multiple data types
Sven In my data-config.xml I have the following document entity name=name1 query=select id, atomID, name, description from v_1 / entity name=name2 query=select id, atomID, name, description from V_2 / /document In my schema.xml I have field name=id type=string indexed=true stored=true required=true / field name=name type=text indexed=true stored=true/ field name=atomId type=string indexed=false stored=true required=true / field name=description type=text indexed=true stored=true / And in my solrconfig.xml I have requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdata-config.xml/str /lst /requestHandler requestHandler name=name1 class=solr.SearchHandler lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str float name=tie0.01/float str name=qfname^1.5 description^1.0/str /lst /requestHandler requestHandler name=contacts class=solr.SearchHandler lst name=defaults str name=defTypedismax/str str name=echoParamsexplicit/str float name=tie0.01/float str name=qfname^1.5 description^1.0/str /lst /requestHandler And the requestHandler name=dismax class=solr.SearchHandler Has been untouched So when I run http://localhost:7001/solr/select/?q=foodqt=name1 I was expecting to get results form the data that had been indexed by entity name=name1 Regards Stefan Maric
How to configure multiple data import types
I have got a dataimport request handler configured to index data by selecting data from a DB view I now need to index additional data sets from other views so that I can support other search queries I defined additional entity .. definitions within the document .. section of my data-config.xml But I only seem to pull in data for the 1st entity .. and not both Is there an xsd (or dtd) for data-config.xml schema.xml slrconfig.xml As these might help with understanding how to construct usable conf files Regards Stefan Maric BT Innovate Design | Collaboration Platform - Customer Innovation Solutions
RE: How to configure multiple data import types
No my views have already taken care of pulling the related data together I've indexed my first data set and now want to configure a second (non-related) data set so that a User can issue a query for data set #1 whilst another user might be querying for data set #2 Should I be defining multiple document .. or entity .. entries Or what ?? Thanks Stefan Maric
Indexing / querying multiple data types
OK - so I've now got my data-config.xml sorted so that I'm pulling in the expected number of indexed documents for my two data sets So I've defined two entities (name1 name2) and they both make use of the same fields -- I'm not sure if this is a good thing to have done When I run a query I include qt=name1 (or qt=name2) and am expecting to only get the number of results from the appropriate data set -- in fact I'm getting the sum total from both Does the entity name=name1 equate to the query qt=name1 In my solrconfig.xml I have defined two requestHandlers (name1 name2) using the common set of fields So how do ensure that my query http://localhost:7001/solr/select/?q=foodqt=name1 or http://localhost:7001/solr/select/?q=foodqt=name2 Will operate on the correct data set as loaded via the data import -- entity name=name1 or entity name=name2 Thankss Stefan Maric BT Innovate Design | Collaboration Platform - Customer Innovation Solutions