Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?
[X] ASF Mirrors (linked in our release announcements or via the Lucene website) [X] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) [X] I/we build them from source via an SVN/Git checkout. [] Other (someone in your company mirrors them internally or via a downstream project) 2011/1/18 Ahmet Arslan iori...@yahoo.com [] ASF Mirrors (linked in our release announcements or via the Lucene website) [X] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) [] I/we build them from source via an SVN/Git checkout. [] Other (someone in your company mirrors them internally or via a downstream project)
Re: Including Small Amounts of New Data in Searches (MultiSearcher ?)
Thanks Lance for mentioning the MergePolicies and specifically this one contributed by LinkedIn. 2011/1/8 Lance Norskog goks...@gmail.com There are always slowdowns when merging new segments during indexing. A MergePolicy decides when to merge segments. The older MergePolicies followed a strategy which is quite disruptive in an NRT environment. There is a new feature in 3.x the trunk called 'BalancedSegmentMergePolicy'. This new MergePolicy is designed for the near-real-time use case. It was contributed by LinkedIn. You may find it works well enough for your case. Lance On Thu, Jan 6, 2011 at 10:21 AM, Stephen Boesch java...@gmail.com wrote: Thanks Yonik, Using a stable release of Solr what would you suggest to do - given MultiSearch's demise and the other work is still ongoing? 2011/1/6 Yonik Seeley yo...@lucidimagination.com On Thu, Jan 6, 2011 at 12:37 PM, Stephen Boesch java...@gmail.com wrote: Solr/lucene newbie here .. We would like searches against a solr/lucene index to immediately be able to view data that was added. I stress small amount of new data given that any significant amount would require excessive latency. There has been significant ongoing work in lucene-core for NRT (near real time). We need to overhaul Solr's DirectUpdateHandler2 to take advantage of all this work. Mark Miller took a first crack at it (sharing a single IndexWriter, letting lucene handle the concurrency issues, etc) but if there's a JIRA issue, I'm having trouble finding it. Looking around, i'm wondering if the direction would be a MultiSearcher living on top of our standard directory-based IndexReader as well as a custom Searchable that handles the newest documents - and then combines the two results? If you look at trunk, MultiSearcher has already gone away. -Yonik http://www.lucidimagination.com -- Lance Norskog goks...@gmail.com
Including Small Amounts of New Data in Searches (MultiSearcher ?)
Solr/lucene newbie here .. We would like searches against a solr/lucene index to immediately be able to view data that was added. I stress small amount of new data given that any significant amount would require excessive latency. Looking around, i'm wondering if the direction would be a MultiSearcher living on top of our standard directory-based IndexReader as well as a custom Searchable that handles the newest documents - and then combines the two results? is that a way to go - and would there be examples of similar implementations? thanks! stephenb
Re: Including Small Amounts of New Data in Searches (MultiSearcher ?)
Thanks Yonik, Using a stable release of Solr what would you suggest to do - given MultiSearch's demise and the other work is still ongoing? 2011/1/6 Yonik Seeley yo...@lucidimagination.com On Thu, Jan 6, 2011 at 12:37 PM, Stephen Boesch java...@gmail.com wrote: Solr/lucene newbie here .. We would like searches against a solr/lucene index to immediately be able to view data that was added. I stress small amount of new data given that any significant amount would require excessive latency. There has been significant ongoing work in lucene-core for NRT (near real time). We need to overhaul Solr's DirectUpdateHandler2 to take advantage of all this work. Mark Miller took a first crack at it (sharing a single IndexWriter, letting lucene handle the concurrency issues, etc) but if there's a JIRA issue, I'm having trouble finding it. Looking around, i'm wondering if the direction would be a MultiSearcher living on top of our standard directory-based IndexReader as well as a custom Searchable that handles the newest documents - and then combines the two results? If you look at trunk, MultiSearcher has already gone away. -Yonik http://www.lucidimagination.com
Re: Luke for inspecting indexes on remote solr servers?
i am interested in the LukeRequestHandler in fact having been pointed to it will try to find a comprehensive list of solr request handlers for future reference. as regards the -X our linux box does not have xwindows installed for security reasons, which is why did not try that approach. 2011/1/4 Peter Karich peat...@yahoo.de Am 04.01.2011 21:43, schrieb Ahmet Arslan: Is that supported? Pointer(s) to how to do it? perhaps http://wiki.apache.org/solr/LukeRequestHandler ? or via ssh u...@host -X ;-)
solr newbie: Diagnose why DataImportHandler DIH not saving documents
I am asking for a full DataImport via a url. It seems to be partially happy with the request - with debug=on I can see it saying that 10 documents were indexed. The backend however realizes there are actually 440 records available for the query. Not sure why only 10 records were selected and then why even those 10 records are not stored. Here is the obfuscated url used for invoking the DataImport: mySolrHost:8983/solr/core0/dataimport?command=full-importdebug=onhttp://knowtate.servehttp.com:8983/solr/core0/dataimport?command=full-importdebug=on Here is the output: looks reasonable for the 10 records it does find: notice it says *added/updated 10 documents* 0360db-data-config.xmlfull-importdebugBrad is testing thisjava.math.BigDecimal:1java.math.BigDecimal:15000947 Wood Duck Lanejava.math.BigDecimal:3java.math.BigDecimal:15002Stanford Quad Sculpturejava.math.BigDecimal:3java.math.BigDecimal:15200Apple Store - Palo Altojava.math.BigDecimal:3java.math.BigDecimal:15201Fox Theaterjava.math.BigDecimal:3java.math.BigDecimal:15220java.math.BigDecimal:3java.math.BigDecimal:15222Knowtate promojava.math.BigDecimal:4welcome to Knowtatejava.math.BigDecimal:16163The Green Dragon Tavernjava.math.BigDecimal:5java.math.BigDecimal:15020The All New Infiniti Mjava.math.BigDecimal:5Introjava.math.BigDecimal:15100The All New Infiniti Mjava.math.BigDecimal:5To hear current specialsjava.math.BigDecimal:15100idleConfiguration Re-loaded sucessfully11002010-12-31 16:45:11Indexing completed. *Added/Updated: 10 documents. *Deleted 0 documents.100:0:0.331This response format is experimental. It is likely to change in the future. But when I go to the Admin screen, it tells me Documents Processed: 10 *Total Documents Processed 0* * * So what is difference between Documents and Total Documents ?? Note that there is presently *no *data in the indexes. mySolrHost:8983/solr/core0/admin/stats.jsphttp://knowtate.servehttp.com:8983/solr/core0/admin/ *name: */dataimport *class: *org.apache.solr.handler.dataimport.DataImportHandler *version: *1.0 *description: *Manage data import from databases to Solr * stats: *Status : IDLE Documents Processed : 10 Requests made to DataSource : 1 Rows Fetched : 10 Documents Deleted : 0 Documents Skipped : 0 Total Documents Processed : 0 Total Requests made to DataSource : 0 Total Rows Fetched : 0 Total Documents Deleted : 0 Total Documents Skipped : 0 handlerStart : 1293831460260 requests : 2
Re: solr newbie: Diagnose why DataImportHandler DIH not saving documents
one little extra piece of info: part of the stats page got omitted - notably the number of errors was reported as 0. errors : 0 timeouts : 0 totalTime : 1963 avgTimePerRequest : 981.5 avgRequestsPerSecond : 0.0011371888 2010/12/31 Stephen Boesch java...@gmail.com I am asking for a full DataImport via a url. It seems to be partially happy with the request - with debug=on I can see it saying that 10 documents were indexed. The backend however realizes there are actually 440 records available for the query. Not sure why only 10 records were selected and then why even those 10 records are not stored. Here is the obfuscated url used for invoking the DataImport: mySolrHost:8983/solr/core0/dataimport?command=full-importdebug=onhttp://knowtate.servehttp.com:8983/solr/core0/dataimport?command=full-importdebug=on Here is the output: looks reasonable for the 10 records it does find: notice it says *added/updated 10 documents* 0360db-data-config.xmlfull-importdebugBrad is testing thisjava.math.BigDecimal:1java.math.BigDecimal:15000947 Wood Duck Lanejava.math.BigDecimal:3java.math.BigDecimal:15002Stanford Quad Sculpturejava.math.BigDecimal:3java.math.BigDecimal:15200Apple Store - Palo Altojava.math.BigDecimal:3java.math.BigDecimal:15201Fox Theaterjava.math.BigDecimal:3java.math.BigDecimal:15220java.math.BigDecimal:3java.math.BigDecimal:15222Knowtate promojava.math.BigDecimal:4welcome to Knowtatejava.math.BigDecimal:16163The Green Dragon Tavernjava.math.BigDecimal:5java.math.BigDecimal:15020The All New Infiniti Mjava.math.BigDecimal:5Introjava.math.BigDecimal:15100The All New Infiniti Mjava.math.BigDecimal:5To hear current specialsjava.math.BigDecimal:15100idleConfiguration Re-loaded sucessfully11002010-12-31 16:45:11Indexing completed. *Added/Updated: 10 documents. *Deleted 0 documents.100:0:0.331This response format is experimental. It is likely to change in the future. But when I go to the Admin screen, it tells me Documents Processed: 10 *Total Documents Processed 0* * * So what is difference between Documents and Total Documents ?? Note that there is presently *no *data in the indexes. mySolrHost:8983/solr/core0/admin/stats.jsphttp://knowtate.servehttp.com:8983/solr/core0/admin/ *name: * /dataimport *class: * org.apache.solr.handler.dataimport.DataImportHandler *version: * 1.0 *description: * Manage data import from databases to Solr *stats: * Status : IDLE Documents Processed : 10 Requests made to DataSource : 1 Rows Fetched : 10 Documents Deleted : 0 Documents Skipped : 0 Total Documents Processed : 0 Total Requests made to DataSource : 0 Total Rows Fetched : 0 Total Documents Deleted : 0 Total Documents Skipped : 0 handlerStart : 1293831460260 requests : 2
Re: solr newbie: Diagnose why DataImportHandler DIH not saving documents
sure I'll try that. 2010/12/31 Ahmet Arslan iori...@yahoo.com It seems that with debug=on there is a hard coded default rows=10. http://knowtate.servehttp.com:8983/solr/core0/dataimport?command=full-importdebug=onechoParams=allrows=50 returns Added/Updated: 50 documents. Deleted 0 documents. It seems that debug parameter is related to /solr/core0/admin/dataimport.jsp page. Don't know exact purpose of debug parameter but, can't you just ignore it and use http://knowtate.servehttp.com:8983/solr/core0/dataimport?command=full-import --- On Sat, 1/1/11, Stephen Boesch java...@gmail.com wrote: From: Stephen Boesch java...@gmail.com Subject: Re: solr newbie: Diagnose why DataImportHandler DIH not saving documents To: solr-user@lucene.apache.org Date: Saturday, January 1, 2011, 3:09 AM one little extra piece of info: part of the stats page got omitted - notably the number of errors was reported as 0. errors : 0 timeouts : 0 totalTime : 1963 avgTimePerRequest : 981.5 avgRequestsPerSecond : 0.0011371888 2010/12/31 Stephen Boesch java...@gmail.com I am asking for a full DataImport via a url. It seems to be partially happy with the request - with debug=on I can see it saying that 10 documents were indexed. The backend however realizes there are actually 440 records available for the query. Not sure why only 10 records were selected and then why even those 10 records are not stored. Here is the obfuscated url used for invoking the DataImport: mySolrHost:8983/solr/core0/dataimport?command=full-importdebug=on http://knowtate.servehttp.com:8983/solr/core0/dataimport?command=full-importdebug=on Here is the output: looks reasonable for the 10 records it does find: notice it says *added/updated 10 documents* 0360db-data-config.xmlfull-importdebugBrad is testing thisjava.math.BigDecimal:1java.math.BigDecimal:15000947 Wood Duck Lanejava.math.BigDecimal:3java.math.BigDecimal:15002Stanford Quad Sculpturejava.math.BigDecimal:3java.math.BigDecimal:15200Apple Store - Palo Altojava.math.BigDecimal:3java.math.BigDecimal:15201Fox Theaterjava.math.BigDecimal:3java.math.BigDecimal:15220java.math.BigDecimal:3java.math.BigDecimal:15222Knowtate promojava.math.BigDecimal:4welcome to Knowtatejava.math.BigDecimal:16163The Green Dragon Tavernjava.math.BigDecimal:5java.math.BigDecimal:15020The All New Infiniti Mjava.math.BigDecimal:5Introjava.math.BigDecimal:15100The All New Infiniti Mjava.math.BigDecimal:5To hear current specialsjava.math.BigDecimal:15100idleConfiguration Re-loaded sucessfully11002010-12-31 16:45:11Indexing completed. *Added/Updated: 10 documents. *Deleted 0 documents.100:0:0.331This response format is experimental. It is likely to change in the future. But when I go to the Admin screen, it tells me Documents Processed: 10 *Total Documents Processed 0* * * So what is difference between Documents and Total Documents ?? Note that there is presently *no *data in the indexes. mySolrHost:8983/solr/core0/admin/stats.jsp http://knowtate.servehttp.com:8983/solr/core0/admin/ *name: * /dataimport *class: * org.apache.solr.handler.dataimport.DataImportHandler *version: * 1.0 *description: * Manage data import from databases to Solr *stats: * Status : IDLE Documents Processed : 10 Requests made to DataSource : 1 Rows Fetched : 10 Documents Deleted : 0 Documents Skipped : 0 Total Documents Processed : 0 Total Requests made to DataSource : 0 Total Rows Fetched : 0 Total Documents Deleted : 0 Total Documents Skipped : 0 handlerStart : 1293831460260 requests : 2
Re: solr newbie: Diagnose why DataImportHandler DIH not saving documents
Yes that fixed the problem. interesting.. usually think setting debug just changes the verbosity level.. in this case caused docs not to be processed. 02db-data-config.xmlfull-importidle144002010-12-31 17:45:03Indexing completed. Added/Updated: 440 documents. Deleted 0 documents.2010-12-31 17:45:032010-12-31 17:45:034400:0:0.258This response format is experimental. It is likely to change in the future. Now I am seeing the full 440 docs being processed. cool! *ame: */dataimport *class: *org.apache.solr.handler.dataimport.DataImportHandler *version: *1.0 *description: *Manage data import from databases to Solr * stats: *Status : IDLE Documents Processed : 440 Requests made to DataSource : 1 Rows Fetched : 440 Documents Deleted : 0 Documents Skipped : 0 Total Documents Processed : 880 Total Requests made to DataSource : 2 Total Rows Fetched : 880 Total Documents Deleted : 0 Total Documents Skipped : 0 handlerStart : 1293831460260 requests : 35 errors : 0 timeouts : 0 totalTime : 3170 avgTimePerRequest : 90.57143 avgRequestsPerSecond : 0.008557899 2010/12/31 Stephen Boesch java...@gmail.com sure I'll try that. 2010/12/31 Ahmet Arslan iori...@yahoo.com It seems that with debug=on there is a hard coded default rows=10. http://knowtate.servehttp.com:8983/solr/core0/dataimport?command=full-importdebug=onechoParams=allrows=50 returns Added/Updated: 50 documents. Deleted 0 documents. It seems that debug parameter is related to /solr/core0/admin/dataimport.jsp page. Don't know exact purpose of debug parameter but, can't you just ignore it and use http://knowtate.servehttp.com:8983/solr/core0/dataimport?command=full-import --- On Sat, 1/1/11, Stephen Boesch java...@gmail.com wrote: From: Stephen Boesch java...@gmail.com Subject: Re: solr newbie: Diagnose why DataImportHandler DIH not saving documents To: solr-user@lucene.apache.org Date: Saturday, January 1, 2011, 3:09 AM one little extra piece of info: part of the stats page got omitted - notably the number of errors was reported as 0. errors : 0 timeouts : 0 totalTime : 1963 avgTimePerRequest : 981.5 avgRequestsPerSecond : 0.0011371888 2010/12/31 Stephen Boesch java...@gmail.com I am asking for a full DataImport via a url. It seems to be partially happy with the request - with debug=on I can see it saying that 10 documents were indexed. The backend however realizes there are actually 440 records available for the query. Not sure why only 10 records were selected and then why even those 10 records are not stored. Here is the obfuscated url used for invoking the DataImport: mySolrHost:8983/solr/core0/dataimport?command=full-importdebug=on http://knowtate.servehttp.com:8983/solr/core0/dataimport?command=full-importdebug=on Here is the output: looks reasonable for the 10 records it does find: notice it says *added/updated 10 documents* 0360db-data-config.xmlfull-importdebugBrad is testing thisjava.math.BigDecimal:1java.math.BigDecimal:15000947 Wood Duck Lanejava.math.BigDecimal:3java.math.BigDecimal:15002Stanford Quad Sculpturejava.math.BigDecimal:3java.math.BigDecimal:15200Apple Store - Palo Altojava.math.BigDecimal:3java.math.BigDecimal:15201Fox Theaterjava.math.BigDecimal:3java.math.BigDecimal:15220java.math.BigDecimal:3java.math.BigDecimal:15222Knowtate promojava.math.BigDecimal:4welcome to Knowtatejava.math.BigDecimal:16163The Green Dragon Tavernjava.math.BigDecimal:5java.math.BigDecimal:15020The All New Infiniti Mjava.math.BigDecimal:5Introjava.math.BigDecimal:15100The All New Infiniti Mjava.math.BigDecimal:5To hear current specialsjava.math.BigDecimal:15100idleConfiguration Re-loaded sucessfully11002010-12-31 16:45:11Indexing completed. *Added/Updated: 10 documents. *Deleted 0 documents.100:0:0.331This response format is experimental. It is likely to change in the future. But when I go to the Admin screen, it tells me Documents Processed: 10 *Total Documents Processed 0* * * So what is difference between Documents and Total Documents ?? Note that there is presently *no *data in the indexes. mySolrHost:8983/solr/core0/admin/stats.jsp http://knowtate.servehttp.com:8983/solr/core0/admin/ *name: * /dataimport *class: * org.apache.solr.handler.dataimport.DataImportHandler *version: * 1.0 *description: * Manage data import from databases to Solr *stats: * Status : IDLE Documents Processed : 10 Requests made to DataSource : 1 Rows Fetched : 10 Documents Deleted : 0 Documents Skipped : 0 Total Documents Processed : 0 Total Requests made to DataSource : 0 Total Rows Fetched : 0 Total Documents Deleted : 0 Total Documents Skipped : 0 handlerStart : 1293831460260 requests : 2