[jira] Commented: (SOLR-1383) Replication causes master to fail to delete old index files
[ https://issues.apache.org/jira/browse/SOLR-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12748682#action_12748682 ] Lance Norskog commented on SOLR-1383: - I checked again- the files do not go away. Not after another commit, not after restarting solr. The replication commit reservation code definitely has a bug. Replication causes master to fail to delete old index files --- Key: SOLR-1383 URL: https://issues.apache.org/jira/browse/SOLR-1383 Project: Solr Issue Type: Bug Components: replication (java) Environment: Linux CentOS - latest Solr 1.4 trunk - Java 1.6 Reporter: Lance Norskog Fix For: 1.4 I have developed a way to make replication leave old index files in the master's data/index directory. It is timing-dependent. A sequence of commands runs correctly or fails, depending on the timing between the commands. Here is the test scenario: Start a master and slave version of the Solr distributed example. I used 8080 for the slave. (See example/etc/jetty.xml) Be sure to start with empty solr/data/index files on both master and slave. Open the replication administration jsp on the slave ( http://localhost:8080/solr/admin/replication/index.jsp ) Disable polling. In a text window, go to the example/exampledocs directory and run this script {code} for x in *.xml do echo $x sh post.sh $x sleep 15 curl http://localhost:8080/solr/replication?command=fetchindex; done {code} This prints each example file, indexes it, and does a replication command. At the end of this exercise, the master and slave solr/data/index files will be identical. Now, kill master slave, remove the solr/index/data directories, and start over. This time, remove the sleep command from the script. In my environment, old Lucene index files were left in the master's data/index. Here is what is left in the master data/index. The segments_? files are random across runs, but the index files left over are consistent. Note (courtesy of the Linux 'ls -l /proc/PID/fd' command) that the old files are not kept open by the master solr; they are merely left behind. In the master server: {code} % ls solr/data/index _0.fdt _1.prx _2.tvx _4.nrm _5.tii _7.frq _8.tvd _a.tvx _c.nrm _0.fdx _1.tii _3.fdt _4.prx _5.tis _7.nrm _8.tvf _b.fdt _c.prx _0.fnm _1.tis _3.fdx _4.tii _6.fdt _7.prx _8.tvx _b.fdx _c.tii _0.frq _2.fdt _3.fnm _4.tis _6.fdx _7.tii _a.fdt _b.fnm _c.tis _0.nrm _2.fdx _3.frq _4.tvd _6.fnm _7.tis _a.fdx _b.frq segments.gen _0.prx _2.fnm _3.nrm _4.tvf _6.frq _8.fdt _a.fnm _b.nrm segments_8 _0.tii _2.frq _3.prx _4.tvx _6.nrm _8.fdx _a.frq _b.prx segments_9 _0.tis _2.nrm _3.tii _5.fdt _6.prx _8.fnm _a.nrm _b.tii segments_a _1.fdt _2.prx _3.tis _5.fdx _6.tii _8.frq _a.prx _b.tis segments_b _1.fdx _2.tii _4.fdt _5.fnm _6.tis _8.nrm _a.tii _c.fdt segments_c _1.fnm _2.tis _4.fdx _5.frq _7.fdt _8.prx _a.tis _c.fdx segments_d _1.frq _2.tvd _4.fnm _5.nrm _7.fdx _8.tii _a.tvd _c.fnm _1.nrm _2.tvf _4.frq _5.prx _7.fnm _8.tis _a.tvf _c.frq {code} {code} % ls -l /proc/PID/fd lr-x-- 1 root root 64 Aug 25 22:52 137 - /index/master/solr/data/index/_a.tis lr-x-- 1 root root 64 Aug 25 22:52 138 - /index/master/solr/data/index/_a.frq lr-x-- 1 root root 64 Aug 25 22:52 139 - /index/master/solr/data/index/_a.prx lr-x-- 1 root root 64 Aug 25 22:52 140 - /index/master/solr/data/index/_a.fdt lr-x-- 1 root root 64 Aug 25 22:52 141 - /index/master/solr/data/index/_a.fdx lr-x-- 1 root root 64 Aug 25 22:52 142 - /index/master/solr/data/index/_a.tvx lr-x-- 1 root root 64 Aug 25 22:52 143 - /index/master/solr/data/index/_a.tvd lr-x-- 1 root root 64 Aug 25 22:52 144 - /index/master/solr/data/index/_a.tvf lr-x-- 1 root root 64 Aug 25 22:52 145 - /index/master/solr/data/index/_a.nrm lr-x-- 1 root root 64 Aug 25 22:52 72 - /index/master/solr/data/index/_b.tis lr-x-- 1 root root 64 Aug 25 22:52 73 - /index/master/solr/data/index/_b.frq lr-x-- 1 root root 64 Aug 25 22:52 74 - /index/master/solr/data/index/_b.prx lr-x-- 1 root root 64 Aug 25 22:52 76 - /index/master/solr/data/index/_b.fdt lr-x-- 1 root root 64 Aug 25 22:52 78 - /index/master/solr/data/index/_b.fdx lr-x-- 1 root root 64 Aug 25 22:52 79 - /index/master/solr/data/index/_b.nrm lr-x-- 1 root root 64 Aug 25 22:52 80 - /index/master/solr/data/index/_c.tis lr-x-- 1 root root 64 Aug 25 22:52 81 - /index/master/solr/data/index/_c.frq lr-x-- 1 root root 64 Aug 25 22:52 82 - /index/master/solr/data/index/_c.prx lr-x-- 1 root root 64 Aug 25 22:52 83 -
[jira] Commented: (SOLR-1383) Replication causes master to fail to delete old index files
[ https://issues.apache.org/jira/browse/SOLR-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12748688#action_12748688 ] Noble Paul commented on SOLR-1383: -- bq.he files do not go away. Not after another commit, not after restarting solr. All the old files are necessary for the index to work. The latest commit is not the only one that is used. You do an optimize and the old files will go away Replication causes master to fail to delete old index files --- Key: SOLR-1383 URL: https://issues.apache.org/jira/browse/SOLR-1383 Project: Solr Issue Type: Bug Components: replication (java) Environment: Linux CentOS - latest Solr 1.4 trunk - Java 1.6 Reporter: Lance Norskog Fix For: 1.4 I have developed a way to make replication leave old index files in the master's data/index directory. It is timing-dependent. A sequence of commands runs correctly or fails, depending on the timing between the commands. Here is the test scenario: Start a master and slave version of the Solr distributed example. I used 8080 for the slave. (See example/etc/jetty.xml) Be sure to start with empty solr/data/index files on both master and slave. Open the replication administration jsp on the slave ( http://localhost:8080/solr/admin/replication/index.jsp ) Disable polling. In a text window, go to the example/exampledocs directory and run this script {code} for x in *.xml do echo $x sh post.sh $x sleep 15 curl http://localhost:8080/solr/replication?command=fetchindex; done {code} This prints each example file, indexes it, and does a replication command. At the end of this exercise, the master and slave solr/data/index files will be identical. Now, kill master slave, remove the solr/index/data directories, and start over. This time, remove the sleep command from the script. In my environment, old Lucene index files were left in the master's data/index. Here is what is left in the master data/index. The segments_? files are random across runs, but the index files left over are consistent. Note (courtesy of the Linux 'ls -l /proc/PID/fd' command) that the old files are not kept open by the master solr; they are merely left behind. In the master server: {code} % ls solr/data/index _0.fdt _1.prx _2.tvx _4.nrm _5.tii _7.frq _8.tvd _a.tvx _c.nrm _0.fdx _1.tii _3.fdt _4.prx _5.tis _7.nrm _8.tvf _b.fdt _c.prx _0.fnm _1.tis _3.fdx _4.tii _6.fdt _7.prx _8.tvx _b.fdx _c.tii _0.frq _2.fdt _3.fnm _4.tis _6.fdx _7.tii _a.fdt _b.fnm _c.tis _0.nrm _2.fdx _3.frq _4.tvd _6.fnm _7.tis _a.fdx _b.frq segments.gen _0.prx _2.fnm _3.nrm _4.tvf _6.frq _8.fdt _a.fnm _b.nrm segments_8 _0.tii _2.frq _3.prx _4.tvx _6.nrm _8.fdx _a.frq _b.prx segments_9 _0.tis _2.nrm _3.tii _5.fdt _6.prx _8.fnm _a.nrm _b.tii segments_a _1.fdt _2.prx _3.tis _5.fdx _6.tii _8.frq _a.prx _b.tis segments_b _1.fdx _2.tii _4.fdt _5.fnm _6.tis _8.nrm _a.tii _c.fdt segments_c _1.fnm _2.tis _4.fdx _5.frq _7.fdt _8.prx _a.tis _c.fdx segments_d _1.frq _2.tvd _4.fnm _5.nrm _7.fdx _8.tii _a.tvd _c.fnm _1.nrm _2.tvf _4.frq _5.prx _7.fnm _8.tis _a.tvf _c.frq {code} {code} % ls -l /proc/PID/fd lr-x-- 1 root root 64 Aug 25 22:52 137 - /index/master/solr/data/index/_a.tis lr-x-- 1 root root 64 Aug 25 22:52 138 - /index/master/solr/data/index/_a.frq lr-x-- 1 root root 64 Aug 25 22:52 139 - /index/master/solr/data/index/_a.prx lr-x-- 1 root root 64 Aug 25 22:52 140 - /index/master/solr/data/index/_a.fdt lr-x-- 1 root root 64 Aug 25 22:52 141 - /index/master/solr/data/index/_a.fdx lr-x-- 1 root root 64 Aug 25 22:52 142 - /index/master/solr/data/index/_a.tvx lr-x-- 1 root root 64 Aug 25 22:52 143 - /index/master/solr/data/index/_a.tvd lr-x-- 1 root root 64 Aug 25 22:52 144 - /index/master/solr/data/index/_a.tvf lr-x-- 1 root root 64 Aug 25 22:52 145 - /index/master/solr/data/index/_a.nrm lr-x-- 1 root root 64 Aug 25 22:52 72 - /index/master/solr/data/index/_b.tis lr-x-- 1 root root 64 Aug 25 22:52 73 - /index/master/solr/data/index/_b.frq lr-x-- 1 root root 64 Aug 25 22:52 74 - /index/master/solr/data/index/_b.prx lr-x-- 1 root root 64 Aug 25 22:52 76 - /index/master/solr/data/index/_b.fdt lr-x-- 1 root root 64 Aug 25 22:52 78 - /index/master/solr/data/index/_b.fdx lr-x-- 1 root root 64 Aug 25 22:52 79 - /index/master/solr/data/index/_b.nrm lr-x-- 1 root root 64 Aug 25 22:52 80 - /index/master/solr/data/index/_c.tis lr-x-- 1 root root 64 Aug 25 22:52 81 - /index/master/solr/data/index/_c.frq lr-x-- 1 root root 64 Aug 25 22:52 82 -
Solr nightly build failure
init-forrest-entities: [mkdir] Created dir: /tmp/apache-solr-nightly/build [mkdir] Created dir: /tmp/apache-solr-nightly/build/web compile-solrj: [mkdir] Created dir: /tmp/apache-solr-nightly/build/solrj [javac] Compiling 84 source files to /tmp/apache-solr-nightly/build/solrj [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. compile: [mkdir] Created dir: /tmp/apache-solr-nightly/build/solr [javac] Compiling 373 source files to /tmp/apache-solr-nightly/build/solr [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. compileTests: [mkdir] Created dir: /tmp/apache-solr-nightly/build/tests [javac] Compiling 167 source files to /tmp/apache-solr-nightly/build/tests [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. junit: [mkdir] Created dir: /tmp/apache-solr-nightly/build/test-results [junit] Running org.apache.solr.BasicFunctionalityTest [junit] Tests run: 20, Failures: 0, Errors: 0, Time elapsed: 27.792 sec [junit] Running org.apache.solr.ConvertedLegacyTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 16.196 sec [junit] Running org.apache.solr.DisMaxRequestHandlerTest [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 11.192 sec [junit] Running org.apache.solr.EchoParamsTest [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 4.433 sec [junit] Running org.apache.solr.MinimalSchemaTest [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 6.59 sec [junit] Running org.apache.solr.OutputWriterTest [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 4.107 sec [junit] Running org.apache.solr.SampleTest [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 6.926 sec [junit] Running org.apache.solr.SolrInfoMBeanTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.276 sec [junit] Running org.apache.solr.TestDistributedSearch [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 61.66 sec [junit] Running org.apache.solr.TestSolrCoreProperties [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 2.895 sec [junit] Running org.apache.solr.TestTrie [junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 13.585 sec [junit] Running org.apache.solr.analysis.DoubleMetaphoneFilterFactoryTest [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.959 sec [junit] Running org.apache.solr.analysis.DoubleMetaphoneFilterTest [junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 0.817 sec [junit] Running org.apache.solr.analysis.EnglishPorterFilterFactoryTest [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 2.433 sec [junit] Running org.apache.solr.analysis.HTMLStripCharFilterTest [junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 1.143 sec [junit] Running org.apache.solr.analysis.LengthFilterTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.824 sec [junit] Running org.apache.solr.analysis.SnowballPorterFilterFactoryTest [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.999 sec [junit] Running org.apache.solr.analysis.TestBufferedTokenStream [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 2.122 sec [junit] Running org.apache.solr.analysis.TestCapitalizationFilter [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.52 sec [junit] Running org.apache.solr.analysis.TestDelimitedPayloadTokenFilterFactory [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 7.575 sec [junit] Running org.apache.solr.analysis.TestHyphenatedWordsFilter [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.658 sec [junit] Running org.apache.solr.analysis.TestKeepFilterFactory [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 3.589 sec [junit] Running org.apache.solr.analysis.TestKeepWordFilter [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.876 sec [junit] Running org.apache.solr.analysis.TestMappingCharFilterFactory [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.655 sec [junit] Running org.apache.solr.analysis.TestPatternReplaceFilter [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 3.489 sec [junit] Running
Build failed in Hudson: Solr-trunk #907
See http://hudson.zones.apache.org/hudson/job/Solr-trunk/907/changes Changes: [noble] SOLR-1391 The XPath field in the XPathEntityResolver should use the resolver to replace possible tokens [gsingers] Add a get started section to the front page [yonik] AutoCommitTest: no more guessing about when a commit has finished -- [...truncated 2227 lines...] [junit] Running org.apache.solr.analysis.TestPatternTokenizerFactory [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.848 sec [junit] Running org.apache.solr.analysis.TestPhoneticFilter [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 2.388 sec [junit] Running org.apache.solr.analysis.TestRemoveDuplicatesTokenFilter [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 3.182 sec [junit] Running org.apache.solr.analysis.TestStopFilterFactory [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 4.254 sec [junit] Running org.apache.solr.analysis.TestSynonymFilter [junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 9.361 sec [junit] Running org.apache.solr.analysis.TestSynonymMap [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 4.836 sec [junit] Running org.apache.solr.analysis.TestTrimFilter [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 2.489 sec [junit] Running org.apache.solr.analysis.TestWordDelimiterFilter [junit] Tests run: 14, Failures: 0, Errors: 0, Time elapsed: 44.253 sec [junit] Running org.apache.solr.client.solrj.SolrExceptionTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.955 sec [junit] Running org.apache.solr.client.solrj.SolrQueryTest [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.517 sec [junit] Running org.apache.solr.client.solrj.TestBatchUpdate [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 24.938 sec [junit] Running org.apache.solr.client.solrj.TestLBHttpSolrServer [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 19.087 sec [junit] Running org.apache.solr.client.solrj.beans.TestDocumentObjectBinder [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 1.086 sec [junit] Running org.apache.solr.client.solrj.embedded.JettyWebappTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 11.272 sec [junit] Running org.apache.solr.client.solrj.embedded.LargeVolumeBinaryJettyTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 9.759 sec [junit] Running org.apache.solr.client.solrj.embedded.LargeVolumeEmbeddedTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 9.43 sec [junit] Running org.apache.solr.client.solrj.embedded.LargeVolumeJettyTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 8.996 sec [junit] Running org.apache.solr.client.solrj.embedded.MergeIndexesEmbeddedTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 8.491 sec [junit] Running org.apache.solr.client.solrj.embedded.MultiCoreEmbeddedTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 7.907 sec [junit] Running org.apache.solr.client.solrj.embedded.MultiCoreExampleJettyTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 12.219 sec [junit] Running org.apache.solr.client.solrj.embedded.SolrExampleEmbeddedTest [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 17.75 sec [junit] Running org.apache.solr.client.solrj.embedded.SolrExampleJettyTest [junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 34.326 sec [junit] Running org.apache.solr.client.solrj.embedded.SolrExampleStreamingTest [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 33.91 sec [junit] Running org.apache.solr.client.solrj.embedded.TestSolrProperties [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 4.249 sec [junit] Running org.apache.solr.client.solrj.request.TestUpdateRequestCodec [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.526 sec [junit] Running org.apache.solr.client.solrj.response.AnlysisResponseBaseTest [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.512 sec [junit] Running org.apache.solr.client.solrj.response.DocumentAnalysisResponseTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.688 sec [junit] Running org.apache.solr.client.solrj.response.FieldAnalysisResponseTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.86 sec [junit] Running org.apache.solr.client.solrj.response.QueryResponseTest [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.292 sec [junit] Running org.apache.solr.client.solrj.response.TestSpellCheckResponse [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 14.777 sec [junit] Running org.apache.solr.client.solrj.util.ClientUtilsTest [junit] Tests run: 1, Failures: 0,
[jira] Created: (SOLR-1392) NPE on replication page on slave
NPE on replication page on slave Key: SOLR-1392 URL: https://issues.apache.org/jira/browse/SOLR-1392 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 1.4 Reporter: Reuben Firmin On our slave's replication page, I periodically see this exception. java.lang.NullPointerException at _jsp._admin._replication._index__jsp._jspService(_index__jsp.java:265) at com.caucho.jsp.JavaPage.service(JavaPage.java:61) at com.caucho.jsp.Page.pageservice(Page.java:578) at com.caucho.server.dispatch.PageFilterChain.doFilter(PageFilterChain.java:192) at com.caucho.server.webapp.DispatchFilterChain.doFilter(DispatchFilterChain.java:97) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:280) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:108) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:264) at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:76) at com.caucho.server.cache.CacheFilterChain.doFilter(CacheFilterChain.java:158) at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:178) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.hmux.HmuxRequest.handleRequest(HmuxRequest.java:435) at com.caucho.server.port.TcpConnection.run(TcpConnection.java:586) at com.caucho.util.ThreadPool$Item.runTasks(ThreadPool.java:690) at com.caucho.util.ThreadPool$Item.run(ThreadPool.java:612) at java.lang.Thread.run(Thread.java:619) Date: Fri, 28 Aug 2009 13:53:59 GMT Server: Apache/2.2.3 (Red Hat) Content-Type: text/html; charset=utf-8 Vary: Accept-Encoding,User-Agent Content-Encoding: gzip Content-Length: 524 Connection: close ‹��íVMÓ0=_чìÅi¶ßȪ„ �...@Õn%Ž'ëÇ6ŽÓ-ÿÇqi‹S-àËHã7ïÅ3vžŒSòe„ ùf8,'�...@ÔœgÑ9ĉأÏ6ßhè÷G Ê0)¢WIJ§J¡Œä%(Ó 8£¤QÆDǬÅmx}`âËõFÿÍýtr礨,%5-$jÀOÜ= Ê3-8Ú4íd¹eÉufLfó€ÒTF9«1´pôŒÛ6W±-å²Íâߧ˜.îo)ÃŽ(...ÞyáNÍ.Ðé.f/n´'³é~j8dÚ1Ã]ïõ¾×p™�...@¯kÅÑw©Ÿ‰Îãn´•žÿ•t:ôgõE,Š*ɵÓ- ]¢G›\ÏñßÌ߸ñ‹ómë;-„œêlzKºm¯šØ) më›×ÕOõìÞ*Õ)Bæüófe}DmðGŒ r‹/'5...^à¥Ôm©ZI!€º¯ëZÄ×+íO¢6Œ£m¡Ím¤ä¯×ʆ¿%Õ- *Þ[á...¾¥'So g÷lK.ªO•‹»'ëâÄyp„w2ÿÑ8rÚqoĽ÷FÜñ2bk¹Vƒd‰mc„«'pû:~€Š‰d„R4²U~'Ψ-˽ à?...I�� -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Mucking with DocSlices
I'd like to be able to change the DocList in a SearchComponent, for instance to shorten or lengthen it. I can get a shorter one via the subset() method, but the problem with this is the new subset still reflects the number of matches, etc. of the parent, which seems a little odd to me. If I say String.substring().length(), I wouldn't expect the length returned to be the same as the parent (unless of course the substring requested is the identity one), so I'm not sure why DocSlice.subset does. Likewise for the maxScore, etc. Is there a reason why, if I know I have a DocSlice, I can't cast the docList to it and make some of these lower level changes to the member variables? It would be a lot more efficient than having to copy over all the docs, etc. to a new DocSlice. Thanks, Grant
Re: [jira] Created: (SOLR-1392) NPE on replication page on slave
By any chance can you share that file _index__jsp.java ? On Fri, Aug 28, 2009 at 7:32 PM, Reuben Firmin (JIRA)j...@apache.org wrote: NPE on replication page on slave Key: SOLR-1392 URL: https://issues.apache.org/jira/browse/SOLR-1392 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 1.4 Reporter: Reuben Firmin On our slave's replication page, I periodically see this exception. java.lang.NullPointerException at _jsp._admin._replication._index__jsp._jspService(_index__jsp.java:265) at com.caucho.jsp.JavaPage.service(JavaPage.java:61) at com.caucho.jsp.Page.pageservice(Page.java:578) at com.caucho.server.dispatch.PageFilterChain.doFilter(PageFilterChain.java:192) at com.caucho.server.webapp.DispatchFilterChain.doFilter(DispatchFilterChain.java:97) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:280) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:108) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:264) at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:76) at com.caucho.server.cache.CacheFilterChain.doFilter(CacheFilterChain.java:158) at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:178) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.hmux.HmuxRequest.handleRequest(HmuxRequest.java:435) at com.caucho.server.port.TcpConnection.run(TcpConnection.java:586) at com.caucho.util.ThreadPool$Item.runTasks(ThreadPool.java:690) at com.caucho.util.ThreadPool$Item.run(ThreadPool.java:612) at java.lang.Thread.run(Thread.java:619) Date: Fri, 28 Aug 2009 13:53:59 GMT Server: Apache/2.2.3 (Red Hat) Content-Type: text/html; charset=utf-8 Vary: Accept-Encoding,User-Agent Content-Encoding: gzip Content-Length: 524 Connection: close ‹��íVM Ó0=_чìÅi¶ßȪ„ �...@Õn%Ž'ë Ç6ŽÓ-ÿÇqi‹S-àËHã7ïÅ3vžŒSòe„ ùf8,'�...@ÔœgÑ9ĉأÏ6ßhè÷G Ê0)¢WIJ§J¡Œä%(Ó 8£¤QÆDǬÅmx}`âËõFÿÍýtr礨,%5-$jÀOÜ = Ê3 -8Ú4íd¹eÉufLfó€ÒT F9«1´pôŒÛ6W±-å²Íâߧ˜.îo)ÃŽ(...ÞyáNÍ.Ðé.f/n´'³é~j8DÚ1 à ]ïõ¾×p™�...@¯kÅÑw©Ÿ‰Îãn´•žÿ•t:ôgõE,Š*ɵÓ- ]¢G›\ÏñßÌ߸ñ‹ómë; -„œêlzKºm¯šØ) më›×ÕOõìÞ*Õ)Bæü ófe}DmðGŒ r‹/'5...^à¥Ôm©ZI!€º¯ëZÄ×+í O¢6Œ£m¡Ím¤ä¯×ʆ¿%Õ- *Þ[á...¾¥'So g÷lK.ªO•‹»'ëâÄyp„w2ÿÑ8rÚqoĽ÷FÜñ 2bk¹Vƒd‰mc„«'pû:~€Š‰d„R4²U~'Ψ-˽ à?...I�� -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. -- - Noble Paul | Principal Engineer| AOL | http://aol.com
[jira] Assigned: (SOLR-1392) NPE on replication page on slave
[ https://issues.apache.org/jira/browse/SOLR-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul reassigned SOLR-1392: Assignee: Noble Paul NPE on replication page on slave Key: SOLR-1392 URL: https://issues.apache.org/jira/browse/SOLR-1392 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 1.4 Reporter: Reuben Firmin Assignee: Noble Paul On our slave's replication page, I periodically see this exception. java.lang.NullPointerException at _jsp._admin._replication._index__jsp._jspService(_index__jsp.java:265) at com.caucho.jsp.JavaPage.service(JavaPage.java:61) at com.caucho.jsp.Page.pageservice(Page.java:578) at com.caucho.server.dispatch.PageFilterChain.doFilter(PageFilterChain.java:192) at com.caucho.server.webapp.DispatchFilterChain.doFilter(DispatchFilterChain.java:97) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:280) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:108) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:264) at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:76) at com.caucho.server.cache.CacheFilterChain.doFilter(CacheFilterChain.java:158) at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:178) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.hmux.HmuxRequest.handleRequest(HmuxRequest.java:435) at com.caucho.server.port.TcpConnection.run(TcpConnection.java:586) at com.caucho.util.ThreadPool$Item.runTasks(ThreadPool.java:690) at com.caucho.util.ThreadPool$Item.run(ThreadPool.java:612) at java.lang.Thread.run(Thread.java:619) Date: Fri, 28 Aug 2009 13:53:59 GMT Server: Apache/2.2.3 (Red Hat) Content-Type: text/html; charset=utf-8 Vary: Accept-Encoding,User-Agent Content-Encoding: gzip Content-Length: 524 Connection: close ‹��íVMÓ0=_чìÅi¶ßȪ„ �...@Õn%Ž'ëÇ6ŽÓ-ÿÇqi‹S-àËHã7ïÅ3vžŒSòe„ ùf8,'�...@ÔœgÑ9ĉأÏ6ßhè÷G Ê0)¢WIJ§J¡Œä%(Ó 8£¤QÆDǬÅmx}`âËõFÿÍýtr礨,%5-$jÀOÜ= Ê3-8Ú4íd¹eÉufLfó€ÒTF9«1´pôŒÛ6W±-å²Íâߧ˜.îo)ÃŽ(...ÞyáNÍ.Ðé.f/n´'³é~j8dÚ1Ã]ïõ¾×p™�...@¯kÅÑw©Ÿ‰Îãn´•žÿ•t:ôgõE,Š*ɵÓ- ]¢G›\ÏñßÌ߸ñ‹ómë;-„œêlzKºm¯šØ) më›×ÕOõìÞ*Õ)Bæüófe}DmðGŒ r‹/'5...^à¥Ôm©ZI!€º¯ëZÄ×+íO¢6Œ£m¡Ím¤ä¯×ʆ¿%Õ- *Þ[á...¾¥'So g÷lK.ªO•‹»'ëâÄyp„w2ÿÑ8rÚqoĽ÷FÜñ2bk¹Vƒd‰mc„«'pû:~€Š‰d„R4²U~'Ψ-˽ à?...I�� -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1393) Allow more control over SearchComponents ordering in SearchHandler
Allow more control over SearchComponents ordering in SearchHandler -- Key: SOLR-1393 URL: https://issues.apache.org/jira/browse/SOLR-1393 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 It would be useful to be able to add the notion of before/after when declaring search components. Currently, you can either explicitly declare all components or insert at the beginning or end. It would be nice to be able to say: this new component comes after the Query component without having to declare all the components. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Mucking with DocSlices
On Fri, Aug 28, 2009 at 10:26 AM, Grant Ingersollgsing...@apache.org wrote: If I say String.substring().length(), I wouldn't expect the length returned to be the same as the parent (unless of course the substring requested is the identity one), so I'm not sure why DocSlice.subset does. .size() should reflect the new size. .matches() always reflects the total number of matches that this DocList is a window into. Likewise for the maxScore, etc. Is there a reason why, if I know I have a DocSlice, I can't cast the docList to it and make some of these lower level changes to the member variables? It would be a lot more efficient than having to copy over all the docs, etc. to a new DocSlice. Just make a new DocSlice - one shouldn't be modifying these since they can be cached. -Yonik http://www.lucidimagination.com
Re: Mucking with DocSlices
On Aug 28, 2009, at 1:03 PM, Yonik Seeley wrote: On Fri, Aug 28, 2009 at 10:26 AM, Grant Ingersollgsing...@apache.org wrote: If I say String.substring().length(), I wouldn't expect the length returned to be the same as the parent (unless of course the substring requested is the identity one), so I'm not sure why DocSlice.subset does. .size() should reflect the new size. .matches() always reflects the total number of matches that this DocList is a window into. Likewise for the maxScore, etc. Is there a reason why, if I know I have a DocSlice, I can't cast the docList to it and make some of these lower level changes to the member variables? It would be a lot more efficient than having to copy over all the docs, etc. to a new DocSlice. Just make a new DocSlice - one shouldn't be modifying these since they can be cached. Sure, but that requires creating a new int [] doc array, copying elements, etc. all over again and I may not need to do that (for instance, if I am shortening the list based on some business rules) My solution so far is a light weight wrapper around DocList that seems to be working just fine.
[jira] Commented: (SOLR-1301) Solr + Hadoop
[ https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12748935#action_12748935 ] jv ning commented on SOLR-1301: --- I have used this at a decent scale, and will be adding a few patches, to allow mutliple tasks per machine to build. The code currently uses the same directory in /tmp for the solr config, and if multipel tasks are running, the directory may be removed by earlier tasks that finish. Solr + Hadoop - Key: SOLR-1301 URL: https://issues.apache.org/jira/browse/SOLR-1301 Project: Solr Issue Type: Improvement Affects Versions: 1.4 Reporter: Andrzej Bialecki Attachments: hadoop-0.19.1-core.jar, hadoop.patch This patch contains a contrib module that provides distributed indexing (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is twofold: * provide an API that is familiar to Hadoop developers, i.e. that of OutputFormat * avoid unnecessary export and (de)serialization of data maintained on HDFS. SolrOutputFormat consumes data produced by reduce tasks directly, without storing it in intermediate files. Furthermore, by using an EmbeddedSolrServer, the indexing task is split into as many parts as there are reducers, and the data to be indexed is not sent over the network. Design -- Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, which in turn uses SolrRecordWriter to write this data. SolrRecordWriter instantiates an EmbeddedSolrServer, and it also instantiates an implementation of SolrDocumentConverter, which is responsible for turning Hadoop (key, value) into a SolrInputDocument. This data is then added to a batch, which is periodically submitted to EmbeddedSolrServer. When reduce task completes, and the OutputFormat is closed, SolrRecordWriter calls commit() and optimize() on the EmbeddedSolrServer. The API provides facilities to specify an arbitrary existing solr.home directory, from which the conf/ and lib/ files will be taken. This process results in the creation of as many partial Solr home directories as there were reduce tasks. The output shards are placed in the output directory on the default filesystem (e.g. HDFS). Such part-N directories can be used to run N shard servers. Additionally, users can specify the number of reduce tasks, in particular 1 reduce task, in which case the output will consist of a single shard. An example application is provided that processes large CSV files and uses this API. It uses a custom CSV processing to avoid (de)serialization overhead. This patch relies on hadoop-core-0.19.1.jar - I attached the jar to this issue, you should put it in contrib/hadoop/lib. Note: the development of this patch was sponsored by an anonymous contributor and approved for release under Apache License. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1255) An attempt to visit the replication admin page when its not a defined handler should display an approp message
[ https://issues.apache.org/jira/browse/SOLR-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12748939#action_12748939 ] Grant Ingersoll commented on SOLR-1255: --- Is this fixed? An attempt to visit the replication admin page when its not a defined handler should display an approp message -- Key: SOLR-1255 URL: https://issues.apache.org/jira/browse/SOLR-1255 Project: Solr Issue Type: Bug Reporter: Mark Miller Assignee: Noble Paul Priority: Trivial Fix For: 1.4 Attachments: SOLR-1255.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1221) Change Solr Highlighting to use the SpanScorer with MultiTerm expansion by default
[ https://issues.apache.org/jira/browse/SOLR-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12748940#action_12748940 ] Grant Ingersoll commented on SOLR-1221: --- Is this going to make it into 1.4? Change Solr Highlighting to use the SpanScorer with MultiTerm expansion by default -- Key: SOLR-1221 URL: https://issues.apache.org/jira/browse/SOLR-1221 Project: Solr Issue Type: Improvement Components: highlighter Reporter: Mark Miller Assignee: Mark Miller Fix For: 1.4 To improve the out of the box experience of Solr 1.4, I really think we should make this change. You will still be able to turn both off. Comments? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Lucene RC2
Anyone tried out the new Lucene RC2 in Solr yet? Should we upgrade to it?
Re: Lucene RC2
have not tried it yet but we should certainly upgrade. the more testing the better! On Aug 28, 2009, at 2:54 PM, Grant Ingersoll wrote: Anyone tried out the new Lucene RC2 in Solr yet? Should we upgrade to it?
[jira] Resolved: (SOLR-1091) phps (serialized PHP) writer produces invalid output
[ https://issues.apache.org/jira/browse/SOLR-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley resolved SOLR-1091. Resolution: Fixed committed. phps (serialized PHP) writer produces invalid output -- Key: SOLR-1091 URL: https://issues.apache.org/jira/browse/SOLR-1091 Project: Solr Issue Type: Bug Components: search Affects Versions: 1.3 Environment: Sun JRE 1.6.0 on Centos 5 Reporter: frank farmer Priority: Minor Fix For: 1.4 Attachments: SOLR-1091.patch The serialized PHP output writer can outputs invalid string lengths for certain (unusual) input values. Specifically, I had a document containing the following 6 byte character sequence: \xED\xAF\x80\xED\xB1\xB8 I was able to create a document in the index containing this value without issue; however, when fetching the document back out using the serialized PHP writer, it returns a string like the following: s:4:􀁸; Note that the string length specified is 4, while the string is actually 6 bytes long. When using PHP's native serialize() function, it correctly sets the length to 6: # php -r 'var_dump(serialize(\xED\xAF\x80\xED\xB1\xB8));' string(13) s:6:􀁸; The wt=php writer, which produces output to be parsed with eval(), doesn't have any trouble with this string. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1383) Replication causes master to fail to delete old index files
[ https://issues.apache.org/jira/browse/SOLR-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12748964#action_12748964 ] Lance Norskog commented on SOLR-1383: - The old files did go away after an optimize. Thank you. Restarting did not remove them. I suggest that old index files should be removed after all runtime requirements for them disappear. They should definitely be removed by restarting. Restarting Solr should cure all runtime problems; this includes extra files. There are a lot of Solr sites that want continuous propagation from data source to indexing to query. If they use Java replication to poll continuously for updates, it will leave vast amounts of junk files behind. The current functionality is fine for a Solr 1.4 release, but this issue should be fixed after that. Please reopen it and mark it for 1.5. Thanks. Replication causes master to fail to delete old index files --- Key: SOLR-1383 URL: https://issues.apache.org/jira/browse/SOLR-1383 Project: Solr Issue Type: Bug Components: replication (java) Environment: Linux CentOS - latest Solr 1.4 trunk - Java 1.6 Reporter: Lance Norskog Fix For: 1.4 I have developed a way to make replication leave old index files in the master's data/index directory. It is timing-dependent. A sequence of commands runs correctly or fails, depending on the timing between the commands. Here is the test scenario: Start a master and slave version of the Solr distributed example. I used 8080 for the slave. (See example/etc/jetty.xml) Be sure to start with empty solr/data/index files on both master and slave. Open the replication administration jsp on the slave ( http://localhost:8080/solr/admin/replication/index.jsp ) Disable polling. In a text window, go to the example/exampledocs directory and run this script {code} for x in *.xml do echo $x sh post.sh $x sleep 15 curl http://localhost:8080/solr/replication?command=fetchindex; done {code} This prints each example file, indexes it, and does a replication command. At the end of this exercise, the master and slave solr/data/index files will be identical. Now, kill master slave, remove the solr/index/data directories, and start over. This time, remove the sleep command from the script. In my environment, old Lucene index files were left in the master's data/index. Here is what is left in the master data/index. The segments_? files are random across runs, but the index files left over are consistent. Note (courtesy of the Linux 'ls -l /proc/PID/fd' command) that the old files are not kept open by the master solr; they are merely left behind. In the master server: {code} % ls solr/data/index _0.fdt _1.prx _2.tvx _4.nrm _5.tii _7.frq _8.tvd _a.tvx _c.nrm _0.fdx _1.tii _3.fdt _4.prx _5.tis _7.nrm _8.tvf _b.fdt _c.prx _0.fnm _1.tis _3.fdx _4.tii _6.fdt _7.prx _8.tvx _b.fdx _c.tii _0.frq _2.fdt _3.fnm _4.tis _6.fdx _7.tii _a.fdt _b.fnm _c.tis _0.nrm _2.fdx _3.frq _4.tvd _6.fnm _7.tis _a.fdx _b.frq segments.gen _0.prx _2.fnm _3.nrm _4.tvf _6.frq _8.fdt _a.fnm _b.nrm segments_8 _0.tii _2.frq _3.prx _4.tvx _6.nrm _8.fdx _a.frq _b.prx segments_9 _0.tis _2.nrm _3.tii _5.fdt _6.prx _8.fnm _a.nrm _b.tii segments_a _1.fdt _2.prx _3.tis _5.fdx _6.tii _8.frq _a.prx _b.tis segments_b _1.fdx _2.tii _4.fdt _5.fnm _6.tis _8.nrm _a.tii _c.fdt segments_c _1.fnm _2.tis _4.fdx _5.frq _7.fdt _8.prx _a.tis _c.fdx segments_d _1.frq _2.tvd _4.fnm _5.nrm _7.fdx _8.tii _a.tvd _c.fnm _1.nrm _2.tvf _4.frq _5.prx _7.fnm _8.tis _a.tvf _c.frq {code} {code} % ls -l /proc/PID/fd lr-x-- 1 root root 64 Aug 25 22:52 137 - /index/master/solr/data/index/_a.tis lr-x-- 1 root root 64 Aug 25 22:52 138 - /index/master/solr/data/index/_a.frq lr-x-- 1 root root 64 Aug 25 22:52 139 - /index/master/solr/data/index/_a.prx lr-x-- 1 root root 64 Aug 25 22:52 140 - /index/master/solr/data/index/_a.fdt lr-x-- 1 root root 64 Aug 25 22:52 141 - /index/master/solr/data/index/_a.fdx lr-x-- 1 root root 64 Aug 25 22:52 142 - /index/master/solr/data/index/_a.tvx lr-x-- 1 root root 64 Aug 25 22:52 143 - /index/master/solr/data/index/_a.tvd lr-x-- 1 root root 64 Aug 25 22:52 144 - /index/master/solr/data/index/_a.tvf lr-x-- 1 root root 64 Aug 25 22:52 145 - /index/master/solr/data/index/_a.nrm lr-x-- 1 root root 64 Aug 25 22:52 72 - /index/master/solr/data/index/_b.tis lr-x-- 1 root root 64 Aug 25 22:52 73 - /index/master/solr/data/index/_b.frq lr-x-- 1 root root 64 Aug 25 22:52 74 - /index/master/solr/data/index/_b.prx lr-x-- 1 root root 64 Aug 25
[jira] Commented: (SOLR-1392) NPE on replication page on slave
[ https://issues.apache.org/jira/browse/SOLR-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12748985#action_12748985 ] Reuben Firmin commented on SOLR-1392: - Further debugging - this happens when the master url cannot be reached (i.e. does not resolve to a real URL). NPE on replication page on slave Key: SOLR-1392 URL: https://issues.apache.org/jira/browse/SOLR-1392 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 1.4 Reporter: Reuben Firmin Assignee: Noble Paul Fix For: 1.4 On our slave's replication page, I periodically see this exception. java.lang.NullPointerException at _jsp._admin._replication._index__jsp._jspService(_index__jsp.java:265) at com.caucho.jsp.JavaPage.service(JavaPage.java:61) at com.caucho.jsp.Page.pageservice(Page.java:578) at com.caucho.server.dispatch.PageFilterChain.doFilter(PageFilterChain.java:192) at com.caucho.server.webapp.DispatchFilterChain.doFilter(DispatchFilterChain.java:97) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:280) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:108) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:264) at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:76) at com.caucho.server.cache.CacheFilterChain.doFilter(CacheFilterChain.java:158) at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:178) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.hmux.HmuxRequest.handleRequest(HmuxRequest.java:435) at com.caucho.server.port.TcpConnection.run(TcpConnection.java:586) at com.caucho.util.ThreadPool$Item.runTasks(ThreadPool.java:690) at com.caucho.util.ThreadPool$Item.run(ThreadPool.java:612) at java.lang.Thread.run(Thread.java:619) Date: Fri, 28 Aug 2009 13:53:59 GMT Server: Apache/2.2.3 (Red Hat) Content-Type: text/html; charset=utf-8 Vary: Accept-Encoding,User-Agent Content-Encoding: gzip Content-Length: 524 Connection: close -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (SOLR-659) Explicitly set start and rows per shard for more efficient bulk queries across distributed Solr
[ https://issues.apache.org/jira/browse/SOLR-659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley reassigned SOLR-659: - Assignee: Yonik Seeley Explicitly set start and rows per shard for more efficient bulk queries across distributed Solr --- Key: SOLR-659 URL: https://issues.apache.org/jira/browse/SOLR-659 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.3 Reporter: Brian Whitman Assignee: Yonik Seeley Priority: Minor Fix For: 1.4 Attachments: shards.start_rows.patch, SOLR-659.patch The default behavior of setting start and rows on distributed solr (SOLR-303) is to set start at 0 across all shards and set rows to start+rows across each shard. This ensures all results are returned for any arbitrary start and rows setting, but during bulk queries (where start is incrementally increased and rows is kept consistent) the client would need finer control of the per-shard start and rows parameter as retrieving many thousands of documents becomes intractable as start grows higher. Attaching a patch that creates a shards.start and shards.rows parameter. If used, the logic that sets rows to start+rows per shard is overridden and each shard gets the exact start and rows set in shards.start and shards.rows. The client will receive up to shards.rows * nShards results and should set rows accordingly. This makes bulk queries across distributed solr possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-659) Explicitly set start and rows per shard for more efficient bulk queries across distributed Solr
[ https://issues.apache.org/jira/browse/SOLR-659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12748986#action_12748986 ] Yonik Seeley commented on SOLR-659: --- I agree this makes sense to enable efficient bulk operations, and also fits in with a past idea I had about mapping shards.param=foo to param=foo during a sub-request. I'll give it a couple of days and commit if there are no objections. Explicitly set start and rows per shard for more efficient bulk queries across distributed Solr --- Key: SOLR-659 URL: https://issues.apache.org/jira/browse/SOLR-659 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.3 Reporter: Brian Whitman Priority: Minor Fix For: 1.4 Attachments: shards.start_rows.patch, SOLR-659.patch The default behavior of setting start and rows on distributed solr (SOLR-303) is to set start at 0 across all shards and set rows to start+rows across each shard. This ensures all results are returned for any arbitrary start and rows setting, but during bulk queries (where start is incrementally increased and rows is kept consistent) the client would need finer control of the per-shard start and rows parameter as retrieving many thousands of documents becomes intractable as start grows higher. Attaching a patch that creates a shards.start and shards.rows parameter. If used, the logic that sets rows to start+rows per shard is overridden and each shard gets the exact start and rows set in shards.start and shards.rows. The client will receive up to shards.rows * nShards results and should set rows accordingly. This makes bulk queries across distributed solr possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: HTML decoder is splitting tokens
Greetings. I am moving this issue from the solr-user list. As can be seen in the messages below, I am having problems with the Solr HTML stripper. After some investigation, I have found the cause to be that the stripper is replacing the removed HTML with spaces. This obviously breaks when the HTML is in the middle of a word, like Guuml;nther. So, without knowing what I was doing, I hacked together a fix that uses offset correction instead. That seemed to work, except that closing tags and attributes still broke the positioning. With even less of a clue, I replaced read() with next() in the two methods handling those. Finally, invalid HTML also gave wrong offsets, and I fixed that by restoring numRead when rolling back the input stream. At this point I stopped trying to break it, so there may still be more problems. Or I might have introduced some problem on my own. Anyway, I have put the three patches at the bottom of this mail, in case somebody wants to move along with this issue. Regards, Anders. Anders Melchiorsen m...@spoon.kalibalik.dk writes: Hello. Thanks for the hints. Still some trouble, though. I added just the HTMLStripCharFilterFactory because, according to documentation, it should also replace HTML entities. It did, but still left a space after the entity, so I got two tokens from Guuml;nther. That seems like a bug? Adding MappingCharFilterFactory in front of the HTML stripper (so that the latter will not see the entity) does work as expected. That is, until I try strings like use lt;pgt; to mark a paragraph, where the HTML stripper will then remove parts of the actual text. So this approach will not work. Finally, I was happy that I could now use an arbitrary tokenizer with HTML input. The PatternTokenizer, however, seems to be using character offsets corresponding to the output of the char filters, and so the highlighting markers end up at the wrong place. Is that a bug, or a configuration issue? Cheers, Anders. Koji Sekiguchi wrote: Hi Anders, Sorry, I don't know this is a bug or a feature, but I'd like to show an alternate way if you'd like. In Solr trunk, HTMLStripWhitespaceTokenizerFactory is marked as deprecated. Instead, HTMLStripCharFilterFactory and an arbitrary TokenizerFactory are encouraged to use. And I'd recommend you to use MappingCharFilterFactory to convert character references to real characters. That is, you have: fieldType name=textHtml class=solr.TextField analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping.txt/ charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType where the contents of mapping.txt: uuml; = ü auml; = ä iuml; = ï euml; = ë ouml; = ö : : Then run analysis.jsp and see the result. Thank you, Koji Anders Melchiorsen wrote: Hi. When indexing the string Guuml;nther with HTMLStripWhitespaceTokenizerFactory (in analysis.jsp), I get two tokens, Gü and nther. Is this a bug, or am I doing something wrong? (Using a Solr nightly from 2009-05-29) Anders. commit 1fb2d42181d8effb1b444aa2fa02d86df1d860d7 Author: Anders Melchiorsen m...@spoon.kalibalik.dk Date: Fri Aug 28 15:57:03 2009 +0200 Use offset correction instead of inserting spaces into the stream Fixes Guuml;nther turning into Gü nther. diff --git a/HTMLStripCharFilter.java b/HTMLStripCharFilter.java index 733d783..e473cef 100644 --- a/HTMLStripCharFilter.java +++ b/HTMLStripCharFilter.java @@ -37,7 +37,9 @@ public class HTMLStripCharFilter extends BaseCharFilter { private int readAheadLimit = DEFAULT_READ_AHEAD; private int safeReadAheadLimit = readAheadLimit - 3; private int numWhitespace = 0; + private int numWhitespaceCorrected = 0; private int numRead = 0; + private int numReadLast = 0; private int lastMark; private SetString escapedTags; @@ -674,9 +676,11 @@ public class HTMLStripCharFilter extends BaseCharFilter { // where do we have to worry about them? // ![ CDATA [ unescaped markup ]] if (numWhitespace 0){ - numWhitespace--; - return ' '; + addOffCorrectMap(numReadLast+1-numWhitespaceCorrected, numWhitespaceCorrected+numWhitespace); + numWhitespaceCorrected += numWhitespace; + numWhitespace = 0; } +numReadLast = numRead; //do not limit this one by the READAHEAD while(true) { int lastNumRead = numRead; commit 542f5734136bbfd72ae802c30b6c61361268bccf Author: Anders Melchiorsen m...@spoon.kalibalik.dk Date: Fri Aug 28 15:57:29 2009 +0200 Use next() in place of read() The read() method is our public interface, while next() is what we use internally to get the next character. diff --git a/HTMLStripCharFilter.java b/HTMLStripCharFilter.java index e473cef..ab14de5 100644 --- a/HTMLStripCharFilter.java +++ b/HTMLStripCharFilter.java @@ -537,13 +537,13 @@ public class
[jira] Updated: (SOLR-1392) NPE on replication page on slave
[ https://issues.apache.org/jira/browse/SOLR-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reuben Firmin updated SOLR-1392: Comment: was deleted (was: Further debugging - this happens when the master url cannot be reached (i.e. does not resolve to a real URL). ) NPE on replication page on slave Key: SOLR-1392 URL: https://issues.apache.org/jira/browse/SOLR-1392 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 1.4 Reporter: Reuben Firmin Assignee: Noble Paul Fix For: 1.4 On our slave's replication page, I periodically see this exception. java.lang.NullPointerException at _jsp._admin._replication._index__jsp._jspService(_index__jsp.java:265) at com.caucho.jsp.JavaPage.service(JavaPage.java:61) at com.caucho.jsp.Page.pageservice(Page.java:578) at com.caucho.server.dispatch.PageFilterChain.doFilter(PageFilterChain.java:192) at com.caucho.server.webapp.DispatchFilterChain.doFilter(DispatchFilterChain.java:97) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:280) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:108) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:264) at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:76) at com.caucho.server.cache.CacheFilterChain.doFilter(CacheFilterChain.java:158) at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:178) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.hmux.HmuxRequest.handleRequest(HmuxRequest.java:435) at com.caucho.server.port.TcpConnection.run(TcpConnection.java:586) at com.caucho.util.ThreadPool$Item.runTasks(ThreadPool.java:690) at com.caucho.util.ThreadPool$Item.run(ThreadPool.java:612) at java.lang.Thread.run(Thread.java:619) Date: Fri, 28 Aug 2009 13:53:59 GMT Server: Apache/2.2.3 (Red Hat) Content-Type: text/html; charset=utf-8 Vary: Accept-Encoding,User-Agent Content-Encoding: gzip Content-Length: 524 Connection: close -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1392) NPE on replication page on slave
[ https://issues.apache.org/jira/browse/SOLR-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12749008#action_12749008 ] Reuben Firmin commented on SOLR-1392: - There's some issue on the master. What does host mean in this context? http://master/replication?command=detailswt=xml java.lang.IllegalArgumentException: host parameter is null at org.apache.commons.httpclient.HttpConnection.init(HttpConnection.java:206) at org.apache.commons.httpclient.HttpConnection.init(HttpConnection.java:155) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionWithReference.init(MultiThreadedHttpConnectionManager.java:1145) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool.createConnection(MultiThreadedHttpConnectionManager.java:762) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:476) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:416) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:153) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.apache.solr.handler.SnapPuller.getNamedListResponse(SnapPuller.java:192) at org.apache.solr.handler.SnapPuller.getCommandResponse(SnapPuller.java:187) at org.apache.solr.handler.ReplicationHandler.getReplicationDetails(ReplicationHandler.java:589) at org.apache.solr.handler.ReplicationHandler.handleRequestBody(ReplicationHandler.java:180) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1299) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:76) at com.caucho.server.cache.CacheFilterChain.doFilter(CacheFilterChain.java:158) at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:178) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.hmux.HmuxRequest.handleRequest(HmuxRequest.java:435) at com.caucho.server.port.TcpConnection.run(TcpConnection.java:586) at com.caucho.util.ThreadPool$Item.runTasks(ThreadPool.java:690) at com.caucho.util.ThreadPool$Item.run(ThreadPool.java:612) at java.lang.Thread.run(Thread.java:619) Date: Fri, 28 Aug 2009 22:22:53 GMT Server: Apache/2.2.3 (Red Hat) Cache-Control: no-cache, no-store Pragma: no-cache Expires: Sat, 01 Jan 2000 01:00:00 GMT Content-Type: text/html; charset=UTF-8 Vary: Accept-Encoding,User-Agent Content-Encoding: gzip Content-Length: 713 Connection: close NPE on replication page on slave Key: SOLR-1392 URL: https://issues.apache.org/jira/browse/SOLR-1392 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 1.4 Reporter: Reuben Firmin Assignee: Noble Paul Fix For: 1.4 On our slave's replication page, I periodically see this exception. java.lang.NullPointerException at _jsp._admin._replication._index__jsp._jspService(_index__jsp.java:265) at com.caucho.jsp.JavaPage.service(JavaPage.java:61) at com.caucho.jsp.Page.pageservice(Page.java:578) at com.caucho.server.dispatch.PageFilterChain.doFilter(PageFilterChain.java:192) at com.caucho.server.webapp.DispatchFilterChain.doFilter(DispatchFilterChain.java:97) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:280) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:108) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:264) at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:76) at com.caucho.server.cache.CacheFilterChain.doFilter(CacheFilterChain.java:158) at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:178) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.hmux.HmuxRequest.handleRequest(HmuxRequest.java:435) at com.caucho.server.port.TcpConnection.run(TcpConnection.java:586)
[jira] Commented: (SOLR-1343) HTMLStripCharFilter
[ https://issues.apache.org/jira/browse/SOLR-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12749021#action_12749021 ] Jason Rutherglen commented on SOLR-1343: I'm seeing a bug related to this patch going in. It's been hard to track down and I'm dealing with a JVM bug at the same time, so I haven't had time to write a test case yet. In summary, I reverted to the previous classes and the indexing goes back to normal. HTMLStripCharFilter --- Key: SOLR-1343 URL: https://issues.apache.org/jira/browse/SOLR-1343 Project: Solr Issue Type: Improvement Components: Analysis Affects Versions: 1.4 Reporter: Koji Sekiguchi Assignee: Koji Sekiguchi Priority: Trivial Fix For: 1.4 Attachments: SOLR-1343.patch Introducing HTMLStripCharFilter: * move html strip logic from HTMLStripReader to HTMLStripCharFilter * make HTMLStripReader depracated * make HTMLStrip*TokenizerFactory deprecated -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: HTML decoder is splitting tokens
Anders, Thank you for attaching the patch. Sorry again, I don't have enough time to investigate the patch and the problem you have, though, I'd like just to recommend that you'd open a JIRA issue and attach the patch so that I or someone can look into it later. And I didn't understand this part of your previous mail: Adding MappingCharFilterFactory in front of the HTML stripper (so that the latter will not see the entity) does work as expected. That is, until I try strings like use lt;pgt; to mark a paragraph, where the HTML stripper will then remove parts of the actual text. So this approach will not work. Thanks, Koji Anders Melchiorsen wrote: Greetings. I am moving this issue from the solr-user list. As can be seen in the messages below, I am having problems with the Solr HTML stripper. After some investigation, I have found the cause to be that the stripper is replacing the removed HTML with spaces. This obviously breaks when the HTML is in the middle of a word, like Guuml;nther. So, without knowing what I was doing, I hacked together a fix that uses offset correction instead. That seemed to work, except that closing tags and attributes still broke the positioning. With even less of a clue, I replaced read() with next() in the two methods handling those. Finally, invalid HTML also gave wrong offsets, and I fixed that by restoring numRead when rolling back the input stream. At this point I stopped trying to break it, so there may still be more problems. Or I might have introduced some problem on my own. Anyway, I have put the three patches at the bottom of this mail, in case somebody wants to move along with this issue. Regards, Anders. Anders Melchiorsen m...@spoon.kalibalik.dk writes: Hello. Thanks for the hints. Still some trouble, though. I added just the HTMLStripCharFilterFactory because, according to documentation, it should also replace HTML entities. It did, but still left a space after the entity, so I got two tokens from Guuml;nther. That seems like a bug? Adding MappingCharFilterFactory in front of the HTML stripper (so that the latter will not see the entity) does work as expected. That is, until I try strings like use lt;pgt; to mark a paragraph, where the HTML stripper will then remove parts of the actual text. So this approach will not work. Finally, I was happy that I could now use an arbitrary tokenizer with HTML input. The PatternTokenizer, however, seems to be using character offsets corresponding to the output of the char filters, and so the highlighting markers end up at the wrong place. Is that a bug, or a configuration issue? Cheers, Anders. Koji Sekiguchi wrote: Hi Anders, Sorry, I don't know this is a bug or a feature, but I'd like to show an alternate way if you'd like. In Solr trunk, HTMLStripWhitespaceTokenizerFactory is marked as deprecated. Instead, HTMLStripCharFilterFactory and an arbitrary TokenizerFactory are encouraged to use. And I'd recommend you to use MappingCharFilterFactory to convert character references to real characters. That is, you have: fieldType name=textHtml class=solr.TextField analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping.txt/ charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.WhitespaceTokenizerFactory/ /analyzer /fieldType where the contents of mapping.txt: uuml; = ü auml; = ä iuml; = ï euml; = ë ouml; = ö : : Then run analysis.jsp and see the result. Thank you, Koji Anders Melchiorsen wrote: Hi. When indexing the string Guuml;nther with HTMLStripWhitespaceTokenizerFactory (in analysis.jsp), I get two tokens, Gü and nther. Is this a bug, or am I doing something wrong? (Using a Solr nightly from 2009-05-29) Anders. commit 1fb2d42181d8effb1b444aa2fa02d86df1d860d7 Author: Anders Melchiorsen m...@spoon.kalibalik.dk Date: Fri Aug 28 15:57:03 2009 +0200 Use offset correction instead of inserting spaces into the stream Fixes Guuml;nther turning into Gü nther. diff --git a/HTMLStripCharFilter.java b/HTMLStripCharFilter.java index 733d783..e473cef 100644 --- a/HTMLStripCharFilter.java +++ b/HTMLStripCharFilter.java @@ -37,7 +37,9 @@ public class HTMLStripCharFilter extends BaseCharFilter { private int readAheadLimit = DEFAULT_READ_AHEAD; private int safeReadAheadLimit = readAheadLimit - 3; private int numWhitespace = 0; + private int numWhitespaceCorrected = 0; private int numRead = 0; + private int numReadLast = 0; private int lastMark; private SetString escapedTags; @@ -674,9 +676,11 @@ public class HTMLStripCharFilter extends BaseCharFilter { // where do we have to worry about them? // ![ CDATA [ unescaped markup ]] if (numWhitespace 0){ - numWhitespace--; - return ' '; + addOffCorrectMap(numReadLast+1-numWhitespaceCorrected, numWhitespaceCorrected+numWhitespace); + numWhitespaceCorrected += numWhitespace; +
Re: [jira] Commented: (SOLR-1392) NPE on replication page on slave
does it work if you hit the url http://master/replication directly? On Sat, Aug 29, 2009 at 3:56 AM, Reuben Firmin (JIRA)j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12749008#action_12749008 ] Reuben Firmin commented on SOLR-1392: - There's some issue on the master. What does host mean in this context? http://master/replication?command=detailswt=xml java.lang.IllegalArgumentException: host parameter is null at org.apache.commons.httpclient.HttpConnection.init(HttpConnection.java:206) at org.apache.commons.httpclient.HttpConnection.init(HttpConnection.java:155) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionWithReference.init(MultiThreadedHttpConnectionManager.java:1145) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool.createConnection(MultiThreadedHttpConnectionManager.java:762) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:476) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:416) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:153) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.apache.solr.handler.SnapPuller.getNamedListResponse(SnapPuller.java:192) at org.apache.solr.handler.SnapPuller.getCommandResponse(SnapPuller.java:187) at org.apache.solr.handler.ReplicationHandler.getReplicationDetails(ReplicationHandler.java:589) at org.apache.solr.handler.ReplicationHandler.handleRequestBody(ReplicationHandler.java:180) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1299) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:76) at com.caucho.server.cache.CacheFilterChain.doFilter(CacheFilterChain.java:158) at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:178) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.hmux.HmuxRequest.handleRequest(HmuxRequest.java:435) at com.caucho.server.port.TcpConnection.run(TcpConnection.java:586) at com.caucho.util.ThreadPool$Item.runTasks(ThreadPool.java:690) at com.caucho.util.ThreadPool$Item.run(ThreadPool.java:612) at java.lang.Thread.run(Thread.java:619) Date: Fri, 28 Aug 2009 22:22:53 GMT Server: Apache/2.2.3 (Red Hat) Cache-Control: no-cache, no-store Pragma: no-cache Expires: Sat, 01 Jan 2000 01:00:00 GMT Content-Type: text/html; charset=UTF-8 Vary: Accept-Encoding,User-Agent Content-Encoding: gzip Content-Length: 713 Connection: close NPE on replication page on slave Key: SOLR-1392 URL: https://issues.apache.org/jira/browse/SOLR-1392 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 1.4 Reporter: Reuben Firmin Assignee: Noble Paul Fix For: 1.4 On our slave's replication page, I periodically see this exception. java.lang.NullPointerException at _jsp._admin._replication._index__jsp._jspService(_index__jsp.java:265) at com.caucho.jsp.JavaPage.service(JavaPage.java:61) at com.caucho.jsp.Page.pageservice(Page.java:578) at com.caucho.server.dispatch.PageFilterChain.doFilter(PageFilterChain.java:192) at com.caucho.server.webapp.DispatchFilterChain.doFilter(DispatchFilterChain.java:97) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:280) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:108) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:264) at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:76) at com.caucho.server.cache.CacheFilterChain.doFilter(CacheFilterChain.java:158) at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:178) at
Re: [jira] Commented: (SOLR-1392) NPE on replication page on slave
BTW which build are you using? 2009/8/29 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com: does it work if you hit the url http://master/replication directly? On Sat, Aug 29, 2009 at 3:56 AM, Reuben Firmin (JIRA)j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12749008#action_12749008 ] Reuben Firmin commented on SOLR-1392: - There's some issue on the master. What does host mean in this context? http://master/replication?command=detailswt=xml java.lang.IllegalArgumentException: host parameter is null at org.apache.commons.httpclient.HttpConnection.init(HttpConnection.java:206) at org.apache.commons.httpclient.HttpConnection.init(HttpConnection.java:155) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionWithReference.init(MultiThreadedHttpConnectionManager.java:1145) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool.createConnection(MultiThreadedHttpConnectionManager.java:762) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:476) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:416) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:153) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.apache.solr.handler.SnapPuller.getNamedListResponse(SnapPuller.java:192) at org.apache.solr.handler.SnapPuller.getCommandResponse(SnapPuller.java:187) at org.apache.solr.handler.ReplicationHandler.getReplicationDetails(ReplicationHandler.java:589) at org.apache.solr.handler.ReplicationHandler.handleRequestBody(ReplicationHandler.java:180) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1299) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:76) at com.caucho.server.cache.CacheFilterChain.doFilter(CacheFilterChain.java:158) at com.caucho.server.webapp.WebAppFilterChain.doFilter(WebAppFilterChain.java:178) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.hmux.HmuxRequest.handleRequest(HmuxRequest.java:435) at com.caucho.server.port.TcpConnection.run(TcpConnection.java:586) at com.caucho.util.ThreadPool$Item.runTasks(ThreadPool.java:690) at com.caucho.util.ThreadPool$Item.run(ThreadPool.java:612) at java.lang.Thread.run(Thread.java:619) Date: Fri, 28 Aug 2009 22:22:53 GMT Server: Apache/2.2.3 (Red Hat) Cache-Control: no-cache, no-store Pragma: no-cache Expires: Sat, 01 Jan 2000 01:00:00 GMT Content-Type: text/html; charset=UTF-8 Vary: Accept-Encoding,User-Agent Content-Encoding: gzip Content-Length: 713 Connection: close NPE on replication page on slave Key: SOLR-1392 URL: https://issues.apache.org/jira/browse/SOLR-1392 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 1.4 Reporter: Reuben Firmin Assignee: Noble Paul Fix For: 1.4 On our slave's replication page, I periodically see this exception. java.lang.NullPointerException at _jsp._admin._replication._index__jsp._jspService(_index__jsp.java:265) at com.caucho.jsp.JavaPage.service(JavaPage.java:61) at com.caucho.jsp.Page.pageservice(Page.java:578) at com.caucho.server.dispatch.PageFilterChain.doFilter(PageFilterChain.java:192) at com.caucho.server.webapp.DispatchFilterChain.doFilter(DispatchFilterChain.java:97) at com.caucho.server.dispatch.ServletInvocation.service(ServletInvocation.java:241) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:280) at com.caucho.server.webapp.RequestDispatcherImpl.forward(RequestDispatcherImpl.java:108) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:264) at com.caucho.server.dispatch.FilterFilterChain.doFilter(FilterFilterChain.java:76) at com.caucho.server.cache.CacheFilterChain.doFilter(CacheFilterChain.java:158) at
[jira] Commented: (SOLR-1383) Replication causes master to fail to delete old index files
[ https://issues.apache.org/jira/browse/SOLR-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12749076#action_12749076 ] Noble Paul commented on SOLR-1383: -- Lance. let me suggest you one thing. # disable replicationhandler # run your program # check the file list in the index # repeat the same set of operations with replictaion on. and see if there is any difference in the no:of files Replication causes master to fail to delete old index files --- Key: SOLR-1383 URL: https://issues.apache.org/jira/browse/SOLR-1383 Project: Solr Issue Type: Bug Components: replication (java) Environment: Linux CentOS - latest Solr 1.4 trunk - Java 1.6 Reporter: Lance Norskog Fix For: 1.4 I have developed a way to make replication leave old index files in the master's data/index directory. It is timing-dependent. A sequence of commands runs correctly or fails, depending on the timing between the commands. Here is the test scenario: Start a master and slave version of the Solr distributed example. I used 8080 for the slave. (See example/etc/jetty.xml) Be sure to start with empty solr/data/index files on both master and slave. Open the replication administration jsp on the slave ( http://localhost:8080/solr/admin/replication/index.jsp ) Disable polling. In a text window, go to the example/exampledocs directory and run this script {code} for x in *.xml do echo $x sh post.sh $x sleep 15 curl http://localhost:8080/solr/replication?command=fetchindex; done {code} This prints each example file, indexes it, and does a replication command. At the end of this exercise, the master and slave solr/data/index files will be identical. Now, kill master slave, remove the solr/index/data directories, and start over. This time, remove the sleep command from the script. In my environment, old Lucene index files were left in the master's data/index. Here is what is left in the master data/index. The segments_? files are random across runs, but the index files left over are consistent. Note (courtesy of the Linux 'ls -l /proc/PID/fd' command) that the old files are not kept open by the master solr; they are merely left behind. In the master server: {code} % ls solr/data/index _0.fdt _1.prx _2.tvx _4.nrm _5.tii _7.frq _8.tvd _a.tvx _c.nrm _0.fdx _1.tii _3.fdt _4.prx _5.tis _7.nrm _8.tvf _b.fdt _c.prx _0.fnm _1.tis _3.fdx _4.tii _6.fdt _7.prx _8.tvx _b.fdx _c.tii _0.frq _2.fdt _3.fnm _4.tis _6.fdx _7.tii _a.fdt _b.fnm _c.tis _0.nrm _2.fdx _3.frq _4.tvd _6.fnm _7.tis _a.fdx _b.frq segments.gen _0.prx _2.fnm _3.nrm _4.tvf _6.frq _8.fdt _a.fnm _b.nrm segments_8 _0.tii _2.frq _3.prx _4.tvx _6.nrm _8.fdx _a.frq _b.prx segments_9 _0.tis _2.nrm _3.tii _5.fdt _6.prx _8.fnm _a.nrm _b.tii segments_a _1.fdt _2.prx _3.tis _5.fdx _6.tii _8.frq _a.prx _b.tis segments_b _1.fdx _2.tii _4.fdt _5.fnm _6.tis _8.nrm _a.tii _c.fdt segments_c _1.fnm _2.tis _4.fdx _5.frq _7.fdt _8.prx _a.tis _c.fdx segments_d _1.frq _2.tvd _4.fnm _5.nrm _7.fdx _8.tii _a.tvd _c.fnm _1.nrm _2.tvf _4.frq _5.prx _7.fnm _8.tis _a.tvf _c.frq {code} {code} % ls -l /proc/PID/fd lr-x-- 1 root root 64 Aug 25 22:52 137 - /index/master/solr/data/index/_a.tis lr-x-- 1 root root 64 Aug 25 22:52 138 - /index/master/solr/data/index/_a.frq lr-x-- 1 root root 64 Aug 25 22:52 139 - /index/master/solr/data/index/_a.prx lr-x-- 1 root root 64 Aug 25 22:52 140 - /index/master/solr/data/index/_a.fdt lr-x-- 1 root root 64 Aug 25 22:52 141 - /index/master/solr/data/index/_a.fdx lr-x-- 1 root root 64 Aug 25 22:52 142 - /index/master/solr/data/index/_a.tvx lr-x-- 1 root root 64 Aug 25 22:52 143 - /index/master/solr/data/index/_a.tvd lr-x-- 1 root root 64 Aug 25 22:52 144 - /index/master/solr/data/index/_a.tvf lr-x-- 1 root root 64 Aug 25 22:52 145 - /index/master/solr/data/index/_a.nrm lr-x-- 1 root root 64 Aug 25 22:52 72 - /index/master/solr/data/index/_b.tis lr-x-- 1 root root 64 Aug 25 22:52 73 - /index/master/solr/data/index/_b.frq lr-x-- 1 root root 64 Aug 25 22:52 74 - /index/master/solr/data/index/_b.prx lr-x-- 1 root root 64 Aug 25 22:52 76 - /index/master/solr/data/index/_b.fdt lr-x-- 1 root root 64 Aug 25 22:52 78 - /index/master/solr/data/index/_b.fdx lr-x-- 1 root root 64 Aug 25 22:52 79 - /index/master/solr/data/index/_b.nrm lr-x-- 1 root root 64 Aug 25 22:52 80 - /index/master/solr/data/index/_c.tis lr-x-- 1 root root 64 Aug 25 22:52 81 - /index/master/solr/data/index/_c.frq lr-x-- 1 root root 64 Aug 25 22:52 82 -
[jira] Commented: (SOLR-1255) An attempt to visit the replication admin page when its not a defined handler should display an approp message
[ https://issues.apache.org/jira/browse/SOLR-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12749077#action_12749077 ] Noble Paul commented on SOLR-1255: -- the original issue is fixed. But if the user registers multiple RH then only one will be shown. Ideally there should be no multiple RH registered An attempt to visit the replication admin page when its not a defined handler should display an approp message -- Key: SOLR-1255 URL: https://issues.apache.org/jira/browse/SOLR-1255 Project: Solr Issue Type: Bug Reporter: Mark Miller Assignee: Noble Paul Priority: Trivial Fix For: 1.4 Attachments: SOLR-1255.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.