[jira] [Reopened] (SOLR-6864) Support registering searcher listeners in SolrCoreAware.inform(SolrCore) method
[ https://issues.apache.org/jira/browse/SOLR-6864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson reopened SOLR-6864: -- Assignee: Erick Erickson (was: Tomás Fernández Löbbe) re-opening for inclusion in 4.10.5 Support registering searcher listeners in SolrCoreAware.inform(SolrCore) method --- Key: SOLR-6864 URL: https://issues.apache.org/jira/browse/SOLR-6864 Project: Solr Issue Type: Bug Affects Versions: 5.0, Trunk Reporter: Tomás Fernández Löbbe Assignee: Erick Erickson Fix For: 5.0, Trunk Attachments: SOLR-6864-tests.patch, SOLR-6864.patch, fix.patch I'm marking this Jira as Bug because we already have components that do this (SuggestComponent and SpellcheckComponent), however, listeners registered at this stage not always work. From https://issues.apache.org/jira/browse/SOLR-6845?focusedCommentId=14250350page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14250350 {quote} Trying to add some unit tests to this feature I found another issue. SuggestComponent and SpellcheckComponent rely on a {{firstSearcherListener}} to load (and in this case, also build) some structures. These firstSearcherListeners are registered on {{SolrCoreAware.inform()}}, however the first searcher listener task is only added to the queue of warming tasks if there is at least one listener registered at the time of the first searcher creation (before SolrCoreAware.inform() is ever called). See {code:title=SolrCore.java} if (currSearcher == null firstSearcherListeners.size() 0) { future = searcherExecutor.submit(new Callable() { @Override public Object call() throws Exception { try { for (SolrEventListener listener : firstSearcherListeners) { listener.newSearcher(newSearcher, null); } } catch (Throwable e) { SolrException.log(log, null, e); if (e instanceof Error) { throw (Error) e; } } return null; } }); } {code} I'll create a new Jira for this {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (SOLR-6845) Add buildOnStartup option for suggesters
[ https://issues.apache.org/jira/browse/SOLR-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson reopened SOLR-6845: -- Reopening for inclusion in 4.10.5 Add buildOnStartup option for suggesters Key: SOLR-6845 URL: https://issues.apache.org/jira/browse/SOLR-6845 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Erick Erickson Fix For: Trunk, 5.1 Attachments: SOLR-6845.patch, SOLR-6845.patch, SOLR-6845.patch, tests-failures.txt SOLR-6679 was filed to track the investigation into the following problem... {panel} The stock solrconfig provides a bad experience with a large index... start up Solr and it will spin at 100% CPU for minutes, unresponsive, while it apparently builds a suggester index. ... This is what I did: 1) indexed 10M very small docs (only takes a few minutes). 2) shut down Solr 3) start up Solr and watch it be unresponsive for over 4 minutes! I didn't even use any of the fields specified in the suggester config and I never called the suggest request handler. {panel} ..but ultimately focused on removing/disabling the suggester from the sample configs. Opening this new issue to focus on actually trying to identify the root problem fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7124) Add delconfig command to zkcli
[ https://issues.apache.org/jira/browse/SOLR-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14341217#comment-14341217 ] Vamsee Yarlagadda commented on SOLR-7124: - bq. It would not be required on upconfig, because upconfig will not delete any files from an existing config, it will only add or overwrite. You are right. I am not aware of this. I was under the impression that it will throw an error if config already exists. However as you pointed out, it will not delete any files as part of this operation. But there is a chance that we can end up with a mix of both old and new configs. Let's say if the old config has a file foo.txt and the new config is missing it then the updated configs in ZK contains all the files from new config + foo.txt. I am not sure whether this is a separate bug we need to fix. bq. A slightly different check (making sure the config actually exists) might need to be performed for linkconfig. Makes sense. Add delconfig command to zkcli Key: SOLR-7124 URL: https://issues.apache.org/jira/browse/SOLR-7124 Project: Solr Issue Type: New Feature Components: SolrCloud Affects Versions: 5.0 Reporter: Shawn Heisey Priority: Minor Fix For: 5.1 As far as I know, there is no functionality included with Solr that can delete a SolrCloud config in zookeeper. A delconfig command should be added to ZkCli and the zkcli script that can accomplish this. It should refuse to delete a config that is in use by any current collection. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6310) FilterScorer should delegate asTwoPhaseIterator
[ https://issues.apache.org/jira/browse/LUCENE-6310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-6310: Attachment: LUCENE-6310.patch updated patch: * cleanup FilterScorer to have a final next/advance/doc so there are no performance or correctness traps. * fix FilteredQuery queryFirst impl: it needed matches(), but nothing tested this! * put query with approximation support inside FilteredQuery in SearchEquivalenceTestBase * cleanup reqexcl etc to no longer extend FilterScorer. FilterScorer should delegate asTwoPhaseIterator --- Key: LUCENE-6310 URL: https://issues.apache.org/jira/browse/LUCENE-6310 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir Attachments: LUCENE-6310.patch, LUCENE-6310.patch FilterScorer is like FilterInputStream for a scorer. But today it doesn't delegate approximations, I think it should. Most things using this api are just modifying the score and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7174) Can't index a directory of files using DIH with BinFileDataSource and TikaEntityProcessor
[ https://issues.apache.org/jira/browse/SOLR-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexandre Rafalovitch updated SOLR-7174: Attachment: SOLR-7174.patch Proposed patch that allows TikaEntityProcessor to reset currently on reuse. Can't index a directory of files using DIH with BinFileDataSource and TikaEntityProcessor - Key: SOLR-7174 URL: https://issues.apache.org/jira/browse/SOLR-7174 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 5.0 Environment: Windows 7. Ubuntu 14.04. Reporter: Gary Taylor Labels: dataimportHandler, tika,text-extraction Attachments: SOLR-7174.patch Downloaded Solr 5.0.0, on a Windows 7 PC. I ran solr start and then solr create -c hn2 to create a new core. I want to index a load of epub files that I've got in a directory. So I created a data-import.xml (in solr\hn2\conf): dataConfig dataSource type=BinFileDataSource name=bin / document entity name=files dataSource=null rootEntity=false processor=FileListEntityProcessor baseDir=c:/Users/gt/Documents/epub fileName=.*epub onError=skip recursive=true field column=fileAbsolutePath name=id / field column=fileSize name=size / field column=fileLastModified name=lastModified / entity name=documentImport processor=TikaEntityProcessor url=${files.fileAbsolutePath} format=text dataSource=bin onError=skip field column=file name=fileName/ field column=Author name=author meta=true/ field column=title name=title meta=true/ field column=text name=content/ /entity /entity /document /dataConfig In my solrconfig.xml, I added a requestHandler entry to reference my data-import.xml: requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdata-import.xml/str /lst /requestHandler I renamed managed-schema to schema.xml, and ensured the following doc fields were setup: field name=id type=string indexed=true stored=true required=true multiValued=false / field name=fileName type=string indexed=true stored=true / field name=author type=string indexed=true stored=true / field name=title type=string indexed=true stored=true / field name=size type=long indexed=true stored=true / field name=lastModified type=date indexed=true stored=true / field name=content type=text_en indexed=false stored=true multiValued=false/ field name=text type=text_en indexed=true stored=false multiValued=true/ copyField source=content dest=text/ I copied all the jars from dist and contrib\* into server\solr\lib. Stopping and restarting solr then creates a new managed-schema file and renames schema.xml to schema.xml.back All good so far. Now I go to the web admin for dataimport (http://localhost:8983/solr/#/hn2/dataimport//dataimport) and try and execute a full import. But, the results show Requests: 0, Fetched: 58, Skipped: 0, Processed:1 - ie. it only adds one document (the very first one) even though it's iterated over 58! No errors are reported in the logs. I can repeat this on Ubuntu 14.04 using the same steps, so it's not Windows specific. - If I change the data-import.xml to use FileDataSource and PlainTextEntityProcessor and parse txt files, eg: dataConfig dataSource type=FileDataSource name=bin / document entity name=files dataSource=null rootEntity=false processor=FileListEntityProcessor baseDir=c:/Users/gt/Documents/epub fileName=.*txt field column=fileAbsolutePath name=id / field column=fileSize name=size / field column=fileLastModified name=lastModified / entity name=documentImport processor=PlainTextEntityProcessor url=${files.fileAbsolutePath} format=text dataSource=bin field column=plainText name=content/ /entity /entity /document /dataConfig This works. So it's a combo of BinFileDataSource and TikaEntityProcessor that is failing. On Windows, I ran Process Monitor, and spotted that only the very first epub file is actually being read (repeatedly). With verbose and debug on when running the DIH, I get the following response: verbose-output: [ entity:files, [ null,
Re: [VOTE] Release 4.10.4 RC0
SUCCESS! [0:40:55.528119] +1 On Fri, Feb 27, 2015 at 6:27 PM, Michael McCandless luc...@mikemccandless.com wrote: Artifacts: http://people.apache.org/~mikemccand/staging_area/lucene-solr-4.10.4-RC0-rev1662817 Smoke tester: python3 -u dev-tools/scripts/smokeTestRelease.py http://people.apache.org/~mikemccand/staging_area/lucene-solr-4.10.4-RC0-rev1662817 1662817 4.10.4 /tmp/smoke4104 True SUCCESS! [0:39:34.527017] I also confirmed Elasticsearch 1.x tests pass after upgrading to this. Here's my +1 Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6310) FilterScorer should delegate asTwoPhaseIterator
[ https://issues.apache.org/jira/browse/LUCENE-6310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-6310: Attachment: LUCENE-6310.patch simple patch. * FilteredQuery.LeapFrog uses ConjunctionDISI and works with approximations. * FilteredQuery.QueryFirst returns the query as the approximation. * CustomScoreQuery and a few other oddball scorers pass thru approximation support where they did not before. I think this is ok default behavior, given how FilterScorer is typically used. FilterScorer should delegate asTwoPhaseIterator --- Key: LUCENE-6310 URL: https://issues.apache.org/jira/browse/LUCENE-6310 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir Attachments: LUCENE-6310.patch FilterScorer is like FilterInputStream for a scorer. But today it doesn't delegate approximations, I think it should. Most things using this api are just modifying the score and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-7177) ConcurrentUpdateSolrClient should log connection information on http failures
Vamsee Yarlagadda created SOLR-7177: --- Summary: ConcurrentUpdateSolrClient should log connection information on http failures Key: SOLR-7177 URL: https://issues.apache.org/jira/browse/SOLR-7177 Project: Solr Issue Type: Improvement Affects Versions: 4.10.3 Reporter: Vamsee Yarlagadda Priority: Minor I notice when there is an http connection failure, we simply log the error but not the connection information. It would be good to log this info to make debugging easier. e.g: 1. {code} 2015-02-27 08:56:51,503 ERROR org.apache.solr.update.StreamingSolrServers: error java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:196) at java.net.SocketInputStream.read(SocketInputStream.java:122) at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166) at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90) at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254) at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289) at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252) at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127) at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:235) {code} 2. {code} 2015-02-27 10:26:12,363 ERROR org.apache.solr.update.StreamingSolrServers: error org.apache.http.NoHttpResponseException: The target server failed to respond at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:95) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254) at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289) at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252) at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127) at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:235) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code} As we can notice, we can see the exception but we don't have any information around which server is the end point. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7177) ConcurrentUpdateSolrClient should log connection information on http failures
[ https://issues.apache.org/jira/browse/SOLR-7177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vamsee Yarlagadda updated SOLR-7177: Affects Version/s: 5.0 ConcurrentUpdateSolrClient should log connection information on http failures -- Key: SOLR-7177 URL: https://issues.apache.org/jira/browse/SOLR-7177 Project: Solr Issue Type: Improvement Affects Versions: 4.10.3, 5.0 Reporter: Vamsee Yarlagadda Priority: Minor Attachments: SOLR-7177.patch I notice when there is an http connection failure, we simply log the error but not the connection information. It would be good to log this info to make debugging easier. e.g: 1. {code} 2015-02-27 08:56:51,503 ERROR org.apache.solr.update.StreamingSolrServers: error java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:196) at java.net.SocketInputStream.read(SocketInputStream.java:122) at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166) at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90) at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254) at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289) at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252) at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127) at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:235) {code} 2. {code} 2015-02-27 10:26:12,363 ERROR org.apache.solr.update.StreamingSolrServers: error org.apache.http.NoHttpResponseException: The target server failed to respond at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:95) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254) at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289) at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252) at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127) at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:235) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code} As we can notice, we can see the exception but we don't have any information around which server is
[jira] [Updated] (SOLR-7177) ConcurrentUpdateSolrClient should log connection information on http failures
[ https://issues.apache.org/jira/browse/SOLR-7177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vamsee Yarlagadda updated SOLR-7177: Attachment: SOLR-7177.patch Here is the first revision of the patch. I tried to preserve the original behavior by catching the exception, logging the required details, and then throwing the same exception again. ConcurrentUpdateSolrClient should log connection information on http failures -- Key: SOLR-7177 URL: https://issues.apache.org/jira/browse/SOLR-7177 Project: Solr Issue Type: Improvement Affects Versions: 4.10.3 Reporter: Vamsee Yarlagadda Priority: Minor Attachments: SOLR-7177.patch I notice when there is an http connection failure, we simply log the error but not the connection information. It would be good to log this info to make debugging easier. e.g: 1. {code} 2015-02-27 08:56:51,503 ERROR org.apache.solr.update.StreamingSolrServers: error java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:196) at java.net.SocketInputStream.read(SocketInputStream.java:122) at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166) at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90) at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254) at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289) at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252) at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127) at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:235) {code} 2. {code} 2015-02-27 10:26:12,363 ERROR org.apache.solr.update.StreamingSolrServers: error org.apache.http.NoHttpResponseException: The target server failed to respond at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:95) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254) at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289) at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252) at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127) at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:235) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at
[jira] [Commented] (SOLR-7174) Can't index a directory of files using DIH with BinFileDataSource and TikaEntityProcessor
[ https://issues.apache.org/jira/browse/SOLR-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14341313#comment-14341313 ] Alexandre Rafalovitch commented on SOLR-7174: - It looks like the TikaEntityProcessor is not capable of re-entry. This is only triggered when it is an inner entity. The title of the JIRA should probably be renamed. The cause is a flag **done** which is set to false in firstInit, is set to true at the end of the first run and is not reset before a second (reused) run. One solution is to override *init* method (and not just *firstInit*) and move resetting the flag there. Can't index a directory of files using DIH with BinFileDataSource and TikaEntityProcessor - Key: SOLR-7174 URL: https://issues.apache.org/jira/browse/SOLR-7174 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 5.0 Environment: Windows 7. Ubuntu 14.04. Reporter: Gary Taylor Labels: dataimportHandler, tika,text-extraction Downloaded Solr 5.0.0, on a Windows 7 PC. I ran solr start and then solr create -c hn2 to create a new core. I want to index a load of epub files that I've got in a directory. So I created a data-import.xml (in solr\hn2\conf): dataConfig dataSource type=BinFileDataSource name=bin / document entity name=files dataSource=null rootEntity=false processor=FileListEntityProcessor baseDir=c:/Users/gt/Documents/epub fileName=.*epub onError=skip recursive=true field column=fileAbsolutePath name=id / field column=fileSize name=size / field column=fileLastModified name=lastModified / entity name=documentImport processor=TikaEntityProcessor url=${files.fileAbsolutePath} format=text dataSource=bin onError=skip field column=file name=fileName/ field column=Author name=author meta=true/ field column=title name=title meta=true/ field column=text name=content/ /entity /entity /document /dataConfig In my solrconfig.xml, I added a requestHandler entry to reference my data-import.xml: requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdata-import.xml/str /lst /requestHandler I renamed managed-schema to schema.xml, and ensured the following doc fields were setup: field name=id type=string indexed=true stored=true required=true multiValued=false / field name=fileName type=string indexed=true stored=true / field name=author type=string indexed=true stored=true / field name=title type=string indexed=true stored=true / field name=size type=long indexed=true stored=true / field name=lastModified type=date indexed=true stored=true / field name=content type=text_en indexed=false stored=true multiValued=false/ field name=text type=text_en indexed=true stored=false multiValued=true/ copyField source=content dest=text/ I copied all the jars from dist and contrib\* into server\solr\lib. Stopping and restarting solr then creates a new managed-schema file and renames schema.xml to schema.xml.back All good so far. Now I go to the web admin for dataimport (http://localhost:8983/solr/#/hn2/dataimport//dataimport) and try and execute a full import. But, the results show Requests: 0, Fetched: 58, Skipped: 0, Processed:1 - ie. it only adds one document (the very first one) even though it's iterated over 58! No errors are reported in the logs. I can repeat this on Ubuntu 14.04 using the same steps, so it's not Windows specific. - If I change the data-import.xml to use FileDataSource and PlainTextEntityProcessor and parse txt files, eg: dataConfig dataSource type=FileDataSource name=bin / document entity name=files dataSource=null rootEntity=false processor=FileListEntityProcessor baseDir=c:/Users/gt/Documents/epub fileName=.*txt field column=fileAbsolutePath name=id / field column=fileSize name=size / field column=fileLastModified name=lastModified / entity name=documentImport processor=PlainTextEntityProcessor url=${files.fileAbsolutePath} format=text dataSource=bin field column=plainText name=content/ /entity /entity /document /dataConfig This works. So it's a combo of
[jira] [Assigned] (SOLR-6845) Add buildOnStartup option for suggesters
[ https://issues.apache.org/jira/browse/SOLR-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson reassigned SOLR-6845: Assignee: Erick Erickson (was: Tomás Fernández Löbbe) Add buildOnStartup option for suggesters Key: SOLR-6845 URL: https://issues.apache.org/jira/browse/SOLR-6845 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Erick Erickson Fix For: Trunk, 5.1 Attachments: SOLR-6845.patch, SOLR-6845.patch, SOLR-6845.patch, tests-failures.txt SOLR-6679 was filed to track the investigation into the following problem... {panel} The stock solrconfig provides a bad experience with a large index... start up Solr and it will spin at 100% CPU for minutes, unresponsive, while it apparently builds a suggester index. ... This is what I did: 1) indexed 10M very small docs (only takes a few minutes). 2) shut down Solr 3) start up Solr and watch it be unresponsive for over 4 minutes! I didn't even use any of the fields specified in the suggester config and I never called the suggest request handler. {panel} ..but ultimately focused on removing/disabling the suggester from the sample configs. Opening this new issue to focus on actually trying to identify the root problem fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-5833) Suggestor Version 2 doesn't support multiValued fields
[ https://issues.apache.org/jira/browse/LUCENE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson reassigned LUCENE-5833: -- Assignee: Erick Erickson (was: Steve Rowe) Suggestor Version 2 doesn't support multiValued fields -- Key: LUCENE-5833 URL: https://issues.apache.org/jira/browse/LUCENE-5833 Project: Lucene - Core Issue Type: Bug Components: modules/other Affects Versions: 4.8.1 Reporter: Greg Harris Assignee: Erick Erickson Fix For: 4.10.4, 5.0, Trunk Attachments: LUCENE-5833.patch, LUCENE-5833.patch, LUCENE-5833.patch, LUCENE-5833_branch4_10.patch, SOLR-6210.patch So if you use a multiValued field in the new suggestor it will not pick up terms for any term after the first one. So it treats the first term as the only term it will make it's dictionary from. This is the suggestor I'm talking about: https://issues.apache.org/jira/browse/SOLR-5378 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5833) Suggestor Version 2 doesn't support multiValued fields
[ https://issues.apache.org/jira/browse/LUCENE-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14341240#comment-14341240 ] Erick Erickson commented on LUCENE-5833: Jan: As it happens, I'm doing some other suggester stuff for 4.10.5 (didn't want to try to shoe-horn it in to 4.10.4 at the last second). I'll make it happen. Suggestor Version 2 doesn't support multiValued fields -- Key: LUCENE-5833 URL: https://issues.apache.org/jira/browse/LUCENE-5833 Project: Lucene - Core Issue Type: Bug Components: modules/other Affects Versions: 4.8.1 Reporter: Greg Harris Assignee: Steve Rowe Fix For: 4.10.4, 5.0, Trunk Attachments: LUCENE-5833.patch, LUCENE-5833.patch, LUCENE-5833.patch, LUCENE-5833_branch4_10.patch, SOLR-6210.patch So if you use a multiValued field in the new suggestor it will not pick up terms for any term after the first one. So it treats the first term as the only term it will make it's dictionary from. This is the suggestor I'm talking about: https://issues.apache.org/jira/browse/SOLR-5378 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7121) Solr nodes should go down based on configurable thresholds and not rely on resource exhaustion
[ https://issues.apache.org/jira/browse/SOLR-7121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sachin Goyal updated SOLR-7121: --- Attachment: SOLR-7121.patch The latest patch includes a test-case for a core going down when its configured number of threads is exceeded. The core is automatically brought up by the Health-Poller when the number of threads comes below that threshold. I will try to include a test for long-running-queries as well in the next few days but that should be independent of this patch's code-review. [~otis], great suggestion. I will surely add these metrics to JMX but can we handle that in a follow-up ticket to this one? Let me know. Solr nodes should go down based on configurable thresholds and not rely on resource exhaustion -- Key: SOLR-7121 URL: https://issues.apache.org/jira/browse/SOLR-7121 Project: Solr Issue Type: New Feature Reporter: Sachin Goyal Attachments: SOLR-7121.patch, SOLR-7121.patch, SOLR-7121.patch, SOLR-7121.patch Currently, there is no way to control when a Solr node goes down. If the server is having high GC pauses or too many threads or is just getting too many queries due to some bad load-balancer, the cores in the machine keep on serving unless they exhaust the machine's resources and everything comes to a stall. Such a slow-dying core can affect other cores as well by taking huge time to serve their distributed queries. There should be a way to specify some threshold values beyond which the targeted core can its ill-health and proactively go down to recover. When the load improves, the core should come up automatically. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-6309) disable coord when scores are not needed, or the scoring fn does not use it
Robert Muir created LUCENE-6309: --- Summary: disable coord when scores are not needed, or the scoring fn does not use it Key: LUCENE-6309 URL: https://issues.apache.org/jira/browse/LUCENE-6309 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir In BooleanWeight when disableCoord is set, things are nicer. coord makes things complex (e.g. sometimes requires some crazy scorers in BooleanTopLevelScorers) and can block optimizations. We should also implicitly disableCoord when scores are not needed, or when the Similarity does not use coord(). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-6310) FilterScorer should delegate asTwoPhaseIterator
Robert Muir created LUCENE-6310: --- Summary: FilterScorer should delegate asTwoPhaseIterator Key: LUCENE-6310 URL: https://issues.apache.org/jira/browse/LUCENE-6310 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir FilterScorer is like FilterInputStream for a scorer. But today it doesn't delegate approximations, I think it should. Most things using this api are just modifying the score and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7177) ConcurrentUpdateSolrClient should log connection information on http failures
[ https://issues.apache.org/jira/browse/SOLR-7177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14341311#comment-14341311 ] Vamsee Yarlagadda commented on SOLR-7177: - It might be costly to always log this connection info before every request. https://github.com/apache/lucene-solr/blob/trunk/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.java#L234 I will try to make sure we only log if there is an exception from the above line. ConcurrentUpdateSolrClient should log connection information on http failures -- Key: SOLR-7177 URL: https://issues.apache.org/jira/browse/SOLR-7177 Project: Solr Issue Type: Improvement Affects Versions: 4.10.3 Reporter: Vamsee Yarlagadda Priority: Minor I notice when there is an http connection failure, we simply log the error but not the connection information. It would be good to log this info to make debugging easier. e.g: 1. {code} 2015-02-27 08:56:51,503 ERROR org.apache.solr.update.StreamingSolrServers: error java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:196) at java.net.SocketInputStream.read(SocketInputStream.java:122) at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166) at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90) at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254) at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289) at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252) at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127) at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:235) {code} 2. {code} 2015-02-27 10:26:12,363 ERROR org.apache.solr.update.StreamingSolrServers: error org.apache.http.NoHttpResponseException: The target server failed to respond at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:95) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254) at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289) at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252) at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127) at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:235) at
[jira] [Updated] (LUCENE-6309) disable coord when scores are not needed, or the scoring fn does not use it
[ https://issues.apache.org/jira/browse/LUCENE-6309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-6309: Attachment: LUCENE-6309.patch Here is a patch. We already precompute coord(0..N, N), I just moved this to the ctor and added checks. disable coord when scores are not needed, or the scoring fn does not use it --- Key: LUCENE-6309 URL: https://issues.apache.org/jira/browse/LUCENE-6309 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir Attachments: LUCENE-6309.patch In BooleanWeight when disableCoord is set, things are nicer. coord makes things complex (e.g. sometimes requires some crazy scorers in BooleanTopLevelScorers) and can block optimizations. We should also implicitly disableCoord when scores are not needed, or when the Similarity does not use coord(). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2690 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2690/ 6 tests failed. REGRESSION: org.apache.solr.cloud.MultiThreadedOCPTest.test Error Message: We have a failed SPLITSHARD task Stack Trace: java.lang.AssertionError: We have a failed SPLITSHARD task at __randomizedtesting.SeedInfo.seed([3E4EFF5DB2807E0A:B61AC0871C7C13F2]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.cloud.MultiThreadedOCPTest.testTaskExclusivity(MultiThreadedOCPTest.java:144) at org.apache.solr.cloud.MultiThreadedOCPTest.test(MultiThreadedOCPTest.java:71) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:945) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:920) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at
[jira] [Created] (LUCENE-6307) Rename SegmentInfo.docCount - .maxDoc
Michael McCandless created LUCENE-6307: -- Summary: Rename SegmentInfo.docCount - .maxDoc Key: LUCENE-6307 URL: https://issues.apache.org/jira/browse/LUCENE-6307 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: Trunk, 5.x We already have maxDoc and numDocs, I think it's crazy we have a 3rd one docCount. We should just rename to maxDoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7151) SolrClient.query() methods should throw IOException
[ https://issues.apache.org/jira/browse/SOLR-7151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340075#comment-14340075 ] ASF subversion and git services commented on SOLR-7151: --- Commit 1662671 from [~romseygeek] in branch 'dev/trunk' [ https://svn.apache.org/r1662671 ] SOLR-7151: CHANGES.txt attribution SolrClient.query() methods should throw IOException --- Key: SOLR-7151 URL: https://issues.apache.org/jira/browse/SOLR-7151 Project: Solr Issue Type: Bug Components: SolrJ Reporter: Alan Woodward Assignee: Alan Woodward Priority: Minor Fix For: Trunk, 5.1 Attachments: SOLR-7151.patch All the methods on SolrClient are declared as throwing SolrServerException (thrown if there's an error somewhere on the server), and IOException (thrown if there's a communication error), except for the QueryRequest methods. These swallow up IOException and repackage them in a SolrServerException. I think these are useful distinctions to make (you might want to retry on an IOException, but not on a SolrServerException), and we should make the query methods fall in line with the others. I'm not sure if this should go into 5.x as well as trunk, as it's a backwards-breaking change. I'm leaning towards yes, as it's a sufficiently useful API change that it's worth the break, but I'm not going to insist on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7175) optimize maxSegments=2/ results in more than 2 segments after optimize finishes
[ https://issues.apache.org/jira/browse/SOLR-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Burton-West updated SOLR-7175: -- Attachment: solr4.shotz solrconfig.xml file optimize maxSegments=2/ results in more than 2 segments after optimize finishes --- Key: SOLR-7175 URL: https://issues.apache.org/jira/browse/SOLR-7175 Project: Solr Issue Type: Bug Affects Versions: 4.10.2 Environment: linux Reporter: Tom Burton-West Priority: Minor Attachments: build-1.indexwriterlog.2015-02-23.gz, solr4.shotz After finishing indexing and running a commit, we issue an optimize maxSegments=2/ to Solr. With Solr 4.10.2 we are seeing one or two shards (out of 12) with 3 or 4 segments after the optimize finishes. There are no errors in the Solr logs or indexwriter logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-5.x-Linux (32bit/jdk1.8.0_31) - Build # 11729 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/11729/ Java: 32bit/jdk1.8.0_31 -server -XX:+UseG1GC 1 tests failed. FAILED: org.apache.solr.cloud.RecoveryAfterSoftCommitTest.test Error Message: Didn't see all replicas for shard shard1 in collection1 come up within 3 ms! ClusterState: { collection1:{ replicationFactor:1, shards:{shard1:{ range:8000-7fff, state:active, replicas:{ core_node1:{ core:collection1, base_url:http://127.0.0.1:58731/hdw/cd;, node_name:127.0.0.1:58731_hdw%2Fcd, state:active, leader:true}, core_node2:{ core:collection1, base_url:http://127.0.0.1:33761/hdw/cd;, node_name:127.0.0.1:33761_hdw%2Fcd, state:recovering, router:{name:compositeId}, maxShardsPerNode:1, autoAddReplicas:false, autoCreated:true}, control_collection:{ replicationFactor:1, shards:{shard1:{ range:8000-7fff, state:active, replicas:{core_node1:{ core:collection1, base_url:http://127.0.0.1:39580/hdw/cd;, node_name:127.0.0.1:39580_hdw%2Fcd, state:active, leader:true, router:{name:compositeId}, maxShardsPerNode:1, autoAddReplicas:false, autoCreated:true}} Stack Trace: java.lang.AssertionError: Didn't see all replicas for shard shard1 in collection1 come up within 3 ms! ClusterState: { collection1:{ replicationFactor:1, shards:{shard1:{ range:8000-7fff, state:active, replicas:{ core_node1:{ core:collection1, base_url:http://127.0.0.1:58731/hdw/cd;, node_name:127.0.0.1:58731_hdw%2Fcd, state:active, leader:true}, core_node2:{ core:collection1, base_url:http://127.0.0.1:33761/hdw/cd;, node_name:127.0.0.1:33761_hdw%2Fcd, state:recovering, router:{name:compositeId}, maxShardsPerNode:1, autoAddReplicas:false, autoCreated:true}, control_collection:{ replicationFactor:1, shards:{shard1:{ range:8000-7fff, state:active, replicas:{core_node1:{ core:collection1, base_url:http://127.0.0.1:39580/hdw/cd;, node_name:127.0.0.1:39580_hdw%2Fcd, state:active, leader:true, router:{name:compositeId}, maxShardsPerNode:1, autoAddReplicas:false, autoCreated:true}} at __randomizedtesting.SeedInfo.seed([9FC25D7E38A3E2AC:179662A4965F8F54]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.cloud.AbstractFullDistribZkTestBase.ensureAllReplicasAreActive(AbstractFullDistribZkTestBase.java:1951) at org.apache.solr.cloud.RecoveryAfterSoftCommitTest.test(RecoveryAfterSoftCommitTest.java:103) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:945) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:920) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at
Commits for 4.10.4
Hi Mike, I think your commits for 4.10.4 broke ant precommit. https://svn.apache.org/viewvc?view=revisionrevision=1662742 https://svn.apache.org/viewvc?view=revisionrevision=1662746 https://svn.apache.org/viewvc?view=revisionrevision=1662750 lucene_solr_4_10/solr/build.xml:240: Some example solrconfig.xml files do not refer to the correct luceneMatchVersion: 4.10.4 -- Anshum Gupta http://about.me/anshumgupta
[jira] [Commented] (SOLR-6845) Add buildOnStartup option for suggesters
[ https://issues.apache.org/jira/browse/SOLR-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340366#comment-14340366 ] Tomás Fernández Löbbe commented on SOLR-6845: - No issues that I can think of, let's backport it. Add buildOnStartup option for suggesters Key: SOLR-6845 URL: https://issues.apache.org/jira/browse/SOLR-6845 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Tomás Fernández Löbbe Fix For: Trunk, 5.1 Attachments: SOLR-6845.patch, SOLR-6845.patch, SOLR-6845.patch, tests-failures.txt SOLR-6679 was filed to track the investigation into the following problem... {panel} The stock solrconfig provides a bad experience with a large index... start up Solr and it will spin at 100% CPU for minutes, unresponsive, while it apparently builds a suggester index. ... This is what I did: 1) indexed 10M very small docs (only takes a few minutes). 2) shut down Solr 3) start up Solr and watch it be unresponsive for over 4 minutes! I didn't even use any of the fields specified in the suggester config and I never called the suggest request handler. {panel} ..but ultimately focused on removing/disabling the suggester from the sample configs. Opening this new issue to focus on actually trying to identify the root problem fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7175) optimize maxSegments=2/ results in more than 2 segments after optimize finishes
[ https://issues.apache.org/jira/browse/SOLR-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Burton-West updated SOLR-7175: -- Attachment: build-1.indexwriterlog.2015-02-23.gz Attached is an indexwriter log where after a large merge down to 2 segments, startFullFlush was called and found additional docs in ram which were then written to 2 new segments.These new segments were not merged so the end result of calling optimize maxSegments=2 was a shard with 4 segments. Attached also is our solrconfig.xml file in case the problem is caused by some configuration error that overides the maxSegments=2. optimize maxSegments=2/ results in more than 2 segments after optimize finishes --- Key: SOLR-7175 URL: https://issues.apache.org/jira/browse/SOLR-7175 Project: Solr Issue Type: Bug Affects Versions: 4.10.2 Environment: linux Reporter: Tom Burton-West Priority: Minor Attachments: build-1.indexwriterlog.2015-02-23.gz After finishing indexing and running a commit, we issue an optimize maxSegments=2/ to Solr. With Solr 4.10.2 we are seeing one or two shards (out of 12) with 3 or 4 segments after the optimize finishes. There are no errors in the Solr logs or indexwriter logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-6294) Generalize how IndexSearcher parallelizes collection execution
[ https://issues.apache.org/jira/browse/LUCENE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-6294. -- Resolution: Fixed Fix Version/s: 5.1 Trunk Thanks David, Mike and Ryan for the reviews! Generalize how IndexSearcher parallelizes collection execution -- Key: LUCENE-6294 URL: https://issues.apache.org/jira/browse/LUCENE-6294 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Fix For: Trunk, 5.1 Attachments: LUCENE-6294.patch IndexSearcher takes an ExecutorService that can be used to parallelize collection execution. This is useful if you want to trade throughput for latency. However, this executor service will only be used if you search for top docs. In that case, we will create one collector per slide and call TopDocs.merge in the end. If you use search(Query, Collector), the executor service will never be used. But there are other collectors that could work the same way as top docs collectors, eg. TotalHitCountCollector. And maybe also some of our users' collectors. So maybe IndexSearcher could expose a generic way to take advantage of the executor service? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-5299. -- Resolution: Fixed Fix Version/s: 5.1 Trunk Refactor Collector API for parallelism -- Key: LUCENE-5299 URL: https://issues.apache.org/jira/browse/LUCENE-5299 Project: Lucene - Core Issue Type: Improvement Reporter: Shikhar Bhushan Fix For: Trunk, 5.1 Attachments: LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch, benchmarks.txt h2. Motivation We should be able to scale-up better with Solr/Lucene by utilizing multiple CPU cores, and not have to resort to scaling-out by sharding (with all the associated distributed system pitfalls) when the index size does not warrant it. Presently, IndexSearcher has an optional constructor arg for an ExecutorService, which gets used for searching in parallel for call paths where one of the TopDocCollector's is created internally. The per-atomic-reader search happens in parallel and then the TopDocs/TopFieldDocs results are merged with locking around the merge bit. However there are some problems with this approach: * If arbitary Collector args come into play, we can't parallelize. Note that even if ultimately results are going to a TopDocCollector it may be wrapped inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. * The special-casing with parallelism baked on top does not scale, there are many Collector's that could potentially lend themselves to parallelism, and special-casing means the parallelization has to be re-implemented if a different permutation of collectors is to be used. h2. Proposal A refactoring of collectors that allows for parallelization at the level of the collection protocol. Some requirements that should guide the implementation: * easy migration path for collectors that need to remain serial * the parallelization should be composable (when collectors wrap other collectors) * allow collectors to pick the optimal solution (e.g. there might be memory tradeoffs to be made) by advising the collector about whether a search will be parallelized, so that the serial use-case is not penalized. * encourage use of non-blocking constructs and lock-free parallelism, blocking is not advisable for the hot-spot of a search, besides wasting pooled threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-6294) Generalize how IndexSearcher parallelizes collection execution
[ https://issues.apache.org/jira/browse/LUCENE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340345#comment-14340345 ] Ryan Ernst edited comment on LUCENE-6294 at 2/27/15 4:23 PM: - +1 In the javadocs for {{IndexSearcher.search}} I think you mean In contrast to instead of On the contrary to? was (Author: rjernst): +1 In the javadocs for {{IndexSearch.search}} I think you mean In contrast to instead of On the contrary to? Generalize how IndexSearcher parallelizes collection execution -- Key: LUCENE-6294 URL: https://issues.apache.org/jira/browse/LUCENE-6294 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Attachments: LUCENE-6294.patch IndexSearcher takes an ExecutorService that can be used to parallelize collection execution. This is useful if you want to trade throughput for latency. However, this executor service will only be used if you search for top docs. In that case, we will create one collector per slide and call TopDocs.merge in the end. If you use search(Query, Collector), the executor service will never be used. But there are other collectors that could work the same way as top docs collectors, eg. TotalHitCountCollector. And maybe also some of our users' collectors. So maybe IndexSearcher could expose a generic way to take advantage of the executor service? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6294) Generalize how IndexSearcher parallelizes collection execution
[ https://issues.apache.org/jira/browse/LUCENE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340345#comment-14340345 ] Ryan Ernst commented on LUCENE-6294: +1 In the javadocs for {{IndexSearch.search}} I think you mean In contrast to instead of On the contrary to? Generalize how IndexSearcher parallelizes collection execution -- Key: LUCENE-6294 URL: https://issues.apache.org/jira/browse/LUCENE-6294 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Attachments: LUCENE-6294.patch IndexSearcher takes an ExecutorService that can be used to parallelize collection execution. This is useful if you want to trade throughput for latency. However, this executor service will only be used if you search for top docs. In that case, we will create one collector per slide and call TopDocs.merge in the end. If you use search(Query, Collector), the executor service will never be used. But there are other collectors that could work the same way as top docs collectors, eg. TotalHitCountCollector. And maybe also some of our users' collectors. So maybe IndexSearcher could expose a generic way to take advantage of the executor service? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-7175) optimize maxSegments=2/ results in more than 2 segments after optimize finishes
Tom Burton-West created SOLR-7175: - Summary: optimize maxSegments=2/ results in more than 2 segments after optimize finishes Key: SOLR-7175 URL: https://issues.apache.org/jira/browse/SOLR-7175 Project: Solr Issue Type: Bug Affects Versions: 4.10.2 Environment: linux Reporter: Tom Burton-West Priority: Minor After finishing indexing and running a commit, we issue an optimize maxSegments=2/ to Solr. With Solr 4.10.2 we are seeing one or two shards (out of 12) with 3 or 4 segments after the optimize finishes. There are no errors in the Solr logs or indexwriter logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7175) optimize maxSegments=2/ results in more than 2 segments after optimize finishes
[ https://issues.apache.org/jira/browse/SOLR-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Burton-West updated SOLR-7175: -- Attachment: build-4.iw.2015-02-25.txt.gz Previous file did not have an explicit commit. This file: build-4.iw.2015-02-25.txt includes a restart of Solr, a commit, and then the optimize maxSegments=2. Same scenario where after the major merge down to 2 segments a flush finds docs in ram and additional segments are written to disk. optimize maxSegments=2/ results in more than 2 segments after optimize finishes --- Key: SOLR-7175 URL: https://issues.apache.org/jira/browse/SOLR-7175 Project: Solr Issue Type: Bug Affects Versions: 4.10.2 Environment: linux Reporter: Tom Burton-West Priority: Minor Attachments: build-1.indexwriterlog.2015-02-23.gz, build-4.iw.2015-02-25.txt.gz, solr4.shotz After finishing indexing and running a commit, we issue an optimize maxSegments=2/ to Solr. With Solr 4.10.2 we are seeing one or two shards (out of 12) with 3 or 4 segments after the optimize finishes. There are no errors in the Solr logs or indexwriter logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6303) CachingWrapperFilter - CachingWrapperQuery
[ https://issues.apache.org/jira/browse/LUCENE-6303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340401#comment-14340401 ] Robert Muir commented on LUCENE-6303: - +1, this is awesome. CachingWrapperFilter - CachingWrapperQuery --- Key: LUCENE-6303 URL: https://issues.apache.org/jira/browse/LUCENE-6303 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-6303.patch As part of the filter - query migration, we should migrate the caching wrappers (including the filter cache). I think the behaviour should be to delegate to the wrapped query when scores are needed and cache otherwise like CachingWrapperFilter does today. Also the cache should ignore query boosts so that field:value^2 and field:value^3 are considered equal if scores are not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Welcome Anshum Gupta to the PMC
I'm pleased to announce that Anshum Gupta has accepted the PMC’s invitation to join. Welcome Anshum! Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Anshum Gupta to the PMC
Congratulations, Anshum! Alan Woodward www.flax.co.uk On 27 Feb 2015, at 17:41, Steve Rowe wrote: I'm pleased to announce that Anshum Gupta has accepted the PMC’s invitation to join. Welcome Anshum! Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6300) Remove multi-term filters
[ https://issues.apache.org/jira/browse/LUCENE-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340349#comment-14340349 ] ASF subversion and git services commented on LUCENE-6300: - Commit 1662740 from [~jpountz] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1662740 ] LUCENE-6300: Remove multi-term filters. Remove multi-term filters - Key: LUCENE-6300 URL: https://issues.apache.org/jira/browse/LUCENE-6300 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-6300.patch, LUCENE-6300.patch We have TermRangeFilter, NumericRangeFilter, ... that we should remove in favour of their equivalent queries (TermRangeQuery, NumericRangeQuery, ...). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-NightlyTests-5.x - Build # 771 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-5.x/771/ 7 tests failed. FAILED: org.apache.solr.cloud.FullSolrCloudDistribCmdsTest.test Error Message: IOException occured when talking to server at: http://127.0.0.1:60184/lvl/jl/collection1 Stack Trace: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://127.0.0.1:60184/lvl/jl/collection1 at __randomizedtesting.SeedInfo.seed([1F64C8A296BD6261:9730F77838410F99]:0) at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:572) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:214) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:210) at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:131) at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:309) at org.apache.solr.cloud.CloudInspectUtil.compareResults(CloudInspectUtil.java:224) at org.apache.solr.cloud.CloudInspectUtil.compareResults(CloudInspectUtil.java:166) at org.apache.solr.cloud.FullSolrCloudDistribCmdsTest.testIndexingBatchPerRequestWithHttpSolrClient(FullSolrCloudDistribCmdsTest.java:671) at org.apache.solr.cloud.FullSolrCloudDistribCmdsTest.test(FullSolrCloudDistribCmdsTest.java:151) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:945) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:920) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at
[jira] [Resolved] (LUCENE-6300) Remove multi-term filters
[ https://issues.apache.org/jira/browse/LUCENE-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-6300. -- Resolution: Fixed Fix Version/s: 5.1 Trunk Remove multi-term filters - Key: LUCENE-6300 URL: https://issues.apache.org/jira/browse/LUCENE-6300 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: Trunk, 5.1 Attachments: LUCENE-6300.patch, LUCENE-6300.patch We have TermRangeFilter, NumericRangeFilter, ... that we should remove in favour of their equivalent queries (TermRangeQuery, NumericRangeQuery, ...). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6294) Generalize how IndexSearcher parallelizes collection execution
[ https://issues.apache.org/jira/browse/LUCENE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340439#comment-14340439 ] ASF subversion and git services commented on LUCENE-6294: - Commit 1662761 from [~jpountz] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1662761 ] LUCENE-6294: Generalize how IndexSearcher parallelizes collection execution. Generalize how IndexSearcher parallelizes collection execution -- Key: LUCENE-6294 URL: https://issues.apache.org/jira/browse/LUCENE-6294 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Fix For: Trunk, 5.1 Attachments: LUCENE-6294.patch IndexSearcher takes an ExecutorService that can be used to parallelize collection execution. This is useful if you want to trade throughput for latency. However, this executor service will only be used if you search for top docs. In that case, we will create one collector per slide and call TopDocs.merge in the end. If you use search(Query, Collector), the executor service will never be used. But there are other collectors that could work the same way as top docs collectors, eg. TotalHitCountCollector. And maybe also some of our users' collectors. So maybe IndexSearcher could expose a generic way to take advantage of the executor service? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2687 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2687/ 6 tests failed. REGRESSION: org.apache.solr.handler.component.DistributedMLTComponentTest.test Error Message: Timeout occured while waiting response from server at: http://127.0.0.1:45014//collection1 Stack Trace: org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: http://127.0.0.1:45014//collection1 at __randomizedtesting.SeedInfo.seed([F62C797CA756A128:7E7846A609AACCD0]:0) at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:568) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:214) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:210) at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:131) at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:309) at org.apache.solr.BaseDistributedSearchTestCase.queryServer(BaseDistributedSearchTestCase.java:543) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:591) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:573) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:552) at org.apache.solr.handler.component.DistributedMLTComponentTest.test(DistributedMLTComponentTest.java:126) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:945) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:920) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at
Re: Commits for 4.10.4
Woops, sorry, I'll fix ... trying to prepare for 4.10.4 release. Mike McCandless http://blog.mikemccandless.com On Fri, Feb 27, 2015 at 12:44 PM, Anshum Gupta ans...@anshumgupta.net wrote: Hi Mike, I think your commits for 4.10.4 broke ant precommit. https://svn.apache.org/viewvc?view=revisionrevision=1662742 https://svn.apache.org/viewvc?view=revisionrevision=1662746 https://svn.apache.org/viewvc?view=revisionrevision=1662750 lucene_solr_4_10/solr/build.xml:240: Some example solrconfig.xml files do not refer to the correct luceneMatchVersion: 4.10.4 -- Anshum Gupta http://about.me/anshumgupta - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Anshum Gupta to the PMC
Welcome and congratulations Anshum! On Fri, Feb 27, 2015 at 6:41 PM, Steve Rowe sar...@gmail.com wrote: I'm pleased to announce that Anshum Gupta has accepted the PMC’s invitation to join. Welcome Anshum! Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7174) Can't index a directory of files using DIH with BinFileDataSource and TikaEntityProcessor
[ https://issues.apache.org/jira/browse/SOLR-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Taylor updated SOLR-7174: -- Summary: Can't index a directory of files using DIH with BinFileDataSource and TikaEntityProcessor (was: DIH using BinFileDataSource, FileListEntityProcessor and TikaEntityProcessor only reads first document) Can't index a directory of files using DIH with BinFileDataSource and TikaEntityProcessor - Key: SOLR-7174 URL: https://issues.apache.org/jira/browse/SOLR-7174 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 5.0 Environment: Windows 7. Ubuntu 14.04. Reporter: Gary Taylor Labels: dataimportHandler, tika,text-extraction Downloaded Solr 5.0.0, on a Windows 7 PC. I ran solr start and then solr create -c hn2 to create a new core. I want to index a load of epub files that I've got in a directory. So I created a data-import.xml (in solr\hn2\conf): dataConfig dataSource type=BinFileDataSource name=bin / document entity name=files dataSource=null rootEntity=false processor=FileListEntityProcessor baseDir=c:/Users/gt/Documents/epub fileName=.*epub onError=skip recursive=true field column=fileAbsolutePath name=id / field column=fileSize name=size / field column=fileLastModified name=lastModified / entity name=documentImport processor=TikaEntityProcessor url=${files.fileAbsolutePath} format=text dataSource=bin onError=skip field column=file name=fileName/ field column=Author name=author meta=true/ field column=title name=title meta=true/ field column=text name=content/ /entity /entity /document /dataConfig In my solrconfig.xml, I added a requestHandler entry to reference my data-import.xml: requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdata-import.xml/str /lst /requestHandler I renamed managed-schema to schema.xml, and ensured the following doc fields were setup: field name=id type=string indexed=true stored=true required=true multiValued=false / field name=fileName type=string indexed=true stored=true / field name=author type=string indexed=true stored=true / field name=title type=string indexed=true stored=true / field name=size type=long indexed=true stored=true / field name=lastModified type=date indexed=true stored=true / field name=content type=text_en indexed=false stored=true multiValued=false/ field name=text type=text_en indexed=true stored=false multiValued=true/ copyField source=content dest=text/ I copied all the jars from dist and contrib\* into server\solr\lib. Stopping and restarting solr then creates a new managed-schema file and renames schema.xml to schema.xml.back All good so far. Now I go to the web admin for dataimport (http://localhost:8983/solr/#/hn2/dataimport//dataimport) and try and execute a full import. But, the results show Requests: 0, Fetched: 58, Skipped: 0, Processed:1 - ie. it only adds one document (the very first one) even though it's iterated over 58! No errors are reported in the logs. I can repeat this on Ubuntu 14.04 using the same steps, so it's not Windows specific. - If I change the data-import.xml to use FileDataSource and PlainTextEntityProcessor and parse txt files, eg: dataConfig dataSource type=FileDataSource name=bin / document entity name=files dataSource=null rootEntity=false processor=FileListEntityProcessor baseDir=c:/Users/gt/Documents/epub fileName=.*txt field column=fileAbsolutePath name=id / field column=fileSize name=size / field column=fileLastModified name=lastModified / entity name=documentImport processor=PlainTextEntityProcessor url=${files.fileAbsolutePath} format=text dataSource=bin field column=plainText name=content/ /entity /entity /document /dataConfig This works. So it's a combo of BinFileDataSource and TikaEntityProcessor that is failing. On Windows, I ran Process Monitor, and spotted that only the very first epub file is actually being read (repeatedly). With verbose and debug on when running the DIH, I get the following response: verbose-output: [
[jira] [Updated] (LUCENE-6303) CachingWrapperFilter - CachingWrapperQuery
[ https://issues.apache.org/jira/browse/LUCENE-6303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-6303: - Attachment: LUCENE-6303.patch Here is a patch which: - replaces CachingWrapperFilter with CachingWrapperQuery - replaces FilterCache with QueryCache and caches weights instead of filters - removes DocIdSet.isCacheable since this method is not used anymore - adds built-in query caching to IndexSearcher (enabled by default): weights in the query tree that do not need scores are cached. CachingWrapperFilter - CachingWrapperQuery --- Key: LUCENE-6303 URL: https://issues.apache.org/jira/browse/LUCENE-6303 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-6303.patch As part of the filter - query migration, we should migrate the caching wrappers (including the filter cache). I think the behaviour should be to delegate to the wrapped query when scores are needed and cache otherwise like CachingWrapperFilter does today. Also the cache should ignore query boosts so that field:value^2 and field:value^3 are considered equal if scores are not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6294) Generalize how IndexSearcher parallelizes collection execution
[ https://issues.apache.org/jira/browse/LUCENE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340376#comment-14340376 ] ASF subversion and git services commented on LUCENE-6294: - Commit 1662751 from [~jpountz] in branch 'dev/trunk' [ https://svn.apache.org/r1662751 ] LUCENE-6294: Generalize how IndexSearcher parallelizes collection execution. Generalize how IndexSearcher parallelizes collection execution -- Key: LUCENE-6294 URL: https://issues.apache.org/jira/browse/LUCENE-6294 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Attachments: LUCENE-6294.patch IndexSearcher takes an ExecutorService that can be used to parallelize collection execution. This is useful if you want to trade throughput for latency. However, this executor service will only be used if you search for top docs. In that case, we will create one collector per slide and call TopDocs.merge in the end. If you use search(Query, Collector), the executor service will never be used. But there are other collectors that could work the same way as top docs collectors, eg. TotalHitCountCollector. And maybe also some of our users' collectors. So maybe IndexSearcher could expose a generic way to take advantage of the executor service? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7151) SolrClient.query() methods should throw IOException
[ https://issues.apache.org/jira/browse/SOLR-7151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340071#comment-14340071 ] ASF subversion and git services commented on SOLR-7151: --- Commit 1662670 from [~romseygeek] in branch 'dev/trunk' [ https://svn.apache.org/r1662670 ] SOLR-7151: SolrClient query methods throw IOException SolrClient.query() methods should throw IOException --- Key: SOLR-7151 URL: https://issues.apache.org/jira/browse/SOLR-7151 Project: Solr Issue Type: Bug Components: SolrJ Reporter: Alan Woodward Assignee: Alan Woodward Priority: Minor Fix For: Trunk, 5.1 Attachments: SOLR-7151.patch All the methods on SolrClient are declared as throwing SolrServerException (thrown if there's an error somewhere on the server), and IOException (thrown if there's a communication error), except for the QueryRequest methods. These swallow up IOException and repackage them in a SolrServerException. I think these are useful distinctions to make (you might want to retry on an IOException, but not on a SolrServerException), and we should make the query methods fall in line with the others. I'm not sure if this should go into 5.x as well as trunk, as it's a backwards-breaking change. I'm leaning towards yes, as it's a sufficiently useful API change that it's worth the break, but I'm not going to insist on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-7151) SolrClient.query() methods should throw IOException
[ https://issues.apache.org/jira/browse/SOLR-7151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Woodward resolved SOLR-7151. - Resolution: Fixed SolrClient.query() methods should throw IOException --- Key: SOLR-7151 URL: https://issues.apache.org/jira/browse/SOLR-7151 Project: Solr Issue Type: Bug Components: SolrJ Reporter: Alan Woodward Assignee: Alan Woodward Priority: Minor Fix For: Trunk, 5.1 Attachments: SOLR-7151.patch All the methods on SolrClient are declared as throwing SolrServerException (thrown if there's an error somewhere on the server), and IOException (thrown if there's a communication error), except for the QueryRequest methods. These swallow up IOException and repackage them in a SolrServerException. I think these are useful distinctions to make (you might want to retry on an IOException, but not on a SolrServerException), and we should make the query methods fall in line with the others. I'm not sure if this should go into 5.x as well as trunk, as it's a backwards-breaking change. I'm leaning towards yes, as it's a sufficiently useful API change that it's worth the break, but I'm not going to insist on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7151) SolrClient.query() methods should throw IOException
[ https://issues.apache.org/jira/browse/SOLR-7151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340085#comment-14340085 ] ASF subversion and git services commented on SOLR-7151: --- Commit 1662672 from [~romseygeek] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1662672 ] SOLR-7151: CHANGES.txt attribution SolrClient.query() methods should throw IOException --- Key: SOLR-7151 URL: https://issues.apache.org/jira/browse/SOLR-7151 Project: Solr Issue Type: Bug Components: SolrJ Reporter: Alan Woodward Assignee: Alan Woodward Priority: Minor Fix For: Trunk, 5.1 Attachments: SOLR-7151.patch All the methods on SolrClient are declared as throwing SolrServerException (thrown if there's an error somewhere on the server), and IOException (thrown if there's a communication error), except for the QueryRequest methods. These swallow up IOException and repackage them in a SolrServerException. I think these are useful distinctions to make (you might want to retry on an IOException, but not on a SolrServerException), and we should make the query methods fall in line with the others. I'm not sure if this should go into 5.x as well as trunk, as it's a backwards-breaking change. I'm leaning towards yes, as it's a sufficiently useful API change that it's worth the break, but I'm not going to insist on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7168) TestSolrConfigHandler Test failure :Could not remove the following files
[ https://issues.apache.org/jira/browse/SOLR-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chattopadhyaya updated SOLR-7168: --- Attachment: SOLR-7168.patch Even after this fix, this test was still failing on Windows. http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/4508/testReport/junit/junit.framework/TestSuite/org_apache_solr_core_TestSolrConfigHandler/ Attached a patch to fix this failure. Putting the OutputStream into a try with resources (instead of org.apache.commons.io.IOUtils.closeQuietly()) and replacing FileUtils.sync() to oal.util.IOUtils.fsync() fixes the failure. Just a thought, since the FileUtils.sync() mentions it has been copied from FSDirectory.fsync() and that FSDirectory.fsync() itself is now just a wrapper to oal.util.IOUtils.fsync(), shouldn't we just remove FileUtils.sync() altogether? It seems to be referred to from SnapPuller and ManagedIndexSchema. TestSolrConfigHandler Test failure :Could not remove the following files - Key: SOLR-7168 URL: https://issues.apache.org/jira/browse/SOLR-7168 Project: Solr Issue Type: Bug Environment: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/4508/ {noformat} Stack Trace: java.io.IOException: Could not remove the following files (in the order of attempts): C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 7373A4FF3B396841-001\tempDir-010\collection1\conf\params.json: {noformat} Reporter: Noble Paul Assignee: Noble Paul Priority: Minor Fix For: Trunk, 5.1 Attachments: SOLR-7168.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2686 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2686/ 5 tests failed. FAILED: org.apache.solr.cloud.HttpPartitionTest.test Error Message: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://127.0.0.1:33429/c8n_1x2_shard1_replica2 Stack Trace: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://127.0.0.1:33429/c8n_1x2_shard1_replica2 at __randomizedtesting.SeedInfo.seed([1E0A66FEB87D15D4:965E59241681782C]:0) at org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:597) at org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:918) at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:809) at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:752) at org.apache.solr.cloud.HttpPartitionTest.doSendDoc(HttpPartitionTest.java:484) at org.apache.solr.cloud.HttpPartitionTest.sendDoc(HttpPartitionTest.java:501) at org.apache.solr.cloud.HttpPartitionTest.testRf2(HttpPartitionTest.java:193) at org.apache.solr.cloud.HttpPartitionTest.test(HttpPartitionTest.java:106) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:945) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:920) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at
[jira] [Resolved] (LUCENE-6001) DrillSideways throws NullPointerException for some searches
[ https://issues.apache.org/jira/browse/LUCENE-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-6001. Resolution: Fixed Fix Version/s: 5.1 Trunk Thank you Dragan and jane! DrillSideways throws NullPointerException for some searches --- Key: LUCENE-6001 URL: https://issues.apache.org/jira/browse/LUCENE-6001 Project: Lucene - Core Issue Type: Bug Components: modules/facet Affects Versions: 4.10.1 Reporter: Dragan Jotanovic Priority: Blocker Fix For: 4.10.4, Trunk, 5.1 Attachments: LUCENE-6001.patch For some DrillSideways searches I get NullPointerException. I have tracked the problem to DrillSidewaysScorer class, on line 126 in DrillSidewaysScorer.java: long baseQueryCost = baseScorer.cost(); On some of my index segments, this call throws NullPoinerException. baseScorer is instance of ReqExclScorer. In ReqExclScorer.java: public long cost() { return reqScorer.cost(); } throws NullPointerException because reqScorer is null. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7124) Add delconfig command to zkcli
[ https://issues.apache.org/jira/browse/SOLR-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340205#comment-14340205 ] Shawn Heisey commented on SOLR-7124: bq. We should also keep in mind that a user can replace existing configs for a collection and call collection RELOAD. So adding this check can prevent one from easily replacing configs. Thoughts? The check to see whether the config is in use would only be required on delconfig. It would not be required on upconfig, because upconfig will not delete any files from an existing config, it will only add or overwrite. A slightly different check (making sure the config actually exists) *might* need to be performed for linkconfig. I know it is possible to link a config to a collection that doesn't exist yet, so that when it is created it will already have the correct config. it is probably also possible to make that link to a config that doesn't exist, but I think we should prevent that. Add delconfig command to zkcli Key: SOLR-7124 URL: https://issues.apache.org/jira/browse/SOLR-7124 Project: Solr Issue Type: New Feature Components: SolrCloud Affects Versions: 5.0 Reporter: Shawn Heisey Priority: Minor Fix For: 5.1 As far as I know, there is no functionality included with Solr that can delete a SolrCloud config in zookeeper. A delconfig command should be added to ZkCli and the zkcli script that can accomplish this. It should refuse to delete a config that is in use by any current collection. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6306) Merging of doc values, norms is not abortable
[ https://issues.apache.org/jira/browse/LUCENE-6306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340215#comment-14340215 ] Robert Muir commented on LUCENE-6306: - +1 Merging of doc values, norms is not abortable - Key: LUCENE-6306 URL: https://issues.apache.org/jira/browse/LUCENE-6306 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Priority: Blocker Fix For: 4.10.4 Attachments: LUCENE-6306.patch When you call IW.rollback, IW asks all running merges to abort, and the merges should periodically check their abort flags (it's a cooperative mechanism, like thread interrupting in Java). In 5.x/trunk we have a nice clean solution where the Directory checks the abort bit during writes, so the codec doesn't have to bother with this. But in 4.x, we have to call MergeState.checkAbort.work, and I noticed that neither DVs nor norms call this. Typically this is not a problem since merging DVs and norms is usually fast, but for a very large merge / very many DVs and norm'd fields, it could take non-trivial time to merge. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7168) TestSolrConfigHandler Test failure :Could not remove the following files
[ https://issues.apache.org/jira/browse/SOLR-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340109#comment-14340109 ] ASF subversion and git services commented on SOLR-7168: --- Commit 1662677 from [~noble.paul] in branch 'dev/trunk' [ https://svn.apache.org/r1662677 ] SOLR-7168: Test failure :Could not remove the files in windows TestSolrConfigHandler Test failure :Could not remove the following files - Key: SOLR-7168 URL: https://issues.apache.org/jira/browse/SOLR-7168 Project: Solr Issue Type: Bug Environment: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/4508/ {noformat} Stack Trace: java.io.IOException: Could not remove the following files (in the order of attempts): C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 7373A4FF3B396841-001\tempDir-010\collection1\conf\params.json: {noformat} Reporter: Noble Paul Assignee: Noble Paul Priority: Minor Fix For: Trunk, 5.1 Attachments: SOLR-7168.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-5.x-Windows (32bit/jdk1.7.0_76) - Build # 4402 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Windows/4402/ Java: 32bit/jdk1.7.0_76 -server -XX:+UseConcMarkSweepGC 2 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.core.TestSolrConfigHandler Error Message: Could not remove the following files (in the order of attempts): C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010\collection1\conf\params.json: java.nio.file.FileSystemException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010\collection1\conf\params.json: The process cannot access the file because it is being used by another process. C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010\collection1\conf: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010\collection1\conf C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010\collection1: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010\collection1 C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010 C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010 C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001 Stack Trace: java.io.IOException: Could not remove the following files (in the order of attempts): C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010\collection1\conf\params.json: java.nio.file.FileSystemException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010\collection1\conf\params.json: The process cannot access the file because it is being used by another process. C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010\collection1\conf: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010\collection1\conf C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010\collection1: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010\collection1 C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010 C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001\tempDir-010 C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 17FA3B9EECB6DA52-001: java.nio.file.DirectoryNotEmptyException:
[jira] [Commented] (LUCENE-6001) DrillSideways throws NullPointerException for some searches
[ https://issues.apache.org/jira/browse/LUCENE-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340092#comment-14340092 ] ASF subversion and git services commented on LUCENE-6001: - Commit 1662674 from [~mikemccand] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1662674 ] LUCENE-6001: DrillSideways hits NullPointerException for some BooleanQuery searches DrillSideways throws NullPointerException for some searches --- Key: LUCENE-6001 URL: https://issues.apache.org/jira/browse/LUCENE-6001 Project: Lucene - Core Issue Type: Bug Components: modules/facet Affects Versions: 4.10.1 Reporter: Dragan Jotanovic Priority: Blocker Fix For: 4.10.4 Attachments: LUCENE-6001.patch For some DrillSideways searches I get NullPointerException. I have tracked the problem to DrillSidewaysScorer class, on line 126 in DrillSidewaysScorer.java: long baseQueryCost = baseScorer.cost(); On some of my index segments, this call throws NullPoinerException. baseScorer is instance of ReqExclScorer. In ReqExclScorer.java: public long cost() { return reqScorer.cost(); } throws NullPointerException because reqScorer is null. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-7174) DIH using BinFileDataSource, FileListEntityProcessor and TikaEntityProcessor only reads first document
Gary Taylor created SOLR-7174: - Summary: DIH using BinFileDataSource, FileListEntityProcessor and TikaEntityProcessor only reads first document Key: SOLR-7174 URL: https://issues.apache.org/jira/browse/SOLR-7174 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 5.0 Environment: Windows 7. Ubuntu 14.04. Reporter: Gary Taylor Downloaded Solr 5.0.0, on a Windows 7 PC. I ran solr start and then solr create -c hn2 to create a new core. I want to index a load of epub files that I've got in a directory. So I created a data-import.xml (in solr\hn2\conf): dataConfig dataSource type=BinFileDataSource name=bin / document entity name=files dataSource=null rootEntity=false processor=FileListEntityProcessor baseDir=c:/Users/gt/Documents/epub fileName=.*epub onError=skip recursive=true field column=fileAbsolutePath name=id / field column=fileSize name=size / field column=fileLastModified name=lastModified / entity name=documentImport processor=TikaEntityProcessor url=${files.fileAbsolutePath} format=text dataSource=bin onError=skip field column=file name=fileName/ field column=Author name=author meta=true/ field column=title name=title meta=true/ field column=text name=content/ /entity /entity /document /dataConfig In my solrconfig.xml, I added a requestHandler entry to reference my data-import.xml: requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdata-import.xml/str /lst /requestHandler I renamed managed-schema to schema.xml, and ensured the following doc fields were setup: field name=id type=string indexed=true stored=true required=true multiValued=false / field name=fileName type=string indexed=true stored=true / field name=author type=string indexed=true stored=true / field name=title type=string indexed=true stored=true / field name=size type=long indexed=true stored=true / field name=lastModified type=date indexed=true stored=true / field name=content type=text_en indexed=false stored=true multiValued=false/ field name=text type=text_en indexed=true stored=false multiValued=true/ copyField source=content dest=text/ I copied all the jars from dist and contrib\* into server\solr\lib. Stopping and restarting solr then creates a new managed-schema file and renames schema.xml to schema.xml.back All good so far. Now I go to the web admin for dataimport (http://localhost:8983/solr/#/hn2/dataimport//dataimport) and try and execute a full import. But, the results show Requests: 0, Fetched: 58, Skipped: 0, Processed:1 - ie. it only adds one document (the very first one) even though it's iterated over 58! No errors are reported in the logs. I can repeat this on Ubuntu 14.04 using the same steps, so it's not Windows specific. - If I change the data-import.xml to use FileDataSource and PlainTextEntityProcessor and parse txt files, eg: dataConfig dataSource type=FileDataSource name=bin / document entity name=files dataSource=null rootEntity=false processor=FileListEntityProcessor baseDir=c:/Users/gt/Documents/epub fileName=.*txt field column=fileAbsolutePath name=id / field column=fileSize name=size / field column=fileLastModified name=lastModified / entity name=documentImport processor=PlainTextEntityProcessor url=${files.fileAbsolutePath} format=text dataSource=bin field column=plainText name=content/ /entity /entity /document /dataConfig This works. So it's a combo of BinFileDataSource and TikaEntityProcessor that is failing. On Windows, I ran Process Monitor, and spotted that only the very first epub file is actually being read (repeatedly). With verbose and debug on when running the DIH, I get the following response: verbose-output: [ entity:files, [ null, --- row #1-, fileSize, 2609004, fileLastModified, 2015-02-25T11:37:25.217Z, fileAbsolutePath, c:\\Users\\gt\\Documents\\epub\\issue018.epub, fileDir, c:\\Users\\gt\\Documents\\epub, file, issue018.epub, null, -, entity:documentImport, [ document#1, [ query,
[JENKINS] Lucene-Solr-trunk-Windows (64bit/jdk1.8.0_31) - Build # 4510 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/4510/ Java: 64bit/jdk1.8.0_31 -XX:-UseCompressedOops -XX:+UseG1GC 2 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.core.TestSolrConfigHandler Error Message: Could not remove the following files (in the order of attempts): C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010\collection1\conf\params.json: java.nio.file.FileSystemException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010\collection1\conf\params.json: The process cannot access the file because it is being used by another process. C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010\collection1\conf: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010\collection1\conf C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010\collection1: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010\collection1 C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010 C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010 C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001 Stack Trace: java.io.IOException: Could not remove the following files (in the order of attempts): C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010\collection1\conf\params.json: java.nio.file.FileSystemException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010\collection1\conf\params.json: The process cannot access the file because it is being used by another process. C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010\collection1\conf: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010\collection1\conf C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010\collection1: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010\collection1 C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010 C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001\tempDir-010 C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 9CE4F674E83EED72-001: java.nio.file.DirectoryNotEmptyException:
[jira] [Commented] (LUCENE-6001) DrillSideways throws NullPointerException for some searches
[ https://issues.apache.org/jira/browse/LUCENE-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340090#comment-14340090 ] ASF subversion and git services commented on LUCENE-6001: - Commit 1662673 from [~mikemccand] in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1662673 ] LUCENE-6001: DrillSideways hits NullPointerException for some BooleanQuery searches DrillSideways throws NullPointerException for some searches --- Key: LUCENE-6001 URL: https://issues.apache.org/jira/browse/LUCENE-6001 Project: Lucene - Core Issue Type: Bug Components: modules/facet Affects Versions: 4.10.1 Reporter: Dragan Jotanovic Priority: Blocker Fix For: 4.10.4 Attachments: LUCENE-6001.patch For some DrillSideways searches I get NullPointerException. I have tracked the problem to DrillSidewaysScorer class, on line 126 in DrillSidewaysScorer.java: long baseQueryCost = baseScorer.cost(); On some of my index segments, this call throws NullPoinerException. baseScorer is instance of ReqExclScorer. In ReqExclScorer.java: public long cost() { return reqScorer.cost(); } throws NullPointerException because reqScorer is null. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7168) TestSolrConfigHandler Test failure :Could not remove the following files
[ https://issues.apache.org/jira/browse/SOLR-7168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340113#comment-14340113 ] ASF subversion and git services commented on SOLR-7168: --- Commit 1662678 from [~noble.paul] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1662678 ] SOLR-7168: Test failure :Could not remove the files in windows TestSolrConfigHandler Test failure :Could not remove the following files - Key: SOLR-7168 URL: https://issues.apache.org/jira/browse/SOLR-7168 Project: Solr Issue Type: Bug Environment: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/4508/ {noformat} Stack Trace: java.io.IOException: Could not remove the following files (in the order of attempts): C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\solr\build\solr-core\test\J0\temp\solr.core.TestSolrConfigHandler 7373A4FF3B396841-001\tempDir-010\collection1\conf\params.json: {noformat} Reporter: Noble Paul Assignee: Noble Paul Priority: Minor Fix For: Trunk, 5.1 Attachments: SOLR-7168.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7128) Two phase distributed search is fetching extra fields in GET_TOP_IDS phase
[ https://issues.apache.org/jira/browse/SOLR-7128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-7128: Attachment: SOLR-7128-addendum.patch Hoss prodded me privately about the duplicate field names being requested in shard requests (thanks Hoss!) so I refactored the field modifying logic so that duplicates aren't possible. I also added more tests with no 'fl', fl=* and fl=*,score in both single pass and regular search. Two phase distributed search is fetching extra fields in GET_TOP_IDS phase -- Key: SOLR-7128 URL: https://issues.apache.org/jira/browse/SOLR-7128 Project: Solr Issue Type: Bug Components: search Affects Versions: 4.10.2, 4.10.3 Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Fix For: Trunk, 5.1 Attachments: SOLR-7128-addendum.patch, SOLR-7128.patch, SOLR-7128.patch, SOLR-7128.patch, SOLR-7128.patch [~pqueixalos] reported this to me privately so I am creating this issue on his behalf. {quote} We found an issue in versions 4.10.+ (4.10.2 and 4.10.3 for sure). When processing a two phase distributed query with an explicit fl parameter, the two phases are well processed, but the GET_TOP_IDS retrieves the matching documents fields, even if a GET_FIELDS shard request is getting executed just after. /solr/someCollectionCore?collection=someOtherCollectionq=*:*debug=truefl=id,title = id is retrieved during GET_TOP_IDS phase that's ok:: it's our uniqueKeyField = title is also retrieved during GET_TOP_IDS phase, that's not ok. {quote} I'm able to reproduce this. This is pretty bad performance bug that was introduced in SOLR-5768 or it's subsequent related issues. I plan to fix this bug and add substantial tests to assert such things. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6300) Remove multi-term filters
[ https://issues.apache.org/jira/browse/LUCENE-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340152#comment-14340152 ] ASF subversion and git services commented on LUCENE-6300: - Commit 1662682 from [~jpountz] in branch 'dev/trunk' [ https://svn.apache.org/r1662682 ] LUCENE-6300: Remove multi-term filters. Remove multi-term filters - Key: LUCENE-6300 URL: https://issues.apache.org/jira/browse/LUCENE-6300 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-6300.patch, LUCENE-6300.patch We have TermRangeFilter, NumericRangeFilter, ... that we should remove in favour of their equivalent queries (TermRangeQuery, NumericRangeQuery, ...). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6001) DrillSideways throws NullPointerException for some searches
[ https://issues.apache.org/jira/browse/LUCENE-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340146#comment-14340146 ] ASF subversion and git services commented on LUCENE-6001: - Commit 1662681 from [~mikemccand] in branch 'dev/trunk' [ https://svn.apache.org/r1662681 ] LUCENE-6001: DrillSideways hits NullPointerException for some BooleanQuery searches DrillSideways throws NullPointerException for some searches --- Key: LUCENE-6001 URL: https://issues.apache.org/jira/browse/LUCENE-6001 Project: Lucene - Core Issue Type: Bug Components: modules/facet Affects Versions: 4.10.1 Reporter: Dragan Jotanovic Priority: Blocker Fix For: 4.10.4, Trunk, 5.1 Attachments: LUCENE-6001.patch For some DrillSideways searches I get NullPointerException. I have tracked the problem to DrillSidewaysScorer class, on line 126 in DrillSidewaysScorer.java: long baseQueryCost = baseScorer.cost(); On some of my index segments, this call throws NullPoinerException. baseScorer is instance of ReqExclScorer. In ReqExclScorer.java: public long cost() { return reqScorer.cost(); } throws NullPointerException because reqScorer is null. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6307) Rename SegmentInfo.docCount - .maxDoc
[ https://issues.apache.org/jira/browse/LUCENE-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-6307: --- Attachment: LUCENE-6307.patch Simple patch just touches a lot of files ... Rename SegmentInfo.docCount - .maxDoc -- Key: LUCENE-6307 URL: https://issues.apache.org/jira/browse/LUCENE-6307 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: Trunk, 5.x Attachments: LUCENE-6307.patch We already have maxDoc and numDocs, I think it's crazy we have a 3rd one docCount. We should just rename to maxDoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Anshum Gupta to the PMC
Welcome Anshum! Mike McCandless http://blog.mikemccandless.com On Fri, Feb 27, 2015 at 12:41 PM, Steve Rowe sar...@gmail.com wrote: I'm pleased to announce that Anshum Gupta has accepted the PMC’s invitation to join. Welcome Anshum! Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (SOLR-7033) RecoveryStrategy should not publish any state when closed / cancelled.
[ https://issues.apache.org/jira/browse/SOLR-7033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe reopened SOLR-7033: -- Assignee: Steve Rowe (was: Mark Miller) Reopening to backport to 4.10.4 RecoveryStrategy should not publish any state when closed / cancelled. -- Key: SOLR-7033 URL: https://issues.apache.org/jira/browse/SOLR-7033 Project: Solr Issue Type: Bug Reporter: Mark Miller Assignee: Steve Rowe Priority: Blocker Fix For: 4.10.4, 5.0, Trunk Attachments: SOLR-7033.patch, SOLR-7033.patch, SOLR-7033.patch, SOLR-7033.patch, SOLR-7033.patch Currently, when closed / cancelled, RecoveryStrategy can publish a recovery failed state. In a bad loop (like when no one can become leader because no one had a last state of active) this can cause very fast looped publishing of this state to zk. It's an outstanding item to improve that specific scenario anyway, but regardless, we should fix the close / cancel path to never publish any state to zk. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7033) RecoveryStrategy should not publish any state when closed / cancelled.
[ https://issues.apache.org/jira/browse/SOLR-7033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe updated SOLR-7033: - Fix Version/s: 4.10.4 RecoveryStrategy should not publish any state when closed / cancelled. -- Key: SOLR-7033 URL: https://issues.apache.org/jira/browse/SOLR-7033 Project: Solr Issue Type: Bug Reporter: Mark Miller Assignee: Steve Rowe Priority: Blocker Fix For: 4.10.4, 5.0, Trunk Attachments: SOLR-7033.patch, SOLR-7033.patch, SOLR-7033.patch, SOLR-7033.patch, SOLR-7033.patch Currently, when closed / cancelled, RecoveryStrategy can publish a recovery failed state. In a bad loop (like when no one can become leader because no one had a last state of active) this can cause very fast looped publishing of this state to zk. It's an outstanding item to improve that specific scenario anyway, but regardless, we should fix the close / cancel path to never publish any state to zk. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7033) RecoveryStrategy should not publish any state when closed / cancelled.
[ https://issues.apache.org/jira/browse/SOLR-7033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340556#comment-14340556 ] ASF subversion and git services commented on SOLR-7033: --- Commit 1662784 from [~steve_rowe] in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1662784 ] SOLR-7033, SOLR-5961: RecoveryStrategy should not publish any state when closed / cancelled and there should always be a pause between recoveries even when recoveries are rapidly stopped and started as well as when a node attempts to become the leader for a shard. (merged branch_5x r1658237) RecoveryStrategy should not publish any state when closed / cancelled. -- Key: SOLR-7033 URL: https://issues.apache.org/jira/browse/SOLR-7033 Project: Solr Issue Type: Bug Reporter: Mark Miller Assignee: Steve Rowe Priority: Blocker Fix For: 4.10.4, 5.0, Trunk Attachments: SOLR-7033.patch, SOLR-7033.patch, SOLR-7033.patch, SOLR-7033.patch, SOLR-7033.patch Currently, when closed / cancelled, RecoveryStrategy can publish a recovery failed state. In a bad loop (like when no one can become leader because no one had a last state of active) this can cause very fast looped publishing of this state to zk. It's an outstanding item to improve that specific scenario anyway, but regardless, we should fix the close / cancel path to never publish any state to zk. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-7033) RecoveryStrategy should not publish any state when closed / cancelled.
[ https://issues.apache.org/jira/browse/SOLR-7033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe resolved SOLR-7033. -- Resolution: Fixed Committed to lucene_solr_4_10. RecoveryStrategy should not publish any state when closed / cancelled. -- Key: SOLR-7033 URL: https://issues.apache.org/jira/browse/SOLR-7033 Project: Solr Issue Type: Bug Reporter: Mark Miller Assignee: Steve Rowe Priority: Blocker Fix For: 4.10.4, Trunk, 5.0 Attachments: SOLR-7033.patch, SOLR-7033.patch, SOLR-7033.patch, SOLR-7033.patch, SOLR-7033.patch Currently, when closed / cancelled, RecoveryStrategy can publish a recovery failed state. In a bad loop (like when no one can become leader because no one had a last state of active) this can cause very fast looped publishing of this state to zk. It's an outstanding item to improve that specific scenario anyway, but regardless, we should fix the close / cancel path to never publish any state to zk. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Anshum Gupta to the PMC
Congrats! On Feb 27, 2015 9:41 AM, Steve Rowe sar...@gmail.com wrote: I'm pleased to announce that Anshum Gupta has accepted the PMC’s invitation to join. Welcome Anshum! Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5961) Solr gets crazy on /overseer/queue state change
[ https://issues.apache.org/jira/browse/SOLR-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe resolved SOLR-5961. -- Resolution: Fixed Fix Version/s: 5.0 4.10.4 Fixed as part of SOLR-7033. Solr gets crazy on /overseer/queue state change --- Key: SOLR-5961 URL: https://issues.apache.org/jira/browse/SOLR-5961 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.7.1 Environment: CentOS, 1 shard - 3 replicas, ZK cluster with 3 nodes (separate machines) Reporter: Maxim Novikov Assignee: Shalin Shekhar Mangar Priority: Critical Fix For: 4.10.4, 5.0 No idea how to reproduce it, but sometimes Solr stars littering the log with the following messages: 419158 [localhost-startStop-1-EventThread] INFO org.apache.solr.cloud.DistributedQueue ? LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged 419190 [Thread-3] INFO org.apache.solr.cloud.Overseer ? Update state numShards=1 message={ operation:state, state:recovering, base_url:http://${IP_ADDRESS}/solr;, core:${CORE_NAME}, roles:null, node_name:${NODE_NAME}_solr, shard:shard1, collection:${COLLECTION_NAME}, numShards:1, core_node_name:core_node2} It continues spamming these messages with no delay and the restarting of all the nodes does not help. I have even tried to stop all the nodes in the cluster first, but then when I start one, the behavior doesn't change, it gets crazy nuts with this /overseer/queue state again. PS The only way to handle this was to stop everything, manually clean up all the data in ZooKeeper related to Solr, and then rebuild everything from scratch. As you should understand, it is kinda unbearable in the production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7033) RecoveryStrategy should not publish any state when closed / cancelled.
[ https://issues.apache.org/jira/browse/SOLR-7033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340493#comment-14340493 ] Steve Rowe commented on SOLR-7033: -- [~markrmil...@gmail.com], looks like your last patch on this issue was never committed? RecoveryStrategy should not publish any state when closed / cancelled. -- Key: SOLR-7033 URL: https://issues.apache.org/jira/browse/SOLR-7033 Project: Solr Issue Type: Bug Reporter: Mark Miller Assignee: Mark Miller Priority: Blocker Fix For: 5.0, Trunk Attachments: SOLR-7033.patch, SOLR-7033.patch, SOLR-7033.patch, SOLR-7033.patch, SOLR-7033.patch Currently, when closed / cancelled, RecoveryStrategy can publish a recovery failed state. In a bad loop (like when no one can become leader because no one had a last state of active) this can cause very fast looped publishing of this state to zk. It's an outstanding item to improve that specific scenario anyway, but regardless, we should fix the close / cancel path to never publish any state to zk. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6301) Deprecate Filter
[ https://issues.apache.org/jira/browse/LUCENE-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340510#comment-14340510 ] Adrien Grand commented on LUCENE-6301: -- We are getting closer: QueryWrapperFilter is now the last Filter impl in lucene/core. Filter would be hard to remove from trunk because there are lots of module that implement or consume filters, but we should be able to remove FilteredQuery by using a BooleanQuery with a FILTER clause instead. I will give it a try soon. Deprecate Filter Key: LUCENE-6301 URL: https://issues.apache.org/jira/browse/LUCENE-6301 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Fix For: Trunk, 5.1 It will still take time to completely remove Filter, but I think we should start deprecating it now to state our intention and encourage users to move to queries as soon as possible? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-7147) Introduce new TrackingShardHandlerFactory for monitoring what requests are sent to shards during tests
[ https://issues.apache.org/jira/browse/SOLR-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-7147. - Resolution: Fixed Introduce new TrackingShardHandlerFactory for monitoring what requests are sent to shards during tests -- Key: SOLR-7147 URL: https://issues.apache.org/jira/browse/SOLR-7147 Project: Solr Issue Type: Improvement Components: SolrCloud, Tests Reporter: Hoss Man Assignee: Shalin Shekhar Mangar Fix For: 4.10.4, Trunk, 5.1 Attachments: SOLR-7147.patch, SOLR-7147.patch, SOLR-7147.patch, SOLR-7147.patch, SOLR-7147.patch, SOLR-7147.patch this is an idea shalin proposed as part of the testing for SOLR-7128... bq. I created a TrackingShardHandlerFactory which can record shard requests sent from any node. There are a few helper methods to get requests by shard and by purpose. ... bq. I will likely move the TrackingShardHandlerFactory into its own issue because it is helpful for other distributed tests as well. I also need to decouple it from the MiniSolrCloudCluster abstraction. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5961) Solr gets crazy on /overseer/queue state change
[ https://issues.apache.org/jira/browse/SOLR-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340567#comment-14340567 ] ASF subversion and git services commented on SOLR-5961: --- Commit 1662784 from [~steve_rowe] in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1662784 ] SOLR-7033, SOLR-5961: RecoveryStrategy should not publish any state when closed / cancelled and there should always be a pause between recoveries even when recoveries are rapidly stopped and started as well as when a node attempts to become the leader for a shard. (merged branch_5x r1658237) Solr gets crazy on /overseer/queue state change --- Key: SOLR-5961 URL: https://issues.apache.org/jira/browse/SOLR-5961 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.7.1 Environment: CentOS, 1 shard - 3 replicas, ZK cluster with 3 nodes (separate machines) Reporter: Maxim Novikov Assignee: Shalin Shekhar Mangar Priority: Critical Fix For: 4.10.4, 5.0 No idea how to reproduce it, but sometimes Solr stars littering the log with the following messages: 419158 [localhost-startStop-1-EventThread] INFO org.apache.solr.cloud.DistributedQueue ? LatchChildWatcher fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged 419190 [Thread-3] INFO org.apache.solr.cloud.Overseer ? Update state numShards=1 message={ operation:state, state:recovering, base_url:http://${IP_ADDRESS}/solr;, core:${CORE_NAME}, roles:null, node_name:${NODE_NAME}_solr, shard:shard1, collection:${COLLECTION_NAME}, numShards:1, core_node_name:core_node2} It continues spamming these messages with no delay and the restarting of all the nodes does not help. I have even tried to stop all the nodes in the cluster first, but then when I start one, the behavior doesn't change, it gets crazy nuts with this /overseer/queue state again. PS The only way to handle this was to stop everything, manually clean up all the data in ZooKeeper related to Solr, and then rebuild everything from scratch. As you should understand, it is kinda unbearable in the production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Anshum Gupta to the PMC
Congrats Anshum! -Yonik On Fri, Feb 27, 2015 at 9:41 AM, Steve Rowe sar...@gmail.com wrote: I'm pleased to announce that Anshum Gupta has accepted the PMC’s invitation to join. Welcome Anshum! Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Commits for 4.10.4
Sure, else I can. I have it fixed, just running the tests now. On Fri, Feb 27, 2015 at 9:50 AM, Michael McCandless luc...@mikemccandless.com wrote: Woops, sorry, I'll fix ... trying to prepare for 4.10.4 release. Mike McCandless http://blog.mikemccandless.com On Fri, Feb 27, 2015 at 12:44 PM, Anshum Gupta ans...@anshumgupta.net wrote: Hi Mike, I think your commits for 4.10.4 broke ant precommit. https://svn.apache.org/viewvc?view=revisionrevision=1662742 https://svn.apache.org/viewvc?view=revisionrevision=1662746 https://svn.apache.org/viewvc?view=revisionrevision=1662750 lucene_solr_4_10/solr/build.xml:240: Some example solrconfig.xml files do not refer to the correct luceneMatchVersion: 4.10.4 -- Anshum Gupta http://about.me/anshumgupta - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Anshum Gupta http://about.me/anshumgupta
[jira] [Commented] (SOLR-7147) Introduce new TrackingShardHandlerFactory for monitoring what requests are sent to shards during tests
[ https://issues.apache.org/jira/browse/SOLR-7147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340526#comment-14340526 ] ASF subversion and git services commented on SOLR-7147: --- Commit 1662777 from sha...@apache.org in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1662777 ] SOLR-7147: Introduce new TrackingShardHandlerFactory for monitoring what requests are sent to shards during tests Introduce new TrackingShardHandlerFactory for monitoring what requests are sent to shards during tests -- Key: SOLR-7147 URL: https://issues.apache.org/jira/browse/SOLR-7147 Project: Solr Issue Type: Improvement Components: SolrCloud, Tests Reporter: Hoss Man Assignee: Shalin Shekhar Mangar Fix For: 4.10.4, Trunk, 5.1 Attachments: SOLR-7147.patch, SOLR-7147.patch, SOLR-7147.patch, SOLR-7147.patch, SOLR-7147.patch, SOLR-7147.patch this is an idea shalin proposed as part of the testing for SOLR-7128... bq. I created a TrackingShardHandlerFactory which can record shard requests sent from any node. There are a few helper methods to get requests by shard and by purpose. ... bq. I will likely move the TrackingShardHandlerFactory into its own issue because it is helpful for other distributed tests as well. I also need to decouple it from the MiniSolrCloudCluster abstraction. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6303) CachingWrapperFilter - CachingWrapperQuery
[ https://issues.apache.org/jira/browse/LUCENE-6303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340601#comment-14340601 ] ASF subversion and git services commented on LUCENE-6303: - Commit 1662791 from [~jpountz] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1662791 ] LUCENE-6303: CachingWrapperFilter - CachingWrapperQuery, FilterCache - QueryCache and added caching to IndexSearcher. CachingWrapperFilter - CachingWrapperQuery --- Key: LUCENE-6303 URL: https://issues.apache.org/jira/browse/LUCENE-6303 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-6303.patch As part of the filter - query migration, we should migrate the caching wrappers (including the filter cache). I think the behaviour should be to delegate to the wrapped query when scores are needed and cache otherwise like CachingWrapperFilter does today. Also the cache should ignore query boosts so that field:value^2 and field:value^3 are considered equal if scores are not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (SOLR-6969) When opening an HDFSTransactionLog for append we must first attempt to recover it's lease to prevent data loss.
[ https://issues.apache.org/jira/browse/SOLR-6969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anshum Gupta reopened SOLR-6969: Reopening for backporting to 4.10.4. When opening an HDFSTransactionLog for append we must first attempt to recover it's lease to prevent data loss. --- Key: SOLR-6969 URL: https://issues.apache.org/jira/browse/SOLR-6969 Project: Solr Issue Type: Bug Components: hdfs Reporter: Mark Miller Assignee: Mark Miller Priority: Critical Fix For: 5.0, Trunk Attachments: SOLR-6969-4.10.4-backport.patch, SOLR-6969.patch, SOLR-6969.patch This can happen after a hard crash and restart. The current workaround is to stop and wait it out and start again. We should retry and wait a given amount of time as we do when we detect safe mode though. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6847) LeaderInitiatedRecoveryThread compares wrong replica's state with lirState
[ https://issues.apache.org/jira/browse/SOLR-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340617#comment-14340617 ] ASF subversion and git services commented on SOLR-6847: --- Commit 1662797 from [~steve_rowe] in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1662797 ] SOLR-6847: LeaderInitiatedRecoveryThread compares wrong replica's state with lirState (merged branch_5x r1653880) LeaderInitiatedRecoveryThread compares wrong replica's state with lirState -- Key: SOLR-6847 URL: https://issues.apache.org/jira/browse/SOLR-6847 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.10.2 Reporter: Shalin Shekhar Mangar Assignee: Steve Rowe Priority: Minor Fix For: 4.10.4, 5.0, Trunk Attachments: SOLR-6847.patch LeaderInitiatedRecoveryThread looks at a random replica to figure out if it should re-publish LIR state to down. It does however publish the LIR state for the correct replica. The bug has always been there. The thread used ZkStateReader.getReplicaProps method with the coreName to find the correct replica. However, the coreName parameter in getReplicaProps was un-used and I removed it in SOLR-6240 but I didn't find and fix this bug then. The possible side-effects of this bug would be that we may be republish LIR state multiple times and/or in rare cases, cause double 'requestrecovery' to be executed on a replica. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6969) When opening an HDFSTransactionLog for append we must first attempt to recover it's lease to prevent data loss.
[ https://issues.apache.org/jira/browse/SOLR-6969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anshum Gupta updated SOLR-6969: --- Attachment: SOLR-6969-4.10.4-backport.patch 4.10.4 backport patch. When opening an HDFSTransactionLog for append we must first attempt to recover it's lease to prevent data loss. --- Key: SOLR-6969 URL: https://issues.apache.org/jira/browse/SOLR-6969 Project: Solr Issue Type: Bug Components: hdfs Reporter: Mark Miller Assignee: Mark Miller Priority: Critical Fix For: 5.0, Trunk Attachments: SOLR-6969-4.10.4-backport.patch, SOLR-6969.patch, SOLR-6969.patch This can happen after a hard crash and restart. The current workaround is to stop and wait it out and start again. We should retry and wait a given amount of time as we do when we detect safe mode though. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Commits for 4.10.4
No issues Mike and thanks for fixing this. P.S: Your computer is surely faster than most! On Fri, Feb 27, 2015 at 10:02 AM, Michael McCandless luc...@mikemccandless.com wrote: OK my computer was faster than yours :) Should be fixed now! Sorry for the hassle. Mike McCandless http://blog.mikemccandless.com On Fri, Feb 27, 2015 at 12:57 PM, Anshum Gupta ans...@anshumgupta.net wrote: Sure, else I can. I have it fixed, just running the tests now. On Fri, Feb 27, 2015 at 9:50 AM, Michael McCandless luc...@mikemccandless.com wrote: Woops, sorry, I'll fix ... trying to prepare for 4.10.4 release. Mike McCandless http://blog.mikemccandless.com On Fri, Feb 27, 2015 at 12:44 PM, Anshum Gupta ans...@anshumgupta.net wrote: Hi Mike, I think your commits for 4.10.4 broke ant precommit. https://svn.apache.org/viewvc?view=revisionrevision=1662742 https://svn.apache.org/viewvc?view=revisionrevision=1662746 https://svn.apache.org/viewvc?view=revisionrevision=1662750 lucene_solr_4_10/solr/build.xml:240: Some example solrconfig.xml files do not refer to the correct luceneMatchVersion: 4.10.4 -- Anshum Gupta http://about.me/anshumgupta - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Anshum Gupta http://about.me/anshumgupta - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Anshum Gupta http://about.me/anshumgupta
Re: Welcome Anshum Gupta to the PMC
Welcome Anshum! On Fri, Feb 27, 2015 at 11:11 PM, Steve Rowe sar...@gmail.com wrote: I'm pleased to announce that Anshum Gupta has accepted the PMC’s invitation to join. Welcome Anshum! Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Regards, Shalin Shekhar Mangar.
[jira] [Commented] (LUCENE-6294) Generalize how IndexSearcher parallelizes collection execution
[ https://issues.apache.org/jira/browse/LUCENE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340544#comment-14340544 ] Shikhar Bhushan commented on LUCENE-6294: - This is great. I saw some improvements when testing LUCENE-5299 with the addition of a configurable parallelism throttle at the search request level using a semaphore, that might be useful to have here too. I.e. being able to cap how many segments are concurrently searched. That can help ensure resources for concurrent search requests, or reduce context switching if using an unbounded pool. Generalize how IndexSearcher parallelizes collection execution -- Key: LUCENE-6294 URL: https://issues.apache.org/jira/browse/LUCENE-6294 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Fix For: Trunk, 5.1 Attachments: LUCENE-6294.patch IndexSearcher takes an ExecutorService that can be used to parallelize collection execution. This is useful if you want to trade throughput for latency. However, this executor service will only be used if you search for top docs. In that case, we will create one collector per slide and call TopDocs.merge in the end. If you use search(Query, Collector), the executor service will never be used. But there are other collectors that could work the same way as top docs collectors, eg. TotalHitCountCollector. And maybe also some of our users' collectors. So maybe IndexSearcher could expose a generic way to take advantage of the executor service? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6847) LeaderInitiatedRecoveryThread compares wrong replica's state with lirState
[ https://issues.apache.org/jira/browse/SOLR-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe resolved SOLR-6847. -- Resolution: Fixed Committed to lucene_solr_4_10 LeaderInitiatedRecoveryThread compares wrong replica's state with lirState -- Key: SOLR-6847 URL: https://issues.apache.org/jira/browse/SOLR-6847 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.10.2 Reporter: Shalin Shekhar Mangar Assignee: Steve Rowe Priority: Minor Fix For: 4.10.4, Trunk, 5.0 Attachments: SOLR-6847.patch LeaderInitiatedRecoveryThread looks at a random replica to figure out if it should re-publish LIR state to down. It does however publish the LIR state for the correct replica. The bug has always been there. The thread used ZkStateReader.getReplicaProps method with the coreName to find the correct replica. However, the coreName parameter in getReplicaProps was un-used and I removed it in SOLR-6240 but I didn't find and fix this bug then. The possible side-effects of this bug would be that we may be republish LIR state multiple times and/or in rare cases, cause double 'requestrecovery' to be executed on a replica. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6304) Add MatchNoDocsQuery that matches no documents
[ https://issues.apache.org/jira/browse/LUCENE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lee Hinman updated LUCENE-6304: --- Attachment: LUCENE-6304.patch bq. is the hashcode/equals stuff needed here or can the superclass impls in Query be used? The hashcode is required at least, because otherwise the QueryUtils.check(q) fails because both the MatchNoDocsQuery and the superclass Query have the same hashcode, and the anonymous WhackyQuery that QueryUtils creates shares the same hash code, so QueryUtils.checkUnequal() fails. The .equals() stuff is not required though, it can use the superclass implementation. I've attached a new patch that does this. Add MatchNoDocsQuery that matches no documents -- Key: LUCENE-6304 URL: https://issues.apache.org/jira/browse/LUCENE-6304 Project: Lucene - Core Issue Type: Improvement Components: core/search Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-6304.patch, LUCENE-6304.patch, LUCENE-6304.patch As a followup to LUCENE-6298, it would be nice to have an explicit MatchNoDocsQuery to indicate that no documents should be matched. This would hopefully be a better indicator than a BooleanQuery with no clauses or (even worse) null. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Commits for 4.10.4
OK my computer was faster than yours :) Should be fixed now! Sorry for the hassle. Mike McCandless http://blog.mikemccandless.com On Fri, Feb 27, 2015 at 12:57 PM, Anshum Gupta ans...@anshumgupta.net wrote: Sure, else I can. I have it fixed, just running the tests now. On Fri, Feb 27, 2015 at 9:50 AM, Michael McCandless luc...@mikemccandless.com wrote: Woops, sorry, I'll fix ... trying to prepare for 4.10.4 release. Mike McCandless http://blog.mikemccandless.com On Fri, Feb 27, 2015 at 12:44 PM, Anshum Gupta ans...@anshumgupta.net wrote: Hi Mike, I think your commits for 4.10.4 broke ant precommit. https://svn.apache.org/viewvc?view=revisionrevision=1662742 https://svn.apache.org/viewvc?view=revisionrevision=1662746 https://svn.apache.org/viewvc?view=revisionrevision=1662750 lucene_solr_4_10/solr/build.xml:240: Some example solrconfig.xml files do not refer to the correct luceneMatchVersion: 4.10.4 -- Anshum Gupta http://about.me/anshumgupta - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Anshum Gupta http://about.me/anshumgupta - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6303) CachingWrapperFilter - CachingWrapperQuery
[ https://issues.apache.org/jira/browse/LUCENE-6303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340486#comment-14340486 ] ASF subversion and git services commented on LUCENE-6303: - Commit 1662774 from [~jpountz] in branch 'dev/trunk' [ https://svn.apache.org/r1662774 ] LUCENE-6303: CachingWrapperFilter - CachingWrapperQuery, FilterCache - QueryCache and added caching to IndexSearcher. CachingWrapperFilter - CachingWrapperQuery --- Key: LUCENE-6303 URL: https://issues.apache.org/jira/browse/LUCENE-6303 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-6303.patch As part of the filter - query migration, we should migrate the caching wrappers (including the filter cache). I think the behaviour should be to delegate to the wrapped query when scores are needed and cache otherwise like CachingWrapperFilter does today. Also the cache should ignore query boosts so that field:value^2 and field:value^3 are considered equal if scores are not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5299) Refactor Collector API for parallelism
[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340521#comment-14340521 ] Shikhar Bhushan commented on LUCENE-5299: - LUCENE-6294 is definitely a less intrusive approach. I think the tradeoff is that by moving the parallelization into the {{Collector}} API itself, we can make it composable and work for any arbitrary permutation of parallelizable collectors. Refactor Collector API for parallelism -- Key: LUCENE-5299 URL: https://issues.apache.org/jira/browse/LUCENE-5299 Project: Lucene - Core Issue Type: Improvement Reporter: Shikhar Bhushan Fix For: Trunk, 5.1 Attachments: LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch, benchmarks.txt h2. Motivation We should be able to scale-up better with Solr/Lucene by utilizing multiple CPU cores, and not have to resort to scaling-out by sharding (with all the associated distributed system pitfalls) when the index size does not warrant it. Presently, IndexSearcher has an optional constructor arg for an ExecutorService, which gets used for searching in parallel for call paths where one of the TopDocCollector's is created internally. The per-atomic-reader search happens in parallel and then the TopDocs/TopFieldDocs results are merged with locking around the merge bit. However there are some problems with this approach: * If arbitary Collector args come into play, we can't parallelize. Note that even if ultimately results are going to a TopDocCollector it may be wrapped inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. * The special-casing with parallelism baked on top does not scale, there are many Collector's that could potentially lend themselves to parallelism, and special-casing means the parallelization has to be re-implemented if a different permutation of collectors is to be used. h2. Proposal A refactoring of collectors that allows for parallelization at the level of the collection protocol. Some requirements that should guide the implementation: * easy migration path for collectors that need to remain serial * the parallelization should be composable (when collectors wrap other collectors) * allow collectors to pick the optimal solution (e.g. there might be memory tradeoffs to be made) by advising the collector about whether a search will be parallelized, so that the serial use-case is not penalized. * encourage use of non-blocking constructs and lock-free parallelism, blocking is not advisable for the hot-spot of a search, besides wasting pooled threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6294) Generalize how IndexSearcher parallelizes collection execution
[ https://issues.apache.org/jira/browse/LUCENE-6294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340598#comment-14340598 ] Adrien Grand commented on LUCENE-6294: -- I think a better approach than the semaphore would be to just cap the number of slices of your searcher (see IndexSearcher.slices). Generalize how IndexSearcher parallelizes collection execution -- Key: LUCENE-6294 URL: https://issues.apache.org/jira/browse/LUCENE-6294 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Fix For: Trunk, 5.1 Attachments: LUCENE-6294.patch IndexSearcher takes an ExecutorService that can be used to parallelize collection execution. This is useful if you want to trade throughput for latency. However, this executor service will only be used if you search for top docs. In that case, we will create one collector per slide and call TopDocs.merge in the end. If you use search(Query, Collector), the executor service will never be used. But there are other collectors that could work the same way as top docs collectors, eg. TotalHitCountCollector. And maybe also some of our users' collectors. So maybe IndexSearcher could expose a generic way to take advantage of the executor service? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Anshum Gupta to the PMC
Congratulations Anshum! On Fri, Feb 27, 2015 at 10:53 AM, Yonik Seeley ysee...@gmail.com wrote: Congrats Anshum! -Yonik On Fri, Feb 27, 2015 at 9:41 AM, Steve Rowe sar...@gmail.com wrote: I'm pleased to announce that Anshum Gupta has accepted the PMC’s invitation to join. Welcome Anshum! Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (SOLR-6847) LeaderInitiatedRecoveryThread compares wrong replica's state with lirState
[ https://issues.apache.org/jira/browse/SOLR-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe reopened SOLR-6847: -- Assignee: Steve Rowe (was: Shalin Shekhar Mangar) Reopening to backport to 4.10.4 LeaderInitiatedRecoveryThread compares wrong replica's state with lirState -- Key: SOLR-6847 URL: https://issues.apache.org/jira/browse/SOLR-6847 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.10.2 Reporter: Shalin Shekhar Mangar Assignee: Steve Rowe Priority: Minor Fix For: 4.10.4, 5.0, Trunk Attachments: SOLR-6847.patch LeaderInitiatedRecoveryThread looks at a random replica to figure out if it should re-publish LIR state to down. It does however publish the LIR state for the correct replica. The bug has always been there. The thread used ZkStateReader.getReplicaProps method with the coreName to find the correct replica. However, the coreName parameter in getReplicaProps was un-used and I removed it in SOLR-6240 but I didn't find and fix this bug then. The possible side-effects of this bug would be that we may be republish LIR state multiple times and/or in rare cases, cause double 'requestrecovery' to be executed on a replica. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6847) LeaderInitiatedRecoveryThread compares wrong replica's state with lirState
[ https://issues.apache.org/jira/browse/SOLR-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe updated SOLR-6847: - Fix Version/s: 4.10.4 LeaderInitiatedRecoveryThread compares wrong replica's state with lirState -- Key: SOLR-6847 URL: https://issues.apache.org/jira/browse/SOLR-6847 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.10.2 Reporter: Shalin Shekhar Mangar Assignee: Steve Rowe Priority: Minor Fix For: 4.10.4, 5.0, Trunk Attachments: SOLR-6847.patch LeaderInitiatedRecoveryThread looks at a random replica to figure out if it should re-publish LIR state to down. It does however publish the LIR state for the correct replica. The bug has always been there. The thread used ZkStateReader.getReplicaProps method with the coreName to find the correct replica. However, the coreName parameter in getReplicaProps was un-used and I removed it in SOLR-6240 but I didn't find and fix this bug then. The possible side-effects of this bug would be that we may be republish LIR state multiple times and/or in rare cases, cause double 'requestrecovery' to be executed on a replica. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-6303) CachingWrapperFilter - CachingWrapperQuery
[ https://issues.apache.org/jira/browse/LUCENE-6303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-6303. -- Resolution: Fixed Fix Version/s: 5.1 Trunk CachingWrapperFilter - CachingWrapperQuery --- Key: LUCENE-6303 URL: https://issues.apache.org/jira/browse/LUCENE-6303 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Fix For: Trunk, 5.1 Attachments: LUCENE-6303.patch As part of the filter - query migration, we should migrate the caching wrappers (including the filter cache). I think the behaviour should be to delegate to the wrapped query when scores are needed and cache otherwise like CachingWrapperFilter does today. Also the cache should ignore query boosts so that field:value^2 and field:value^3 are considered equal if scores are not needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Commits for 4.10.4
Thank you for pointing it out! Mike McCandless http://blog.mikemccandless.com On Fri, Feb 27, 2015 at 1:14 PM, Anshum Gupta ans...@anshumgupta.net wrote: No issues Mike and thanks for fixing this. P.S: Your computer is surely faster than most! On Fri, Feb 27, 2015 at 10:02 AM, Michael McCandless luc...@mikemccandless.com wrote: OK my computer was faster than yours :) Should be fixed now! Sorry for the hassle. Mike McCandless http://blog.mikemccandless.com On Fri, Feb 27, 2015 at 12:57 PM, Anshum Gupta ans...@anshumgupta.net wrote: Sure, else I can. I have it fixed, just running the tests now. On Fri, Feb 27, 2015 at 9:50 AM, Michael McCandless luc...@mikemccandless.com wrote: Woops, sorry, I'll fix ... trying to prepare for 4.10.4 release. Mike McCandless http://blog.mikemccandless.com On Fri, Feb 27, 2015 at 12:44 PM, Anshum Gupta ans...@anshumgupta.net wrote: Hi Mike, I think your commits for 4.10.4 broke ant precommit. https://svn.apache.org/viewvc?view=revisionrevision=1662742 https://svn.apache.org/viewvc?view=revisionrevision=1662746 https://svn.apache.org/viewvc?view=revisionrevision=1662750 lucene_solr_4_10/solr/build.xml:240: Some example solrconfig.xml files do not refer to the correct luceneMatchVersion: 4.10.4 -- Anshum Gupta http://about.me/anshumgupta - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Anshum Gupta http://about.me/anshumgupta - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Anshum Gupta http://about.me/anshumgupta - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6306) Merging of doc values, norms is not abortable
[ https://issues.apache.org/jira/browse/LUCENE-6306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340274#comment-14340274 ] ASF subversion and git services commented on LUCENE-6306: - Commit 1662723 from [~mikemccand] in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1662723 ] LUCENE-6306: allow doc values and norms merging to be aborted in IW.rollback Merging of doc values, norms is not abortable - Key: LUCENE-6306 URL: https://issues.apache.org/jira/browse/LUCENE-6306 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Priority: Blocker Fix For: 4.10.4 Attachments: LUCENE-6306.patch When you call IW.rollback, IW asks all running merges to abort, and the merges should periodically check their abort flags (it's a cooperative mechanism, like thread interrupting in Java). In 5.x/trunk we have a nice clean solution where the Directory checks the abort bit during writes, so the codec doesn't have to bother with this. But in 4.x, we have to call MergeState.checkAbort.work, and I noticed that neither DVs nor norms call this. Typically this is not a problem since merging DVs and norms is usually fast, but for a very large merge / very many DVs and norm'd fields, it could take non-trivial time to merge. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6304) Add MatchNoDocsQuery that matches no documents
[ https://issues.apache.org/jira/browse/LUCENE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14340308#comment-14340308 ] Robert Muir commented on LUCENE-6304: - is the hashcode/equals stuff needed here or can the superclass impls in Query be used? They seem to already have this logic. In the tests, i would add a call to QueryUtils.check(q) to one of your matchnodocsqueries. This will do some tests on hashcode/equals. Add MatchNoDocsQuery that matches no documents -- Key: LUCENE-6304 URL: https://issues.apache.org/jira/browse/LUCENE-6304 Project: Lucene - Core Issue Type: Improvement Components: core/search Affects Versions: 5.0 Reporter: Lee Hinman Priority: Minor Attachments: LUCENE-6304.patch, LUCENE-6304.patch As a followup to LUCENE-6298, it would be nice to have an explicit MatchNoDocsQuery to indicate that no documents should be matched. This would hopefully be a better indicator than a BooleanQuery with no clauses or (even worse) null. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7143) MoreLikeThis Query Parser does not handle multiple field names
[ https://issues.apache.org/jira/browse/SOLR-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitaliy Zhovtyuk updated SOLR-7143: --- Attachment: SOLR-7143.patch Local parameters are not support multiple values syntax like: {!mlt qf=field1 qf=field2}, but qf list is required in MoreLikeThis. Added support for comma separated fields: {!mlt qf=field1,field2} Also comparing MLT handler query parser does not have any boost support on fields. This can be extended in qf parameter syntax. MoreLikeThis Query Parser does not handle multiple field names -- Key: SOLR-7143 URL: https://issues.apache.org/jira/browse/SOLR-7143 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 5.0 Reporter: Jens Wille Attachments: SOLR-7143.patch The newly introduced MoreLikeThis Query Parser (SOLR-6248) does not return any results when supplied with multiple fields in the {{qf}} parameter. To reproduce within the techproducts example, compare: {code} curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=name%7DMA147LL/A' curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=features%7DMA147LL/A' curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=name,features%7DMA147LL/A' {code} The first two queries return 8 and 5 results, respectively. The third query doesn't return any results (not even the matched document). In contrast, the MoreLikeThis Handler works as expected (accounting for the default {{mintf}} and {{mindf}} values in SimpleMLTQParser): {code} curl 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=namemlt.mintf=1mlt.mindf=1' curl 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=featuresmlt.mintf=1mlt.mindf=1' curl 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=name,featuresmlt.mintf=1mlt.mindf=1' {code} After adding the following line to {{example/techproducts/solr/techproducts/conf/solrconfig.xml}}: {code:language=XML} requestHandler name=/mlt class=solr.MoreLikeThisHandler / {code} The first two queries return 7 and 4 results, respectively (excluding the matched document). The third query returns 7 results, as one would expect. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org